Improve area effesiency

Programs essentially contain fragments that cannot be parallelized. As Amdahl’s law suggests, speed-up of a single core which executes sequential portion of programs will remain to be important even in the era of many-core processors.  Since energy consumption and resulting heat are ones of the most serious factors that limit the clock frequency, efficiency, i.e., performance per circuit area, or energy is the most important measure of recent processor cores.

A superscalar processor core is composed of a bulk of multi-port RAMs. Therefore, reducing the number of these ports while maintaining the performance leads to high efficiency.  We have proposed such techniques as (1) Dispatched image cache, (2) Non-latency-oriented register cache system, (3) Matrix scheduler, (4) Twin-tail architecture, and (5) Memory access order violation detection with Bloom Filter.

We advanced our Multibanked Register File technology and evaluated its performance, power efficiency, and footprint.