Fine-grain real-time reconfigurable pipelining
IBM Journal of Research and Development, Sep-Nov 2003 by Kim, Suhwan, Ziesler, Conrad H, Papaefthymiou, Marios C
In many computations, average data rates are often significantly lower than the peak rate possible. Consequently, VLSI systems capable of processing data at a maximum specified rate can be excessively dissipative when data rates are low. Such inefficiencies are particularly pronounced in heavily pipelined designs, in which registers account for the bulk of energy dissipation in a system. This paper describes a novel methodology for designing reconfigurable pipelines that achieve very low power dissipation by adapting their resources to their computational requirements. In our fine-grain reconfigurable pipelines, energy is saved by disabling and bypassing an appropriate number of pipeline stages whenever data rates are low. In contrast, coarse-grain approaches, such as dynamic voltage scaling, are often unable to capture savings from short-time-scale variations in throughput requirements because of the long time needed to reconfigure the voltage. To evaluate our methodology, we designed an inverse discrete cosine transform (IDCT) module for MPEG-2. Our IDCT included pipelined multipliers that were dynamically reconfigurable on the basis of the number of nonzero coefficients per block and picture size. In comparison with conventional multipliers in corresponding IDCT implementations, our reconfigurable multipliers dissipated about 12-65% less power.
1. Introduction
Pipelining enables the realization of high-speed, high-efficiency CMOS datapaths by allowing the reduction of supply voltages at the lowest possible levels while still satisfying throughput constraints. In deep pipelines, however, registers and corresponding clock trees are responsible for an increasingly large fraction of total dissipation, no matter how efficiently they may have been implemented [1-4]. For example, the power consumed by the registers of a predictive vector quantization (PVQ) decoder described in [2] amounts to 90% of the total datapath dissipation. In general, these registers latch their inputs unconditionally, even if input data does not change, and thus consume significant power no matter how efficiently they may have been implemented [1, 3, 4].
This paper presents a methodology for designing fine-grain reconfigurable pipelined datapaths that can adapt their performance and dissipation to required data rates in real time. These datapaths can efficiently cope with the variability of data rate that is commonplace in numerous applications. Our reconfiguration methodology reduces energy dissipation by disabling and bypassing a select subset of registers. The number of register stages and corresponding clock trees to be disabled at any interval in the operation of the pipeline is periodically determined by the amount of computation that must be performed at the time. Reconfiguration can be performed "on the fly" while data is streaming through the datapath. The control hardware overhead associated with our approach is very low. For an n-stage pipeline, additional hardware is limited to O(n log n) state bits and O(n) multiplexers.
An application domain that naturally lends itself to our real-time fine-grain reconfiguration scheme is video processing, a key component of multimedia communications and a potentially integral part of next-generation portable devices. Currently, there arc several video standards established for different purposes, including MPEG-1, MPEG-2, and H.261, and their implementations for mobile systems-on-a-chip (SoCs) should provide substantial computing capabilities at low energy consumption levels [5]. The building blocks of these standards include demanding computations such as the discrete cosine transform (DCT), inverse discrete cosine transform (IDCT), motion estimation, motion compensation, variable-length coding/decoding, quantization, and inverse quantization. Video streams are particularly suitable for low-power processing using our reconfiguration approach, because the required data rates of downstream components can be inferred by observing the output values of upstream components. Owing to the real-time reconfiguration capability of our scheme, variable data rates can be accommodated without interrupting the flow of data through the pipeline. In contrast, alternative dynamic adaptation schemes such as voltage scaling would require several cycles to reconfigure the system and thus result in unacceptable latencies for real-time latency-sensitive applications.
To evaluate the efficiency of our methodology, we applied it to the design of reconfigurable pipelined multipliers that were used in IDCT modules with varying degrees of parallelism. Our multipliers were dynamically reconfigured according to the number of nonzero DCT coefficients per block and the picture size. We compared the energy efficiency of the reconfigurable IDCTs with that of statically pipelined IDCTs with identical architecture and peak performance capability. In simulations with a 0.35-�m CMOS technology, our reconfigurable pipelined multipliers used to perform two-dimensional IDCT achieved relative reductions up to 65% compared with the nonreconfigurable counterparts.
Most Recent Technology Articles
- INTERVIEW WITH BEN BUTTERS, DIRECTOR OF EUROPEAN AFFAIRS AT EUROCHAMBRES : "A PERFECT ROAD MAP FOR EU CLUSTERS DOES NOT EXIST".
- AGENDA.(Brief article)(Conference notes)
- FIGHT AGAINST INTERNET PIRACY.
- INTERNET : AUTHORS' SOCIETIES URGE ACTION AGAINST PIRACY.
- TELECOMMUNICATIONS : BUSINESSEUROPE HOSTILE TO FURTHER CONTRACTUAL OBLIGATIONS.(Brief article)
Most Recent Technology Publications
Most Popular Technology Articles
- 3G: naughty or nice? PhoneErotica.com generates over 300 million hits per month, and rings up more minutes of use per month than MSN
- Business process re-engineering in the small firm: A case study
- Performance analysis of shell and tube heat exchanger using miscible system
- What is precision air conditioning and why is it necessary?
- Optimizing of Trichoderma viride cultivation in submerged state fermentation


