Department of Electrical and Computer Engineering                      University of Massachusetts Amherst

ECE 697M - Fall 2003 

Low Power Architecture

(focus on microprocessor design)


InstructorCsaba Andras Moritz, Professor
email:, phone: 413-545-2442 Office: room KEB-309H
Secretary:   June Daehler, phone: 413-545 3621

Office hours: by appointment

Class Time: TTh from 11:15AM
Place: HAS 130
NOTE: Research papers and other materials are distributed in class or posted on this page.

Course Abstract:
Power and energy consumption management has become a critical design goal in computer architectures. Power/energy reduction techniques (especially that do not affect performance adversely) are widely recognized as representing the next phase in the advancement of microprocessors. This course addresses some of the recent developments in both industry and academia related to energy and power management and emphasizes on some of the challenges in future generation designs.

Power/energy reduction is important in both the embedded and general purpose domains. In the embedded domain many applications are battery driven and thus require low power and energy consumption. Emerging applications that merge wireless and Internet capabilities require high performance in addition to low energy. In the general purpose domain, power consumption affects packaging and heat dissipation related costs. This course will cover both memory systems and microprocessor pipelines. We will discuss techniques to reduce both dynamic and leakage power consumption across system layers such as circuits, microarchitecture, and compilation.
(3 credits)

This is a research oriented seminar course and requires full participation. Its focus is to prepare students to do research in this area. Students are expected to give one  30 minutes presentation of a research paper (selected by the instructor) and to participate actively in informal (brainstorming style) research discussions. Projects are listed below. Each student is required to write one "wild idea" paper ( not more than four pages) related to one promising technique/idea he/she comes up with.  One exam. Grading: 10% class attendance, 10% presentation, 50% projects, 30% final exam (material will be selected by instructor).

Prerequisites: Desire to understand how microprocessors are designed to reduce power and energy consumption.  Good (e.g., undergraduate level) knowledge of computer architecture and some understanding of compiler and VLSI implementation issues in microprocessors.

Materials from:



Suggested list of material to read and topics discussed in class - list will be updated as we go along...

Lecture 1: Introduction; MOS device power consumption; VLSI aspects; dynamic vs static power; leakage; technology scaling; tradeoffs between power and performance in circuits; energy consumption and energy-delay products

    [2] - Sections 4.4, Power Consumption in CMOS gates
    [2] - Sections and 3.1, 3.4, Power dissipation in an inverter
    [5] - Section 1.6 Trends in power consumption and Section 4.1 Low-voltage low-threshold voltage circuit design
    and read MJI ISCA Tutorial introduction

Lecture 2: Why is that with new process generations chips tend to consume more power? Power density. Device issues. Circuit level tradeoffs. Area, delay, power, noise margins in MOS. Clock gating. Comparison between Complementary CMOS, Ratioed Logic, and Dynamic Logic gates from point of view of power consumption.  (student questions: Discussion related to impact of ISA design, e.g., RISC, CISC, VLIW, other variable length ISA.)

    [2] Sections 4.2.2, 4.3.1, 4.4.1, (also recommended 4.4.2, 4.4.3, 4.4.4, 4.4.5 for those interested in VLSI aspects)
    and read (especially if you are interested in VLSI aspects) MJI ISCA Tutorial on power issues in VLSI gates

Lecture 3: Main components of a microprocessor (control-path, data-path, memory system). Where is power consumed in different classes of microprocessors? Discussion: some of the key architectural ideas and their impact on power: parallelism, pipelining, specialization/acceleration, static compiler-controlled mechanisms, and speculation.

Lecture 4: Continue discussion on key architectural components in microprocessors. Definition of the first mini-project for the class. Caches. Ram-tag vs CAM-tag caches. Architectural techniques to reduce energy: banking, tag-check removal, way-prediction, way-resizing. Difference between CAM-Tag and Ram-Tag based designs (more on this to follow).

    [1] read related sections; details provided in class.

Lecture 5: Caches and Memory Systems. Design of low-power caches with Cam-Tags and Ram-Tags.
    Reading material: Zhang, Asanovic; Highly Associative Caches for Low-Power Processors [pdf] [ppt]

Lecture 6: Continue on cache designs. Circuit and architecture aspects.
    Additional reading material for those interested in VLSI aspects: MJI ISCA tutorial on memory systems

Lecture 7 (Sept 24, Thursday): Reducing cache power with compiler-architecture interaction. Exploiting application memory access patterns. Compiler's role.

Lecture 8: Energy efficiency related discussion of several recent proposals for multiprocessors-on-a-chip and SMTs

Lecture 9: Improving energy efficiency of a data cache
      Reading material: Raksit Ashok, Saurabh Chheda, and CA Moritz, "Cool_Mem: ...." ASPLOS 2002, [PDF], Raksit's thesis presentation/slides  [ppt]

Lecture 10: Improving energy efficiency in the instruction cache
      Reading: "Way Memoization to Reduce Fetch Energy in Instruction Caches", Albert Ma, Michael Zhang, and Krste Asanovic, Workshop on Complexity-Effective Design, ISCA-28, Goteborg, Sweden, June 2001. (PDF paper)

Lecture 10-11: (Oct 09) Improving energy efficiency in the fetch, issue, and decode units. IPC driven Fetch throttling
    Reading: Osman S. Unsal, Israel Koren, C. Mani Krishna, Csaba Andras Moritz
Cool-Fetch: A Compiler-Based IPC Estimation Based Framework for Energy-Efficiency, in ACM Computer Architecture Letters,   2002, May [PS]

Lecture 12"Banked Multiported Register Files for High-Frequency Superscalar Microprocessors", Jessica Tseng and Krste Asanovic, 30th International Symposium on Computer Architecture (ISCA-30), San Diego, CA, June 2003. (PDF paper, PDF slides, PPT slides)

Lecture 13:"Energy-Exposed Instruction Sets", Krste Asanovic, Mark Hampton, Ronny Krashinsky, and Emmett Witchel, Power Aware Computing, Robert Graybill and Rami Melhem (Eds.), Kluwer Academic/Plenum Publishers, June 2002. (PDF paper, Book Website)

Lecture 14: "Energy-Efficient Register Access", Jessica Tseng and Krste Asanovic, XIII Symposium on Integrated Circuits and System Design (SBCCI2000), Manaus, Amazonas, Brazil, September 2000. (PDF paper, PDF slides, PPT slides)

Lecture 15.  "Reducing the Complexity of the Register File in Dynamic Superscalar Processors" , R. Balasubramonian, S. Dwarkadas, and D.H. Albonesi, 34th International Symposium on Microarchitecture, pp. 237-248, December 2001. [ppt]

Lecture 16.Osman S. Unsal, Israel Koren, C. Mani Krishna, Csaba Andras Moritz, The MiniMax Cache: An Energy-Efficient Framework for Media Processors, in HPCA2002,   2002, Feb [PS]

Lecture 17. Energy-Aware Hardware Data Prefetching. Yao Guo's work [ppt]

Lecture 18: Power Analysis of Control Path. Paper: Landman and Rabaey: Activity-Sensitive Architectural Power Analysis for the Control Path. Nathir's presentation[ppt]

Tentative schedule for remaining part:

Lecture 19. The ARM11 architecture - paper distributed in class [ppt]

Lecture 20. Power issues in branch predictors [ppt]

Lecture 21: Discussion related to research papers

Lecture 22 (Dec 4, Thursday). Voltage scaling

Lecture 22 (Tuesday). Exam Review (and Leakage reduction: Drowsy caches)

Lecture 23  (Thursday) Leakage reduction: Leakage biased bitlines

FINAL Exam: Friday at 10AM, Dec 19, 2003, Place Knowles 3-rd floor conference room

Mini-project 1. Evaluate two versions of a  4-input NAND gate: a pseudo-NMOS  and a dynamic one, using Spice. Use  130-nm and 100-nm BPTM technology nodes. Estimate delays, power consumption (dynamic, direct-path-related, and leakage) and area. Estimate worst-case and best-case scenarios depending on the inputs.  Goals of the project: (1) understand how power consumption is changing with reduced feature sizes; (2) see that circuit style affects power consumption; (3) understand tradeoffs between area/delay/power/noise-margins etc.; (4) understand the impact of  inputs and input transitions on the power consumed.  Requirements: (i) a research report shortly describing your work and results, and (ii) oral defense where questions related to the goals outlined and your simulation will be asked. Deadline is Tuesday, October 7.

Mini-project 2: Cache design and simulation with Cacti ( or Spice. Use the latest Cacti tool to determine the configuration and performance/power parameters for a 16KByte and a 32 KByte cache. Compare results with the discussed results from Lecture 5 or Zhangh's design.    Alternatively, derive the power consumed for Rd/Wr operations of a 32 KByte CAM-Tag cache with 8 banks and 64-way associativity; ignore the sense amplifier. Goals of the project: (1) understand the inner-workings of a cache; (2) understand cache design issues; (3) understand breakdown of power consumption in  a cache. Deadline October 28.

Mini-project 3: Microprocessor architecture power/energy simulation using the Simplescalar/Wattch simulation environment. Goals of the project: (1) learn to work at the architectural level of abstraction; (2) use architectural simulation tools; (3) see how architectural configurations affect energy and power. Use three different architectural configurations: (1) A baseline out-of-order 4-way superscalar with default Simplescalar parameters with 0.18 micron technology node assumptions; (2) Double the L1 cache size; (3) Increase issue-width to 8-way. Show a comparison of energy consumed for the mpeg2 (Mediabench)  application for the three configurations. Explain differences. Deadline November 18.  If you have problems with the compilation use Yao's version

Wild idea paper: your short paper on how you reduce power or energy in a processor you would design.  Will be evaluated based on similar criteria used in conferences (1. Innovation, 2. Organization/language, 3. Evaluation methodology and arguments used to convince reader about feasibility of the idea). Should state the problem, compare with state-of-the-art, cite related work, argue for why it would work and also possible problems you expect (you can use back-of-the-envelope type of reasoning). Deadline Dec 9. Private discussions Dec 2. For other appointments please send email. [Guide]


Last updated:12/23/2003