ECE 697E  Trends in Software Exposed Architectures.
University of Massachusetts Amherst
Department of Electrical and Computer Engineering

InstructorCsaba Andras Moritz, Professor
email:, phone: 413-545-2442

Office: room KEB-309H
Secretary:   June Daehler, phone: 413-545 3621

Office hours: Thursday 2:30 - 4:00

Time: Tuesday-Thursday 1:00 - 2:15
Place: Marston 15
Schedule: [ see below or click]

Homework, prerequisite, material:  [ see below or click
ECE Academic Honesty Policy:  [ see below or click
Important deadlines: Nov 30, research paper.

Course Abstract:
Recent advances in semiconductor technologies make soon possible designing architectures with one billion transistors on a chip, enabling the design of highly parallel microprocessors and integration of processing and memory on the same chip. The promise of much higher performance is however somewhat overshadowed by the difficulty in extracting enough parallelism from applications and challenges in designing architectures that continue to scale with future technological advances. Furthermore, with the emergence of information appliances connected to the Internet (as well as other small embedded devices), focus is quickly shifting towards architectures that are optimized for low power consumption and real-time aspects. To easier accommodate this wider spectrum of design goals, compiler  and runtime technologies are being developed to implement some of the resource customization and management in software rather than hardware. New architectures are being proposed that either expose their internal hardware structure to software layers (software exposed architectures) or (more recently) that fully replace complex hardware with software smarts using the hardware mainly to speed up critical components (software dominated architectures or hardware-extended software-only solutions or software enabled architectures). These architectures can therefore be more flexible,  can  often accommodate different design goals such as low power, high performance, high predictability, and scalability, and new application domains.
In this class we discuss some of these proposals emphasizing on approaches for memory systems, virtualization, caching, speculation, distribution, and  reconfigurability. We will also look into emerging technologies for resource constrained computer systems (such as wireless devices and information appliances connected to the Internet) and  discuss some new compiler and modeling techniques used.  The objective of the class is to stimulate discussions about new related research ideas. Several guest lectures are  given by MIT researchers working in this research area.
(3 credits)

Students are expected to give a  30-40 minutes presentation of one research paper (see guidelines here), to implement/evaluate a compiler technique used in FlexCache (to be released later)  or  to write a research paper (see guidelines here) based on a related research idea (see possible ideas here , and a tar ball of a paper I wrote for how to build it with latex).

Grading is based on 60% research paper or project work and 40% presentation and class activity.

Prerequisites: some understanding of computer system architecture and software systems

Material: research papers available online

(preliminary! check regularly for updated schedule):

Session: Date: Topic: Papers to read/comments:
Lecture 1 Sept 7  Introduction, Adminsitrativia Sign up for presentations.
Three papers handed out: (come by 309H or download if you need copy) as well as quick reference in using Unix and emacs. The papers are used as example of: (1) designing caches for low-power, (2) inflexibility of traditional caches, (3) lack of predictability and determinacy in caches. 
1. Johnson Kin, et. al., The Filter Cache: An Energy Efficient Memory System [download]
2. Keith D Cooper et. al., Compiler-Controlled Memory [download]
3. Philip J. Koopman, Jr., Perils of the PC Cache [download]
Lecture 2 Sept 12 Architecture: Overview Software Exposed Architectures 1. Overview, randome walk driving forces in computer system design.Motivation behind the FlexCache project. [download]
2. Andrew Glew MLP yes! ILP no! (Intel Microcomputer Research Labs and Univeristy of Wisconsin, Madison) [download]
Lecture 3 Sept 14 Architecture: Compilers: Exploiting Memory and Instruction Level Parallelism and Specialization (Noosha Nayeri) 1. W. Lee, R. Barua, D. Srikrishna, J. Babb, V. Sarkar, and S. Amarasinghe, Space-Time Scheduling of Instruction-Level Parallelism on a Raw Machine, ASPLOS, October 1998.  [download]
2. Jonathan Babb, Martin Rinard,C Andras Moritz, Walter Lee, Matthew Frank, Rajeev Barua, and Saman Amarasinghe,  Parallelizing Applications into Silicon,  Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines '99 (FCCM '99), Napa Valley, CA, April 1999.  [download]
Lecture 4 Sept 19 Architecture: Modeling: Tiled 1 Bill transistor architectures (Raksit Ashok) 1.Csaba Andras Moritz, D. Yeung, A. Agarwal. SimpleFit: A Framework to Analyzing Design Tradeoffs in Raw Architectures. Accepted for publication in IEEE Transactions on Parallel and Distributed Systems, [download
Lecture 5 Sept 21 Architecture: Compilers:  Software Managed  Data Caching
 ( Ravinder Rachala)
1. Csaba Andras Moritz, Matthew Frank, Walter Lee,and Saman Amarasinghe 
Hot Pages: Software Caching for Raw Microprocessors , 
MIT-LCS Technical Memo LCS-TM-599, Aug 1999. [download]
Lecture 6 Sept 26 Architecture: Industry: Transmeta Crusoe
(Mandeep Singh)
Transmeta, The Technology Behind Crusoe(TM) Processors  [download] see also and ../dev
Lecture 7 Sept 28 Architecture: PIM: Processor in Memory Approaches (Saurabh Chedda) 1. A Case for Intelligent DRAM: IRAM," IEEE Micro , April 1997. [download]
2.Intelligent RAM (IRAM): the Industrial Setting, Applications, and Architecture", ICCD '97 International Conference on Computer Design, Austin, Texas, 10-12 October 1997 [download]
3. Mark Oskin, Frederic T. Chong, and Timothy Sherwood. Active Pages: A Model of Computation for Intelligent Memory. In the 1998 International Symposium on Computer Architecture, Barcelona, Spain.  [download]
4. FlexRAM: Toward an Advanced Intelligent Memory System by Yi Kang, Michael Huang, Seung-Moon Yoo, Zhenzho Ge, Diana Keen, Vinh Lam, Prattap Pattnaik and Josep Torrellas, International Conference on Computer Design (ICCD), October 1999.  [download]
Lecture 8
Oct 3 Compilers: Pointer Analysis
Guest lecture by Radu Rugina, MIT
1.Pointer Analysis for Multithreaded Programs 
Radu Rugina and Martin Rinard In the Proceedings of the ACM SIGPLAN 1999 Conference on Programming Languages Design and Implementation  Atlanta, GA, May 1999  [download]
 1) Flow-insensitive pointer analysis
       - Steensgaard:
 2) Flow-sensitive pointer analysis:
       - Emami, Ghyia, Hendren

       - Wilson, Lam
 3) multithreaded pointer analysis:
       - Rugina, Rinard
Lecture 9
Oct 5 Compilers: Bitwidth Analysis
Guest lecture by Mark Stephenson, MIT
M. Stephenson and J. Babb and S. Amarasinghe.  Bitwidth Analysis with Application to Silicon Compilation.  In Proceedings of the SIGPLAN conference on Programming Language Design and Implementation, Vancouver, British Columbia, June 2000 [download]
Lecture 10
Oct 10 Compilers:Applications:  Streaming Applications
Guest lecture by Sam Larsen, MIT
1. S. Larsen and S. Amarasinghe.  Exploiting Superword Level Parallelism with Multimedia Instruction Sets. MIT/LCS Technical Memo,  LCS-TM-601, November 18, 1999.  [download]
2.Justin Hensley, Mark Oskin, Diana Keen, Lucian-Vlad Lita, and Frederic T. Chong. Active Page Architectures for Media Processing.  First Workshop on Media Processors and DSPs held with  MICRO32. November 1999 [download]
3. G. Cheong and M. Lam, An Optimizer for Multimedia Instruction Sets,  The Second SUIF compiler workshop,  August 21-23, 1997. [download]
Lecture 11 Oct 12 Architecture: Compilers: Flexible Retargetable  Caches for Processors and FPGAs (Prashant Jain, Rajnish Prasad) 1. A Fully Associative Software-Managed Cache Design, ISCA2000, Erik G. Hallnor 
Steven K. Reinhardt [come by 309H for copy]
2. Reconfigurable Caches and their Application to Media Processing , ISCA2000,
Parthasarathy Ranganathan, Sarita Adve,Norman Jouppi [download
Lecture 12
Oct 17 Architecture: Compilers: Software Speculation
Guest lecture by Matt Frank, MIT
1. Matthew Frank, Csaba Andras Moritz, Benjamin Greenwald, Saman Amarasinghe and Anant Agarwal SUDS: Primitive Mechanisms for Memory Dependence Speculation, Technical Report. LCS-TM-591, October 1998.  [download]
2. M. Franklin and G. S. Sohi, ARB: A Hardware Mechanism for Dynamic Reordering of Memory References, IEEE Transactions on Computers, May 1996. [download]
3. Sridhar Gopal, T. N. Vijaykumar, J. E. Smith and G. S. Sohi, Speculative Versioning Cache, 4th Int'l. Symp. on High Performance Computer Architecture (HPCA-4), Feb 1998.  [download]
4. Jeffery Oplinger, David Heine, Shih-Wei Liao, Basem A. Nayfeh, Monica S. Lam and Kunle Olukotun, Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor, Stanford University Computer Systems Lab Technical Report CSL-TR-97-715, February 1997.  [download] .   More resources  {here}
Lecture 13 Oct 19 Architecture: Compilers: Prefetching (Sriram Swaminathan) 1. Chi-Keung Luk and Todd Mowry, Cooperative Prefetching: Compiler and Hardware Support for Effective Instruction Prefetching in Modern Processors, Proceedings of Micro-31, Nov 30-Dec 2, 1998. pp 182-194.  [download]
2. Ashley Saulsbury, Fredrik Dahlgren, Per Stenstrom. Recency-based TLB Preloading. ISCA2000.[come by my office for copy]
Lecture 14 Oct 24 Architecture: Compilers: Software Instruction Caching
(Prithvi Maddi)
1. Jason Miller. Software Based Instruction Caching for the Raw Architecture MS Thesis MIT, [download]

Lecture 15
Oct 26 Architecture: Stanford Smart Memories (Subramarian Venkatraman, Keerthy Thodima) Smart Memories: A Modular Reconfigurable Architecture 
Kenneth Mai,Timothy Paaske,Nuwan Jayasena,Ron Ho,William J. Dally,
Mark Horowitz.  [download]

Lecture 16
Oct 31 Architecture: Industry: EPIC Processors (Zhenlin Wang) 1. D. I. August, D. A. Connors, S. A. Mahlke, J. W. Sias, K. M. Crozier, B. Cheng, P. R. Eaton, Q. B. Olaniran, W. W. Hwu, Integrated Predicated and Speculative Execution in the IMPACT EPIC Architecture,  ISCA 98. [download]
Lecture 17 Nov 2 Foundation: Research Compiler Infrastructure: Suif, Suif2, Machsuif, Machsuif2, Zephyr, Trimaran, (John Cavazos)  Suif Cookbook, SUIF Library Documentation
Lecture 18 Nov 7 Architecture: Compilers: Flexible Retargetable  Caches for Processors and FPGAs (Prashant Jain) 2. Reconfigurable Caches and their Application to Media Processing , ISCA2000,
Parthasarathy Ranganathan, Sarita Adve,Norman Jouppi [download
Lecture 19 Nov 9 Architecture: Compilers: Virtual Memory Systems (Santosh Thampuran) 1. Bruce Jacob, Trevor Mudge. Software-Managed Address Translation.ISCA 97[download]
2. Bruce Jacob, Trevor Mudge. Virtual Memory in Contemporary Microprocessors. IEEE Micro 1998 Aug, [download]
3. Ashley Saulsbury, Fredrik Dahlgren, Per Stenstrom. Recency-based TLB Preloading. ISCA2000.[come by my office for copy]
Lecture 20 Nov 14 Architecture: Power Modeling, Low-power (Yujing Qing)

Architecture: Compilers: Caches (Jeevan Chittamuru)

1. Wattch: A Framework for Architectural-Level Power Analysis and Optimizations, ISCA 2000, David Brooks,  Vivek Tiwari, Margaret Martonosi 
2. T-C. Lee, V. Tiwari, S. Malik and M.Fujita, Power Analysis and Minimization Techniques for Embedded DSP Software, IEEE Transactions on VLSI Systems, March 1997.  [download]
3. N Vijaykrishnan, et al. Energy-Driven Integrated Hardware-Software Optimizations Using SimplePower, ISCA2000 [download]

1. Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors, Int'l. Symp. on Low Power Electronics and Design, Nikolaos Bellas, Ibrahim Hajj, Constantine Polychronopoulos, and George Stamoulis, Monterey, CA, 1998
2. Cache design trade-offs for power and performance optimization: A case study, Ching-Long Su and Alvin Despain, ISPLED'95. 
3. Memory traffic and data cache behavior of an MPEG-2 software decoder, International Conference on Computer Design, Austin, Texas, October 12-15, 1997. 

Lecture 21 Nov 16 1. Architecture: Resource Constrained Devices:  Bluetooth, WAP, CLDC, KVM, app serverar (Giyasettin Ozcan)

2. Trace caches (Aziz Mohammed)

Trace Processors: Moving to Fourth-Generation Microarchitectures, J. E. Smith, Sriram Vajapeyam, IEEE Computer, Special Issue on The Future of Microprocessors, September 1997
Lecture 22 Nov 21 Reserved for individual project discussions  
Lecture 23 Nov 23 Reserved for individual project discussions  
Lecture 24 Nov 28 Reserved for individual project discussions  
Student projects Nov 30 Deadline for papers  
Student projects      
Course wrap-up      


ECE ACADEMIC HONESTY POLICY A new Honor Code Policy is being instituted for all ECE students, the result of a joint initiative between students in Eta Kappa Nu (the ECE student honor society) and the Faculty of the ECE Department. The purpose of this policy is to emphasize engineering ethics as an important part of your education and career, and to enhance the value of your ECE degree from UMass. Simply put, the policy requires that each ECE student demonstrate high ethical standards by attesting to personal honesty and integrity for each examination taken. The policy fits within the framework of the existing Academic Honesty Policy of the University, and is similar to that used by other universities.

On the last page of your midterm exams and final exam, you will be expected to write out and sign your name to the Honor Code Pledge:

"On my honor, I have not given nor received aid on this exam."

This statement reflects your personal commitment to honesty and ethical practice in the taking of an exam. If you have not written and signed this, you will be contacted by the instructor.

Cheating will not be tolerated. A student found cheating on an exam will receive an automatic grade of F on the exam, and likely will fail the course as well.


Last updated: Sept 12, 2000