ECE
697E Trends in Software Exposed Architectures.
University of
Massachusetts Amherst
Department
of Electrical and Computer Engineering
Instructor: Csaba Andras Moritz, Professor
email: andras@ecs.umass.edu, phone: 413-545-2442Office: room KEB-309H
Secretary: June Daehler, phone: 413-545 3621
Office hours: Thursday 2:30 - 4:00
Time: Tuesday-Thursday 1:00 - 2:15
Place: Marston 15
Schedule: [ see below or click]Homework, prerequisite, material: [ see below or click]
ECE Academic Honesty Policy: [ see below or click]
Important deadlines: Nov 30, research paper.
Course Abstract:
Recent advances in semiconductor technologies make soon possible designing architectures with one billion transistors on a chip, enabling the design of highly parallel microprocessors and integration of processing and memory on the same chip. The promise of much higher performance is however somewhat overshadowed by the difficulty in extracting enough parallelism from applications and challenges in designing architectures that continue to scale with future technological advances. Furthermore, with the emergence of information appliances connected to the Internet (as well as other small embedded devices), focus is quickly shifting towards architectures that are optimized for low power consumption and real-time aspects. To easier accommodate this wider spectrum of design goals, compiler and runtime technologies are being developed to implement some of the resource customization and management in software rather than hardware. New architectures are being proposed that either expose their internal hardware structure to software layers (software exposed architectures) or (more recently) that fully replace complex hardware with software smarts using the hardware mainly to speed up critical components (software dominated architectures or hardware-extended software-only solutions or software enabled architectures). These architectures can therefore be more flexible, can often accommodate different design goals such as low power, high performance, high predictability, and scalability, and new application domains.
In this class we discuss some of these proposals emphasizing on approaches for memory systems, virtualization, caching, speculation, distribution, and reconfigurability. We will also look into emerging technologies for resource constrained computer systems (such as wireless devices and information appliances connected to the Internet) and discuss some new compiler and modeling techniques used. The objective of the class is to stimulate discussions about new related research ideas. Several guest lectures are given by MIT researchers working in this research area.
(3 credits)
Homeworks/Grading:
Students are expected to give a 30-40
minutes presentation of one research paper (see guidelines here), to implement/evaluate a compiler technique used in
FlexCache (to be released later) or to write a
research paper (see
guidelines here) based on a related
research idea (see
possible ideas here , and a tar ball of a paper I wrote
for how to build it with latex).
Grading is based on 60% research paper or project work and 40% presentation and class activity.
Prerequisites: some understanding of computer system architecture and software systems
Material: research papers available online
Schedule:
(preliminary! check regularly for updated
schedule):
| Session: | Date: | Topic: | Papers to read/comments: |
| Lecture 1 | Sept 7 | Introduction, Adminsitrativia | Sign up for presentations. Three papers handed out: (come by 309H or download if you need copy) as well as quick reference in using Unix and emacs. The papers are used as example of: (1) designing caches for low-power, (2) inflexibility of traditional caches, (3) lack of predictability and determinacy in caches. 1. Johnson Kin, et. al., The Filter Cache: An Energy Efficient Memory System [download] 2. Keith D Cooper et. al., Compiler-Controlled Memory [download] 3. Philip J. Koopman, Jr., Perils of the PC Cache [download] |
| Lecture 2 | Sept 12 | Architecture: Overview Software Exposed Architectures | 1. Overview, randome walk driving forces in computer
system design.Motivation behind the FlexCache project. [download] 2. Andrew Glew MLP yes! ILP no! (Intel Microcomputer Research Labs and Univeristy of Wisconsin, Madison) [download] |
| Lecture 3 | Sept 14 | Architecture: Compilers: Exploiting Memory and Instruction Level Parallelism and Specialization (Noosha Nayeri) | 1. W. Lee, R. Barua, D. Srikrishna, J. Babb, V.
Sarkar, and S. Amarasinghe, Space-Time Scheduling of
Instruction-Level Parallelism on a Raw Machine, ASPLOS,
October 1998. [download]
2. Jonathan Babb, Martin Rinard,C Andras Moritz, Walter Lee, Matthew Frank, Rajeev Barua, and Saman Amarasinghe, Parallelizing Applications into Silicon, Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines '99 (FCCM '99), Napa Valley, CA, April 1999. [download] |
| Lecture 4 | Sept 19 | Architecture: Modeling: Tiled 1 Bill transistor architectures (Raksit Ashok) | 1.Csaba Andras Moritz, D. Yeung, A. Agarwal. SimpleFit: A Framework to Analyzing Design Tradeoffs in Raw Architectures. Accepted for publication in IEEE Transactions on Parallel and Distributed Systems, [download] |
| Lecture 5 | Sept 21 | Architecture: Compilers: Software Managed
Data Caching ( Ravinder Rachala) |
1. Csaba Andras Moritz, Matthew Frank, Walter Lee,and
Saman Amarasinghe Hot Pages: Software Caching for Raw Microprocessors , MIT-LCS Technical Memo LCS-TM-599, Aug 1999. [download] |
| Lecture 6 | Sept 26 | Architecture: Industry: Transmeta Crusoe (Mandeep Singh) |
Transmeta, The Technology Behind Crusoe(TM) Processors [download] see also http://www.transmeta.com/articles/ and ../dev |
| Lecture 7 | Sept 28 | Architecture: PIM: Processor in Memory Approaches (Saurabh Chedda) | 1. A Case for Intelligent DRAM: IRAM," IEEE
Micro , April 1997. [download]
2.Intelligent RAM (IRAM): the Industrial Setting, Applications, and Architecture", ICCD '97 International Conference on Computer Design, Austin, Texas, 10-12 October 1997 [download] 3. Mark Oskin, Frederic T. Chong, and Timothy Sherwood. Active Pages: A Model of Computation for Intelligent Memory. In the 1998 International Symposium on Computer Architecture, Barcelona, Spain. [download] 4. FlexRAM: Toward an Advanced Intelligent Memory System by Yi Kang, Michael Huang, Seung-Moon Yoo, Zhenzho Ge, Diana Keen, Vinh Lam, Prattap Pattnaik and Josep Torrellas, International Conference on Computer Design (ICCD), October 1999. [download] |
| Guest Lecture 8 |
Oct 3 | Compilers: Pointer Analysis Guest lecture by Radu Rugina, MIT |
1.Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard In the Proceedings of the ACM SIGPLAN 1999 Conference on Programming Languages Design and Implementation Atlanta, GA, May 1999 [download] 1) Flow-insensitive pointer analysis - Steensgaard: http://www.cag.lcs.mit.edu/~rugina/references/Ste96a.ps 2) Flow-sensitive pointer analysis: - Emami, Ghyia, Hendren http://www.cag.lcs.mit.edu/~rugina/references/EGH94.ps - Wilson, Lam http://www.cag.lcs.mit.edu/~rugina/references/WL95.ps 3) multithreaded pointer analysis: - Rugina, Rinard http://www.cag.lcs.mit.edu/~rugina/papers/pldi99.ps |
| Guest Lecture 9 |
Oct 5 | Compilers: Bitwidth Analysis Guest lecture by Mark Stephenson, MIT |
M. Stephenson and J. Babb and S. Amarasinghe. Bitwidth Analysis with Application to Silicon Compilation. In Proceedings of the SIGPLAN conference on Programming Language Design and Implementation, Vancouver, British Columbia, June 2000 [download] |
| Guest Lecture 10 |
Oct 10 | Compilers:Applications: Streaming Applications Guest lecture by Sam Larsen, MIT |
1. S. Larsen and S. Amarasinghe. Exploiting
Superword Level Parallelism with Multimedia Instruction
Sets. MIT/LCS Technical Memo, LCS-TM-601, November
18, 1999. [download]
2.Justin Hensley, Mark Oskin, Diana Keen, Lucian-Vlad Lita, and Frederic T. Chong. Active Page Architectures for Media Processing. First Workshop on Media Processors and DSPs held with MICRO32. November 1999 [download] 3. G. Cheong and M. Lam, An Optimizer for Multimedia Instruction Sets, The Second SUIF compiler workshop, August 21-23, 1997. [download] |
| Lecture 11 | Oct 12 | Architecture: Compilers: Flexible Retargetable Caches for Processors and FPGAs (Prashant Jain, Rajnish Prasad) | 1. A Fully Associative Software-Managed Cache Design,
ISCA2000, Erik G. Hallnor Steven K. Reinhardt [come by 309H for copy] 2. Reconfigurable Caches and their Application to Media Processing , ISCA2000, Parthasarathy Ranganathan, Sarita Adve,Norman Jouppi [download] |
| Guest Lecture 12 |
Oct 17 | Architecture: Compilers: Software Speculation Guest lecture by Matt Frank, MIT |
1. Matthew Frank, Csaba Andras Moritz, Benjamin
Greenwald, Saman Amarasinghe and Anant Agarwal SUDS:
Primitive Mechanisms for Memory Dependence Speculation,
Technical Report. LCS-TM-591, October 1998. [download]
2. M. Franklin and G. S. Sohi, ARB: A Hardware Mechanism for Dynamic Reordering of Memory References, IEEE Transactions on Computers, May 1996. [download] 3. Sridhar Gopal, T. N. Vijaykumar, J. E. Smith and G. S. Sohi, Speculative Versioning Cache, 4th Int'l. Symp. on High Performance Computer Architecture (HPCA-4), Feb 1998. [download] 4. Jeffery Oplinger, David Heine, Shih-Wei Liao, Basem A. Nayfeh, Monica S. Lam and Kunle Olukotun, Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor, Stanford University Computer Systems Lab Technical Report CSL-TR-97-715, February 1997. [download] . More resources {here} |
| Lecture 13 | Oct 19 | Architecture: Compilers: Prefetching (Sriram Swaminathan) | 1. Chi-Keung Luk and Todd Mowry, Cooperative
Prefetching: Compiler and Hardware Support for Effective
Instruction Prefetching in Modern Processors, Proceedings
of Micro-31, Nov 30-Dec 2, 1998. pp 182-194. [download] 2. Ashley Saulsbury, Fredrik Dahlgren, Per Stenstrom. Recency-based TLB Preloading. ISCA2000.[come by my office for copy] |
| Lecture 14 | Oct 24 | Architecture: Compilers: Software Instruction Caching (Prithvi Maddi) |
1. Jason Miller. Software Based Instruction Caching for the Raw Architecture MS Thesis MIT, [download] |
Lecture 15 |
Oct 26 | Architecture: Stanford Smart Memories (Subramarian Venkatraman, Keerthy Thodima) | Smart Memories: A Modular Reconfigurable
Architecture Kenneth Mai,Timothy Paaske,Nuwan Jayasena,Ron Ho,William J. Dally, Mark Horowitz. [download] |
Lecture 16 |
Oct 31 | Architecture: Industry: EPIC Processors (Zhenlin Wang) | 1. D. I. August, D. A. Connors, S. A. Mahlke, J. W. Sias, K. M. Crozier, B. Cheng, P. R. Eaton, Q. B. Olaniran, W. W. Hwu, Integrated Predicated and Speculative Execution in the IMPACT EPIC Architecture, ISCA 98. [download] |
| Lecture 17 | Nov 2 | Foundation: Research Compiler Infrastructure: Suif, Suif2, Machsuif, Machsuif2, Zephyr, Trimaran, (John Cavazos) | Suif Cookbook, SUIF Library Documentation |
| Lecture 18 | Nov 7 | Architecture: Compilers: Flexible Retargetable Caches for Processors and FPGAs (Prashant Jain) | 2. Reconfigurable Caches and their Application to
Media Processing , ISCA2000, Parthasarathy Ranganathan, Sarita Adve,Norman Jouppi [download] |
| Lecture 19 | Nov 9 | Architecture: Compilers: Virtual Memory Systems (Santosh Thampuran) | 1. Bruce Jacob, Trevor Mudge. Software-Managed
Address Translation.ISCA 97[download]
2. Bruce Jacob, Trevor Mudge. Virtual Memory in Contemporary Microprocessors. IEEE Micro 1998 Aug, [download] 3. Ashley Saulsbury, Fredrik Dahlgren, Per Stenstrom. Recency-based TLB Preloading. ISCA2000.[come by my office for copy] |
| Lecture 20 | Nov 14 | Architecture: Power Modeling, Low-power (Yujing Qing) Architecture: Compilers: Caches (Jeevan Chittamuru) |
1. Wattch: A Framework for Architectural-Level Power
Analysis and Optimizations, ISCA 2000, David
Brooks, Vivek Tiwari, Margaret Martonosi 2. T-C. Lee, V. Tiwari, S. Malik and M.Fujita, Power Analysis and Minimization Techniques for Embedded DSP Software, IEEE Transactions on VLSI Systems, March 1997. [download] 3. N Vijaykrishnan, et al. Energy-Driven Integrated Hardware-Software Optimizations Using SimplePower, ISCA2000 [download] 1.
Architectural and compiler support for energy reduction
in the memory hierarchy of high performance
microprocessors, Int'l. Symp. on Low Power Electronics
and Design, Nikolaos Bellas, Ibrahim Hajj, Constantine
Polychronopoulos, and George Stamoulis, Monterey, CA,
1998 |
| Lecture 21 | Nov 16 | 1. Architecture: Resource Constrained Devices:
Bluetooth, WAP, CLDC, KVM, app serverar (Giyasettin
Ozcan) 2. Trace caches (Aziz Mohammed) |
Trace Processors: Moving to Fourth-Generation
Microarchitectures, J. E. Smith, Sriram Vajapeyam,
IEEE Computer, Special Issue on The Future of
Microprocessors, September 1997 |
| Lecture 22 | Nov 21 | Reserved for individual project discussions | |
| Lecture 23 | Nov 23 | Reserved for individual project discussions | |
| Lecture 24 | Nov 28 | Reserved for individual project discussions | |
| Student projects | Nov 30 | Deadline for papers | |
| Student projects | |||
| Course wrap-up |
| ECE ACADEMIC HONESTY POLICY | A new Honor Code
Policy is being instituted for all ECE students, the
result of a joint initiative between students in Eta
Kappa Nu (the ECE student honor society) and the Faculty
of the ECE Department. The purpose of this policy is to
emphasize engineering ethics as an important part of your
education and career, and to enhance the value of your
ECE degree from UMass. Simply put, the policy requires
that each ECE student demonstrate high ethical standards
by attesting to personal honesty and integrity for each
examination taken. The policy fits within the framework
of the existing Academic
Honesty Policy of the University,
and is similar to that used by other universities. On the last page of your midterm exams and final exam, you will be expected to write out and sign your name to the Honor Code Pledge: "On my honor, I have not given nor received aid on this exam." This statement reflects your personal commitment to honesty and ethical practice in the taking of an exam. If you have not written and signed this, you will be contacted by the instructor. Cheating will not be tolerated. A student found cheating on an exam will receive an automatic grade of F on the exam, and likely will fail the course as well. |
Last updated: Sept 12, 2000
andras@ecs.umass.edu