ECE 669 -- Parallel Computer Architecture.
University of Massachusetts Amherst
Department of Electrical and Computer Engineering
[This page:Syllabus | Course Info | Schedule]
[Goto Homeworks | Projects]Instructor: Csaba Andras Moritz, Associate Professor
email: andras@ecs.umass.edu,
phone: 413-545-2442
Office: Knowles, room KEB-309H
Secretary: June Daehler, phone: 413-545 3621
Office hours: Thursday 1:30 - 2:30 or by appointment
Teaching Assistant: JinWei Hong, mailto:jihong@ecs.umass.edu
Abstract:
Since the early 1980s, when the first commercially successful multiprocessors appeared, parallel processing has begun to make a considerable impact on the computer marketplace. From highly parallel single-chip microprocessors to scalable enterprise multiprocessors, parallel machines are entering new commercial application domains every day. We predict that all future computer systems will be parallel to some extent. This course seeks to train students to specify, design, and evaluate parallel architectures/systems for both scientific/engineering and enterprise application domains. Equal weight is placed on understanding the main architectural components in a parallel computer, will be covering both midrange SMPs, high-performance scalable distributed memory machines, and parallel single-chip designs; the programming models deployed on parallel machines --e.g., message passing and shared memory models--; and automatically parallelizing compilers --e.g., compiler techniques for memory disambiguation, mapping, partitioning, communication and computation scheduling and various parallelism extraction techniques. Recent emerging trends and ideas in parallel machine designs on a single-chip, proposed for billion transistor multiprocessor-on-a-chip architectures, are also discussed, including coarse-grained chip-multiprocessors, fine-grained reconfigurable tiled designs, as well as speculative parallel architectures. State-of-the-art parallel machine models are presented that can be used to explore the design space of current and future parallel systems, parallel algorithms, and also to make the correct design choices for the grain size and balance of parallel computer systems.
Textbook:
David E. Culler, Jaswinder Pal Singh, Anoop Gupta; Parallel Computer Architectures. A Hardware/Software Approach.
Additional reading will be from:
1. Research Papers, will be posted online.
2. F Thomson Leighton; Introduction to Parallel Algorithms and Architecture.
3. Vipin Kumar, Ananth Grama, Anshul Gupta, George Karypis; Introduction to Parallel Computing.
4. Daniel Lenoski, Wolf-Dietrich Weber; Scalable Shared-Memory Multiprocessing.
Course Notes: provided online on the course website.
Requirements: one (research) project, two (or three) homeworks (based on cache and network multiprocessor simulators and Raw multiprocessor-on-a-chip simulators) and one exam (preliminary). The course will contain 25 lectures, and several additional discussion sessions for the research projects.
Grading (preliminary):Grading on-campus students: 40% project and discussions, 40% exam(s), 20% Homeworks.
Grading off-campus students: 40% project, 40% exam(s), 20% Homeworks.
Prerequisites: Introductory level Computer Architecture, basic Algorithms, some understanding of how compilers work.
Equipment:
On-Campus Computer Account: An ECS account is required or access to a SUN/Solaris computer. Off-campus students please contact VIP at UMASS for ECS accounts.
Course URL will be @: /ece/andras/courses/ECE669/index.html
Event Fall 01 Topics Notes (PS or PDF formats) Additional
notes from classTextbook Reading Problems to Solve @ Home Additional Reading Homework Lecture 1 Introduction & Course Information Lecture 1 Notes on blue pad Ch1 See Notes Lecture 2 Fundamental Design Issues Lecture 2 Ch1 Lecture 3 Parallel Applications, Implementations under Various Programming Models Lecture 3 Ch2 for algorithms see books 2,3 Project out Lecture 4 Implementations under Various Programming Models Slides from previous lecture Ch2 Lecture 5 Parallel Programming Models, Commercial Applications, Lecture 5 Notes on blue pad Ch3 Implement a version of MxM in all 3 Prog. Models PrgLang , Active Messages, OpenMP[1] [2] Project plans due **(read note at bottom of page) Lecture 6 Commercial Applications, MPI, Performance Aspects Slides from previous lecture Notes on Blue pad,
Sample1-PtToPtComm
Sample2-Scatter
Sample3-GatherCh3 Implement the equation solver from textbook in MPI MPI [1][2] Lecture 7 Architecture of Midrange Bus Based SMPs, Snoop-Based Multiprocessing Lecture7 Article Ch5 Lecture 8 Architecture of Midrange Bus Based SMPs, Snoop-Based Multiprocessing Slides from previous lecture Notes on blue pad Ch5 Homework 1 out Lecture 9 Snoop-Based Multiprocessing Lecture 9 Notes on blue pad Ch6 Lecture 10 Scalable Multiprocessors: Cache Coherence Lecture 10 Notes on blue pad Ch8 Lecture 11 Scalable Multiprocessors: Cache Coherence Slides from previous lecture Ch8 Lecture 12 Case Studies- Scalable Multiprocessors: Cache Coherence, limited pointer schemes Slides from previous lecture Ch8 Lecture 13 Project Related Discussion- Last Part of Cache Coherence, memory consistency models Slides from previous lecture Notes on blue pad Ch8 & papers Stenstrom Survey False sharing [2], Dash[3] Lecture 14 Interconnection Networks Lecture 14 Ch10 Homework 1 due Lecture 15 Performance Modeling of Parallel Machines Slides from previous lecture Culler LogP [1][2], LogGP papers, MPI model Homework 2 out Due Dec 11, 2001. Send to TA (email or paper copy) Late submissions will not be accepted.
Lecture 16 Exam Review Canceled (because of sickness.) Project descriptions based on the discussion from L13, are due, paper copies should be handed in after class Lecture 16 Exam Review and Network Topologies Review if you need the solution for the hw send me email, I can't put it up on the web! LoGPC, Frank LoPC, Agarwal KnC Exam Midterm - Tuesday, Nov 5 Time&Date see Review Lecture 17 Unloaded Performance in K-ary N-cubes. same slides Notes on blue pad Off-campus exam date is Nov 20! VIP office will send info! PROJECTS ARE DUE DEC 21! No extension will be given. Lecture 18 Routing same slides Lecture19 Alewife res. paper Alewife paper [2] [3] Lecture20 Alewife res. paper Alewife paper [2] [3] Lecture21 Network & Resource Contention slides LoGPC, Frank LoPC, Agarwal KnC Lecture22
Estimating network contention in applications. The Dash multiprocessor same slides Notes on blue pad and Dash overview Daniel Lenoski, Wolf-Dietrich Weber; Scalable Shared-Memory Multiprocessing. (book) Lecture22 The Stanford Dash multiprocessor NOTE THAT THERE IS NO LECTURE TUESDAY Dec 4 Because of the Microarchitecture conference ! To allow more time and emphasis with the projects, HW3 will not be required. Lecture23 Dec 6 Microarchitectural trends. Discussing Micro-34 Overview Raw and Hydra. Lecture24 Dec 11 MLP vs ILP & Raw Compiler, slides material will be distributed in class.{download} RawCC papers [1][2][3], DeepC paper SUDS, HotPages Homework 2 is due Dec 11. Send to TA (email or paper copy) Late submissions will not be accepted. Lecture25 Dec 13 Raw Architecture and Design Exploration. Billion transistor designs. slides Raw Design MS Thesis [2] SimpleFit paper Last lecture! Additional readings: Alewife synchronization [1][2][3]
IEEE Computer 1998, Spec Issue on Billion Transistor Architectures
** Note: off-campus students automatically get a week extension, on ANY deadline that appears on this web page. More extension, if needed, will be granted but send me email to synch details.