ECE697LS/ECE597LS / Syllabus


Course Objectives

To study hardware acceleration techniques for machine learning across a range of hardware systems from mobile battery powered edge devices to cloud.

Course Description

Study architectural techniques for efficient hardware design for machine learning (ML) systems including training and inference. Course has three parts. First part deals with convolutional and deep neural network models. Second part deals with parallelization techniques for improving performance of ML algorithms. Last part deals with hardware design involving various acceleration techniques for improving computational efficiency of ML kernels including locality, precision in matrix multiplication and convolution; role of batch size, regularization, precision and compression in design space trade-off for efficiency vs accuracy; evaluation of performance, energy efficiency, area, and memory hierarchy.

Course will involve Machine Learning Labs in Python using Numpy, Tensorflow and Keras and Verilog for hardware design.


Class is open to computer engineering seniors and graduate students. Prior experience with Python required.


  • Introduction to Machine Learning, 2nd Edition, Ethem Alpaydın, The MIT Press (Recommended)
  • Research papers
  • Lecture Notes

Late Policy

Project demos and lab reports are due as posted on the course web page. Late submissions will not be accepted in general and be graded at the instructor’s discretion. If you know that your project is running late, contact the instructor to make individual arrangements.