Deepak Unnikrishnan


I graduated with a Ph.D. in Electrical and Computer Engineering from the University of Massachusetts, Amherst. My advisor was Prof. Russell Tessier. I now work as a senior design engineer at Altera, San Jose, California. 


My Ph.D. research focused on building novel networking and distributed computing applications that make use of Field-programmable Gate Arrays (FPGAs). FPGAs are integrated circuits that can be reconfigured to adapt to the specialized needs of an application. These circuits have widespread utility in modern networking, scientific computing and hardware prototyping applications.

In my thesis, I developed heterogneous systems that use FPGAs and general-purpose processors to improve the performance and flexibility of next-generation Internet infrastructure. These systems have applications in network virtualization and distributed cluster computing. Please click here for an extended research abstract.

I am interested in the design and automation of computer systems for networking and parallel computing applications. I have experience developing both hardware architecture and system software to realize complete end applications.


Reconfigurable Technologies for Next Generation Internet and Cluster Computing
Ph.D. Thesis, University of Massachusetts, September 2013

Application-specific Customization and Scalability of Soft Multiprocessors
M.S. Thesis, University of Massachusetts, May 2009


Please visit Google Scholar for an updated list.

High-Performance Hardware Monitors to Protect Network Processors from Data Plane Attacks
H. Chandrikakutty, D. Unnikrishnan, T.Wolf and R. Tessier
ACM/IEEE Design Automation Conference (DAC), Austin, TX, June 2013

Reconfigurable Dataplanes for Scalable Network Virtualization
D. Unnikrishnan, R. Vadlamani, Y. Liao, J. Crenne, L. Gao and R. Tessier
IEEE Transactions on Computers (Accepted/To appear)

Configurable Memory Security In Embedded Systems
R. Vaslin, G. Gogniat, J.-P. Diguet, R. Tessier, D. Unnikrishnan, and K. Gaj
ACM Transactions on Embedded Computing Systems, vol. 12, no. 3, March 2013.

ReClick - A Modular Dataplane Design Framework for FPGA-Based Network Virtualization
D. Unnikrishnan, J. Lu, L. Gao and R. Tessier
ACM/IEEE Conf. on Architectures for Networking and Communications Systems (ANCS), New York, Oct 2011

Customizing virtual networks with partial FPGA reconfiguration
D. Yin, D. Unnikrishnan, Y. Liao, L. Gao, R. Tessier
ACM Computer Communication Review (CCR), vol. 41, no. 1, January 2011.

Customizing virtual networks with partial FPGA reconfiguration (Best paper award)
D. Yin, D. Unnikrishnan, Y. Liao, L. Gao, R. Tessier
2nd ACM SIGCOMM Workshop on Virtualized Infrastructure Systems and Architectures (SIGCOMM VISA), New Delhi, Aug 2010

Scalable Network Virtualization using FPGAs
D. Unnikrishnan, R. Vadlamani, Y. Liao, J. Crenne, L. Gao and R. Tessier
ACM/IEEE/SIGDA Intl. Conf. on Field Programmable Gate Arrays (FPGA), Monterey, CA, Feb 2010

Application specific customization and scalability of soft multiprocessors
D. Unnikrishnan, J. Zhao, R. Tessier
IEEE Intl. Conf. on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, April 2009

Memory Security Management for Reconfigurable Embedded Systems
R. Vaslin, G. Gogniat, J.-P. Diguet, R. Tessier, D. Unnikrishnan, and K. Gaj
IEEE Intl. Conf. on Field-Programmable Technology (FPT), Taipai, Taiwan, Dec 2008


Altera Toronto Technology Center, Toronto, Canada (Feb 2012--June 2012)
Intern in the timing and power modeling group. Developed a novel approach to correlate silicon timing (setup/hold) margins with timing models in Altera's Computer Aided Design (CAD) software.

Altera Corporation, San Jose, California (May 2010--Aug 2010)
Intern in the high-speed transceiver group. Developed RTL descriptions and intellectual property (IP) interfaces for XAUI serial protocol in Cyclone IV FPGA.

University of Massachusetts, Amherst (Sep 2007-Dec 2008)
Graduate teaching assistant for introductory digital electronics, data structures, Java programming, and reconfigurable computing courses. Won 2010 outstanding teaching assistant award.

Engineer from Wipro Technologies at Texas Instruments, India (2005--2007)
Developed middleware for a TI OMAP phone prototype


Outstanding teaching assistant, Dept. of ECE, University of Massachusetts, Amherst.
Best SIGCOMM 2010 workshop paper.
University 2nd rank in Electronics and Communication Engineering in the undergraduate program.


Probability and Random Processes (Fall 2007)
Computer Architecture (Fall 2007)
VLSI Architectures (Spring 2008)
Embedded Systems Design (Spring 2008)
VLSI Design Principles (Fall 2008)
Reconfigurable Computing (Spring 2008)
Parallel Computer Architecture (Spring 2008)
Computer Networks (Fall 2009)
Online Social Networks (Spring 2010)
Compiler Techniques (Fall 2010)
Distributed Operating Systems (Fall 2010)


Almost all the work related to my Ph.D. thesis is built on top of NetFPGA and a port of NetFPGA on Altera DE-4 platform. The NetFPGA is an FPGA based programmable platform for networking research from Stanford University. More information can be found here. At the University of Massachusetts, we have also developed an interface for NetFPGA platform on Altera DE-4 board that features a Stratix IV FPGA. Please follow the project descriptions below for obtaining source code.

Network Virtualization using FPGAs
Network virtualization enables the physical network infrastructure (e.g. network routers and switches) to be shared among several logical networks to run diverse protocols and differentiated services.In this project, we demonstrate a high performance heterogeneous network virtualization platform that integrates FPGAs and general-purpose CPUs. Salient features of this architecture include the ability to scale the number of virtual networks in an FPGA using existing software-based network virtualization techniques, the ability to map virtual networks to a combination of hardware and software resources on demand using virtual network migration. Evaluation of our system using a NetFPGA card demonstrates one to two orders of improved throughput over state-of-the-art network virtualization techniques.

ReClick - A modular design platform for FPGA-based network virtualization
ReClick is an open source software framework for describing virtual network dataplane packet forwarding operations for the NetFPGA platform. In this framework, a user describes packet forwarding operations as a combination of simple packet processing primitives. ReClick compiles these descriptions to Register Transfer Level (RTL) descriptions that have standardized interfaces. The compiled descriptions are available as a library. Several components from the library can be stitched together to form a pipelined packet processing datapath. Connections are specified in similar lines to Click which is a rapid software router prototyping platform for routers.

Maestro is an open-source distributed computing framework for executing iterative algorithms in clusters that have FPGAs and general-purpose CPUs. Iterative algorithms represent a pervasive class of data mining, web search and scientific computing applications. In iterative algorithms, a final result is derived by performing repetitive computations on an input data set (e.g. PageRank, Dijkstra's shortest path). Existing techniques to parallelize such algorithms typically use software frameworks such as MapReduce and Hadoop to distribute data for an iteration across multiple CPU-based workstations in a cluster and collect per-iteration results. These platforms are marked by the need to synchronize data computations at iteration boundaries, impeding system performance. Maestro uses asynchronous accumulative updates to break these synchronization barriers. These updates allow for the accumulation of intermediate results for numerous data points without the need for iteration-based barriers allowing individual nodes in a cluster to independently make progress towards the final outcome. Computation is dynamically prioritized to accelerate algorithm convergence. We have implemented a general-class of iterative algorithms have been implemented on a cluster of four FPGAs. A speedup of 371X is demonstrated for a multi-node system of four Altera DE-4 boards versus an equivalent Hadoop-based CPU-workstation cluster.



© 2013 Deepak Unnikrishnan
Template design by Andreas Viklund / Best hosted at

Locations of visitors to this page