|


| |
Project Overview
The multiprocessor nature of NP platforms poses a particularly difficult
problem for application development. Current software development environments
(SDKs) require an in-depth understanding of the hardware architecture of the NP
system (something that traditionally has been abstracted by SDKs). Emerging NP
systems with a large number of heterogeneous processing resources will make this
problem increasingly difficult as the program developer will have to make
choices on which hardware units to use for which tasks. Such decisions can have
significant impact on the overall performance of the system as poor choices can
cause contention on resources. To alleviate this problem, we profile and analyze
NP applications to make decisions on how to assign processing tasks to a
particular NP architecture.
Our approach is to analyze the run-time characteristics of NP applications
and develop an abstract representation of the processing steps and their
dependencies. This creates an "annotated acyclic directed graph" (ADAG), which
is an architecture-independent representation of an application. The annotations
indicate the processing requirements of each block and the strength of the
dependency between blocks. The basic idea is that we build the application
representation "bottom-up" considering each individual data and control
dependency. The ADAG can then be used to determine an optimal
allocation of processing blocks to any arbitrary NP architecture. The focus of
this work is on
- A methodology for automatically identifying processing blocks from a
run-time analysis of NP applications.
- An algorithm to group "cohesive" processing blocks into processing
clusters and a heuristic to efficiently approximate this NP-complete
problem.
- A randomized mapping algorithm to dynamically allocate processing
clusters to processing resources on arbitrary network processor
architectures.
The results from this work can be used to automate software optimization
tasks. The ADAG representation of the application is also a crucial input to the
performance modeling process.
An extension of this work is to consider run-time management of network
processors as a dynamic mapping and re-mapping process.
Publications
- Xin Huang and Tilman Wolf. A methodology for
evaluating runtime support in network processors. In Proc. of
ACM/IEEE Symposium on Architectures for Networking and Communication Systems
(ANCS), San Jose, CA, December 2006.
- Tilman Wolf, Ning Weng, and Chia-Hui Tai, “Design
considerations for network processor operating systems,” in Proc. of
ACM/IEEE Symposium on Architectures for Networking and Communication Systems
(ANCS), Princeton, NJ, Oct. 2005.
- Ning Weng and Tilman Wolf.
Analytic
modeling of heterogeneous network processors for parallel workload mapping.
To appear in ACM Transactions on Embedded Systems.
- Ramaswamy Ramaswamy, Ning Weng, and Tilman Wolf, “Analysis
of network processing workloads,” in Proc. of IEEE International
Symposium on Performance Analysis of Systems and Software (ISPASS),
Austin, TX, Mar. 2005.
- Ning Weng and Tilman Wolf, “Profiling and
mapping of parallel workloads on network processors,” in Proc. of The
20th Annual ACM Symposium on Applied Computing (SAC), Santa Fe, NM, Mar.
2005.
- Ning Weng and Tilman Wolf, “Pipelining vs.
multiprocessors - choosing the right network processor system topology,”
in Proc. of Advanced Networking and Communications Hardware Workshop
(ANCHOR 2004) in conjunction with The 31st Annual International
Symposium on Computer Architecture (ISCA 2004), Munich, Germany, June
2004.
- Ramaswamy Ramaswamy, Ning Weng, and Tilman Wolf, “Application
analysis and resource mapping for heterogeneous network processor
architectures,” in Proc. of Third Workshop on Network Processors and
Applications (NP-3) in conjunction with Tenth International Symposium
on High Performance Computer Architecture (HPCA-10), Madrid, Spain, Feb.
2004, pp. 103–119.
For a complete list of NSL publications, see the
publications page. |