#### ECE 671 – Lectures 22

Specialized hardware and Runtime support



## Packet processor

• How to implement packet processing?

671

Packet processor

© 2011 Tilman Wolf

- Tradeoffs between hardware and software
  - Custom hardware: ASIC
  - Software: workstation processor or network processor



• How can processor be optimized for networking?

ECE 671 © 2011 Tilman Wolf

#### Network processor • System architecture: network processor hardware control accelerator processor processor processor off-chip memory off-chip memory interconnect processor processor I/O interface network interface switch fabric interface ECE 671 © 2011 Tilman Wolf



## Example network processors

|                           | Intel IXP2855           | Cisco QuantumFlow         | Cavium CN5860          |
|---------------------------|-------------------------|---------------------------|------------------------|
| Maximum throughput        | 10Gbps                  | 20Gbps                    | 20Gbps                 |
| Data path processors      | 16 32-bit RISC          | 40 32-bit RISC            | 16 64-bit RISC         |
|                           | processors, 8 threads   | processors, 4 threads per | processors, up to      |
|                           | per processor, up to    | processor, up to 1.2GHz   | 800MHz                 |
|                           | 1.5GHz                  |                           |                        |
| On-chip memory            | 32kB instruction and    | 16kB cache per            | 32kB instruction cache |
|                           | 32kB data memory per    | processor, 256kB shared   | and 8kB data cache per |
|                           | processor               | cache                     | processor, 2MB shared  |
|                           |                         |                           | cache                  |
| Control processor         | 32-bit XScale RISC      | Off-chip                  | Off-chip               |
|                           | core, 32kB instruction  |                           |                        |
|                           | cache, 32kB data cache, |                           |                        |
|                           | up to 750MHz            |                           |                        |
| External memory           | 3 DRAM interfaces,      | DRAM, SRAM, and           | DRAM and TCAM          |
| interfaces                | 4 SRAM interfaces       | TCAM interfaces           | interfaces             |
| Hardware accelerators     | Cryptographic co-       | Classification, traffic   | Cryptographic co-      |
|                           | processor               | policing, etc.            | processor, TCP         |
|                           |                         |                           | acceleration, regular  |
|                           |                         |                           | expression matching,   |
|                           |                         |                           | etc.                   |
| Maximum power dissipation | 32W                     | 80W                       | 40W                    |

ECE 671 © 2011 Tilman Wolf

# **Processing workload**

- · What does packet processing look like?
  - Many components
  - Processing differs based on packet



How to spread workload over processor cores?

ECE 671 © 2011 Tilman Wolf 8





#### Hardware accelerators

- Custom logic components for network functions
  - Lookup and classification (e.g., TCAM)
  - Pattern matching
  - Cryptographic co-processor
  - Compression and decompression
  - XML processing
- Tradeoff between performance gain and space
  - Only highly utilized accelerator is worthwhile

ECE 671 © 2011 Tilman Wolf 11

