Lab 2

Lab 1
Lab 2
Lab 3
Final Project

IP Forwarding and Packet Classification on the IXP1200

Due: Part I: 11/18/03 in class, Part II: 11/25/03 in class
Lab 2 instructions - Part I Lab 2 instructions - Part II
IXP1200 Hardware Reference Manual
IXP1200 Development Tools User's Guide
Lab 2 code

The goal of this lab is to explore the basic functionalities of the IXP1200 software development kit and microengines. There are two parts to the project. Part I uses an existing sample application (IP forwarding) to collect a number of workload statistics from the IXP simulator. Part II requires the programming of a small application on the microengines to implement a simple packet classification mechanism.

Both parts require access to a machine that has the Intel SDK installed. You can remotely log on to the course machine (address and account information given in class) or you can use a machine in Marcus 15B. If you want, you can also request an installation CD for your own machine.

Part I: IPv4 Forwarding Simulation

In this part, you will run an implementation of IP forwarding on the IXP1200 simulator. All the code is provided to you. You will just need to collect a set of workload statistics that are reported by the simulator.

It is recommended that you go through the following steps before you explore the simulator on your own:

  1. Log on to the machine with the SDK installed. Please be aware that this machine is shared. Be careful not to destroy anything (yes, you do have Administrator privileges because you need them for the SDK). Also, please log out as soon as you are done with your simulations.
  2. Start “Developer Workbench”
  3. Click on File->Open Project to open the IP forwarding project. In order to ensure that nobody overwrites other people’s files, your personal copy of the forwarding code can be found at: C:\IXP1200\MicroCode\ECE697J\[your name]\ L3fwd16 . Open the project file L3fwd16.dwp in that directory.
  4. You should see the list of project files on the right. Expand “Assembler Source Files” and double-click on “rx_ether100m.uc” to access the IP forwarding code. Read through the code to get a basic understanding of the structure. Some functions (e.g., ip_verify) are implemented in “ip.uc”.
  5. Build the project but clicking on Build->Rebuild.
  6. Before starting the simulation, you need to ensure it stops at some point. Click on Simulation->IX Bus Device Simulator->Options. In the “Stop Control” tab, checkmark the option “Stop the simulation after the next 100 packets are received by the IXP from this bus” and click “OK”.
  7. Click Debug->Start Debugging.
  8. Click Debug->Run Control->Go. The simulation should start and run for several seconds. Then a window pops up that says “The simulation is being stopped because the IXP received 100 packets  from IX bus”. Click “OK”. At this point the simulation is interrupted and the simulation results are available. You can now access these statistics by looking at the appropriate windows.
  9. Click Simulation->Simulation Statistics. In the “Summary” tab, you can see the microengine utilization and the MIPS rating as well as memory statistics. The “Microengine” tab shows more detailed results for each thread. Expand by clicking on the “+” next to a microengine. The “All” tab has a lot of statistics with detailed event counts and percentages. You need to select the parameter that you want to have displayed. Note that “f0” corresponds to Microengine 0. There is also a number of statistics on memory performance (e.g., latency distributions for various shared resources: R-FIFOs, SRAM, etc.).
  10. Click on View->Debug Windows-> History and make sure the checkmark is set. This will display a window that shows the thread status and memory queue status for the entire execution. It might be necessary that you close/resize other windows to ensure good visibility. At the top of the history window, there are two checkmarks: Threads and Queues. Mark Threads and unmark Queues. You should see a graph with a number of lines and symbols. Click “Legend…” to display a window with explanations. Look at microengine 0 (threads 0-3) and see how context switching occurs. Look at the solid and dotted lines that indicate the progress on resource requests. Find an example where you see an SDRAM and SRAM request from one thread in parallel. Does the microengine stall as a result of that request? Uncheck Threads and check Queues. This shows the queue lengths of various memory requests etc. Identify the queues that are heavily used.
  11. Click Simulation->Execution Coverage… This will pop up a window that shows the source code of IP forwarding and a number on the left of some lines. This number indicates how often a particular instruction on a microengine was executed. Identify code blocks that are executed frequently and those that are not used at all.
  12. Explore other windows, run simulation for more packets etc.

Statistics that you should collect, present, and discuss are:

    bulletMicroengine utilization for all microengines and detailed statistics of one thread from uE 0 and one from uE 5  (execution % and aborted %).
    bulletProcessing power of microengines (in MIPS).
    bulletMemory utilization and bandwidth.
    bulletLatency distribution for SDRAM refs for microengine 0 and SRAM non-read_lock refs for microengine 0 (you might want to run the simulation for a large number of packets (10000) before you collect these statistics). Show a graph (that you generate from this data).
    bulletShow a screenshot (Alt+PrtScn and the paste into MS Paint or Word) for the thread history that shows overlapping SRAM and SDRAM requests by the same microengine. Identify the overall delay for either request (in cycles). What factors (queuing, actual access, …) contributed how much to the overall delay?

Write a short (3-4 pages) report (that requires some text!) that presents the above statistics (plus graphs and screenshots where indicated) and discusses them briefly (i.e., is there anything surprising, is it what you expect, etc.).

This report is due 11/18/03 in class.

Part II: Packet Classification

For this part, you will implement a simple packet classification mechanism on the IXP1200. This requires that you add a small part of microengine assembly to a given IP forwarding implementation.

The function of the classifier is to separate different types of packets to different outputs on the IXP. There are six types of traffic that are considered in this lab:

bulletARP traffic
bulletUDP over IP traffic
bulletWeb traffic over TCP over IP
bulletSSH traffic over TCP over IP
bulletNon-web and non-SSH traffic over TCP over IP
bulletNon-TCP and non-UDP IP traffic (e.g., IP-over-IP tunnel)

The IXP1200 should classify each packet according to these rules and send it to an output port as shown in the figure below. Note that the classification step replaces the IP destination address lookup.

For this part, you should do the following:

bulletExtend the given forwarding code to implement the classification as described above.
bulletDetermine the traffic mix. What fraction of the traffic belongs to each of the six classes?
bulletUse the execution coverage window in the simulator to verify that the instruction coverage of your classifier matches the traffic mix results.
bulletAssume that the classification step was really critical for performance. In your implementation, you have a choice of making classification decisions in different orders (e.g., check for UDP packets before checking the type of TCP packet etc.). In what order should packets be classified given the traffic mix in this example? In general, if the traffic mix is known, in what order should classification be done?

It is recommended that you go through the following steps:

bulletDownload the classifier skeleton code from here and unzip it such that the L3fwd16.classifier directory is in the same directory as the L3fwd16 directory that you used for Part I.
bulletOpen the project L3fwd16.dwp in the L3fwd16.classifier directory. Rebuild the code, start debugging, and run the simulator. Look at the IX Bus window and you will see that all packets arrive on port 0 and are sent back out on ports 0 and 1. When you have implemented the classifier there will be packets on outputs 1 through 6.
bulletOpen the file rx_ether100m.uc in your project. Look for the places where comments were added that include the word “697J”. That’s where you want to pay close attention. You need to add code in the section that does the classification. A basic structure that identifies TCP/IP traffic and SSH/TCP/IP traffic is already given. Extend this to consider all the necessary cases. Also, ensure that all the traffic goes to the right ports (this is done with the instruction move(output_intf, 0x0000000), where the hex value is the port number).
bulletDetermining the port number: This is a bit unusual. The hex value that you put into the instruction is the port number (0-7) left-shifted by three bits. Thus, port 0x1 becomes 0x8 (as in the example for TCP traffic), and port 0x3 becomes 0x18.
bulletWhen you have implemented the classifier, you should see traffic on ports 1-6. Note that traffic is randomized, so you need to let the simulation run for a while (at least 40 or so packets) to make sure all packet types have appeared. To determine the traffic mix, set the simulator to stop after 1000 packets have been transmitted. Then check the number of packets on each port and determine the fraction.
bulletLook at the execution coverage window. Note that only microengine 0 is used. Your classification code is data-dependent and thus not all instructions are executed for all packets (e.g., the distinction between SSH and web is only done for TCP packets, not for UDP or others). So, you should easily find the classification code by looking at the instruction frequency. Look at which instructions of your classification code are executed how many times. Does this match the traffic mix?

Your report should include:

bulletThe code that you wrote in the classification part of the rx_ether100m.uc file PLUS a brief description of how it works. Please don’t submit the entire rx_ether100m.uc file.
bulletThe traffic mix percentages for 3 different runs of the simulator.
bulletA discussion of the execution coverage that you observed and how it matches the traffic mix.
bulletAn answer to the question on how to order the classification steps such that performance is optimized (see above).

If you have any questions or problems, contact me or ask during the help session at the end of the lecture on Thursday, 11/20/03.

©2003 by Tilman Wolf