Compiling using SimpleScalar
This
exercise involves writing C/C++ programs and
using
You
will begin by writing a number of simple C
language programs. The main performance indicator we will need is the execution
time of our programs. On modern computers you will find that most tasks take a
very short time and therefore cannot be measured accurately enough for our
purposes. To overcome this we will often need to repeat the task we are using
for benchmarking many times.
1)
Write
a hello world program in C. Make sure that you
embed the hello world print statement after an empty ‘for’ loop which goes
from 1 to 1000,0000
(one million). Now Compile using home/simplescalar/bin/sslittle-na-sstrix-gcc.
The command format is
Then
use sim-safe to simulate the executable file
which is hello. You should execute sim-safe
to save the simulation output as well as the program output to collect the
statistics (simulation elapsed time and total number of instructions) and make
sure that the program executes correctly i.e., no errors.
sslittle-na-sstrix-gcc
-o –O2 hello_O2 hello.c
Hello |
Total
# of instructions |
Total
Elapsed time |
No
optimization |
|
|
-O2 |
|
|
-O3 |
|
|
Did you observe any difference between the unoptimized version and –O2
version of hello world?
Briefly describe what does –O2 and –O3 do? (Hint: search google using the keywords compiler optimization levels )
Plot a histogram showing different levels of optimization.
2)
Now
we will create two very simple synthetic benchmarks. For the first benchmark
(call it test1.c) write the following
arithmetic expression in your loop a
= b + c * d /e –f. The loop
goes from 1 to 1000,0000
(one million). Initialize a,b,c,d,e,f
as integers with values of b,c,d,e
and f as 2,3,10,5,8 respectively.
Then use sim-safe
to simulate the executable file. You should
execute sim-safe to save the simulation output
as well as the program output to collect the statistics (elapsed time and total
number of instructions) and make sure that the program executes correctly i.e.,
no errors.
Report the
total number of instructions executed and the total elapsed time to execute
test1. Fill table2.
Now use sim-profile
to to simulate the executable file. You should
execute sim-profile to save the simulation
output as well as the program output to collect the statistics (Integer
operations) and make sure that the program executes correctly i.e., no errors.
Report the total number of Integer operations IOPS for test1. Fill table2.
Repeat the
above procedure using –O2 optimization.
Fill table2.
Repeat the
above procedure using –O3 optimization.
Fill table2
Test1 |
Total
# of instructions |
Total
Elapsed time |
Integer
operations % |
Floating
operations % |
No
optimization |
|
|
|
|
-O2 |
|
|
|
|
-O3 |
|
|
|
|
Did
you observe any difference between the unoptimized version and –O2
version of test1?
Did you observe any difference between –O2 and –O3 version of test1?
Plot a
histogram showing different levels of optimization.
3) For the second program (call it test2.c) change the integer arithmetic to floating point arithmetic (you will really only need to change variable declarations from int to float). Note that you will not be able to use the first program to obtain FLOPS (floating point operations).
Repeat all
what you did for test1.c except that now you
should report floating point instructions per sec FLOPS instead of IOPS. Fill
table3
Repeat the
above procedure using –O2 optimization.
Fill table3.
Test2 |
Total
# of instructions |
Total
Elapsed time |
Floating
operations % |
Integer
operations % |
No
optimization |
|
|
|
|
-O2 |
|
|
|
|
-O3 |
|
|
|
|
Did
you observe any difference between the unoptimized version and –O2
version of test2?
Did
you observe any difference between the unoptimized version and –O3
version of test2?
Plot a histogram showing different levels of optimization.