Mail Archives: pgcc/1998/06/09/13:46:32
Comparison of the executable sizes and compression times for a large tar
file (32624640 bytes - linux sources + object files) for bzip2-0.1pl1.
Every variant of the executable has been liked with libgcc.a from gcc 2.7.2.3.
1. Gcc 2.7.2.3, -m386 -O2 -malign-jumps=0 -malign-loops=0 -malign-functions=0
-fno-strength-reduce
2. Gcc 2.7.2.3, -m386 -O2 -malign-jumps=0 -malign-loops=0 -malign-functions=0
3. Pgcc 1.0.2, no haifa, -mpentium -O2 -malign-jumps=0 -malign-loops=0
-malign-functions=0
4. Pgcc 1.0.2, no haifa, -mpentium -O2 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -fno-strength-reduce
5. Pgcc 1.0.2, no haifa, -mpentium -O3 -malign-jumps=0 -malign-loops=0
-malign-functions=0
6. Pgcc 1.0.2, no haifa, -mpentium -O3 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -fno-inline-functions
6a.Pgcc 1.0.2, no haifa, -mpentium -O3 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -fno-inline-functions -fno-strength-reduce
7. Pgcc 1.0.2, no haifa, -mpentium -O4 -malign-jumps=0 -malign-loops=0
-malign-functions=0
8. Pgcc 1.0.2, no haifa, -mpentium -O4 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -fno-inline-functions
8a.Pgcc 1.0.2, no haifa, -mpentium -O4 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -fno-inline-functions -fno-strength-reduce
9. Pgcc 1.0.2, no haifa, -mpentium -O4 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -fno-inline-functions -funroll-all-loops
10.Pgcc 1.0.2, no haifa, -mpentium -O4 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -funroll-all-loops
11.Pgcc 1.0.2 with haifa, -mpentium -O4 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -funroll-all-loops
12.Pgcc 1.0.2 with haifa, -mpentium -O4 -malign-jumps=0 -malign-loops=0
-malign-functions=0 -funroll-all-loops -fno-inline-functions
Machine: Pentium 166 MMX
OS: linux-2.0.34
libc: 5.4.38
Executa- | execution times (seconds) - user only
ble size | level -1 level -5 level -9
(bytes) | compr decompr compr decompr compr decompr
1. 41808 | 167.66 40.33 184.87 45.13 199.93 45.66
2. 42304 | 173.75 40.68 189.03 45.43 203.85 45.94
3. 44420 | 168.78 42.87 179.12 47.14 193.21 47.66
4. 44036 | 166.72 41.39 180.00 47.08 192.85 47.48
5. 53844 | 163.09 42.48 173.02 47.28 186.83 47.33
6. 44324 | 167.09 42.69 178.59 46.76 190.71 47.33
6a. 43908 | 166.30 41.95 177.54 46.52 190.67 47.05
7. 56596 | 157.20 41.12 165.30 46.41 179.09 46.38
8. 46308 | 162.70 41.41 172.48 45.75 184.93 45.92
8a. 45764 | 164.42 41.23 172.53 45.54 186.06 45.47
9. 75428 | 151.59 40.18 161.68 45.13 176.30 45.27
10. 99508 | 150.21 40.86 163.66 45.44 177.69 45.48
11. 99476 | 147.11 39.38 162.86 44.13 176.66 44.68
12. 75396 | 149.11 39.35 161.22 44.10 175.99 44.34
The conclusions are pretty different than for the FPU intensive program
GAMESS. The option -fno-strength-reduce increases the performace of the
program (for lower optimization levels). Unfortunately, -fno-inline-functions
makes the program slower (different than GAMESS). Unrolling of loops gives
large speedup, too.
Decompression is pretty fast with gcc 2.7.2.3 - faster than with most
pgcc compiled executables.
I hate to say this, but those damned, bloated "9" and "10" things are fastest.
Even worse - all bloated variants are _noticeably_ faster than non bloated
ones, while doing the compression. On the other hand, mixing of
-finline-functions and -funroll-all-loops gives slightly faster code only
for the compression level -1.
And what about haifa?
The code (tried for fastest variants only) is a bit smaller - again a
difference in comparison with GAMESS. The program runs slightly faster - yet
another difference (but other set of compilation options was used)...
Eh, life isn't perfect...
I'm going back to my FPU intensive programs.
Krzysztof
- Raw text -