Mail Archives: pgcc/2000/06/13/22:26:36
Hello,
the are on (for Athlon stepping 1) and two (for Athlon stepping 2) flag
names missing in the current linux kernel (2.4.0-test1-ac18). Alan Cox
and I are very pleased if someone of you have the AMD Athlon Programming
Ref handy?!
Took some time to get the CD from AMD...
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 1
model name : AMD-K7(tm) Processor
stepping : 2
cpu MHz : 548.952604
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
cmov 16 mmxext mmx 3dnowext 3dnow
bogomips : 1094.45
16 ?
Now Athlon 800 (nice thing:-)
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 2
model name : AMD Athlon(tm) Processor
stepping : 1
cpu MHz : 798.470512
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
cmov 16 pse36 mmxext mmx 24 3dnowext 3dnow
bogomips : 1592.52
16 & 24 ?
Thanks,
Dieter
BTW I found the best optimization flags combination for the Athlon.
GCC-2.96 CVS didn't come close neither with same flags or mcpu=athlon
and/or march=athlon!!! :-( Why?
Especially -O (nothing), -mcpu=k6 and -mpreferred-stack-boundary=2 (2
!!!) is needed.
!!!-fomit-frame-pointer is worse!!! Don't use it as you can...
This is the best for an MFLOPS test (dgemm from Quant-X, Alpha FPU test,
source available).
-O -mcpu=k6 -mpreferred-stack-boundary=2 -malign-functions=4
-fschedule-insns2 -fexpensive-optimizations
K7-550
gcc -O -funroll-loops -DMAIN -o dgemm dgemm.c
SunWave1>./dgemm-O
m:1000 n:1000 k:1000
Ail_max 24, Blj_max 12, A_row_block 85
Shimizu's DGEMM : 147.493 MFLOPS(13.560 sec)
Shimizu's DGEMM : 147.493 MFLOPS(13.560 sec)
Shimizu's DGEMM : 147.601 MFLOPS(13.550 sec)
gcc -O -mcpu=k6 -mpreferred-stack-boundary=2 -malign-functions=4
-fschedule-insns2 -fexpensive-optimizations -DMAIN -o dgemm dgemm.c
SunWave1>./dgemm-k6
m:1000 n:1000 k:1000
Ail_max 24, Blj_max 12, A_row_block 85
Shimizu's DGEMM : 213.447 MFLOPS( 9.370 sec)
Shimizu's DGEMM : 213.220 MFLOPS( 9.380 sec)
Shimizu's DGEMM : 213.220 MFLOPS( 9.380 sec)
K7-800 got ~222 and ~288
Any questions?
--
Dieter Nützel
Graduate Student, Computer Science
University of Hamburg
Department of Computer Science
Cognitive Systems Group
Vogt-Kölln-Straße 30
D-22527 Hamburg, Germany
email: nuetzel AT kogs DOT informatik DOT uni-hamburg DOT de
@home: dieter DOT nuetzel AT myokay DOT net
- Raw text -