Mail Archives: pgcc/1998/09/05/13:46:30
This seems highly interesting. Could you post some example
code to show how you get the 200-300% improvement. Are
you talking inline assembler code? Also, do
you refer to the Intel documentation to take advantage of
the hardware structure or is this documented somewhere else?
DS
> Maybe, my point of view is a little bit different. I am on this
> mailing list not for the long time. So I apologize...
>
> As far as I noticed, the improvement in speed is something like 10%.
> Maybe, it's impressive, but not so much. I found out that the real
> thing is to take advantage of the c o m p l e t e hardware
> structure (Regs, L1, L2, latency etc). By using these dependencies
> carefully
> an improvement by 200-300% is possible. Since I am using my PC (AMD
> K5 based) for number crunching, the heart of all efforts is careful
> optimization of the whole system for carefully selected routines
> (kernels of routines).
>
> But this does not concern the compiler...
>
> Krzystof, maybe, you can speed up your computations by using
> hand-optimized blas etc for Pentium. Or have a look at atlas or
> phipac.
>
> Yours,
> Michael
>
>
> +---------------------------------------------------------------+
> | Michael Hanke Royal Institute of Technology |
> | NADA |
> | S-10044 Stockholm |
> | Sweden |
> +---------------------------------------------------------------+
> | Visiting address: Lindstedtsvaegen 3 |
> | Phone: + (46) (8) 790 6278 |
> | Fax: + (46) (8) 790 0930 |
> | Email: hanke AT nada DOT kth DOT se |
> | na DOT mhanke AT na-net DOT ornl DOT gov |
> +---------------------------------------------------------------+
>
- Raw text -