Mail Archives: djgpp/1997/02/08/18:53:19
In article <5d916e$c1e AT hecate DOT umd DOT edu>, Jason Stratos Papadopoulos
<jasonp AT Glue DOT umd DOT edu> writes
>Hello. I'm writing a homebrew arbitrary precision arithmetic package,
>and was coding up a multiply that uses fast Fourier transforms. Everything
>is nice and fast up to array (of doubles) size 2^14, but from 2^15 on
>the computation time hits a brick wall! Before this point (2^10,11,12,etc)
>a given power of 2 only takes a bit more than twice as long as the one
>before. The program does no memory management, and I need to work with
>array sizes perhaps as large as 2^18 or 2^19.
This looks suspiciously like L2 cache thrashing. The 2^15 array would be
256K of doubles, which is a likely L2 cache size.
What you need to do is rearrange your code to access the arrays
sequentially in linear memory, for fft code that could prove a little
difficult of course ;) For fft code you may need to rearrange the entire
calculation to work on sections of the array at all butterfly levels
rather than doing each one for the entire array.
---
Paul Shirley: shuffle chocolat before foobar for my real email address
- Raw text -