Mail Archives: djgpp/1996/12/28/20:16:04
Salvador Eduardo Tropea (SET) wrote:
<SNIP> <ALTERNATIVE MEMSET CODE>
> That's practically equal to memcpy (you will save some little test at
> the begining of memcpy => 6 o 10 cicles in 128000)
Partially true - but djgpp inlines memcpy only for length values fixed at
compile-time, so for variable buffers you get a function call. Inlined,
Mihai's code is considerably faster for *small* buffers ( 200 bytes - 2K);
while for 64K+ buffers you indeed run against the hardware limits of
memory access times. In my usage, it was about as fast as memcpy() for
double buffers, slightly faster, but when I replaced memcpy() with it in
some polygon filling code I had the run speed doubled.
The fastest method given you know the buffer size in advance would probably
be an unrolled rep movsd loop, not that that would make a world-shaking
difference; on my DX4-100 with slow (70ns) RAM the CPU is idling 7 cycles
for every byte written through in burst mode, so it wouldn't matter a bit.
E
- Raw text -