Date: Fri, 8 Apr 1994 17:02:40 -0400 (EDT) From: "Chris Mr. Tangerine Man Tate" To: djgpp AT sun DOT soe DOT clarkson DOT edu Subject: memxxx(), Duff's Device, etc. Clearly, I have too much spare time. :-) I ran a quick-and-skanky benchmark comparing the library memxxx() routines, naive byte-by-byte implementations of them, and Duff's Device (unrolled) versions of them. The library version of memcpy() is 60% faster than the Duff's Device implementation, which is in turn about 17% faster than naive byte-by-byte. That gives you some idea of how good the library routines are. :-) On a related note, I'm curious about the Intel architecture. Specifically, I'd like to know: a) Does it have odd-address access restrictions? b) Are accesses on longword (4 byte) boundaries faster than word bounds? b) would probably make it beneficial to write a somewhat more complex, optimized version of memset() and memcpy(), if the library versions of those routines currently work by byte accesses. If they work by some block-move instruction, then give up; it's already optimal. :-) -- chris tate fixer AT faxcsl DOT dcrt DOT nih DOT gov