Date: Fri, 3 Apr 1998 22:25:39 -0800 (PST) Message-Id: <199804040625.WAA17152@adit.ap.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: Jeff Weeks , djgpp AT delorie DOT com From: Nate Eldredge Subject: Re: FPU memcpy slower on PII? Precedence: bulk At 06:06 4/2/1998 -0500, Jeff Weeks wrote: >I just recently talked to Michal Mertl and was lucky enough to get a >copy of his FPU memcpy code. > >It's quite excellent, but oddly enough, it's actually slower than the >regular memcpy on my P2/233! Any ideas why (I'll post code later on)? >As for my P233, its always faster, but not as noticeable on smaller >copies (to be expected). > >I figure this is some odd PII optomization quirk that I've missed. Quite probably. Such are the vagaries of asm optimizations and different processors. TANSTATFC (There Ain't No Such Thing As The Fastest Code) >Anyway, here's the code: > > >void memcpyfpu(void *destination,void *source, unsigned long length) >{ > __asm__ __volatile__ ( > "push %%edx > andl $0xfffffff8,%%edx > xorl %%ecx,%%ecx >_LoopPoint: This should really be changed to use local labels. Otherwise it will fail if you compile with `-O3' or have a variable in your program called LoopPoint. Local labels are documented in the `as' manual in node "Symbols" "Symbol Names". They're quite simple really. Nate Eldredge eldredge AT ap DOT net