delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1998/04/04/01:26:16

Date: Fri, 3 Apr 1998 22:25:39 -0800 (PST)
Message-Id: <199804040625.WAA17152@adit.ap.net>
Mime-Version: 1.0
To: Jeff Weeks <info AT codex DOT nu>, djgpp AT delorie DOT com
From: Nate Eldredge <eldredge AT ap DOT net>
Subject: Re: FPU memcpy slower on PII?

At 06:06  4/2/1998 -0500, Jeff Weeks wrote:
>I just recently talked to Michal Mertl and was lucky enough to get a
>copy of his FPU memcpy code.
>
>It's quite excellent, but oddly enough, it's actually slower than the
>regular memcpy on my P2/233!  Any ideas why (I'll post code later on)? 
>As for my P233, its always faster, but not as noticeable on smaller
>copies (to be expected).
>
>I figure this is some odd PII optomization quirk that I've missed.

Quite probably. Such are the vagaries of asm optimizations and different
processors.

TANSTATFC (There Ain't No Such Thing As The Fastest Code)

>Anyway, here's the code:
>
>
>void memcpyfpu(void *destination,void *source, unsigned long length)
>{
>  __asm__ __volatile__ (
>  "push    %%edx
>        andl     $0xfffffff8,%%edx
>        xorl     %%ecx,%%ecx
>_LoopPoint:

This should really be changed to use local labels. Otherwise it will fail if
you compile with `-O3' or have a variable in your program called LoopPoint.
Local labels are documented in the `as' manual in node "Symbols" "Symbol
Names". They're quite simple really.

Nate Eldredge
eldredge AT ap DOT net



- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019