Message-ID: <363F7C75.5852E3B5@vlsi.com>
Date: Tue, 03 Nov 1998 14:58:13 -0700
From: Charles Marslett <charles DOT marslett AT vlsi DOT com>
X-Mailer: Mozilla 4.03 [en] (WinNT; U)
MIME-Version: 1.0
To: djgpp-workers AT delorie DOT com
CC: Kbwms AT aol DOT com, Zaretski AT delorie DOT com
Subject: Re: Inlining math functions and ANSI/Posix
References: <abd34bd8 DOT 363f3b8d AT aol DOT com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Reply-To: djgpp-workers AT delorie DOT com

Kbwms AT aol DOT com wrote:
> 
> Dear Eli Zaretskii,
> 
> On 11-03-98 at 04:05:59 EST you wrote:
> >
> > I understand that GCC 2.8.x inlines some math functions under -O2 (`sqrt'
> > is one of them).
> >
> > If this is so, how does this influence the ANSI requirements that some
> > values of arguments should set errno?  Is GCC's inlining smart enough to
> > not break this?  If not, should we do something about this?
> >
> > (I cannot test this myself since I still don't have 2.8.1 installed.)
> 
> I wrote some simple code to do a `sqrt' a million times (see appended).
> The code purposely commits a blunder by calling `sqrt' with a negative
> argument.  Below listed are two versions of the assembler output.  The
> first is without optimization, the second is with -O2.
> 
> No optimization
> ---------------
> L9:
>         pushl -16(%ebp)
>         pushl -20(%ebp)
>         call _sqrt
>         addl $8,%esp
>         fstpl -12(%ebp)
>         fldl LC0
>         fstp %st(0)
>         fldl LC0
>         fldl -20(%ebp)
>         fsubp %st,%st(1)
>         fstpl -20(%ebp)
> 
> -O2 optimization
> ----------------
> L10:
>         fldl -8(%ebp)
>         fsqrt
>         fucom %st(0)
>         fnstsw %ax
***********************
>         andb $69,%ah
>         cmpb $64,%ah
***********************
>         je L11
>         fstp %st(0)
>         pushl -4(%ebp)
>         pushl -8(%ebp)
>         fstpt -32(%ebp)
>         call _sqrt
>         addl $8,%esp
>         fldt -32(%ebp)
>         fxch %st(1)
> 
> I am not familiar with this brand of assembler code but, in the version of
> the assembler code that was optimized, it appears that the compiler checks
> the output of `fsqrt' and calls the library version of `sqrt' if something
> looks amiss.

Actually, I'm just as unfamiliar, but if the two lines I bracketed above
AND the FP status byte with 0x69 and then compare it with 0x64, the result
will never be equal, and the call will be made to _sqrt every time, the
direct result will never be used and should be eliminated.

Bit 2 will be turned off by the and, and required to be on for an equal
compare.  Or do I not understand something?

> Under the circumstances of this new code, the fact that it calls `sqrt'
> with a `bad' argument explains why the code seems to run at about the
> same speed regardless of optimization.
> 
> Under both new libraries (libm and Rudd's libc) errno is correctly set
> to 1 when `sqrt' is given a negative argument.  Under the old versions
> of the libraries, errno was not set.
> 
> K.B. Williams
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> #include <errno.h>
> #include <math.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <time.h>
> 
> static double ElapsedTime;
> static clock_t First, Start, Stop;
> 
> int
> main ()
> {
>     int     k;
>     double  Ans, Arg;
> 
>     Arg = -1.0;
> 
>     /* ---------------------------------------------- */
>     /* Get First Reading of Next Tick to Start Timing */
>     /* ---------------------------------------------- */
>     for (First = clock (), Start = clock (); First == Start; Start = clock ())
>         ;
> 
>     for (k = 1; k <= 1000000; ++k)
>       {
>           Ans = sqrt(Arg);
>           Arg -= (1.0000001e-6);
>       }
>     /* ---------------------- */
>     /* Get Last Tick of Clock */
>     /* ---------------------- */
>     Stop = clock ();
>     ElapsedTime = (double) (Stop - Start);
> 
>     printf ("\nElapsed time: %.0f clocks\n", ElapsedTime);
> 
>     printf ("Arg = %f, Ans = sqrt(Arg) = %f, errno = %d\n",
>         Arg, Ans, errno);
>     return 0;
> }

--Charles