delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2003/01/21/17:28:18

Date: Tue, 21 Jan 2003 16:28:03 -0600
From: Eric Rudd <rudd AT cyberoptics DOT com>
Subject: Re: math function verification
To: djgpp-workers AT delorie DOT com
Message-id: <3E2DC973.50588C38@cyberoptics.com>
Organization: CyberOptics
MIME-version: 1.0
X-Mailer: Mozilla 4.76 [en] (Win95; U)
X-Accept-Language: en,pdf
References: <200301181746 DOT h0IHkRf21623 AT brother DOT ludd DOT luth DOT se>
Reply-To: djgpp-workers AT delorie DOT com

Martin Str|mberg wrote:

> tsin without -lm gives (perhaps that is to be expected as its supposed to be fast instead of correct to the last
> bit)):

Are the new functions going into libc.a or libm.a?  I wrote the "double" functions that went into libc.a, but I
didn't write a tsin() function, and I don't find it in the current releases of either libc.a or libm.a.  I was
unaware that there were two versions being considered for the new release.

> sin(355) = -3.01443533594889079e-05 from math library.
> sin(355) = -3.01443533594884505e-05 is correct answer.
> About 7.09 bits are wrong out of 53.00 total bits.
> That is about 135.00 ULPs (Units Last Place) bits error.

First, a little numerical analysis: an important measure of the difficulty of computing a function is the relative
sensitivity.  This is defined as the relative error in the result divided by the relative error of the argument.
In math notation,

   S(f(x)) = (df/f)/(dx/x) = (df/dx)/(f/x),

where f(x) is the function being considered.  If S is large, a small relative error in the argument will give rise
to a large relative error in f(x).  For the sine, we have

   S(sin(x)) = cos(x)*x/sin(x)

and, in particular,

   S(sin(355)) = 1.18E7,

which is very large.  This means that in the above example, the relative error between the two values quoted above
(1.5E-14) can be explained by a relative error in their arguments of only 1.3E-21.  That error is on the order of
the error to be expected if the default coprocessor 64-bit value of pi is used for the argument reduction.  In the
"double" routines, I had used the same default value for argument reduction, arguing that the loss of accuracy was
of little practical consequence, since the argument is usually itself the product of prior finite-precision
computation, and is therefore not generally known to better accuracy than the unit roundoff for the wordlength
used.

In the new "long double" routines, I thought that more accurate (and time-consuming) range reduction had been used
(where one assumes that the argument is *exact*).  However, when I consulted QCALC, which is Stephen Moshier's
extended-precision calculator program (computing to 352 bits), it returned the following value:

   sin(355) = -3.014435335948844921433028E-5

This value also agreed with the Matlab symbolic variable-precision package, so I have reason to believe that it is
correct.  Assuming that it is, the first value from the "math library" is off by 1.52E-14 and the second "correct
answer" is off by 4.3E-17.

Moshier's qlib (the math function library used by QCALC) is publicly available

   http://www.moshier.net/qlib.zip

and is very useful for computing the error in less accurate routines.

-Eric Rudd
rudd AT cyberoptics DOT com

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019