Mail Archives: djgpp-workers/2004/02/18/10:41:28
Eli Zaretskii wrote:
>As someone who was involved, albeit as an interested bystander, in
>Eric's efforts to produce fast, reliable, and standard-conforming
>functions, I largely agree with these sentiments.
>
>So I think that it is not too awful that we will continue to have
>these deviations from C9x.
>
In the C99 standard, there is the following:
6.10.8 Predefined macro names
[#2] The following macro names are conditionally defined by
the implementation:
__STDC_IEC_559__ The decimal constant 1, intended to
indicate conformance to the
specifications in annex F (IEC 60559
floating-point arithmetic).
We can conform to the standard with the current math functions by not
defining this macro. Then, when someone has the inclination and energy
to modify the functions, they can define this macro.
I have read Esa's posting, and can also see his point of view; I agree
that standards are important, and would tend to adhere to a standard
rather than personal preference. However, I can't even reconcile IEC
60559 with the rest of the C99 standard, in particular, with the
following clause:
7.12.1 Treatment of error conditions
1 The behavior of each of the functions in <math.h> is specified for
all representable
values of its input arguments, except where stated otherwise. Each
function shall execute
as if it were a single operation without generating any externally
visible exceptional
conditions.
2 For all functions, a domain error occurs if an input argument is
outside the domain over
which the mathematical function is defined.
Since, for example, inf/inf is outside the domain over which the arc
tangent function is defined, it would seem that a domain error should occur.
>>on the Pentium 4, the coprocessor handles NaNs, etc. through an
>>exception mechanism that is as much as 300 times slower than normal
>>execution.
>>
>>
>Do you have details about this? Like more precise timings, the
>reasons why it is so slow, etc.?
>
I doubt that the timings are completely consistent, but I modified an
FFT test harness to take the 2-D FFT an array of NaNs, and got the
following for a 1024-by-1024 array on a 550-MHz Pentium 3: Valid floats,
30 ms; NaNs, 1.06 s -- a ratio of 35. On a 1.7-GHz Pentium 4 there was
even more of a difference: valid floats, 11 ms; NaNs, 2.6 s -- a ratio
of over 200.
My guess is that Intel wisely decided to make the routine floating-point
computations more efficient, at the expense of unusual computations. I
don't know how much DJGPP is involved in handling these exceptions, but
the substantial differences between the Pentium 3 and Pentium 4 would
suggest that the hardware is causing most of these delays.
-Eric Rudd
- Raw text -