Date: Sun, 13 Jun 1999 12:04:19 -0500 From: Eric Rudd Subject: Re: libm sources from cyberoptics To: Eli Zaretskii Cc: djgpp-workers AT delorie DOT com Message-id: <3763E492.DE5D8281@cyberoptics.com> Organization: CyberOptics MIME-version: 1.0 X-Mailer: Mozilla 4.05 [en] (Win95; U) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit References: Reply-To: djgpp-workers AT delorie DOT com Eli Zaretskii wrote: > 1) About asin and acos returning NaN: > > This difference is the reason for the NaN, because instead of 1-x on > the stack top after FSUBRP, like what you wanted, I get x-1. > > I'm not sure whether this is a bug in Gas or something else. One > thing that confuses me is that I don't understand how your code was > supposed to work. In many of the routines, I have comments that indicate the current contents of the coprocessor stack, in the order /* st(0) st(1) st(2), etc. */ > As far as I know, "fsubrp %st(1), %st" means > "compute st - st(1), store the result in st, then pop the stack". But > if this is true, then the result is gone after the stack is popped, > no? Am I missing something here? The fsubr, fsubrp, fdivr, and fdivrp instructions have been a source of exasperation for me. Here's the scoop: The entire problem, as far as I can tell, arose because of some bungled Intel docs. In Intel's notation, the destination operand is first, the source second. Thus in their manuals, fdiv st(0), st(1) is supposed to divide st(0) by st(1), and leave the result in st(0). Sometimes one wants the reciprocal of this result, so Intel provided the fdivr instruction. For instance, fdivr st(0), st(1) would replace st(0) with st(1)/st(0). Now, it is also possible to make a register other than the stack top the destination register. Thus, fdiv st(1), st(0) was supposed to replace st(1) with st(1)/st(0), and fdivr st(1), st(0) was supposed to replace st(1) with st(0)/st(1). Unfortunately, some Intel docs got this wrong. In some manuals, the descriptions themselves are completely consistent, but the processor does something else. In some later versions, the manuals are not even consistent in themselves as to what is described. All along, the processor itself has used the correct operands and placed the results in the correct places. By correct, I mean that the processor followed a consistent scheme for which bits in the opcode specified the source operand and which bits specified the destination operand, and the rule that the destination operand was operated on by the source operand, and that the corresponding opcodes described with "R" mnemonics negated or reciprocated the result. I first uncovered this problem many years ago on a 387, which was coming up with wrong answers, and I eventually found that these instructions simply were not doing what Intel claimed. Borland discovered this problem too, and made both TASM and its documentation consistent with the actual operation of the coprocessor, which I think was the right thing to do. Unfortunately, someone decided to change the gas mnemonics so as to coincide with certain incorrect Intel docs instead. This has been very confusing, and I groan whenever I need to use those instructions, because I always have to disassemble them, to see what gas has actually done. What's worse, gas has changed over the years, in an attempt to "fix" these problems. I view this as a grave mistake, since it breaks existing code. > Anyway, even if this is because of a bug in Gas 2.7, it would be nice > if the code could assemble with both versions, if that is possible > without paying too much. In view of the problems you have had, I think the only feasible solution is to emit the correct opcode bytes directly, thus bypassing the assembler altogether. This code would stand fast amidst the raging torrent of Intel and gas bugs. I hate to do this, but I know perfectly well what opcodes will do the job best, and it is an intolerable situation if I have to refrain from using them. > 2) exp2 and exp10: I checked again, and they indeed set errno for the > last two Elefunt tests. The original test program didn't do that > because I failed to see that your source defines __pow2, not pow2, I need to put in proper stubs for these. DJ and I had a discussion about this last year; I still don't understand exactly what's going on, but I know how to make stubs like the others. > and the one-pass operation of the linker got me by linking in the > old version from stock libc.a. I hope at least you're using -fno-builtin; otherwise some of my routines (such as sqrt, sin, and cos) don't get called at all! > Provided we find a solution to asin and acos, this leaves the > following issues: I'm still thinking about these issues; here are my current inclinations, based on recent discussions on djgpp-workers (I'm still open to discussion): > - should denormals raise ERANGE? Yes -- a programmer should be able to tell when accuracy has been compromised by underflow. > - should sqrt(-0.0) and log(-0.0) return the same as for positive > zero? Yes, because -0. compares equal to +0. Thus, there's no portable way for a calling program to distinguish them. > - should ceil, fabs, floor, and frexp set errno for NaN arguments? The current feeling in C9x is that NaN arguments should be silently passed on, but they are indeed outside the mathematical domain of definition, so I'm still leaning towards raising EDOM in those cases, even though it slows down the routines slightly. > - storing old control word on the stack as opposed to static data, > in those functions that modify the control word; If these functions get called from ISRs, it would be nice for them to be re-entrant, so I'll fix them. There was much discussion in djgpp-workers about pow(0.,0.). I think I will probably change it to return 1., but indicate a domain error. There's also the issue of whether the control word needs to be maintained in round-to-nearest. I'll look at the 387 manual, to see if we need to do that. It would be nice if we didn't, especially after seeing how much it cluttered the docs you wrote. In general, it's a bad idea for an app to mess with the control word and not put it back. For instance, another change to the control word would cause all results to be rounded to 24 bits, which would mess up a lot of things (including the inline "double" arithmetic that gcc issues). > I did whatever is necessary for the docs part. I send it for your > review in a separate message. Thanks. I've looked over them briefly, but will not give them a final examination until we have cleared up issues with exception cases, control word, etc. > - hypot doesn't set errno if the result overflows a double (e.g., if > both arguments are DBL_MAX). Is this intentional? If not, > perhaps the final value should be checked for being an Inf, and > errno set to ERANGE. The same applies to ldexp. These are minor warts I had overlooked. I'll fix them. > - log2 doesn't set errno to ERANGE for zero arguments, although log, > log10 and log1p all do. I think log2 should behave like the other > log's. Yes, of course it should. I don't know why log2 was different. I'll fix it. > - shouldn't pow set errno to ERANGE when the second argument is Inf > and the result is an Inf? My thinking was that ERANGE should signal a loss of significance, due to extreme range. Thus, pow(2., 1.E6) would signal ERANGE, since the exact result is finite, but couldn't be represented, but pow(2., +inf), for instance, is exactly infinity. Since infinity exists in the IEEE coding, this doesn't represent a loss of significance. > If we come to an agreement quickly enough, we might be in time for > v2.03 (which is all but ready, except for some last-minute > administrative changes in the build process that DJ asked me to do). -Eric Rudd