delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/11/17/09:18:26

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
Message-ID: <4B02B41A.5010806@gmail.com>
Date: Tue, 17 Nov 2009 14:32:58 +0000
From: Dave Korn <dave DOT korn DOT cygwin AT googlemail DOT com>
User-Agent: Thunderbird 2.0.0.17 (Windows/20080914)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: gcc -ffast-math defect with tan(x)
References: <AE5F3949C74C6547AEAEAB87D1AC7F1B057788A6 AT cos-us-mb01 DOT cos DOT agilent DOT com> <4B028ED0 DOT 5070600 AT gmail DOT com> <loom DOT 20091117T143159-735 AT post DOT gmane DOT org>
In-Reply-To: <loom.20091117T143159-735@post.gmane.org>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Eric Backus wrote:

> One experiment that I did, which confused me more than anything else, is 
> replace the calls to tan() with calls to log() (and change all the 0.0 values 
> to something OK for log() like 1.0).  The generated assembly code appears to 
> be identical except that _f_tan is replaced by _f_log, but the program works 
> correctly.  That would mean that the generated assembly code is correct, and 
> the defect is in _f_tan?

  Something I don't understand is going on in the x87 fpu.  Your STC again:

int main(void)
{
    double d1 = 0.0;
    double d2 = 0.0;
    d1 = tan(d1);
    d2 = tan(d2);
    (void) printf("d1 = %lg, expecting 0 (or -0)\n", d1);
    (void) printf("d2 = %lg, expecting 0 (or -0)\n", d2);
    return 0;
}

  In this code, _f_tan is called twice, with a value of zero each time.  But
it behaves differently the second time.  The _f_tan code in assembly looks like:

> (gdb) disass 0x610ea500
> Dump of assembler code for function _f_tan:
> 0x610ea500 <_f_tan+0>:  push   %ebp
> 0x610ea501 <_f_tan+1>:  mov    %esp,%ebp
> 0x610ea503 <_f_tan+3>:  fldl   0x8(%ebp)
> 0x610ea506 <_f_tan+6>:  fptan
> 0x610ea508 <_f_tan+8>:  fincstp
> 0x610ea50a <_f_tan+10>: leave
> 0x610ea50b <_f_tan+11>: ret
> End of assembler dump.
> (gdb)

  Here's the first run through:

> Breakpoint 2, 0x004011e8 in _f_tan ()
> _f_tan () at /gnu/winsup/src/newlib/libm/machine/i386/f_tan.S:28
> 28		pushl ebp
> Current language:  auto; currently asm
> 29		movl esp,ebp
> 30		fldl 8(ebp)

This is the fp state just before executing the fldl above:

>   R7: Empty   0x00000000000000000000
>   R6: Empty   0x00016106e7800022c548
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
> =>R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff0000                                            
>                        TOP: 0
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffffff
> Instruction Pointer: 0x00:0x00000000
> Operand Pointer:     0xffff0000:0x00000000
> Opcode:              0x0000
> 31		fptan

This is the fp state just before executing the fptan.  Zero has been loaded
into r7 which is the current top-of-stack a.k.a. st(0):

> =>R7: Zero    0x00000000000000000000 +0                         
>   R6: Empty   0x00016106e7800022c548
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3800                                            
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff7fff
> Instruction Pointer: 0x1b:0x0040a8d2
> Operand Pointer:     0xffff0023:0x0175ffa0
> Opcode:              0xdf7d
> 32		fincstp

  This is the fp state immediately after the fptan and before the fincstp.  It
has loaded zero (= tan 0.0) into r7 and pushed a constant +1 (as is the
documented behaviour of fptan):

>   R7: Zero    0x00000000000000000000 +0                         
> =>R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3000                                            
>                        TOP: 6
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff4fff
> Instruction Pointer: 0x1b:0x00a50c88
> Operand Pointer:     0xffff0023:0x03f6f5c8
> Opcode:              0xddd8
> 34		leave

  FP stack pointer has been incremented and we return the result in ST(0):

> =>R7: Zero    0x00000000000000000000 +0                         
>   R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3800                                            
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff4fff
> Instruction Pointer: 0x00:0x00000000
> Operand Pointer:     0xffff0000:0x00000000
> Opcode:              0x0000
> Undefined command: "".  Try "help".

  So far so good.  Then it comes to the second execution:


> Breakpoint 2, 0x004011e8 in _f_tan ()
> _f_tan () at /gnu/winsup/src/newlib/libm/machine/i386/f_tan.S:28
> 28		pushl ebp
> 29		movl esp,ebp
> 30		fldl 8(ebp)

  State before the fldl, as before.

>   R7: Empty   0x00000000000000000000
>   R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
> =>R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff0000                                            
>                        TOP: 0
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffcfff
> Instruction Pointer: 0x1b:0x00483622
> Operand Pointer:     0xffff0023:0x0012e7d8
> Opcode:              0xdd1c
> 31		fptan

  State after fldl, before fptan: r7 correctly loaded with zero, as before.

> =>R7: Zero    0x00000000000000000000 +0                         
>   R6: Valid   0x3fff8000000000000000 +1                         
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3800                                            
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffff4fff
> Instruction Pointer: 0x1b:0x00a50c88
> Operand Pointer:     0xffff0023:0x03f6f5c8
> Opcode:              0xddd8
> 32		fincstp

  WTF?  The fptan has returned two QNaNs for no apparent reason?

>   R7: Special 0xffffc000000000000000 Real Indefinite (QNaN)
> =>R6: Special 0xffffc000000000000000 Real Indefinite (QNaN)
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3241   IE                       SF      C1      
>                        TOP: 6
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffafff
> Instruction Pointer: 0x1b:0x007c74c7
> Operand Pointer:     0xffff0023:0x03e6fd5c
> Opcode:              0xddd8
> 34		leave

  ... and we increment the stack pointer and return the top one as the
function's result.

> =>R7: Special 0xffffc000000000000000 Real Indefinite (QNaN)
>   R6: Special 0xffffc000000000000000 Real Indefinite (QNaN)
>   R5: Empty   0xd8021027c0001027bff8
>   R4: Empty   0xd8d0611dae800022bf20
>   R3: Empty   0x0c656120a86000000000
>   R2: Empty   0x001e0000007700000042
>   R1: Empty   0xc52c0022c530001a75de
>   R0: Empty   0x170800ce00cc507c0000
> 
> Status Word:         0xffff3841   IE                       SF              
>                        TOP: 7
> Control Word:        0xffff037f   IM DM ZM OM UM PM
>                        PC: Extended Precision (64-bits)
>                        RC: Round to nearest
> Tag Word:            0xffffafff
> Instruction Pointer: 0x1b:0x004042b1
> Operand Pointer:     0xffff0023:0x01490518
> Opcode:              0xd95f
> Continuing.
> 
> Program exited normally.

  Hmm.  I think the C1 indicates it believes there has been a stack underflow,
and maybe that happens because the r6 slot is valid rather than empty the
second time round; maybe _f_tan needs to be 'popping' (or in some way marking
invalid) that unused +1.0 constant rather than just skipping the stack pointer
over it.  I'll see if that makes a difference; I'm not an x87 specialist, I
only know just enough to get by.  I see that the fincstp documentation does
warn that "this operation is not equivalent to popping the stack", so I may be
on the right track.

    cheers,
      DaveK



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019