delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/04/28/07:27:38

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
From: "Dave Korn" <dave DOT korn AT artimi DOT com>
To: "'Jean-Christophe Kablitz'" <jckablitz AT infonie DOT fr>, <cygwin AT cygwin DOT com>
Cc: <newlib AT sources DOT redhat DOT com>
Subject: [PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]]
Date: Thu, 28 Apr 2005 12:25:57 +0100
MIME-Version: 1.0
In-Reply-To: <004a01c54ab6$c37b2ce0$66779b53@maison>
Message-ID: <SERRANOIqtQvAauNjVu0000015f@SERRANO.CAM.ARTIMI.COM>
X-OriginalArrivalTime: 28 Apr 2005 11:26:13.0308 (UTC) FILETIME=[15AB77C0:01C54BE5]
Note-from-DJ: This may be spam

------=_NextPart_000_002C_01C54BED.6E31FFB0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

----Original Message----
>From: Jean-Christophe Kablitz
>Sent: 27 April 2005 00:22

> Hello,
> 
> I have noticed, that, while parsing {a float_value immediately followed by
> 'n' or 'N'} with the "%f%c" format, the sscanf function of cygwin-1.5.16-1
> behaves differently from the scanf function of cygwin-1.5.14-1.
> Until cygwin-1.5.14-1 (included), 'n' matches %c, while with
cygwin-1.5.15-1 
> and cygwin-1.5-16-1, 'n' is no more assigned to %c.
> 
> In the following test case, I would expect the progran to output
> i=2 x=1 m=a
> i=2 x=1 m=n
> 
> that was the case until cygwin-1.5.14-1 (included).
> 
> With cygwin-1.5.15-1 and cygwin-1.5-16-1, the program outputs instead
> i=2 x=1 m=a
> i=1 x=1 m=_
> 
> Maybe I have been misusing sscanf. Or there is a relationship with the
> NaN-parsing problem of the "newlib".

  No, your use of sscanf is perfectly correct!  Yes, there is a newly
exposed bug in the NaN parsing code, as you guessed; it falsely accepted the
N as part of 'NaN'.  Then, because it had begun by parsing a number, and
because it successfully parsed the number, it didn't go through the
'nan-parsing-has-failed-so-put-back-the-eaten-chars' bit that my last fix
introduced.

> --- beginning of test case ---
> jck:/sscanf> cat ssn.c
> #include <stdio.h>
> 
> int main()
> {
>     double x;
>     char   m;
>     int    i;
> 
>     x = 0.0;
>     m = '_';
>     i = sscanf("1.0a", "%lf%c", &x, &m);
>     printf("i=%d x=%g m=%c\n", i, x, m);
>     x = 0.0;
>     m = '_';
>     i = sscanf("1.0n", "%lf%c", &x, &m);
>     printf("i=%d x=%g m=%c\n", i, x, m);
>     return 0;
> }
> 
> jck:/sscanf> gcc -O0 ssn.c -o ssn.exe
> jck:/sscanf> ./ssn.exe
> i=2 x=1 m=a
> i=1 x=1 m=_
> --- end of test case ---

  Thank you for the simple test case; I was able to reproduce the problem
easily, although not exactly: the output I got was:

dk AT mace /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=0 x=0 m=_

  It turns out there has been an underlying bug that was exposed with my
earlier fix.  The problem is in /src/newlib/libc/stdio/vfscanf.c, function
__SVFSCANF_R, case CT_FLOAT, where it's parsing a float and sees an 'n':

		case 'n':
		case 'N':
	          if (nancount == 0
		      && (flags & (SIGNOK | NDIGITS | DPTOK | EXPOK)))
		    {
		      flags &= ~(SIGNOK | DPTOK | EXPOK | NDIGITS);
		      nancount = 1;
		      goto fok;
		    }
		  else if (nancount == 2)
		    {
		      nancount = 3;
		      goto fok;
		    }
		  break;

  The condition at the top of the loop is meant to be testing to ensure we
haven't already parsed any of the other possible components of an FP number,
but what it actually tests is whether or not we've parsed *all* the other
possible components; that's the only case it'll refuse to accept an 'n' at
present.  The reason it used to work is because after bogusly parsing the
'n', the old version then hits this bit of code when it comes time to parse
the %c field (CT_CHAR):

	case CT_CHAR:
	  /* scan arbitrary characters (sets NOSKIP) */
	  if (width == 0)
	    width = 1;

  I don't understand what this is doing, but it looks like some kind of
kludge that's saying "If we got here, then we know there must have been a
char to parse, so if we don't have any, we must have bogusly consumed it
already, so pretend it's there anyway".  Or something; like I say, I don't
understand it, but it looks like a kludge to me.

  Anyway, the attached patch changes the bitwise-AND (&) to an equality (==)
operator, which genuinely tests that we haven't parsed anything else at all;
it's effectively verifying that the flags haven't changed from their initial
value before beginning to attempt to parse the possible 'NaN' string.  This
fixes the testcase for me: I now see

dk AT mace /test/sscanf> ./ssn.exe
i=2 x=1 m=a
i=2 x=1 m=n

and indeed, with an expanded version of it, which also verifies the amount
of characters consumed, I see:

dk AT mace /test/sscanf> cat ssn.c
#include <stdio.h>
int main()
{
    double x;
    char   m;
    int    i, n;

    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0a", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0n", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0na", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0nan", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0e", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    x = 0.0;
    m = '_';
    n = -1;
    i = sscanf("1.0f", "%lf%c%n", &x, &m, &n);
    printf("i=%d x=%g m=%c n=%d\n", i, x, m, n);
    return 0;
}

dk AT mace /test/sscanf> gcc -ggdb -O0 ssn.c -o ssn
dk AT mace /test/sscanf> ./ssn.exe
i=2 x=1 m=a n=4
i=2 x=1 m=n n=4
i=2 x=1 m=n n=4
i=2 x=1 m=n n=4
i=2 x=1 m=e n=4
i=2 x=1 m=f n=4

so I reckon it's all good now.  Jean-Cristophe, you might want to try this
patch locally, or you can wait for it to be incorporated into newlib, at
which point it will start to show up in cygwin developer snapshots, or you
could just wait for the next cygwin dll release.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

------=_NextPart_000_002C_01C54BED.6E31FFB0
Content-Type: application/octet-stream;
	name="exposed-nan-bug.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="exposed-nan-bug.diff"

Index: newlib/libc/stdio/vfscanf.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdio/vfscanf.c,v
retrieving revision 1.29
diff -p -u -w -B -b -r1.29 vfscanf.c
--- newlib/libc/stdio/vfscanf.c	5 Apr 2005 18:38:09 -0000	1.29
+++ newlib/libc/stdio/vfscanf.c	28 Apr 2005 11:09:38 -0000
@@ -1050,7 +1050,7 @@ _DEFUN(__SVFSCANF_R, (rptr, fp, fmt0, ap
 		case 'n':
 		case 'N':
 	          if (nancount == 0
-		      && (flags & (SIGNOK | NDIGITS | DPTOK | EXPOK)))
+		      && (flags == (SIGNOK | NDIGITS | DPTOK | EXPOK)))
 		    {
 		      flags &= ~(SIGNOK | DPTOK | EXPOK | NDIGITS);
 		      nancount = 1;


------=_NextPart_000_002C_01C54BED.6E31FFB0
Content-Type: text/plain; charset=us-ascii

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/
------=_NextPart_000_002C_01C54BED.6E31FFB0--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019