Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com From: "Dave Korn" To: "'Jean-Christophe Kablitz'" , Cc: Subject: [PATCH] Fix newly exposed bug [was RE: RFC: Fix partial NaN-parsing problem [was RE: sscanf problem]] Date: Thu, 28 Apr 2005 12:25:57 +0100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_002C_01C54BED.6E31FFB0" In-Reply-To: <004a01c54ab6$c37b2ce0$66779b53@maison> Message-ID: X-OriginalArrivalTime: 28 Apr 2005 11:26:13.0308 (UTC) FILETIME=[15AB77C0:01C54BE5] Note-from-DJ: This may be spam ------=_NextPart_000_002C_01C54BED.6E31FFB0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit ----Original Message---- >From: Jean-Christophe Kablitz >Sent: 27 April 2005 00:22 > Hello, > > I have noticed, that, while parsing {a float_value immediately followed by > 'n' or 'N'} with the "%f%c" format, the sscanf function of cygwin-1.5.16-1 > behaves differently from the scanf function of cygwin-1.5.14-1. > Until cygwin-1.5.14-1 (included), 'n' matches %c, while with cygwin-1.5.15-1 > and cygwin-1.5-16-1, 'n' is no more assigned to %c. > > In the following test case, I would expect the progran to output > i=2 x=1 m=a > i=2 x=1 m=n > > that was the case until cygwin-1.5.14-1 (included). > > With cygwin-1.5.15-1 and cygwin-1.5-16-1, the program outputs instead > i=2 x=1 m=a > i=1 x=1 m=_ > > Maybe I have been misusing sscanf. Or there is a relationship with the > NaN-parsing problem of the "newlib". No, your use of sscanf is perfectly correct! Yes, there is a newly exposed bug in the NaN parsing code, as you guessed; it falsely accepted the N as part of 'NaN'. Then, because it had begun by parsing a number, and because it successfully parsed the number, it didn't go through the 'nan-parsing-has-failed-so-put-back-the-eaten-chars' bit that my last fix introduced. > --- beginning of test case --- > jck:/sscanf> cat ssn.c > #include > > int main() > { > double x; > char m; > int i; > > x = 0.0; > m = '_'; > i = sscanf("1.0a", "%lf%c", &x, &m); > printf("i=%d x=%g m=%c\n", i, x, m); > x = 0.0; > m = '_'; > i = sscanf("1.0n", "%lf%c", &x, &m); > printf("i=%d x=%g m=%c\n", i, x, m); > return 0; > } > > jck:/sscanf> gcc -O0 ssn.c -o ssn.exe > jck:/sscanf> ./ssn.exe > i=2 x=1 m=a > i=1 x=1 m=_ > --- end of test case --- Thank you for the simple test case; I was able to reproduce the problem easily, although not exactly: the output I got was: dk AT mace /test/sscanf> ./ssn.exe i=2 x=1 m=a i=0 x=0 m=_ It turns out there has been an underlying bug that was exposed with my earlier fix. The problem is in /src/newlib/libc/stdio/vfscanf.c, function __SVFSCANF_R, case CT_FLOAT, where it's parsing a float and sees an 'n': case 'n': case 'N': if (nancount == 0 && (flags & (SIGNOK | NDIGITS | DPTOK | EXPOK))) { flags &= ~(SIGNOK | DPTOK | EXPOK | NDIGITS); nancount = 1; goto fok; } else if (nancount == 2) { nancount = 3; goto fok; } break; The condition at the top of the loop is meant to be testing to ensure we haven't already parsed any of the other possible components of an FP number, but what it actually tests is whether or not we've parsed *all* the other possible components; that's the only case it'll refuse to accept an 'n' at present. The reason it used to work is because after bogusly parsing the 'n', the old version then hits this bit of code when it comes time to parse the %c field (CT_CHAR): case CT_CHAR: /* scan arbitrary characters (sets NOSKIP) */ if (width == 0) width = 1; I don't understand what this is doing, but it looks like some kind of kludge that's saying "If we got here, then we know there must have been a char to parse, so if we don't have any, we must have bogusly consumed it already, so pretend it's there anyway". Or something; like I say, I don't understand it, but it looks like a kludge to me. Anyway, the attached patch changes the bitwise-AND (&) to an equality (==) operator, which genuinely tests that we haven't parsed anything else at all; it's effectively verifying that the flags haven't changed from their initial value before beginning to attempt to parse the possible 'NaN' string. This fixes the testcase for me: I now see dk AT mace /test/sscanf> ./ssn.exe i=2 x=1 m=a i=2 x=1 m=n and indeed, with an expanded version of it, which also verifies the amount of characters consumed, I see: dk AT mace /test/sscanf> cat ssn.c #include int main() { double x; char m; int i, n; x = 0.0; m = '_'; n = -1; i = sscanf("1.0a", "%lf%c%n", &x, &m, &n); printf("i=%d x=%g m=%c n=%d\n", i, x, m, n); x = 0.0; m = '_'; n = -1; i = sscanf("1.0n", "%lf%c%n", &x, &m, &n); printf("i=%d x=%g m=%c n=%d\n", i, x, m, n); x = 0.0; m = '_'; n = -1; i = sscanf("1.0na", "%lf%c%n", &x, &m, &n); printf("i=%d x=%g m=%c n=%d\n", i, x, m, n); x = 0.0; m = '_'; n = -1; i = sscanf("1.0nan", "%lf%c%n", &x, &m, &n); printf("i=%d x=%g m=%c n=%d\n", i, x, m, n); x = 0.0; m = '_'; n = -1; i = sscanf("1.0e", "%lf%c%n", &x, &m, &n); printf("i=%d x=%g m=%c n=%d\n", i, x, m, n); x = 0.0; m = '_'; n = -1; i = sscanf("1.0f", "%lf%c%n", &x, &m, &n); printf("i=%d x=%g m=%c n=%d\n", i, x, m, n); return 0; } dk AT mace /test/sscanf> gcc -ggdb -O0 ssn.c -o ssn dk AT mace /test/sscanf> ./ssn.exe i=2 x=1 m=a n=4 i=2 x=1 m=n n=4 i=2 x=1 m=n n=4 i=2 x=1 m=n n=4 i=2 x=1 m=e n=4 i=2 x=1 m=f n=4 so I reckon it's all good now. Jean-Cristophe, you might want to try this patch locally, or you can wait for it to be incorporated into newlib, at which point it will start to show up in cygwin developer snapshots, or you could just wait for the next cygwin dll release. cheers, DaveK -- Can't think of a witty .sigline today.... ------=_NextPart_000_002C_01C54BED.6E31FFB0 Content-Type: application/octet-stream; name="exposed-nan-bug.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="exposed-nan-bug.diff" Index: newlib/libc/stdio/vfscanf.c =================================================================== RCS file: /cvs/src/src/newlib/libc/stdio/vfscanf.c,v retrieving revision 1.29 diff -p -u -w -B -b -r1.29 vfscanf.c --- newlib/libc/stdio/vfscanf.c 5 Apr 2005 18:38:09 -0000 1.29 +++ newlib/libc/stdio/vfscanf.c 28 Apr 2005 11:09:38 -0000 @@ -1050,7 +1050,7 @@ _DEFUN(__SVFSCANF_R, (rptr, fp, fmt0, ap case 'n': case 'N': if (nancount == 0 - && (flags & (SIGNOK | NDIGITS | DPTOK | EXPOK))) + && (flags == (SIGNOK | NDIGITS | DPTOK | EXPOK))) { flags &= ~(SIGNOK | DPTOK | EXPOK | NDIGITS); nancount = 1; ------=_NextPart_000_002C_01C54BED.6E31FFB0 Content-Type: text/plain; charset=us-ascii -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ ------=_NextPart_000_002C_01C54BED.6E31FFB0--