delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/04/22/06:31:05

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
Message-ID: <49EEF44D.1010908@gmail.com>
Date: Wed, 22 Apr 2009 11:41:17 +0100
From: Dave Korn <dave DOT korn DOT cygwin AT googlemail DOT com>
User-Agent: Thunderbird 2.0.0.17 (Windows/20080914)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: [1.7] Updated: cygwin-1.7.0-45
References: <20090331111757 DOT GA22043 AT calimero DOT vinschen DOT de> <200904031037 DOT n33Ab4Ma001073 AT mail DOT bln1 DOT bf DOT nsn-intra DOT net> <20090403145139 DOT GJ12738 AT calimero DOT vinschen DOT de> <200904211025 DOT n3LAPf7a022955 AT mail DOT bln1 DOT bf DOT nsn-intra DOT net> <20090421152334 DOT GH8722 AT calimero DOT vinschen DOT de> <20090421161337 DOT GG18867 AT trikaliotis DOT net> <20090421165642 DOT GK8722 AT calimero DOT vinschen DOT de> <20090421175436 DOT GA18266 AT calimero DOT vinschen DOT de> <49EE5D4D DOT 8030906 AT gmail DOT com> <49EE9A96 DOT 6040900 AT byu DOT net> <20090422083145 DOT GQ8722 AT calimero DOT vinschen DOT de>
In-Reply-To: <20090422083145.GQ8722@calimero.vinschen.de>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Corinna Vinschen wrote:
> On Apr 21 22:18, Eric Blake wrote:
>> The bug was that isblank(-1) was blindly treated as if were equivalent
>> with isblank(0xff), which, in some locales, is flat out wrong
>> (isblank(EOF) should always be 0, even when isblank(0xff) is well-defined
>> as 1).  Broken apps can't tell the difference between isblank((char)0xff)
>> and isblank(EOF), but correct apps, like sed, CAN tell the difference
>> between 0xff and EOF in "int ch = getchar(); isblank(ch)" since getchar()
>> returns an int containing an unsigned char value (and not a char).
>>
>> Sed's infinite loop, then, was because of newlib/cygwin's bug - sed
>> reached the end of the file while trying to skip blanks, but because
>> isblank() was returning the wrong value for -1, sed thought that EOF was a
>> blank and kept trying to read the file instead of breaking out of the loop.
> 
> Thanks for the explanation.  Apparently I'm unable to explain this
> clearly enough.

  When you referred to "broken applications" passing the wrong input to the
ctype function, I thought you meant SED by that.

  Rereading your letter with the isFOO examples, I guess I should have been
able to infer it from that.  It would have been a bit clearer (to me, at any
rate) if the answer to the question "How does SED cope with this on glibc
systems" had not been the quote from the header, which describes something
that newlib also does, but has said something like "Glibc makes sure that
entry -1 has the correct flag values for EOF rather than for
0xff-incorrectly-promoted, newlib resolves the clash in favour of using the
flag values for 0xff-incorrectly-promoted".

    cheers,
      DaveK

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019