From: lhall AT rfk DOT com (Larry Hall) Subject: Re: strcasecmp revisited 30 Nov 1998 11:47:00 -0800 Message-ID: <3.0.5.32.19981130144243.00976e60.cygnus.cygwin32.developers@pop.ma.ultranet.com> References: <19981130132257 DOT B15656 AT cygnus DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: Christopher Faylor , cygwin32-developers AT cygnus DOT com At 01:22 PM 11/30/98 -0500, Christopher Faylor wrote: >Someone on the gnu-win32 mailing list noted that strcasecmp does not return >"the right thing" when comparing "a" to "_". Instead of returning the >difference between "_" and "a", it returns the difference between "_" and "A". > >I was just about to check in a fix for this behavior when it occurred to >me that I should check the Single UNIX Specification to see how they say >this should be handled. Here's what they say: > > The strcasecmp() function compares, while ignoring differences in case, > the string pointed to by s1 to the string pointed to by s2. The > strncasecmp() function compares, while ignoring differences in case, not > more than n bytes from the string pointed to by s1 to the string pointed > to by s2. > > In the POSIX locale, strcasecmp() and strncasecmp() do upper to lower > conversions, then a byte comparison. The results are unspecified in > other locales. > >The newlib strcasecmp does a toupper on the string and ignores locales >but, except for that, it seems to be complying with the spirit of the >above paragraphs. > >My change detected the case where a non-alpha was being compared to an >alpha and avoided doing a toupper in that case. I'm wondering if this >is the correct thing to do given the above description? > >Does anybody have any opinions? > >cgf > Here's mine. The excerpt you've given states that there is a specific way to do the conversion for POSIX locales and an unspecified way to do it for other locales. If newlib ignores the locale, that means to me that it does the same kind of comparison regardless of the locale. If so, in order to be compliant with the statement above, it would need to do one of 2 things: 1. Perform a POSIX locale-style comparison for all locales. 2. Perform a POSIX locale-style comparison for POSIX locales and any other kind of comparison for any other locale. If you still agree with my reasoning, it seems to me that what you've done already fits the latter half of my statement (2) above. It doesn't address the first half nor does it address statement (1). To me, it seems like (1) is the preferable and easiest thing to do from an implementational perspective, although filling out the implementation for (2) would leverage what's already there and shouldn't be much harder. Larry Hall lhall AT rfk DOT com RFK Partners, Inc. (781) 239-1053 8 Grove Street (781) 239-1655 - FAX Wellesley, MA 02482-7797 http://www.rfk.com