| delorie.com/archives/browse.cgi | search |
| X-Recipient: | archive-cygwin AT delorie DOT com |
| DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
| :list-unsubscribe:list-subscribe:list-archive:list-post | |
| :list-help:sender:subject:to:references:from:message-id:date | |
| :mime-version:in-reply-to:content-type | |
| :content-transfer-encoding; q=dns; s=default; b=FrOzqkSU4vePLnUF | |
| K8Z1NsvatOP5uuFAD17FaerDdeiPKipvVa1RO00U7Gexy6YDUYOrOqFkRLAJsJKk | |
| w8KSrIN3Mc/KpYS4IzxSpyd9EX/ymmyfm9aJBIBE2CaCkqNEi+j0Fl8bTWGgUjTA | |
| oHTW7prupqcAqZ6dBI0JSnzk0xQ= | |
| DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
| :list-unsubscribe:list-subscribe:list-archive:list-post | |
| :list-help:sender:subject:to:references:from:message-id:date | |
| :mime-version:in-reply-to:content-type | |
| :content-transfer-encoding; s=default; bh=pALIhZjKCS0toCk6qjTDyf | |
| pyouE=; b=cbHMYPpLpSnzIEdyxtb9cbl4C93ubLuHoBXM9GwC3evJ0siYI5XdNu | |
| iEiBIfhyGcEIdYexXeWJ7ItAF5qVO5PLt5B+ffS6XaDCEMTmn6jSb8jjERwHA+B3 | |
| CWyYx5cxVvnkKqUyPWc6h0lh/9XoMLXVY6LoHHOFw4A623V/gfdxc= | |
| Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
| List-Id: | <cygwin.cygwin.com> |
| List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
| List-Archive: | <http://sourceware.org/ml/cygwin/> |
| List-Post: | <mailto:cygwin AT cygwin DOT com> |
| List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
| Sender: | cygwin-owner AT cygwin DOT com |
| Mail-Followup-To: | cygwin AT cygwin DOT com |
| Delivered-To: | mailing list cygwin AT cygwin DOT com |
| Authentication-Results: | sourceware.org; auth=none |
| X-Virus-Found: | No |
| X-Spam-SWARE-Status: | No, score=-0.9 required=5.0 tests=BAYES_05,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE,RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Apart, H*M:online, 25052016, 25.05.2016 |
| X-HELO: | mailout10.t-online.de |
| Subject: | Re: Invalid tm_zone from localtime() when TZ is not set |
| To: | cygwin AT cygwin DOT com |
| References: | <o8xeg8x7e2r DOT wl-koba AT jp DOT fujitsu DOT com> <932D033F-9DA4-4901-9158-328AA929FEC8 AT etr-usa DOT com> <o8x8tz573zs DOT wl-koba AT jp DOT fujitsu DOT com> <CAEhDDbA-ATpAtVggR7cfqn58AHw0sPK_Y3mNSJ8UO29sg2ZpuA AT mail DOT gmail DOT com> <o8x37pd6nlm DOT wl-koba AT jp DOT fujitsu DOT com> <o8xposa6fcl DOT wl-koba AT jp DOT fujitsu DOT com> <20160525084430 DOT GA17601 AT calimero DOT vinschen DOT de> |
| From: | =?UTF-8?Q?Hans-Bernhard_Br=c3=b6ker?= <HBBroeker AT t-online DOT de> |
| Message-ID: | <2eddaaf6-4e37-cd9b-aa9d-8a87234d0cf9@t-online.de> |
| Date: | Wed, 25 May 2016 22:02:50 +0200 |
| User-Agent: | Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 |
| MIME-Version: | 1.0 |
| In-Reply-To: | <20160525084430.GA17601@calimero.vinschen.de> |
| X-IsSubscribed: | yes |
Am 25.05.2016 um 10:44 schrieb Corinna Vinschen:
> On May 25 11:28, KOBAYASHI Shinji wrote:
>>
>> Any other comments on this topic? Let me explain my proposal again.
>>
>> The intention of the following code in tzsetwall() should be to pick
>> up UPPERCASE letters "in ASCII range":
Are you sure you're not mixing ASCII with '8-bit character' range there?
>> if (isupper(*src)) *dst++ = *src;
>>
>> NOTE: src is wchar_t *, dst is char *.
>>
>> As Csaba Raduly pointed out, isw*() functions should be the first
>> choice if they achieve the desired behavior (select uppercase AND
>> ASCII).
But it doesn't, so it's not.
>> However, iswupper() does not fit for this purpose, as it
>> returns 1 for L'\uff21' for example. And I could not find isw*()
>
> In that case, wouldn't it make sense to fix iswupper in the first place?
I don't believe it's been shown to be broken, so there's no need to fix it.
> Apart from that, we can workaround all problems in tzsetwall by just
> checking for
>
> if (*src >= L'A' && *src <= L'Z')
While that may be possible if it really is ASCII you're looking for,
it's perverting the whole reason <ctype.h> and <wctype.h> exist: to make
tests like this as independent of the actual character encoding as possible.
Here's what I wrote last week, but apparently only to Csaba Raduli:
Am 20.05.2016 um 09:09 schrieb Csaba Raduly:
> If the type of those members is WCHAR[] then using isascii() /
> isupper() on them is just plain wrong.
Absolutely. The argument type of isupper() and friends is 'int', not
'unsigned char'. But the _only_ allowed argument values are those in
the range of unsigned char, plus EOF. For typical systems, that means
the allowed argument range of is*() is -1 ... 255 inclusive. Calling
these Standard Library functions with any other argument causes
undefined behaviour.
That leaves three sensible ways of calling isupper() in portable code:
*) isupper(foo) # where type of foo is unsigned char
*) isupper((unsigned char)bar) # where bar is signed char, or plain char
*) isupper(baz) # where baz was got from fgetc() or similar
All other call patterns are plain and simply wrong, or at least
non-portable. In particular, passing a wchar_t to any of the <ctype.h>
function is wrong every time.
> The correct function to use would be iswupper().
Actually, the is*upper() isn't even the actual problem here. The whole
idea of copying a wchar_t string into a char one, element by element, is
most likely nonsensical. A wchar_t cannot be assumed to just fit into a
char, regardless whether iswupper() returned true on it or not. E.g.
what do we expect this to do with an upper-case Greek or Cyrillic letter?
A proper solution may have to be more like this:
int mapped = wctob(*src);
/* this call is safe now because of how wctob() works: */
if (isupper(mapped)) {
*dst++ = (unsigned char)mapped;
}
>> So, I propose to call isascii() to assure the wchar_t fits in the
>> range of ASCII before calling isupper().
Calling isascii() would be wrong for the same reasons calling isupper() is.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
| webmaster | delorie software privacy |
| Copyright © 2019 by DJ Delorie | Updated Jul 2019 |