delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2017/05/22/08:50:29

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:mime-version:in-reply-to:references:from:date
:message-id:subject:to:content-type; q=dns; s=default; b=mj0KLV9
8P//2aLslPYKTvz5zar9DWDHipSMwjVRehIS8RopUZX0DL+JrZYKaeWybrkRoHXF
730QcqxutwkPEBEzqMYTdLVtPiphhv+WF6FO7YC6jvX6vcIgnA9V5MM5m2McCNvP
Kfj7fX1kUq0stSfjtSamoibJWn+ABaey6gvU=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:mime-version:in-reply-to:references:from:date
:message-id:subject:to:content-type; s=default; bh=XXrWauVoXDf/Y
TCzyllbL8CnjA8=; b=QUGbsTByvziQ/zBoUjG5YYw0S9UfapeORQz7ZADRBQKbV
i7rwhONnA0/DeMz65PkY/3mVdgOPj6gb1rdmKXs1768lb4FqckHU62t5fV7OUk4i
mJrOzG2jRQckeBnNX2hMMe1gJmaL9zGBHuRAYH+ppcT8X8BZjlYqk4VCXQ+NBY=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=0.7 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=lb, 1033, Duncan, duncan
X-HELO: mail-yb0-f182.google.com
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=LDpJUX/prYe2tACdIuLVbNTZNe3DOPxl1nySyvau3h4=; b=CwzNt0s44xJh4oxDXCc8JQbkj48sJ8HHWaJYC5EBSOIKlmYEcN1+vrk/3t8aDI6qHf S7kVlosYSbbk19clFi0DbJwMhrPG8UuqO7uKdWER12C2ZwUHE137ta67a08O2kiJS9W5 sqFLVT7JZnvucun/iLoeNe/arkhlauWcg3Tkoa52NQMBCHHL/SbsfKQ8f+4Fr0PKBpEE ScwjQGppT1wXv/cfkcnRxf1XBxoBUNkqd0fUXodXcCttSMbMpB4j4F+rtoT9LYQx+AOM U8azWbWwKZ1BrDpyEtri7OmH/uE0ZG91EChAaHMslAi4ew4f3D1v54Ws8PLTxikp7EVO 0sKg==
X-Gm-Message-State: AODbwcDReJzR+LlnknFqLsfhlyu9fo8aTZyKTqNdQlmzF0JsQm95i4hP NU1Pq8QjuuAZKRqivbBKj6F/i4YRPxe+RLM=
X-Received: by 10.37.105.6 with SMTP id e6mr5740369ybc.161.1495457410243; Mon, 22 May 2017 05:50:10 -0700 (PDT)
MIME-Version: 1.0
In-Reply-To: <20170521042352.GA4045@dimstar.local.net>
References: <CAOTD34aCROSAQojYvV4rjwiWOfiALFP+P2wODoMV1dcaOhKPFQ AT mail DOT gmail DOT com> <20170521042352 DOT GA4045 AT dimstar DOT local DOT net>
From: Erik Bray <erik DOT m DOT bray AT gmail DOT com>
Date: Mon, 22 May 2017 14:50:09 +0200
Message-ID: <CAOTD34buK86Xdd-XV5DGtfD_xyMYT++-O6y+gpg7YC4YNLXLsQ@mail.gmail.com>
Subject: Re: Bug? wcsxfrm causing memory corruption
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes

On Sun, May 21, 2017 at 6:23 AM, Duncan Roe wrote:
> On Wed, May 10, 2017 at 11:30:46AM +0200, Erik Bray wrote:
>> Greetings--
>>
>> In the process of fixing the Python test suite on Cygwin I ran across
>> one test that was consistently causing segfaults later on, not
>> directly local to that test.  The test involves wcsxfrm so that's
>> where I focused my attention.
>>
>> The attached test demonstrates the bug.  Given an output buffer of N
>> wide characters, wcsxfrm will cause bytes beyond the destination size
>> to be reversed. I believe it might actually be a bug in the underlying
>> LCMapStringW workhorse (this is on Windows 10; have not tested other
>> versions).
>>
>> According to its docs [1], the cchDest argument (size of the
>> destination buffer) is treated as a *byte* count when using
>> LCMAP_SORTKEY.  However, for the purposes of applying the
>> LCMAP_BYTEREV transformation it seems to be treating the output size
>> (in bytes) as character count.  So in the example I give, where the
>> output sort key is 7 bytes (including the null terminator), it swaps
>> *14* bytes--the bytes including the sort key as well as the next 7
>> adjacent bytes.  This is obviously a problem if the destination buffer
>> is allocated out of some larger memory pool.
>>
>> This definitely has to be a bug, right?  Or at least very poorly
>> documented on MS's part.  A workaround would either be to not use
>> LCMAP_BYTEREV and just swap the bytes manually, or in a second call to
>> LCMapStringW with LCMAP_BYTEREV and the correct character count...
>>
>> Thanks,
>> Erik
>>
>>
>> [1] https://msdn.microsoft.com/en-us/library/windows/desktop/dd318700(v=vs.85).aspx
>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <locale.h>
>> #include <wchar.h>
>> #include <string.h>
>> #include <windows.h>
>>
>> #define SIZE 32
>>
>>
>> void fill_bytes(uint8_t *a, int n) {
>>     int idx;
>>     for (idx=0; idx<n; idx++) {
>>         a[idx] = idx;
>>     }
>> }
>>
>>
>> void print_bytes(uint8_t *a, int n) {
>>     int idx;
>>     for (idx=0; idx<n; idx++) {
>>         printf("0x%02x ", ((uint8_t*)a)[idx]);
>>         if ((idx + 1) % 8 == 0) printf("\n");
>>     }
>> }
>>
>> int main(void) {
>>     wchar_t *a, *b;
>>     uint8_t *aa;
>>     size_t ret;
>>     LCID collate_lcid;
>>     int idx;
>>     collate_lcid = 1033;
>>     b = L"b";
>>     a = (wchar_t*) malloc(SIZE);
>>     aa = (uint8_t*) a;
>>
>>     setlocale(LC_ALL, "en_US.UTF-8");
>>
>>     printf("using wcsxfrm:\n");
>>     fill_bytes(aa, SIZE);
>>     printf("before:\n");
>>     print_bytes(aa, SIZE);
>>     ret = wcsxfrm(a, b, 4);
>>     printf("after (%d):\n", ret);
>>     print_bytes(aa, SIZE);
>>
>>     printf("\nusing LCMapStringW directly:\n");
>>     fill_bytes(aa, SIZE);
>>     printf("before:\n");
>>     print_bytes(aa, SIZE);
>>
>>     ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY | LCMAP_BYTEREV, b, -1, a, 8);
>>     printf("after (%d):\n", ret);
>>     print_bytes(aa, SIZE);
>>
>>     printf("\nwithout LCMAP_BYTEREV:\n");
>>     fill_bytes(aa, SIZE);
>>     printf("before:\n");
>>     print_bytes(aa, SIZE);
>>
>>     ret = LCMapStringW(collate_lcid, LCMAP_SORTKEY, b, -1, a, 8);
>>     printf("after (%d):\n", ret);
>>     print_bytes(aa, SIZE);
>>     free(a);
>>
>>     return 0;
>> }
>
> Hi Erik,
>
> I get
>
> using wcsxfrm:
> before:
> 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
> after (3):
> 0x09 0x0e 0x01 0x01 0x01 0x01 0x00 0x00
> 0x09 0x08 0x0b 0x0a 0x0d 0x0c 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
>
> using LCMapStringW directly:
> before:
> 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
> after (7):
> 0x09 0x0e 0x01 0x01 0x01 0x01 0x07 0x00
> 0x09 0x08 0x0b 0x0a 0x0d 0x0c 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
>
> without LCMAP_BYTEREV:
> before:
> 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f
> after (7):
> 0x0e 0x09 0x01 0x01 0x01 0x01 0x00 0x07
> 0x08 0x09 0x0a 0x0b 0x0c 0x0d 0x0e 0x0f
> 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
> 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f

Yes, that's the same.  Thanks for giving it a try--I should have
included example output in my original message.

You can see in the last case that without LCMAP_BYTEREV it writes the sequence

0x0e 0x09 0x01 0x01 0x01 0x01 0x00

with a terminating 0x00.  Bytes after that remain unchanged.  In the
other two examples *with* LCMAP_BYTEREV, the terminating 0x00 gets
swapped with the 0x07 after it, but this documented and expected
behavior of LCMapStringW, and is already accounted for in Cygwin's
wcsxfrm.  What is undocumented, and unexpected, is that it also byte
swaps 3 more byte pairs after the actual sort key, which can corrupt
memory unexpectedly.

Thanks,
Erik

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019