DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 58U6IPTo2712079 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 58U6IPTo2712079 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=UfKsLkzb X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BF957385841F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1759213103; bh=tmPRMd/bti8bQb+W8YxVd55Mb3niv745IiLgzFdEQZI=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=UfKsLkzb/tdyniTyDx2qUO9CFN0ZjSptdIUkD3pvUp5EfH6e2Jej4bFh7Efq0dXNM rMTn/MKJnKcfUrLtkERNig7CfX2kVNKpnwArI+FZx9RBdf1yoLsvPjfGHqjqcbZUdL iHJ1KPSuzRYg6OCiqz6L0VytPFoKh7vbxR6cOlBE= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2D1B83858D1E ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2D1B83858D1E ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1759213046; cv=none; b=pXFvYc7JA3HgkN1kacmwfA9k9SI8NKybI01ufaVFc8L9i0dJoisS57C40UBkAc6AM1sNL5xHMW/VkjfGG4PslkihQGR8dbozCySivBWTxDWx3K8jXJHYr+26eL2g5FWGF3ODtYYlvAQXHD/dKmEzgbaVRwhHdPONmkEnCoOc4Yc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1759213046; c=relaxed/simple; bh=Bb+XIPOn6FzZKXf6Aq1pGuD9PPcGzD/k2jWUBuK8SBw=; h=Message-ID:Date:MIME-Version:From:Subject:To:DKIM-Signature; b=KJ/rLTXEm0fdu9pN3kNYI8NCDlWK6kAOu0Tn/hixo+KEXqvD8XyOwhfybZnJX83KCSPo1UeP4IrfacfAcjeF6B36Bkk/C8oZ/xqbGkPAqpc9C0Wt5aLN8wAypNPLQxiM+8Z1XKrQmbg2Fp5qfzQjXMICLMGdErDwy/kgOWCHAn8= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2D1B83858D1E Message-ID: <92bbd8ac-6f6e-4e4b-b2da-dff8eaead301@SystematicSW.ab.ca> Date: Tue, 30 Sep 2025 00:17:23 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: cygwin 3.6.4 breaks mbrtowc Content-Language: en-CA To: cygwin AT cygwin DOT com References: Organization: Systematic Software In-Reply-To: X-Stat-Signature: r8y6sioaryecteukc9jugybqfxwxsqiq X-Rspamd-Server: rspamout05 X-Rspamd-Queue-Id: 1CB9817 X-Session-Marker: 427269616E2E496E676C69734053797374656D6174696353572E61622E6361 X-Session-ID: U2FsdGVkX18Tfm+E1Inew84sVogzIjyAj75gndx2gGE= X-HE-Tag: 1759213044-186535 X-HE-Meta: U2FsdGVkX19PCyfnclNrKHy3HvCWkZhBC/XZMObRp/1qkVTx+VC9XXY+YjG+uyu5xIM6h8nbR2s5Umq1fMj+yeJnRUk3cupjC2EB0wWsS9nEq0/lXOsld4CV7MAlfUm2YNPBDuue1nqWd1YfUFtWB6P5f51N1AC66yMcmJ34VjhfYSh/NRmxj7hzfDag+VUVDLPjm9NZe6W3Uws0LUaWqEFv0/VyyG0C5oQgNL4j/B/4B3ZRhkPzE0rBrfdICyQ+fxDAgIJ8O8k/zj8jjRvMSs9hqvnyHaviLuc1su6IinUc60bpmTJiTUhXqwEEupxdYdWPr7PXCCUdlJBADwRIdQYNIK5AACst X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Brian Inglis via Cygwin Reply-To: cygwin AT cygwin DOT com Cc: Brian Inglis Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 58U6IPTo2712079 On 2025-09-29 16:52, Mark Geisert via Cygwin wrote: > Hello Thomas, > Apologies for the late response to your report. > > On 7/21/2025 8:25 PM, Thomas Wolff via Cygwin wrote: >> mbrtowc is broken in 3.6.4 which breaks non-BMP display in mintty. >> Test case below. >> Thomas >> >> #include >> #include >> #include >> >> void mb(unsigned char c) >> { >>    wchar_t wc; >>    int ret = mbrtowc(&wc, &c, 1, 0); >>    printf("%02X -> %04X : %d\n", c, wc, ret); >> } >> >> void main () >> { >>    setlocale (LC_CTYPE, ""); >> >>    mb(0xF0); >>    mb(0x9F); >>    mb(0x98); >>    mb(0x8E); >> } > > Running your testcase gives different output between 3.6.4 and 3.7.0-dev-139 but > I'm unsure the latter is correct.  Can you comment please? > > On 3.6.4: > ~ ./a > F0 -> 0000 : -2 > 9F -> 0000 : -2 > 98 -> 0000 : -2 > 8E -> D83D : 3 > > On 3.7.0-dev-139: > ~ ./a > F0 -> 0000 : -2 > 9F -> 0000 : -2 > 98 -> D83D : 1 > 8E -> DE0E : 1 A code point converter I have agrees with the latter under 3.6.4, but the mbrtowc() return values should not be 1, 1, but 2, 1 possibly; otherwise you don't know when the character is complete: $ utf8cp $'\xf0\x9f\x98\x8e' 😎 😎 U+01f60e f0 9f 98 8e d83d de0e 😎 U+01f60e f0 9f 98 8e d83d de0e expanded: f0 == 1111 0/000 => 4 bytes - 3 bits 0 00 -- 0 9f == 10/01 1111 => 3 - 6 01 1111 -- 1f 98 == 10/01 1000 => 2 - 6 0110 00 -- 60 hi 10 - 1101 10/00 0011 1101 == d8 3d 8e == 10/00 1110 => 1 - 6 00 1110 -- 0e lo 10 - 1101 11/10 0000 1110 == de 0e -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retrancher but when there is no more to cut -- Antoine de Saint-Exupéry -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple