delorie.com/archives/browse.cgi | search |
DMARC-Filter: | OpenDMARC Filter v1.4.2 delorie.com 55SAJZSg1853475 |
Authentication-Results: | delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com |
Authentication-Results: | delorie.com; spf=pass smtp.mailfrom=cygwin.com |
DKIM-Filter: | OpenDKIM Filter v2.11.0 delorie.com 55SAJZSg1853475 |
Authentication-Results: | delorie.com; |
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=iXVGf74p | |
X-Recipient: | archive-cygwin AT delorie DOT com |
DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org BEDAD38560BC |
DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; |
s=default; t=1751105973; | |
bh=zVRjxM51AAWnHFu8ytJEQtQo7yoAc+4TtuU9ISwaEOs=; | |
h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: | |
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: | |
From; | |
b=iXVGf74pO7gnJkssh4gezhStCNG9C4ANL4Uf48DCFqJNgxFNgqVYEUP8HdKwInUKm | |
lYDUWDTfnDH9wuCU4mQp4kYYOzmObuRfz3DSUNvXTkm5Yy/3R2tT82/jFMMQZqyrXA | |
CjnKUeQ5gkuRVQrdzuvoR0D93o2XibuphigkzgRQ= | |
X-Original-To: | cygwin AT cygwin DOT com |
Delivered-To: | cygwin AT cygwin DOT com |
DMARC-Filter: | OpenDMARC Filter v1.4.2 sourceware.org 75091385781B |
ARC-Filter: | OpenARC Filter v1.0.0 sourceware.org 75091385781B |
ARC-Seal: | i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1751105944; cv=none; |
b=g09xW7EvTMCMtT//rSOU43OBB7z2Fws0Z0cpplx+GsssVJ2SZf90AbfmSmoQA7wFlvFPkixqSvFkB36hugnujq1YbS/AJCwmV7wrilYIwBWN+TeDv8Wh19lBeeiWmOQo6kKQZ4Ih/02fMeSs4tlp/lLUQzZLw1eDU7eQz0j0+3I= | |
ARC-Message-Signature: | i=1; a=rsa-sha256; d=sourceware.org; s=key; |
t=1751105944; c=relaxed/simple; | |
bh=F1OK78Q7wFHXoll8kU8rdiRh6W4bOhDZRf7MngBwwOc=; | |
h=From:Subject:To:Message-ID:Date:MIME-Version; | |
b=g+k4Ioxblq0DpkWyqKSQir8ra7Rw4FF6Am2KBr3LXmwHBHYjw50mDI5Bop3gUVXQqoMWTXdY8eiijWIr1CXz1gMoqT/uSKX1MbrCFWYm3XQ2kmpJdJtJ0N79wOcbAew7KPmkrQ9mEX99TQyUtIirT7x4vuANAGWK6uAlIerbZCo= | |
ARC-Authentication-Results: | i=1; server2.sourceware.org |
DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org 75091385781B |
Subject: | Re: readdir() returns inaccessible name if file was created with |
invalid UTF-8 | |
To: | cygwin AT cygwin DOT com |
References: | <96f2253b-791b-b8a0-97dd-8d257eefb9b1 AT t-online DOT de> |
<03c4fae7-7322-572c-ae72-52e300f0b438 AT t-online DOT de> | |
<aFxRfI4NdZ8y5IlK AT calimero DOT vinschen DOT de> | |
<f78c615c-aefe-b3d0-aada-5f9d0cf73a0a AT t-online DOT de> | |
<aF5y15iQ840LxLYJ AT calimero DOT vinschen DOT de> | |
<3295c8bd-2c09-76c7-8b5f-0106dc39dd96 AT t-online DOT de> | |
<aF6x55WXIS1t655i AT calimero DOT vinschen DOT de> | |
Message-ID: | <5fae4fcc-6847-ab19-b487-3a28c76d96e4@t-online.de> |
Date: | Sat, 28 Jun 2025 12:18:57 +0200 |
User-Agent: | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101 |
SeaMonkey/2.53.20 | |
MIME-Version: | 1.0 |
In-Reply-To: | <aF6x55WXIS1t655i@calimero.vinschen.de> |
X-TOI-EXPURGATEID: | 150726::1751105940-057F8536-6F7683DD/0/0 CLEAN NORMAL |
X-TOI-MSGID: | 54bf753f-892e-4dbe-b9ad-14dbce175f03 |
X-BeenThere: | cygwin AT cygwin DOT com |
X-Mailman-Version: | 2.1.30 |
List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com> |
List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>, |
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe> | |
List-Archive: | <https://cygwin.com/pipermail/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help> |
List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>, |
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe> | |
From: | Christian Franke via Cygwin <cygwin AT cygwin DOT com> |
Reply-To: | cygwin AT cygwin DOT com |
Cc: | Christian Franke <Christian DOT Franke AT t-online DOT de> |
Errors-To: | cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com |
Sender: | "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com> |
This is a multi-part message in MIME format. --------------B19B623E207A25A37D693D40 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Corinna Vinschen via Cygwin wrote: > On Jun 27 15:32, Christian Franke via Cygwin wrote: >> $ touch $'t-\xef\x80\x80' >> The name mapping is: >> "t-\xEF\x80\x80" -(open, ...)-> L"t-\xDB59" -(readdir)-> "t-" > Did you copy/paste this from the old mail, by any chance? Sorry, I accidentally mixed two cases with same readdir() result: "t-\xEF\x80\x80" -(open, ...)-> L"t-\xF000" -(readdir)-> "t-" "t-\xED\xAD\x99' -(open, ...)-> L"t-\xDB59" -(readdir)-> "t-" $ touch $'t-\xed\xad\x99' $ touch $'t-\xef\x80\x80' $ ls | uniq -c      2 t- Does no longer occur in 3.7.0-0.165.g1b60f4861b70 but see below. > Using the latest test DLL the mapping is > > "t-\xEF\x80\x80" -(open, ...)-> L"t-\xF000" > > And that's basically correct, albeit it leads to problems. > > You know that we defined the area from 0xf000 to 0xf0ff as our private > use area to create filenames with characters invalid in DOS filenames > by transposing these chars into the private use area. When converting > the filenames back, the 0xf0XX chars are transposed back to 0xXX. Yes. > But yeah, I found the bug here. The problem is that the transpose table > incorrectly contains NUL as transposable character. So if you create > L"t-\xF000", that's fine. However, when converting this name back to > UTF-8, the filename becomes L"t-\0". Oops. > > I dropped the ASCII NUL from the list of transposable characters and > now what you get is this: > > $ touch $'t-\xef\x80\x80' > $ touch $'t-\xef\x80\x81' > $ ls -l > total 0 > -rw-r--r-- 1 corinna vinschen 0 Jun 27 16:49 't-'$'\001' > -rw-r--r-- 1 corinna vinschen 0 Jun 27 16:49 't-'$'\357\200\200' > > Apart from the incorrect transposition of ASCII NUL, the transposition > works transparently: > > $ echo foo > $'t-\xef\x80\x81' > $ cat $'t-\xef\x80\x81' > foo > $ cat $'t-\x01' > foo > > I'll apply the patch shortly. $ touch $'t-\xed\xad\x90' $ touch $'t-\xed\xad\x91' $ touch $'t-\xed\xad\x92' $ touch $'t-\xed\xad\x93' $ touch $'t-\xed\xad\x94' $ ls | uniq -c      5 t- $ ls -s ls: cannot access 't-': No such file or directory ls: cannot access 't-': No such file or directory ls: cannot access 't-': No such file or directory ls: cannot access 't-': No such file or directory ls: cannot access 't-': No such file or directory total 0 ? t- ? t- ? t- ? t- ? t- All results found by several runs with different seeds of the attached test program have in common that the Windows path name contains an invalid word in UTF-16 High Surrogate range: $ ./randnames 42 $'t-\xEC\x9E\xB3\xEF\x82\x80\xEF\x83\xA0': access() failed, errno=2: $'t-\xED\xA4\xA8\x80\xE0': original path L"t-\xD928\xF080\xF0E0": Windows path $'t-\xEE\x9E\xB3\xEF\x83\xA1': access() failed, errno=2: $'t-\xED\xA6\xB0\xE1': original path L"t-\xD9B0\xF0E1": Windows path ... $'t-\xE7\xBE\xB3\xEF\x82\xB3': access() failed, errno=2: $'t-\xED\xA2\x96\xB3': original path L"t-\xD896\xF0B3": Windows path -- Thanks, Christian --------------B19B623E207A25A37D693D40 Content-Type: text/plain; charset=UTF-8; name="randnames.c" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="randnames.c" I2luY2x1ZGUgPGRpcmVudC5oPg0KI2luY2x1ZGUgPGVycm5vLmg+DQojaW5jbHVkZSA8ZmNu dGwuaD4NCiNpbmNsdWRlIDxzdGRpby5oPg0KI2luY2x1ZGUgPHN0ZGxpYi5oPg0KI2luY2x1 ZGUgPHN0cmluZy5oPg0KI2luY2x1ZGUgPHVuaXN0ZC5oPg0KI2luY2x1ZGUgPHdjaGFyLmg+ DQojaW5jbHVkZSA8d2luZG93cy5oPg0KDQpzdGF0aWMgdm9pZCBwcmludF9jKEZJTEUgKiBm LCBjb25zdCBjaGFyICogcykNCnsNCiAgZnB1dHMoIiQnIiwgZik7DQogIGNoYXIgYzsNCiAg Zm9yIChpbnQgaSA9IDA7IChjID0gc1tpXSk7IGkrKykgew0KICAgIGlmIChjID09ICdcJycp DQogICAgICBmcHV0cygiJ1xcJyQnIiwgZik7DQogICAgZWxzZSBpZiAoJyAnIDw9IGMgJiYg YyA8PSAnficpDQogICAgICBmcHV0YyhjLCBmKTsNCiAgICBlbHNlDQogICAgICBmcHJpbnRm KGYsICJcXHglMDJYIiwgYyAmIDB4ZmYpOw0KICB9DQogIGZwdXRjKCdcJycsIGYpOw0KfQ0K DQpzdGF0aWMgdm9pZCBwcmludF93KEZJTEUgKiBmLCBjb25zdCB3Y2hhcl90ICogcykNCnsN CiAgZnB1dHMoIkxcIiIsIGYpOw0KICB3Y2hhcl90IGM7DQogIGZvciAoaW50IGkgPSAwOyAo YyA9IHNbaV0pOyBpKyspIHsNCiAgICBpZiAoYyA9PSBMJyInIHx8IGMgPT0gTCdcXCcpDQog ICAgICBmcHJpbnRmKGYsICJcXCVjIiwgYyk7DQogICAgZWxzZSBpZiAoTCcgJyA8PSBjICYm IGMgPD0gTCd+JykNCiAgICAgIGZwdXRjKGMsIGYpOw0KICAgIGVsc2UNCiAgICAgIGZwcmlu dGYoZiwgIlxceCUwNFgiLCBjICYgMHhmZmZmKTsNCiAgfQ0KICBmcHV0YygnIicsIGYpOw0K fQ0KDQpzdGF0aWMgdm9pZCBnZXRfd2lubmFtZSh3Y2hhcl90ICogbmFtZSkNCnsNCiAgV0lO MzJfRklORF9EQVRBVyBlOw0KICBIQU5ETEUgaCA9IEZpbmRGaXJzdEZpbGVXKEwiKiIsICZl KTsNCiAgaWYgKGggPT0gSU5WQUxJRF9IQU5ETEVfVkFMVUUpIHsNCiAgICBmcHJpbnRmKHN0 ZGVyciwgIkZpbmRGaXJzdEZpbGVXKCk6IEVycm9yPSV1XG4iLCBHZXRMYXN0RXJyb3IoKSk7 DQogICAgZXhpdCgxKTsNCiAgfQ0KICBpbnQgaSA9IDA7DQogIGRvIHsNCiAgICBpZiAoIXdj c2NtcChlLmNGaWxlTmFtZSwgTCIuIikgfHwgIXdjc2NtcChlLmNGaWxlTmFtZSwgTCIuLiIp KQ0KICAgICAgY29udGludWU7DQogICAgd2NzY3B5KG5hbWUsIGUuY0ZpbGVOYW1lKTsNCiAg ICBpKys7DQogIH0gd2hpbGUgKEZpbmROZXh0RmlsZVcoaCwgJmUpKTsNCiAgRmluZENsb3Nl KGgpOw0KICBpZiAoaSAhPSAxKSB7DQogICAgZnByaW50ZihzdGRlcnIsICJFcnJvcjogJWQg V2luMzIgZmlsZXMgZm91bmRcbiIsIGkpOw0KICAgIGV4aXQoMSk7DQogIH0NCn0NCg0Kc3Rh dGljIHZvaWQgZ2V0X2N5Z25hbWUoY2hhciAqIG5hbWUpDQp7DQogIERJUiAqIGQgPSBvcGVu ZGlyKCIuIik7IA0KICBpZiAoIWQpIHsNCiAgICBwZXJyb3IoIm9wZW5kaXIiKTsNCiAgICBl eGl0KDEpOw0KICB9DQogIGludCBpID0gMDsNCiAgY29uc3Qgc3RydWN0IGRpcmVudCAqIGU7 DQogIHdoaWxlICgoZSA9IHJlYWRkaXIoZCkpKSB7DQogICAgaWYgKCFzdHJjbXAoZS0+ZF9u YW1lLCAiLiIpIHx8ICFzdHJjbXAoZS0+ZF9uYW1lLCAiLi4iKSkNCiAgICAgIGNvbnRpbnVl Ow0KICAgIHN0cmNweShuYW1lLCBlLT5kX25hbWUpOw0KICAgIGkrKzsNCiAgfQ0KICBjbG9z ZWRpcihkKTsNCiAgaWYgKGkgIT0gMSkgew0KICAgIGZwcmludGYoc3RkZXJyLCAiRXJyb3I6 ICVkIEN5Z3dpbiBmaWxlcyBmb3VuZFxuIiwgaSk7DQogICAgZXhpdCgxKTsNCiAgfQ0KfQ0K DQpzdGF0aWMgdm9pZCByYW5kbmFtZShjaGFyICogbmFtZSwgaW50IG1heGxlbikNCnsNCiAg aW50IGxlbiA9IDEgKyByYW5kKCkgJSAobWF4bGVuICsgMSAtIDEpOw0KICBmb3IgKGludCBp ID0gMDsgaSA8IGxlbjsgaSsrKSB7DQogICAgY2hhciBjID0gMSArIHJhbmQoKSAlICgyNTYg LSAyIC0gMSk7DQogICAgaWYgKGMgPj0gJy8nKQ0KICAgICAgYysrOw0KICAgIGlmIChjID49 ICdcXCcpDQogICAgICBjKys7DQogICAgbmFtZVtpXSA9IGM7DQogIH0NCiAgbmFtZVtsZW5d ID0gMDsNCn0NCg0Kc3RhdGljIGludCB0ZXN0bmFtZShjb25zdCBjaGFyICogbmFtZSkNCnsN CiAgaW50IGZkID0gb3BlbihuYW1lLCBPX1dST05MWXxPX0NSRUFULCAwNjQ0KTsNCiAgaWYg KGZkIDwgMCkgew0KICAgIHByaW50X2Moc3Rkb3V0LCBuYW1lKTsgcHJpbnRmKCI6IG9wZW4o KSBmYWlsZWQsIGVycm5vPSVkXG4iLCBlcnJubyk7DQogICAgZXhpdCgxKTsNCiAgfQ0KICBj bG9zZShmZCk7DQoNCiAgY2hhciBjeWduYW1lW01BWF9QQVRIXTsNCiAgZ2V0X2N5Z25hbWUo Y3lnbmFtZSk7DQogIHdjaGFyX3Qgd2lubmFtZVtNQVhfUEFUSF07DQogIGdldF93aW5uYW1l KHdpbm5hbWUpOw0KDQogIGludCByYyA9IDE7DQogIGlmIChhY2Nlc3MoY3lnbmFtZSwgMCkp IHsNCiAgICBwcmludF9jKHN0ZG91dCwgY3lnbmFtZSk7IHByaW50ZigiOiBhY2Nlc3MoKSBm YWlsZWQsIGVycm5vPSVkOlxuIiwgZXJybm8pOw0KICAgIHByaW50X2Moc3Rkb3V0LCBuYW1l KTsgcHJpbnRmKCI6IG9yaWdpbmFsIHBhdGhcbiIpOyANCiAgICBwcmludF93KHN0ZG91dCwg d2lubmFtZSk7IHByaW50ZigiOiBXaW5kb3dzIHBhdGhcblxuIik7DQogICAgcmMgPSAwOw0K ICB9DQoNCiAgaWYgKHVubGluayhuYW1lKSkgew0KICAgIHByaW50X2Moc3Rkb3V0LCBuYW1l KTsgcHJpbnRmKCI6IHVubGluaygpIGZhaWxlZCwgZXJybm89JWRcbiIsIGVycm5vKTsNCiAg ICBwcmludF93KHN0ZG91dCwgd2lubmFtZSk7IHByaW50ZigiOiBXaW5kb3dzIHBhdGhcbiIp Ow0KICAgIGV4aXQoMSk7DQogIH0NCiAgcmV0dXJuIHJjOw0KfQ0KDQppbnQgbWFpbihpbnQg YXJnYywgY2hhciAqKmFyZ3YpDQp7DQogIGlmIChhcmdjID4gMSkNCiAgICBzcmFuZChhdG9p KGFyZ3ZbMV0pKTsNCg0KICBjb25zdCBjaGFyICogZGlyID0gInRlc3QudG1wIjsNCiAgcm1k aXIoZGlyKTsNCiAgaWYgKG1rZGlyKGRpciwgMDc1NSkpIHsNCiAgICBwZXJyb3IoZGlyKTsg cmV0dXJuIDE7DQogIH0NCiAgaWYgKGNoZGlyKGRpcikpIHsNCiAgICBwZXJyb3IoZGlyKTsg cmV0dXJuIDE7DQogIH0NCg0KICBpbnQgZXJycyA9IDA7DQogIGZvciAoaW50IGkgPSAwOyBp IDwgMTAwMDAwOyBpKyspIHsNCiAgICBjaGFyIG5hbWVbOF0gPSAidC0iOw0KICAgIHJhbmRu YW1lKG5hbWUgKyAyLCBzaXplb2YobmFtZSkgLSAxIC0gMik7DQogICAgaWYgKCF0ZXN0bmFt ZShuYW1lKSAmJiArK2VycnMgPj0gMTApDQogICAgICBicmVhazsNCiAgfQ0KICByZXR1cm4g MDsNCn0NCg== --------------B19B623E207A25A37D693D40 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple --------------B19B623E207A25A37D693D40--
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |