delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/11/23/06:23:42

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 4ANBNgM3773515
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 4ANBNgM3773515
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=mM42FraO
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE60C3858019
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1732361021;
bh=Tp9gzLvLh50KUs3wwZ2HgXnt3wiTx+W/FpYu9cZc9t0=;
h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=mM42FraOhZxkEwUKDbc582vyBq451FW9JT04UWnqNBy3kLPOrVg001NLkpT6MKe7i
do+8oL+coyX9DuQY7n+UFwHPrAFxHjHHeD3gMvvwm+mWPztxbd8W0Qr5Nfmz42beDt
04Yv4MxrWatabHfBILa3gbTqxsGIFPJNl+qJOPpQ=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4ED063858D37
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4ED063858D37
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1732360955; cv=none;
b=T0eJixm7JDA8Ca5rt4wqkJR/tCcQI42al+NbcqOIwXEgC32Ebbayhqv3JhPsJLBktcYx/BARQemTpqh6nRjBVKk73YVqXlB4caGtr0uhtejozwzNEtfyjw9h7707bdJQna1PhQlaR6xV0i1JdwhZgC33dfWxld5On4zb+iTX1RU=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1732360955; c=relaxed/simple;
bh=/8sfXcHBJT0WZdbAdVTePw+/PBov9yfhWcpic7WnqNE=;
h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To;
b=cfgddsx7FwjWW3Djhtx3nc4Bgteb/MMz1zvmufZRi8e6Bm6O/jXtxX6u8hD6C1uOWJiSd9J/yseOxI+y8ZH8nbTzu3jhkZc9dN+uwwjgi6Rbepk1ywwtq7irK/5Zv6rpCY64wVd5VxxVfifJ90u+yq4nMGeNvNL9i6Hq2fpgya4=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4ED063858D37
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1732360953; x=1732965753;
h=content-transfer-encoding:to:subject:message-id:date:from
:in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
:subject:date:message-id:reply-to;
bh=607md+0osDQCV/KmW4YooyexylBpno3SCK8+8ryUyqQ=;
b=kY48ICWlkkY6Ha0yUpnXPY25/mZL85LSgoL5rEAWg8sUBqMs4WRmrd7SCDh2ThTgog
cxix0tlOpuHnkg0aeoymZNrHfXKIv1rorCoVhu9gUaeGQvV+ggd7VhHc+CBzDRMOd3gP
VT/we858EykXIvwDt+blRrlVhLKsnq/SONNRHKesqRxQfqAv8ZpEgzl0+pLFHLWURFJk
V516Xb60GnI9DkFYbOaF6SnwkbhjiVMxfZwx74kq4ap/WFgJD2d5B+7A00wskcrxAfgo
WWovltuWzAOrxRJBrt3I8M3kP5tDUZgJ2+CPnGirKq61ahvOM7NDqArLwWiB57aDSiru
5n3w==
X-Gm-Message-State: AOJu0YyI8iKf7N1M02rKUn08DdHMD6iL1rX8ZLg+ylHkB7BcpA4OM9aP
S1OwBAyf3Jp17MOY9vc+3xEbNQoIDiTzPYCd1YK3nj7EpxEKSohe83uQl3zyAS0Isz7QP7HqI5M
qws+wHNVCzlMb2e5mzAULcN4Q0WXMJYNK
X-Gm-Gg: ASbGncv+MzK9MnIftyrWZhjMfaAlAJJ+ZSWvcms87xvJcpOwZxIZolFVuPfnO2xxx8L
i4YVEPYoTYSq2nsH2v9GNMWVjs024dVE=
X-Google-Smtp-Source: AGHT+IHGmRHic73hnTXgRGb2yd311vshANn332zw1+tYdYGjKG1X5aDruZQpXAngJb8iGkLvWo1Knx6FM661TgwL/sE=
X-Received: by 2002:a05:6402:42cb:b0:5cf:cfa8:d6bd with SMTP id
4fb4d7f45d1cf-5d0207995b0mr5010068a12.25.1732360952975; Sat, 23 Nov 2024
03:22:32 -0800 (PST)
MIME-Version: 1.0
References: <CALXu0UcnZnQBbJQcSsbianeKiyB2vkOmvE1weGN_-EQSU=RNrQ AT mail DOT gmail DOT com>
In-Reply-To: <CALXu0UcnZnQBbJQcSsbianeKiyB2vkOmvE1weGN_-EQSU=RNrQ@mail.gmail.com>
Date: Sat, 23 Nov 2024 12:21:56 +0100
Message-ID: <CALXu0UfYmRP5yMG4J6znd4svqq1kbgEkpvHj-CWjB6APE8C3uw@mail.gmail.com>
Subject: Re: /bin/ls -l cannot handle printable Unicode characters outside the
BMP ...
To: cygwin AT cygwin DOT com
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Cedric Blancher via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Cedric Blancher <cedric DOT blancher AT gmail DOT com>
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 4ANBNgM3773515

On Sat, 23 Nov 2024 at 11:44, Cedric Blancher <cedric DOT blancher AT gmail DOT com> wrote:
>
> Good morning!
>
> /bin/ls -l cannot handle printable Unicode characters outside the BMP
>
> Example using '𝒯'
> bash -c 'printf "\U0001D4AF\n"' # MATHEMATICAL SCRIPT CAPITAL T
> (yes, our mathematicians want to use THAT as file name)
>
> On Linux:
> LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch "$t" "$t$t"'
> ls -la
> total 8
> -rw-r--r--  1 ced staden  0 Nov 23 11:29 ΓΆΓΆΓΆΓΆΓΆΓΆΓΆ
> -rw-r--r--  2 ced staden  4 Nov 23 11:31 𝒯
> -rw-r--r--  2 ced staden  4 Nov 23 11:31𝒯𝒯
>
> On Cygwin:
> LC_ALL=en_US.UTF-8 bash -c 't="$(printf "\U0001D4AF\n")" ; touch "$t" "$t$t"'
> $ ls -la
> -rw-r--r-- 1 ced staden  0 Nov 23 11:29  ΓΆΓΆΓΆΓΆΓΆΓΆΓΆ
> -rw-r--r-- 2 ced staden  4 Nov 23 11:31 ''$'\360\235\222\257'
> -rw-r--r-- 2 ced staden  4 Nov 23 11:31 ''$'\360\235\222\257\360\235\222\257'
>
> Looks like the Cygwin locale has a problem with non-BMP chars.

find(1) is even worse:
$ find .
.
./ΓΆΓΆΓΆΓΆΓΆΓΆΓΆ
./????
./x??x

The Microsoft Explorer GUI shows the file names correctly, so IMO this
is not a Windows or Win32 API problem.

Ced
-- 
Cedric Blancher <cedric DOT blancher AT gmail DOT com>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019