delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/06/13/18:40:53

DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 45DMeqYr316399
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=b11LmkaP
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 13295388206F
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1718318451;
bh=klRxKeyOW5drFS6vMS2uBshKKpGgn/G+k2p1AY5R744=;
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
From;
b=b11LmkaP6Jb+Af2bxEw65ndlPYWQRX0SLjs/UNOoniTSSKJVcpjGVsBOkSLrvVZNV
5d06cHQH54McLv1q/BuMX0mkwn7SG15M/DR9vNUSqMoWrdvoVaX0kdwxpxjU1RcGR6
aNEREeh7JJJB3N4hfe/OwPf3R7n/iq7cWpVF2PyI=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F31343882123
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F31343882123
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718318392; cv=none;
b=Zw55Fkjw3HwusEQC7DyWd7woVI4AvP63HGA19mkd/Dnx7PRhxo3xFTktRHaweD+UYU9dWy7u3MFeRXXeQRJci4r2YwLdLeO5AYlT3Z0nw6bkIFc59jJPtRYBZxRHhS/4Lr2hBqU+KwLukxu7FG4kjjR3JUpojmXLkZfvTCeXNZ8=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1718318392; c=relaxed/simple;
bh=/ndQ/0jFJzLzvXuQ/LSQ3mcmp89M3R1Gx239VdC6Bnw=;
h=Message-ID:Date:MIME-Version:Subject:To:From;
b=OBCM5pjV+IQVAxlRgQtPRo5cHNAtvpNxGhBViNNG/yL9txIuJkJ75ud346PlhTyw4dBYrsER/xzojfjBrhT6+2l+p3M8caq9tB3eYkw1u1X/kBqltt3BKLN+gpShaxxeGHGWdAPPL4wzwmNK7BSI9fpLUoHcwDPGHQSIaVAL7sA=
ARC-Authentication-Results: i=1; server2.sourceware.org
Message-ID: <f2641cc4-cfb4-4248-a50c-23e611b0bb9a@SystematicSW.ab.ca>
Date: Thu, 13 Jun 2024 16:39:47 -0600
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: unable to remove oddly-named directory
To: cygwin AT cygwin DOT com
References: <6f296b1c-ad43-81e3-163b-4bc4d1b8ed4c AT jdrake DOT com>
Organization: Systematic Software
In-Reply-To: <6f296b1c-ad43-81e3-163b-4bc4d1b8ed4c@jdrake.com>
X-Rspamd-Queue-Id: 1F83020010
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS,
SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE,
UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6
X-Rspamd-Server: rspamout06
X-Stat-Signature: kzku9ayehm8fsz8ogcfk9n9catisjs6j
X-Session-Marker: 427269616E2E496E676C69734053797374656D6174696353572E61622E6361
X-Session-ID: U2FsdGVkX19RTcP+mGxAVnv6ZvSn6MboHhlpjyFYGKo=
X-HE-Tag: 1718318388-241864
X-HE-Meta: U2FsdGVkX1+UJLan4iai14Wf5/2yepcbJZl7SKX0zYAPZxOv/UTPpopveltgsTgOMSGgDHPP3glPhaIWzTxpt5cLuCI4R3s+CwpQuk2sZpgapeBtXiV+he43JlrktOgbCt0GWIQ6FdTIHDoAL9U1DxqeThhDkHM/Sz+7Brt1PQLg9z0i+QO35B0Lb5+sgx3LAdxKgk15P69LwnyyaumQtU0yX5lj0kkxHkwToasrOL8GFxT1wYnaCoiIK0iqlNiVsK/V6oeIyG1tDCMxL0R7A4Pd/EzzYVWH3QZNS0ru9K3cheRp3Sa+0xd7vN2hg3/umRIRJ+eL3iJh5HlAMUdlmta7/H5dwEPHa3ZXZKYVwCnd5NQ4eZbc50Qioi7gfpnXerst4oOnheihG9UPuPnLv9sM87nF1uBWLte1XAEVvwZ+1CuVc9J20fIb4ObzH1wAX9LWcTc73ymzaNXSkZqueVKWZ5aBYhNB/iTvyxo1ALOxSYPW9GeDhA1cnIDZTZmocWzuypBS4/fRx/objPEmKC7VqCtb1PkZwBv32e8gv4s=
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Brian Inglis via Cygwin <cygwin AT cygwin DOT com>
Reply-To: cygwin AT cygwin DOT com
Cc: Brian Inglis <Brian DOT Inglis AT SystematicSW DOT ab DOT ca>
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 45DMeqYr316399

On 2024-06-13 14:59, Jeremy Drake via Cygwin wrote:
> Backstory: rust's test suite makes an oddly-named directory as part of a
> test:
> https://github.com/rust-lang/rust/blob/921645c737f1d6d107a0a10ca5ee129d364dcd7a/tests/run-make/non-unicode-in-incremental-dir/rmake.rs
> 
> When trying to clean up after a rust build/test with rm -rf, it results in
> a "Directory not empty" error.

Suggest using rust uutils for this, as rust created it:

	https://uutils.github.io/coreutils/
	https://github.com/uutils/coreutils

> Thankfully, this can be simply reproduced with the following two bash
> commands (on cygwin 3.5.3):
> 
> mkdir -p foo/$'\uD800'
> rm -rf foo
> 
> This fails with: rm: cannot remove 'foo': Directory not empty
> when it should succeed.

That is questionable as that value is a reserved Unicode high surrogate for a 
Unicode character higher than UTF-16 was originally designed for.

$ mkdir -p foo/$'\uD800'
$ rm -rf  foo # /$'\uD800'
/bin/rm: cannot remove 'foo': Directory not empty
$ rm -rf  foo/$'\uD800'
removed directory 'foo/'$'\355\240\200'
$ rm -rf  foo
removed directory 'foo'

These reserved surrogate values should probably either be blocked, or encoded at 
the file system interface layer so they can be round tripped, like the Windows 
reserved characters, in the BMP or SMP PUAs.

Reserved surrogate ranges are D800-DBFF|DC00-DFFF.

Reserved noncharacters are U+FDD0-FDEF, and the last two code points of the BMP 
U+FFFE-FFFF, and each of the SMPs: U+{1-10}FFFE-{1-10}FFFF.

Allowed PUAs are U+E000-F8FF, U+F0000-FFFFD and U+100000-10FFFD.

Corinna? Opinions?

It would also be good to avoid the CSUR U+E000-E82F, U+F8A0-F8FF, U+F0000-F16AF:

	https://www.evertype.com/standards/csur/

and UCSUR U+E830-EDFF, U+F4C0-F4EF, U+F16B0-F1C9F, F1F00-F289F registry ranges:

	https://www.kreativekorp.com/ucsur/

as some fonts may render these, as they are used by applications.

These registry folks are major contributors to Unicode standards, and these 
efforts bring order to supporting, managing, and using minority, minor 
historical or ancient, undeciphered, or constructed (e.g. Mormon, Shaw, Tolkien, 
Le Guin, Star Trek, Star Wars) language scripts or writing systems with glyphs 
not (yet) officially assigned in Unicode Standards.

-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019