X-Recipient: archive-cygwin AT delorie DOT com X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3AF133858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=towo.net Authentication-Results: sourceware.org; spf=none smtp.mailfrom=towo.net Message-ID: <48d29b55-98d1-d1dd-44ef-af466429d2d7@towo.net> Date: Thu, 3 Feb 2022 06:02:22 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.5.1 Subject: Re: Removing ^X in paths To: cygwin AT cygwin DOT com References: <0255429a-409d-c17a-7b4d-8cbbfbea7255 AT ucar DOT edu> <61FB3CA1 DOT 8000001 AT tlinx DOT org> <214212b2-270b-ad62-837b-fb34697a2f33 AT ucar DOT edu> From: Thomas Wolff In-Reply-To: <214212b2-270b-ad62-837b-fb34697a2f33@ucar.edu> X-Provags-ID: V03:K1:sXizT7rnGiCYpXLioq0+bhjBwu4cyjyG8sroaFe05b76e52Ys1i bu6UIgl/4WracitxvZRuJzlR8wF5ZNJR44mFRgbpOkPCEiTdEGWaZR+AGsa5SVNOjqTm6R3 hNbnsJGEAy7WL+zA9xsgzuiKg7Lg5fOG1Dd7AjJT+HM+laPbxQ5YXJUpJ8DtmK3WOezZRhc Ockdf3XLt3m+N9/lOy2Lg== X-UI-Out-Filterresults: notjunk:1;V03:K0:D6vy+lFA0s8=:uhdGAU9+ao67hg+GSEjpSa Hm59XU1tys7iWzXhgkm9H+p9qpg1ixifjwciqte12B6L80+CGQMBMSQokgLsC1yhb/BxoshI2 qyuMAH5/EralHBZgy5fiRt0U1E5C0FC3xNlJAyWxs0wto5o86OB4WBOtKuNlRzFvzeKJsGet6 w9lDp58Wf0bP5GxtjmifNS7isr9s5/JphWII5lSj5YfyLAb7Rb5Od30Kvc0JSiWOgMBo0a8xa hoMcrLmIVgRMsIZE/aZR66PP8XMKMG2syZvnOPZ46WCWBG+ugXF9an76PjT9Fj7eAJwddjRrL HR6/KoEPooVoR5RjLlSv8RgIZscGP5ayEjwvdkElkG/voq8o4DlmdjxfmP4cg5XpooBoFpORg I82DHIiybrpGEAEdYNT5cedwuf/VvZdtc2gsjG13sQn/Hy3hpsQ74gQ63TD0BoAVFDAfq9Ywj JZ4ciytEI6YnWw/ma/3KKaYyqVTLQGzkMgGV5oJs93SBYo6A0OjgiRAMhCcc+y/h9rNvKCqkh nQZqXmYH05wbiliNWKsNixmmnnt3KfdwwfTRVBSvchjx0qG0RmIo2/o3eTwHP4ValAvEn7QIA oCHLM4PX9C9IsoQgzCHypz1MLiOrleGvtoGj8APsmm7hHgjLNuA0WJ5cMqLteKNMDxIVs92EH jRl/NU/jRCTKtSUmk4Wouf8uD5hoyVV4YQZaB+bWgYpr2uqTdN9dXTBpd0BWH7nUEOBw= X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 213531ov008126 Am 03.02.2022 um 05:12 schrieb Dennis Heimbigner: > I am using 64bit. > And it has nothing to do misreading characters. > > The ^X is described in this document: > https://www.cygwin.com/cygwin-ug-net/using-specialnames.html, > > There you will see this text: > > "If you don't want or can't use UTF-8 as character set for > whatever reason, you will nevertheless be able to access the > file. How does that work? When Cygwin converts the filename from > UTF-16 to your character set, it recognizes characters which > can't be converted. If that occurs, Cygwin replaces the > non-convertible character with a special character sequence. The > sequence starts with an ASCII CAN character (hex code 0x18, > equivalent Control-X), followed by the UTF-8 representation of > the character. The result is a filename containing some ugly > looking characters. While it doesn't look nice, it is nice, > because Cygwin knows how to convert this filename back to > UTF-16. The filename will be converted using your usual > character set. However, when Cygwin recognizes an ASCII CAN > character, it skips over the ASCII CAN and handles the following > bytes as a UTF-8 character. Thus, the filename is symmetrically > converted back to UTF-16 and you can access the file." This supports a non-UTF-8 cygwin client side, e.g. when running LC_ALL=de_DE mintty and you have a Chinese character in a file name. > There is no obvious good reason to continue this convention. See above, there is good reason and no reason to drop it. Thomas > > On 2/2/2022 7:23 PM, L A Walsh wrote: >> On 2022/02/02 12:40, Dennis Heimbigner wrote: >>> It appears that windows now supports the UTF-8 codepage. >> It has since early 2000's. >>> I light of this, it seems time to change cygwin so it no longer adds >>> those >>> control-x (^X)  characters in e.g. path names. >> ^x is ASCII.  Cygwin doesn't insert ^X characters in paths. >> >> Perhaps you are thinking of '\' which looks like ¥ (a capital 'Y' >> with 2 horizontal lines, (Fullwidth Yen Sign  U+FFE5)...if that's the >> case, some 8-bit font >> displayed that sign instead of a backslash in non-unicode locals. >> >> Are you using a 32-bit or 64-bit version of Cygwin?  on what version >> of windows? >> >> If you still use a 32-bit version, you might need to move to a 64-bit >> version. >> I know the 32-bit version sometimes had the problem because it supported >> fewer fonts and fewer characters at the same time. >> >> You might check out your locale (if in english, try setting: >> LC_CTYPE="en_US.UTF-8" >> in your shell and also check that your used font has a backslash in the >> 0x7f position. >> >> But in shell, ^x is usually a character to erase the whole line -- so >> it really >> wouldn't do to have it in a PATH. >> >> Hope this helps, and sorry if this is completely off base. >> >> Linda >> >>> >> > > -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple