X-Recipient: archive-cygwin AT delorie DOT com X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1E40C3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ucar.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ucar.edu DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucar-edu.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=Mp1dKoMlVRY7psD2KDKwG/BBDngq7JoAZ0Gc3WZYXpw=; b=ZV+eFhUyrZBfXnhJVXpejOqr73iYXCnOVOcHiL4tSQkgc7umEhgGXyH0y/rjoUv7rS Gj3WZDl/QNkDTk7gJZllKDqjYS7NFMHcrwvVXOhDdmNjGZZw5OwkMn5eg+gfSCck9Cp6 6Hpan29pcQIOB7B38eSwx5KGGUPH6hOrC48Dopjy5Wre9//pbw8Zkehhua50g+fKXZYv rvB9hrQ0NafH8Kx9zpTiA/7TDmTftUDrWhdalt/gDPw6du369Nazgn6VkdNv9Pe1GiBG GGxBchGATHYAPhnbU4UB2GHvp7mMgMDvNGnbCd1UivSaOXT77o77qeLsnU+X9Ref9D1D atOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=Mp1dKoMlVRY7psD2KDKwG/BBDngq7JoAZ0Gc3WZYXpw=; b=8Da7arL/GQKivAkh0NSdmFtOiH1hk1DocLh1zE70ZAkOBK0k1XIXNQgJ7QN4rynx+2 EVXAq76IabRfkk4ZR0bEmrzM3GMPFfDJwy3ou0M9ZkjAIyVvI0zs5+ostzcmfbjJdIuK qnBgZu+r49Lx9T5LGdvkCypMDlb37j2O4nzzUpGdpb0kZ72lVqAYIcSxmqeIKAahWjFp M2JC30hlsmbs0cd6uwtLx1m+GzamzPdmloCj1h0jsi/Pjk+ih3+hJ8SlVLof8eIXbVea OeFKhFnaimbzIJfD5qSAWF0qvKgk2lEAstHZadqZOGGPqipeiC1pyq18d3m/Ymdlge5q c73g== X-Gm-Message-State: AOAM531TDMl0B6U2zrZe2MTwISNEDwppLF589hI66aq+K7oDbuUpD0yv PuYoYkPgu7pdSkaaIXjIyVtWhghJjdMx7g== X-Google-Smtp-Source: ABdhPJznAqxjYry2dzO4/aTQroLyueQ4ECCAzpss3/pOK3scgJ1MImLqKc1xDVjtC+T0rBZiryfPgw== X-Received: by 2002:a05:6638:d4c:: with SMTP id d12mr11235878jak.283.1643861578472; Wed, 02 Feb 2022 20:12:58 -0800 (PST) Message-ID: <214212b2-270b-ad62-837b-fb34697a2f33@ucar.edu> Date: Wed, 2 Feb 2022 21:12:56 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.5.1 Subject: Re: Removing ^X in paths Content-Language: en-US To: L A Walsh References: <0255429a-409d-c17a-7b4d-8cbbfbea7255 AT ucar DOT edu> <61FB3CA1 DOT 8000001 AT tlinx DOT org> From: Dennis Heimbigner In-Reply-To: <61FB3CA1.8000001@tlinx.org> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: cygwin AT cygwin DOT com Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 2134DHh0002008 I am using 64bit. And it has nothing to do misreading characters. The ^X is described in this document: https://www.cygwin.com/cygwin-ug-net/using-specialnames.html, There you will see this text: "If you don't want or can't use UTF-8 as character set for whatever reason, you will nevertheless be able to access the file. How does that work? When Cygwin converts the filename from UTF-16 to your character set, it recognizes characters which can't be converted. If that occurs, Cygwin replaces the non-convertible character with a special character sequence. The sequence starts with an ASCII CAN character (hex code 0x18, equivalent Control-X), followed by the UTF-8 representation of the character. The result is a filename containing some ugly looking characters. While it doesn't look nice, it is nice, because Cygwin knows how to convert this filename back to UTF-16. The filename will be converted using your usual character set. However, when Cygwin recognizes an ASCII CAN character, it skips over the ASCII CAN and handles the following bytes as a UTF-8 character. Thus, the filename is symmetrically converted back to UTF-16 and you can access the file." There is no obvious good reason to continue this convention. On 2/2/2022 7:23 PM, L A Walsh wrote: > On 2022/02/02 12:40, Dennis Heimbigner wrote: >> It appears that windows now supports the UTF-8 codepage. > It has since early 2000's. >> I light of this, it seems time to change cygwin so it no longer adds >> those >> control-x (^X)  characters in e.g. path names. > ^x is ASCII.  Cygwin doesn't insert ^X characters in paths. > > Perhaps you are thinking of '\' which looks like ¥ (a capital 'Y' with > 2 horizontal lines, (Fullwidth Yen Sign  U+FFE5)...if that's the case, > some 8-bit font > displayed that sign instead of a backslash in non-unicode locals. > > Are you using a 32-bit or 64-bit version of Cygwin?  on what version > of windows? > > If you still use a 32-bit version, you might need to move to a 64-bit > version. > I know the 32-bit version sometimes had the problem because it supported > fewer fonts and fewer characters at the same time. > > You might check out your locale (if in english, try setting: > LC_CTYPE="en_US.UTF-8" > in your shell and also check that your used font has a backslash in the > 0x7f position. > > But in shell, ^x is usually a character to erase the whole line -- so > it really > wouldn't do to have it in a PATH. > > Hope this helps, and sorry if this is completely off base. > > Linda > >> > -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple