X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,TW_YG,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org MIME-Version: 1.0 In-Reply-To: <837hmmenu0.fsf@garydjones.name> References: <83d3wff7t5 DOT fsf AT garydjones DOT name> <837hmmenu0 DOT fsf AT garydjones DOT name> Date: Sat, 29 May 2010 16:54:26 +0100 Message-ID: Subject: Re: cygpath behaviour when input is not a path From: Andy Koppe To: cygwin AT cygwin DOT com Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On 29 May 2010 16:20, Gary wrote: > I wrote: >> How should cygcheck > > cygpath, not cygcheck > > /me smacks head > >> behave when given a "PATH list" (e.g., >> '/bin:/usr/bin'), *without* the -p option? >> >> For example: >> $ cygpath -a -p -C ANSI -w /bin:/usr/bin >> C:\cygwin\bin;C:\cygwin\bin >> >> = okay. What I expect from RTFMP. >> >> $ cygpath -a -C ANSI -w /bin:/usr/bin >> C:\cygwin\bin?\usr\bin >> >> Urk! I would have hoped it wouldn't try to convert this, or at the very >> least not the last part, but I don't know if it's a bug, "by design", or >> what. A colon is a valid character in a POSIX filename, i.e. the path your converting there starts with a directory called 'bin:'. Colons aren't allowed in Windows filenames though, which is why Cygwin maps it to the Unicode private use area: U+F03A. Without the '-C ANSI' option, and assuming you're using a UTF-8 locale, that would be displayed as the "Unicode replacement character", usually an empty box or a question mark in a box, depending on the font. But with the '-C ANSI' option, the string is converted to your ANSI codepage, presumably CP1252, whereby codepoints such as U+F03A that can't be represented in that codepage are turned into question mark characters. Have you got any particular reason for overriding the locale charset with the -C option? Doing that will probably cause breakage with any filenames containing non-ASCII characters. Andy -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple