X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SARE_SUB_ENC_UTF8,SPF_PASS X-Spam-Check-By: sourceware.org MIME-Version: 1.0 In-Reply-To: References: <3f0ad08d0905121029j119c8a7ep41d3a261d8bea338 AT mail DOT gmail DOT com> <20090512173741 DOT GZ21324 AT calimero DOT vinschen DOT de> <20090513142953 DOT GI21324 AT calimero DOT vinschen DOT de> Date: Wed, 13 May 2009 15:54:40 +0100 Message-ID: <416096c60905130754s3ffaae9dl8d6df4c4184b95e6@mail.gmail.com> Subject: Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8 From: Andy Koppe To: cygwin AT cygwin DOT com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com > - why do you need to touch the filename at all? I haven't read all of it. Is > the UTF-16 on disk and we need to work around UTF-16 being intractable as C > string? Yes. If you simply treated each UTF-16 symbol as two chars, you'd get unintended NULs and slashes. For starters, the upper halves of all ISO-8859-1 characters are NUL in UTF-16. And even without that, the resulting filenames would be completely unusable. Andy -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/