Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT cygwin DOT com Delivered-To: mailing list cygwin-developers AT cygwin DOT com X-WM-Posted-At: avacado.atomice.net; Wed, 3 Jul 02 11:07:15 +0100 Message-ID: <008401c22279$68759a00$0100a8c0@advent02> From: "Chris January" To: Subject: UTF8 support in Cygwin Date: Wed, 3 Jul 2002 11:07:15 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 I am working on a patch which would add UTF8 support to Cygwin. i.e. Unicode filenames would be encoded as UTF8 before being returned by, e.g., readdir and then converted back to Unicode before being passed to the Windows API. This would solve Ville Herva's problem where he/she wanted to back up a filesystem containing Unicode filenames using Cygwin, but found that the Unicode characters were converted to question marks. Also, with an appropriate terminal, it is actually possible to view the Unicode characters (altough at the moment, it is not possible to input them correctly AFAIK). The code is currently guarded by a CYGWIN environment variable flag, 'utf8'. An example of the way I'm doing this is: if (use_utf8) { WCHAR wbuf[MAX_PATH]; if (MultiByteToWideChar (CP_UTF8, 0, get_win32_name(), -1, wbuf, MAX_PATH) == 0) { __seterrno (); goto done; } x = CreateFileW (wbuf, access, shared, &sa, creation_distribution, file_attributes, 0); } else x = CreateFileA (get_win32_name (), access, shared, &sa, creation_distribution, file_attributes, 0); My question is, does anyone have any objections to doing things this way, and if so, can they suggest a better way? I don't want to patch the whole of Cygwin and then have to re-write everything at a later date. Regards Chris