Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT cygwin DOT com Delivered-To: mailing list cygwin-developers AT cygwin DOT com X-WM-Posted-At: avacado.atomice.net; Wed, 3 Jul 02 14:19:32 +0100 Message-ID: <01d501c22294$44a874b0$0100a8c0@advent02> From: "Chris January" To: References: <008401c22279$68759a00$0100a8c0 AT advent02> Subject: Re: UTF8 support in Cygwin Date: Wed, 3 Jul 2002 14:19:31 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 > > My question is, does anyone have any objections to doing things this way, > > and if so, can they suggest a better way? I don't want to patch the whole of > > Cygwin and then have to re-write everything at a later date. > > I'd like to propose supporting other codepages than UTF8 and > making it connected with other portions than filenames. > > For example, in case of CYGWIN=codepage:20866, suppose > the `parse_options' set current_codepage = other_cp and > current_cpnum = (UINT)20866. > Your example would become as follows. > > if (current_codepage == other_cp) > { > WCHAR wbuf[MAX_PATH]; > if (MultiByteToWideChar (current_cpnum, 0, get_win32_name(), -1, > wbuf, MAX_PATH) == 0) > { > __seterrno (); > goto done; > } > x = CreateFileW (wbuf, access, shared, &sa, creation_distribution, > file_attributes, 0); > } > else > x = CreateFileA (get_win32_name (), access, shared, &sa, creation_distribution, > file_attributes, 0); > > Moreover, get_cp in miscfunc.cc would have to become as follows. > > UINT > get_cp () > { > switch (current_codepage) > { > case ansi_cp: > return GetACP(); > case oem_cp: > return GetOEMCP(); > case other_cp: > return current_cpnum; > } > } > > When we want to use UTF8, we set codepage:65001 or codepage:utf8. > The latter case needs for the parser to accept "utf8" and > translate it to CP_UTF8 (65001). > > How about this idea? This sounds like a good idea - I will have a go at implementing this. Chris