delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin-developers/2002/07/03/09:19:38

Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-developers-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin-developers/>
List-Post: <mailto:cygwin-developers AT cygwin DOT com>
List-Help: <mailto:cygwin-developers-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-developers-owner AT cygwin DOT com
Delivered-To: mailing list cygwin-developers AT cygwin DOT com
X-WM-Posted-At: avacado.atomice.net; Wed, 3 Jul 02 14:19:32 +0100
Message-ID: <01d501c22294$44a874b0$0100a8c0@advent02>
From: "Chris January" <chris AT atomice DOT net>
To: <cygwin-developers AT cygwin DOT com>
References: <008401c22279$68759a00$0100a8c0 AT advent02> <s1ssn31nf4y DOT fsf AT jaist DOT ac DOT jp>
Subject: Re: UTF8 support in Cygwin
Date: Wed, 3 Jul 2002 14:19:31 +0100
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000

> > My question is, does anyone have any objections to doing things this
way,
> > and if so, can they suggest a better way? I don't want to patch the
whole of
> > Cygwin and then have to re-write everything at a later date.
>
> I'd like to propose supporting other codepages than UTF8 and
> making it connected with other portions than filenames.
>
> For example, in case of CYGWIN=codepage:20866, suppose
> the `parse_options' set current_codepage = other_cp and
> current_cpnum = (UINT)20866.
> Your example would become as follows.
>
>   if (current_codepage == other_cp)
>     {
>       WCHAR wbuf[MAX_PATH];
>       if (MultiByteToWideChar (current_cpnum, 0, get_win32_name(), -1,
>                                wbuf, MAX_PATH) == 0)
>         {
>           __seterrno ();
>           goto done;
>         }
>       x = CreateFileW (wbuf, access, shared, &sa, creation_distribution,
>                        file_attributes, 0);
>     }
>   else
>     x = CreateFileA (get_win32_name (), access, shared, &sa,
creation_distribution,
>       file_attributes, 0);
>
> Moreover, get_cp in miscfunc.cc would have to become as follows.
>
>     UINT
>     get_cp ()
>     {
>       switch (current_codepage)
>         {
>         case ansi_cp:
>           return GetACP();
>         case oem_cp:
>           return GetOEMCP();
>         case other_cp:
>           return current_cpnum;
>         }
>     }
>
> When we want to use UTF8, we set codepage:65001 or codepage:utf8.
> The latter case needs for the parser to accept "utf8" and
> translate it to CP_UTF8 (65001).
>
> How about this idea?
This sounds like a good idea - I will have a go at implementing this.

Chris



- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019