delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2004/06/15/10:18:09

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Date: Tue, 15 Jun 2004 23:17:19 +0900
From: Jaeho Shin <netj AT sparcs DOT kaist DOT ac DOT kr>
To: "Pierre A. Humblet" <pierre DOT humblet AT ieee DOT org>
Cc: cygwin AT cygwin DOT com
Subject: Re: Unable to open files including Korean names
Message-ID: <20040615141718.GD5948@sab.mazic.org>
References: <20040612183000 DOT GA1628 AT sab DOT mazic DOT org> <20040612183000 DOT GA1628 AT sab DOT mazic DOT org> <3 DOT 0 DOT 5 DOT 32 DOT 20040613145523 DOT 00805ce0 AT incoming DOT verizon DOT net> <20040614111257 DOT GA3736 AT sab DOT mazic DOT org> <40CDAE70 DOT 86F50279 AT ieee DOT org> <40CE0845 DOT F6278F8F AT ieee DOT org> <20040615111128 DOT GA5948 AT sab DOT mazic DOT org> <40CEF62E DOT 1526816A AT ieee DOT org>
Mime-Version: 1.0
In-Reply-To: <40CEF62E.1526816A@ieee.org>
User-Agent: Mutt/1.4.1i
Organization: SPARCS, KAIST
X-IsSubscribed: yes
Note-from-DJ: This may be spam

--xaMk4Io5JJdpkLEb
Content-Type: text/plain; charset=euc-kr
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, 2004-06-15 09:14:22 -0400, Pierre A. Humblet wrote:
> Thanks. Nothing conclusive.
> Could you compile and run the following one line program?=20
>=20
> #include <windows.h>
> #include <stdio.h>
>=20
> main()
> {
>     printf("AreFileApisANSI %d\n", AreFileApisANSI());=20
> }
>=20=20
> Compile it with
> gcc -mno-cygwin try_ansi.c=20
>=20
> With the -mno-cygwin, the value of CYGWIN=3Dcodepage:oem
> shouldn't matter. When compiled without that switch
> codepage:oem or codepage:ansi should matter.
>=20
> Running on 1.5.9 is OK.

Here's the result:

$ gcc -mno-cygwin try_ansi.c=20
$ ./a.exe=20
AreFileApisANSI 1
$=20

>=20
> Also, the Korean directory name has numerical value
> ~> od -x xx.txt=20
> 0000000 d1c7 dbb1
>=20
> Do you know what encoding that is? Is it Unicode or UTF8?
> If it is UTF8, do you know what the Unicode values should be?

Well, that's in EUC-KR and CP949.  CP949 has some more characters
defined in the empty areas of EUC-KR.  The directory name I used,
``=C7=D1=B1=DB'', which is pronounced ``hangeul'' and means Korean (written
language) in Korean, is consisted of two characters:
 U+D55C: Hangul syllable Hieuh A Nieun,
 U+AE00: Hangul syllable Kiyeok Eu Rieul.
(Perhaps, you may be able to find it from Windows charmap)
Neither character is in CP949's extension, so they have identical values
in both EUC-KR and CP949 encoding.

Yes, you gave me the identical numerical value I use.=20=20
Running, `echo -n =C7=D1=B1=DB | od -x -` tells me:
0000000 d1c7 dbb1

Now, `echo -n =C7=D1=B1=DB | iconv -f euc-kr -t utf-8 | od -x -` tells me:
0000000 95ed ea9c 80b8

Yes, it's in EUC-KR (or CP949 equivalently in this case).  I don't use
unicode environment yet.  Actually, I don't know how to change encoding
from Windows.  Korean version of Windows just uses CP949 as default.

Looks like od's output is in little-endian.  This identifies them as
U+D55C and U+AE00, `echo -n =C7=D1=B1=DB | iconv -f euc-kr -t ucs-2 | od -x=
 -`:
0000000 5cd5 00ae


> Thanks for your help

My pleasure. :)


BTW, is there any reason you not sending your msgs to cygwin ML?
If not, I'll just keep Cc'ing to it.

--=20
=BD=C5=C0=E7=C8=A3 | Jaeho Shin <netj AT sparcs DOT kaist DOT ac DOT kr> | http://netj.org/
System Programmers' Association for Researching Computer Systems
Division of Computer Science, Department of EECS, KAIST


--xaMk4Io5JJdpkLEb
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Cygwin)

iD8DBQFAzwTueGASkZ411HcRArzUAKCh4G54EQg3ZWLrqaJTas93RqJMwQCgvPID
eIzVYt3T+A2VBxUPhLivNs4=
=vHqi
-----END PGP SIGNATURE-----

--xaMk4Io5JJdpkLEb--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019