delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin-developers/2000/03/07/15:23:32

Mailing-List: contact cygwin-developers-help AT sourceware DOT cygnus DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-developers-subscribe AT sourceware DOT cygnus DOT com>
List-Archive: <http://sourceware.cygnus.com/ml/cygwin-developers/>
List-Post: <mailto:cygwin-developers AT sourceware DOT cygnus DOT com>
List-Help: <mailto:cygwin-developers-help AT sourceware DOT cygnus DOT com>, <http://sourceware.cygnus.com/ml/#faqs>
Sender: cygwin-developers-owner AT sourceware DOT cygnus DOT com
Delivered-To: mailing list cygwin-developers AT sourceware DOT cygnus DOT com
Date: Tue, 7 Mar 2000 21:22:46 +0100 (MET)
From: Joerg Schilling <schilling AT fokus DOT gmd DOT de>
Message-Id: <200003072022.VAA25434@fokus.gmd.de>
To: cygwin-developers AT sourceware DOT cygnus DOT com
Subject: Re: Character sets in win32 and cygwin

>From: DJ Delorie <dj AT delorie DOT com>

>> While I do want cygwin to be as robust as possible, it is not likely that
>> I (or DJ, I assume) will have enough time to investigate something that
>> takes as much setup as this seems to entail.

>I'll trade.  I'll spend more time on random cygwin things, if someone
>else volunteers to paint my house.

Agreed: Nice idea, so you probably like to clean my rooms to give me 
some more minutes to work on open source software ;- I am working my 
whole free time on CD-recording. I have no extra time to debug software 
from other people. 

Some other notes: 

-	The problem may be observed with the German versions of Win95
	and WNT.

-	cdrecord -version prints on Solaris:

	Cdrecord-ProDVD 1.8 (sparc-sun-solaris2.4) Copyright (C) 1995-2000 Jörg Schilling

	The same source compiled on Cygwin prints:

	Cdrecord-ProDVD 1.8 (i586-pc-cygwin) Copyright (C) 1995-2000 J÷rg Schilling

From a German CD recording news group, I got the information
that UNICODE is transferred into the "OEM" charater set for DOS box
applications. If this is true for cygwin too, cygwin will not be usable with
character sets bejond 7 bit ASCII as the code pages are different and not
visible from a POSIX application.

Here is what I got in addition:

die Referenz sagt:

Console Code Pages

"A code page is a mapping of 256 character codes to individual
characters. Different code pages include different special characters,
typically customized for a language or a group of languages. 

Associated with each console are two code pages: one for input and one
for output."



Console Application Issues

"The 8-bit console functions use the OEM code page. All other
functions use the ANSI code page by default. This means that strings
returned by the console functions may not be processed correctly by
the other functions and vice versa. For example, if FindFirstFileA
returns a string that contains certain extended ANSI characters,
WriteConsoleA will not display the string properly. 

The best long-term solution for a console application is to use
Unicode. Barring that solution, a console application should use the
SetFileApisToOEM function. That function changes relevant Win32 file
functions so that they produce OEM character set strings rather than
ANSI character set strings. "




und damit darf der programmierer raten was jetzt 8-bit concole
functions sind, was mit Unicode laeuft und was mit SetFileApisToOEM
umgestellt werden kann.

Jörg

 EMail:joerg AT schily DOT isdn DOT cs DOT tu-berlin DOT de (home) Jörg Schilling D-13353 Berlin
       js AT cs DOT tu-berlin DOT de		(uni)  If you don't have iso-8859-1
       schilling AT fokus DOT gmd DOT de		(work) chars I am J"org Schilling
 URL:  http://www.fokus.gmd.de/usr/schilling   ftp://ftp.fokus.gmd.de/pub/unix

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019