delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2011/03/21/10:47:42

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_LOW
X-Spam-Check-By: sourceware.org
Message-ID: <4D8764BD.4060108@cwilson.fastmail.fm>
Date: Mon, 21 Mar 2011 10:46:21 -0400
From: Charles Wilson <cygwin AT cwilson DOT fastmail DOT fm>
Reply-To: Charles Wilson <cygwin AT cwilson DOT fastmail DOT fm>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: cygwin + GetConsoleOutputCP
References: <4D8651F2 DOT 3000200 AT cwilson DOT fastmail DOT fm> <AANLkTi=2pKTTo0+nUFa9Qaad7FxJwhhbQ5wJqtqtCpaw AT mail DOT gmail DOT com> <20110321111746 DOT GP31220 AT calimero DOT vinschen DOT de> <AANLkTimBFu3=4UCkKL=jraDLX00-MwhYpujm-vsRYsuc AT mail DOT gmail DOT com> <4D8756ED DOT 7010800 AT xs4all DOT nl>
In-Reply-To: <4D8756ED.7010800@xs4all.nl>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On 3/21/2011 9:47 AM, Erwin Waterlander wrote:
> I was doubting between GetACP and GetConsoleOutputCP on Windows. On a
> Western regional Windows the format of a text file is most likely CP1252.
> But the tool is named *dos*2unix, not *windows*2unix, therefore my
> choice was GetConsoleOutputCP.

Thanks for jumping in.

> Note that this tool is used by people who use the command-line. If you
> use 'edit' (in cmd.exe) to create a text file, it will use the console's
> DOS code page. Windows GUI users most probably don't even know what the
> Command Prompt is, so they will not use dos2unix at all.

Not true.  I've seen GUI IDE build rules that invoke d2u... Now, that
means the guy who WROTE the rule know about the command line, but the
other team members who merely use the IDE to build /whatever/ probably
don't have a clue.

OTOH, the guy who wrote the rule probably won't want to make any
assumptions about "default code page" and would explicitly specify
anyway.  If he used -iso at all.

> I forgot what the standard code page was under Cygwin 1.5.
> Under cygwin 1.7 this functionality is not really needed. Perhaps it's
> handy to have in special cases, if you know what you are doing.
> 
> 
>>> In theory the option is not useful and should just go away. If you
>>> have to keep it for backward compatibility, stick to the current
>>> behaviour and outlaw its use, perhaps be printing a nagging warning
>>> to stderr.
>> ... and pointing them at iconv (which, to be fair, the -iso
>> description already does).

Ack.

> That is what the dos2unix manual page does, point to iconv. The
> functionality is only there for backward compatibility and to be
> compatible with SunOS dos2unix, after which the utility was modelled.
> The original author only implemented cp437 vs iso8859-1 conversion. I
> finished his job and also added the missing ones, and I added 1252 for
> ease of use. I'm aware that on Cygwin and modern Linuxes these
> conversions make not much sense. This version of dos2unix is also used
> on DOS (16 and 32 bit), old Windows versions and even OS/2 Warp. The DOS
> and Windows versions are quite popular.

Hmm.  So we have a choice for the cygwin version, for choosing the
default dos cp when the user specifies -iso and doesn't explicitly
specify the cp: either
 1) follow the unix behavior: just simply default to -437.
 2) follow the dos behavior: use GetConsoleOutputCP() as currently
    coded.
 3) Do something completely different, and use GetACP().
 4) Do something completely different, and use nl_langinfo or setlocale.

I think #3 is a bad idea, since it creates a new semantic just /because/.

#4 would make sense if we were designing from scratch -- but then, we'd
really just be re-implementing iconv(1).  I don't think that's a
valuable exercise, and besides:

> There is no intention to add other conversions. And I don't plan to
> remove the options in the near future.

I think I lean towards #1 for this specific corner case.  In general,
cygwin ports should act like their unix conterparts, even if there is a
"native" win32 port with different semantics, IMO.

--
Chuck

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019