delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/01/23/10:51:42

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=BAYES_00,SARE_MSGID_LONG40,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <20100123150703.GY2402@calimero.vinschen.de>
References: <e22ab97b1001222149r3c217decmb0da069d7049c896 AT mail DOT gmail DOT com> <20100123135020 DOT GW2402 AT calimero DOT vinschen DOT de> <20100123150703 DOT GY2402 AT calimero DOT vinschen DOT de>
Date: Sat, 23 Jan 2010 15:51:24 +0000
Message-ID: <416096c61001230751m308ac854x4f026b1f83b966d0@mail.gmail.com>
Subject: Re: Please support CP932. (I have problem using subversion with SJIS)
From: Andy Koppe <andy DOT koppe AT gmail DOT com>
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On 23 January 2010 15:07, Corinna Vinschen:
> Ouch. =C2=A0I understand now. =C2=A0Standard SJIS is *really* different f=
rom
> Microsoft CP932 in two code points:
>
> =C2=A0CP932 0x5c =3D=3D U+005E
> =C2=A0SJIS =C2=A00x5c =3D=3D U+00A5
>
> =C2=A0CP932 0x7e =3D=3D U+007E
> =C2=A0SJIS =C2=A00x7e =3D=3D U+203E

Aargh! I wonder what that would do to DOS paths and stuff like ~username.

> Would it be a valid help for your case if Cygwin's SJIS conversion would
> convert 0x5c to U+00A5 and 0x7e to 203E, so that the SJIS conversion
> would be really correct *and* bijective?

I think that's the correct thing to do, but it'll likely break other
stuff. Seems SJIS really isn't suited for Unix command line use. All
the more reason to make EUC-JP the default for "ja_JP" I guess.

> =C2=A0To me this sounds like the
> better solution than adding a CP932 charset identifier.

I agree. Simply aliasing CP932 to SJIS is wrong, because they are
quite different character sets. Supporting CP932 as a charset in its
own right might be worth considering though, especially as that's the
standard charset on Japanese Cygwin 1.5.

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019