delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/01/24/04:55:53

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
Date: Sun, 24 Jan 2010 10:53:52 +0100
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Please support CP932. (I have problem using subversion with SJIS)
Message-ID: <20100124095352.GC2402@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <e22ab97b1001222149r3c217decmb0da069d7049c896 AT mail DOT gmail DOT com> <20100123135020 DOT GW2402 AT calimero DOT vinschen DOT de> <20100123150703 DOT GY2402 AT calimero DOT vinschen DOT de> <416096c61001230751m308ac854x4f026b1f83b966d0 AT mail DOT gmail DOT com> <20100123164546 DOT GZ2402 AT calimero DOT vinschen DOT de> <416096c61001231431u7e67cd37r2e741d0cb48c732f AT mail DOT gmail DOT com> <20100124093750 DOT GA2402 AT calimero DOT vinschen DOT de>
MIME-Version: 1.0
In-Reply-To: <20100124093750.GA2402@calimero.vinschen.de>
User-Agent: Mutt/1.5.20 (2009-06-14)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Jan 24 10:37, Corinna Vinschen wrote:
> On Jan 23 22:31, Andy Koppe wrote:
> > Corinna Vinschen:
> > > I applied a patch which handles the characters 0x5c and 0cfe differently
> > > if the charset is set to "SJIS"
> > 
> > Something's going seriously wrong with this, and I'd suspect it's to
> > do with turning backslashes into yen symbols.
> 
> Right.  It occured to me tonight that this will not work from a
> filesystem point-of-view.  The people who decided to overload backslash
> and tilde in the ASCII range with different symbols in SJIS still need
> some serious knock on their heads.  No wonder the Microsoft guys kept
> the binary values of characters intact, especially due to the backslash
> problem.
> 
> > Not sure what could be done about it. Remove SJIS support in favour of CP932?
> 
> In theory, we could be able to keep SJIS support in.  The
> Cygwin-internal function converting multibyte strings to Unicode
> filenames would have to use CP932.  Only on the application level the
> conversion would use SJIS.
> 
> There's no system API which takes wchar_t strings, so all strings are
> exchanged between application and system using multibyte strings.  Since
> the multibytes strings are the same, that should give a round-trip which
> still works for Win32 filenames:
> 
> Input string:  "\x5e\xfe"
> 
> Application:     mbstowcs ("\x5e\xfe")      ==> L"\x00a5\x203e"
>                  wcstombs (L"\x00a5\x203e") ==> "x5e\xfe"
> 
> Cygwin       sys_mbstowcs ("\x5e\xfe")      ==> L"\x005e\x007e"
>              sys_wcstombs (L"\x005e\x007e") ==> "x5e\xfe"

...and, if we implement it that way, do we really still need support
for a "CP932" charset?

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019