X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Thu, 24 Feb 2011 12:14:47 +0100 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: Mg3a - a version of Mg2a developed on Cygwin Message-ID: <20110224111447.GS9392@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <35i9m6pt07r66fib882etg5tgirkr413co AT 4ax DOT com> <0105D5C1E0353146B1B222348B0411A209DAA0FF98 AT NIHMLBX02 DOT nih DOT gov> <20110224085617 DOT GM9392 AT calimero DOT vinschen DOT de> <4nccm61h5q3f207me4u69qfk6i0vqrd2f2 AT 4ax DOT com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <4nccm61h5q3f207me4u69qfk6i0vqrd2f2@4ax.com> User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Feb 24 11:56, Bengt Larsson wrote: > Corinna Vinschen wrote: > >Just a hint: > > > >When on Cygwin, you might better use Cygwin's(*) wcwidth function. It's > >based on the same code from Markus Kuhn, but it interacts with the > >setlocale function to make sure that the width returned for the CJK > >ambiguous width characters makes sense in the given locale. Plus, it > >supports a Cygwin-specific locale modifier '@cjknarrow' which allows the > >user to modify this behaviour. When using your own wcwidth, you're > >giving up on this feature. > > > >Better yet, convert wide chars to wide strings and use the wcswidth > >function. In contrast to wcwidth, it can also handle surrogate pairs. > > I don't use surrogates. I only use UTF-8 and UTF-32. But using cygwin's > wcwidth may be worth thinking about. I suppose it will be consistent > with mintty that way; otherwise not? Yes, I think Andy uses the system functions as well. As for the wide char representation, wchar_t is UTF-16 on Cygwin, as on the underlying Windows, and surrogate pairs are always possible. You can't use any libc wide char function if you assume UTF-32. > Using wcswidth isn't very useful in the editor because it has special > requirements, like showing control characters with ^C. Well, it's not really such a big problem to special case wide char control values and just call wcswidth otherwise... > That's one of the > reasons I mostly wrote my own, all the special requirements. I always > iterate of bytes, which are converted in a mode-dependent way to ints > (UTF-32). Do you have a case-insensitive compare? Because I limited that > to ASCII. wcscasecmp and wcsncasecmp are both available. But obviously they use UTF-16, since wchar_t is UTF-16. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple