Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <40C27FC5.1080700@athensgroup.com> Date: Sat, 05 Jun 2004 21:21:57 -0500 From: James Garrison Organization: Athens Group, Inc. User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: rxvt, ssh and utf8 - partial success References: <40C09A9B DOT 5060707 AT athensgroup DOT com> <40C0AF05 DOT DBA983C AT dessent DOT net> <20040605124405 DOT GA2352 AT ata DOT cs DOT hun DOT edu DOT tr> <40C2620F DOT 9010009 AT athensgroup DOT com> In-Reply-To: <40C2620F.9010009@athensgroup.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes James Garrison wrote: > Baurjan Ismagulov wrote: > >> On Fri, Jun 04, 2004 at 10:19:01AM -0700, Brian Dessent wrote: >> >>> I'd love to know why one or the other terminal setting can't just work >>> for everything. >> >> >> >> This happens due to the following difference in the terminfo entries: >> >> - acsc=``aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~, >> + >> acsc=+\257\,\256-\^0\333`\004a\261f\370g\361h\260j\331k\277l\332m\300n\305o~p\304q\304r\304s_t\303u\264v\301w\302x\263y\363z\362{\343|\330}\234~\376, >> >> >> Handling this in rxvt should solve your problem (I wonder if there are >> any reasons not to do that). > > > Not sure what you mean by 'handling'. Are you saying rxvt needs to be > modified, or just terminfo? I uploaded the rxvt-cygwin terminfo file from Cygwin onto the Linux system (into ~/.terminfo/r/rxvt-cygwin). Setting TERM=rxvt-cygwin now allows the curses-based program to draw boxes correctly, but apparently some substitution is going on because it's using plain old hyphens and vertical bars for lines and plus signs for corners. Terminfo doesn't seem to know anything about Unicode as far as I can tell (or does it?). That leads to the question of how putting the terminfo file on the Linux system caused the curses-based program to output single ASCII characters where previously it was sending Unicode sequences... something understood how to interpret the Unicode box- drawing characters and replaced them with the nearest ASCII matches "+", "|" and '-'. However, this IS NOT happening with Unicode quote characters. Here's a snippet from the man page for terminfo itself, as displayed: > Entries in terminfo consist of a sequence of ‘,’ separated fields > (embedded commas may be escaped with a backslash or notated as \054). > White space after the ‘,’ separator is ignored. The first entry for > each terminal gives the names which are known for the terminal, sepa- > rated by ‘|’ characters. The first name given is the most common Those "‘" sequences turn out to be \xE2\x80\x99, which is the UTF8 encoding of the Unicode character "Right Single Quotation Mark) (U+2019). Here's the full terminfo entry (decompiled with infocmp): > # Reconstructed via infocmp from file: /home/jhg/.terminfo/r/rxvt-cygwin > rxvt-cygwin|rxvt terminal emulator (X Window System) on cygwin, > am, bce, xenl, eo, km, mir, msgr, xon, > cols#80, it#8, lines#24, colors#8, pairs#64, > acsc=+\257\,\256-\^0\333`\004a\261f\370g\361h\260j\331k\277l\332m\300n\305o~p\304q\304r\304s_t\303u\264v\301w\302x\263y\363z\362{\343|\330}\234~\376, > bel=^G, cr=^M, csr=\E[%i%p1%d;%p2%dr, tbc=\E[3g, > clear=\E[H\E[2J, el1=\E[1K, el=\E[K, ed=\E[J, > hpa=\E[%i%p1%dG, cup=\E[%i%p1%d;%p2%dH, cud1=^J, > home=\E[H, civis=\E[?25l, cub1=^H, cnorm=\E[?25h, > cuf1=\E[C, cuu1=\E[A, cvvis=\E[?25h, dch1=\E[P, dl1=\E[M, > enacs=\E(B\E)0, smacs=^N, blink=\E[5m, bold=\E[1m, > smcup=\E7\E[?47h, smir=\E[4h, rev=\E[7m, smso=\E[7m, > smul=\E[4m, rmacs=^O, sgr0=\E[m\017, > rmcup=\E[2J\E[?47l\E8, rmir=\E[4l, rmso=\E[27m, > rmul=\E[24m, flash=\E[?5h\E[?5l, is1=\E[?47l\E=\E[?1l, > is2=\E[r\E[m\E[2J\E[H\E[?7h\E[?1;3;4;6l\E[4l, > ich1=\E[@, il1=\E[L, ka1=\EOw, ka3=\EOy, kb2=\EOu, kbs=^H, > kcbt=\E[Z, kc1=\EOq, kc3=\EOs, kdch1=\E[3~, kcud1=\E[B, > kend=\E[8~, kent=\EOM, kel=\E[8\^, kf0=\E[21~, kf1=\E[11~, > kf10=\E[21~, kf11=\E[23~, kf12=\E[24~, kf13=\E[25~, > kf14=\E[26~, kf15=\E[28~, kf16=\E[29~, kf17=\E[31~, > kf18=\E[32~, kf19=\E[33~, kf2=\E[12~, kf20=\E[34~, > kf3=\E[13~, kf4=\E[14~, kf5=\E[15~, kf6=\E[17~, kf7=\E[18~, > kf8=\E[19~, kf9=\E[20~, kfnd=\E[1~, khome=\E[7~, > kich1=\E[2~, kcub1=\E[D, kmous=\E[M, knp=\E[6~, kpp=\E[5~, > kcuf1=\E[C, kDC=\E[3$, kslt=\E[4~, kEND=\E[8$, kHOM=\E[7$, > kLFT=\E[d, kNXT=\E[6$, kPRV=\E[5$, kRIT=\E[c, kcuu1=\E[A, > rmkx=\E>, smkx=\E=, op=\E[39;49m, dch=\E[%p1%dP, > dl=\E[%p1%dM, cud=\E[%p1%dB, ich=\E[%p1%d@, il=\E[%p1%dL, > cub=\E[%p1%dD, cuf=\E[%p1%dC, cuu=\E[%p1%dA, > rs1=\E>\E[1;3;4;5;6l\E[?7h\E[m\E[r\E[2J\E[H, > rs2=\E[r\E[m\E[2J\E[H\E[?7h\E[?1;3;4;6l\E[4l\E>, > rc=\E8, vpa=\E[%i%p1%dd, sc=\E7, ind=^J, ri=\EM, s0ds=\E(B, > s1ds=\E(0, setab=\E[4%p1%dm, setaf=\E[3%p1%dm, hts=\EH, > ht=^I, Any insight ijnto what's going on and how to make it work correctly would be greatly appreciated. -- James Garrison Athens Group, Inc. mailto:jhg AT athensgroup DOT com 5608 Parkcrest Dr http://www.athensgroup.com Austin, TX 78731 PGP: RSA=0x92E90A3B DH/DSS=0x498D331C (512) 345-0600 x150 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/