delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/04/20/13:13:33

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=1.5 required=5.0 tests=BAYES_50,TVD_RCVD_IP
X-Spam-Check-By: sourceware.org
Date: Tue, 20 Apr 2010 10:12:00 -0700 (PDT)
From: "Peter A. Castro" <doctor AT fruitbat DOT org>
To: Yutaka Amanai <yasai-itame1942 AT jade DOT plala DOT or DOT jp>
cc: cygwin AT cygwin DOT com
Subject: Re: zsh 4.3.9-1: text-mode stdin problem (breaking base64)
In-Reply-To: <4BC597F1.20104@jade.plala.or.jp>
Message-ID: <alpine.LNX.2.00.1004191651550.3743@gremlin.fruitbat.org>
References: <4BC597F1 DOT 20104 AT jade DOT plala DOT or DOT jp>
User-Agent: Alpine 2.00 (LNX 1167 2008-08-23)
MIME-Version: 1.0
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Wed, 14 Apr 2010, Yutaka Amanai wrote:

Greetings, Yutaka,

> On Cygwin, zsh forces stdin to be text-mode. By this, some commands
> don't work correctly on zsh. For example, when you encode stdin with
> base64 on zsh, there is a possibility that base64 produces an incorrect
> result.

The text-mode "hack" was created to solve a basic problem that zsh has
with running scripts, in general, on Windows.  Much of the code assumes
that scripts have a single-character line terminator (eg: LF).  So do
many text-based programs and filters.  Windows "native" line termination
is (still) CRLF and zsh code does not deal well with this.

Cygwin's text-mode munches the CR from the stream input leaving the LF
which works well in 99% of the usage cases.  Without it, Zsh treats the
CR as part of the input line and tries to parse it as such leading to
"Bad Things"(tm) happening.  The same think would be true of data read
via the shell and passed to other programs as stdin.  There's also some
size calculations that only work with a single-character line terminator
(at least in zsh code).

A while back I looked at making changes to somehow acommodate CRLF, but
there are many places in the code that would require some heavy changes
(some of which I'm still not certain would be correct) and would make it
difficult to maintain.  I doubt that Zsh base would accept such changes
either as they would be an intrusive hack for Windows only support.  By
contrast the premain hack was elegent and global.

I could have simply told people that they had to run scripts from a
non-text-mode mount, that their /tmp had to also be on a non-text-mode
mount and all data the scripts explicitly read from were also on a
non-text-mode mount AND all scripts (and input data) must be non-CRLF.
Think that would fly?  Me neither.  That was the basis for this "fix" in
the first place.

> I wrote a test case. Save the script below as 'test.sh':
> ----
> printf '\x0D\x0A' > crlf-src
> base64 < crlf-src | base64 -d > crlf-dst
> if cmp crlf-src crlf-dst >/dev/null; then
>  echo "OK"
> else
>  echo "NG"
> fi
> ----
> And try this:
> $ /bin/bash test.sh
> OK
> $ /bin/zsh test.sh
> NG

How about this instead:

   base64 crlf-src | base64 -d > crlf-dst

Or maybe:

   cat crlf-src | base64 | base64 -d > crlf-dst

> Many commands, such as cat and gzip, explicitly call freopen() or
> setmode() to set stdin as binary-mode. So, such commands work well even
> on zsh. But, base64 doesn't take such measures and doesn't work well on
> zsh. Some other commands might have the same problem.

And how is base64's deficiency a zsh problem?  Stdin/Stdout are "text"
handles, which implies possible data manipulation along those lines.
There's no guarantee that they would pass binary data.

I believe that programs reading from stdin are supposed to assume the
text-mode semantic for the handles and behave accordingly.  You've
mentioned "cat" and "gzip" doing that very thing.  Think there might be a
reason for that?

> To fix the problem, it is the easiest way to simply erase
> cygwin_premain0() in main.c of zsh, and recompile zsh. If you don't
> mount any filesystem as text-mode, there will be no problem. But it
> seems that cygwin_premain0() is introduced for text-mode users. So, the
> solution I mentioned might not work well for text-mode users.

Infact, it's not a solution at all for that very reason.  Your "fix"
undoes the very thing that people complained about in the first place.
The fact that it works for you and your specific environmental setup does
not mean it will work for 99% of the rest of the users of Zsh in Cygwin.

As long as there are different mount types in Cygwin (and Windows uses
CRLF), this will continue to be a problem with no good solution.

As always, you are welcome to take a crack at re-working zsh code to work
better on windows (I even mention this as a future direction in the
main.c comments), but this "fix" of yours is not it.

-- 
Peter A. Castro <doctor AT fruitbat DOT org> or <Peter DOT Castro AT oracle DOT com>
 	"Cats are just autistic Dogs" -- Dr. Tony Attwood

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019