delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/04/21/05:04:41

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Message-ID: <4BCEBF96.3030201@jade.plala.or.jp>
Date: Wed, 21 Apr 2010 18:04:22 +0900
From: Yutaka Amanai <yasai-itame1942 AT jade DOT plala DOT or DOT jp>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ja; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: zsh 4.3.9-1: text-mode stdin problem (breaking base64)
X-VirusScan: Outbound; msa03b; Wed, 21 Apr 2010 18:04:26 +0900
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

2010/04/21 2:12 Peter A. Castro wrote:
> Greetings, Yutaka,

Greetings, Peter. Thank you for your reply.

> The text-mode "hack" was created to solve a basic problem that zsh has
> with running scripts, in general, on Windows. Much of the code assumes
> that scripts have a single-character line terminator (eg: LF). So do
> many text-based programs and filters. Windows "native" line termination
> is (still) CRLF and zsh code does not deal well with this.
>
> Cygwin's text-mode munches the CR from the stream input leaving the LF
> which works well in 99% of the usage cases. Without it, Zsh treats the
> CR as part of the input line and tries to parse it as such leading to
> "Bad Things"(tm) happening. The same think would be true of data read
> via the shell and passed to other programs as stdin. There's also some
> size calculations that only work with a single-character line terminator
> (at least in zsh code).

Could you give me a simple test case that fails without
cygwin_premain0()? I set my filesystems as text-mode and tried to find
such cases, but I couldn't.

I thought about two cases:
* If you don't use CRLF scripts at all and mount all your filesystems as
  binary-mode, there should be no problem (without premain hack).
* If you use CRLF scripts and mount all your filesystem as text-mode,
  there should be no problem (without premain hack).
Is it right?

> A while back I looked at making changes to somehow acommodate CRLF, but
> there are many places in the code that would require some heavy changes
> (some of which I'm still not certain would be correct) and would make it
> difficult to maintain. I doubt that Zsh base would accept such changes
> either as they would be an intrusive hack for Windows only support. By
> contrast the premain hack was elegent and global.
>
> I could have simply told people that they had to run scripts from a
> non-text-mode mount, that their /tmp had to also be on a non-text-mode
> mount and all data the scripts explicitly read from were also on a
> non-text-mode mount AND all scripts (and input data) must be non-CRLF.
> Think that would fly? Me neither. That was the basis for this "fix" in
> the first place.

I don't know well about zsh code, but I think it will be hard to do the
hack without cygwin_premain0(), as you said. But, how about bash? bash
seems not to have such hacks, but it seems to work well. And I think
it's confusing that bash and zsh treat stdin as different mode.

> And how is base64's deficiency a zsh problem? Stdin/Stdout are "text"
> handles, which implies possible data manipulation along those lines.
> There's no guarantee that they would pass binary data.
>
> I believe that programs reading from stdin are supposed to assume the
> text-mode semantic for the handles and behave accordingly. You've
> mentioned "cat" and "gzip" doing that very thing. Think there might be a
> reason for that?

Indeed, it's theoretically right that any programs which perform binary
I/O should set stdin/stdout as binary mode for portability. But
practically, it will be a heavy work to check that all programs on our
system follow the rule, and I think the check can't be perfect. I'd
rather keep all my scripts as LF than break my data by some programs
like base64, so I will continue to use the customized zsh.

PS: for base64, I will report the problem to bug-coreutils list later.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019