X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=1.5 required=5.0 tests=BAYES_50,TVD_RCVD_IP X-Spam-Check-By: sourceware.org Date: Tue, 20 Apr 2010 10:12:00 -0700 (PDT) From: "Peter A. Castro" To: Yutaka Amanai cc: cygwin AT cygwin DOT com Subject: Re: zsh 4.3.9-1: text-mode stdin problem (breaking base64) In-Reply-To: <4BC597F1.20104@jade.plala.or.jp> Message-ID: References: <4BC597F1 DOT 20104 AT jade DOT plala DOT or DOT jp> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; format=flowed Content-ID: X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Wed, 14 Apr 2010, Yutaka Amanai wrote: Greetings, Yutaka, > On Cygwin, zsh forces stdin to be text-mode. By this, some commands > don't work correctly on zsh. For example, when you encode stdin with > base64 on zsh, there is a possibility that base64 produces an incorrect > result. The text-mode "hack" was created to solve a basic problem that zsh has with running scripts, in general, on Windows. Much of the code assumes that scripts have a single-character line terminator (eg: LF). So do many text-based programs and filters. Windows "native" line termination is (still) CRLF and zsh code does not deal well with this. Cygwin's text-mode munches the CR from the stream input leaving the LF which works well in 99% of the usage cases. Without it, Zsh treats the CR as part of the input line and tries to parse it as such leading to "Bad Things"(tm) happening. The same think would be true of data read via the shell and passed to other programs as stdin. There's also some size calculations that only work with a single-character line terminator (at least in zsh code). A while back I looked at making changes to somehow acommodate CRLF, but there are many places in the code that would require some heavy changes (some of which I'm still not certain would be correct) and would make it difficult to maintain. I doubt that Zsh base would accept such changes either as they would be an intrusive hack for Windows only support. By contrast the premain hack was elegent and global. I could have simply told people that they had to run scripts from a non-text-mode mount, that their /tmp had to also be on a non-text-mode mount and all data the scripts explicitly read from were also on a non-text-mode mount AND all scripts (and input data) must be non-CRLF. Think that would fly? Me neither. That was the basis for this "fix" in the first place. > I wrote a test case. Save the script below as 'test.sh': > ---- > printf '\x0D\x0A' > crlf-src > base64 < crlf-src | base64 -d > crlf-dst > if cmp crlf-src crlf-dst >/dev/null; then > echo "OK" > else > echo "NG" > fi > ---- > And try this: > $ /bin/bash test.sh > OK > $ /bin/zsh test.sh > NG How about this instead: base64 crlf-src | base64 -d > crlf-dst Or maybe: cat crlf-src | base64 | base64 -d > crlf-dst > Many commands, such as cat and gzip, explicitly call freopen() or > setmode() to set stdin as binary-mode. So, such commands work well even > on zsh. But, base64 doesn't take such measures and doesn't work well on > zsh. Some other commands might have the same problem. And how is base64's deficiency a zsh problem? Stdin/Stdout are "text" handles, which implies possible data manipulation along those lines. There's no guarantee that they would pass binary data. I believe that programs reading from stdin are supposed to assume the text-mode semantic for the handles and behave accordingly. You've mentioned "cat" and "gzip" doing that very thing. Think there might be a reason for that? > To fix the problem, it is the easiest way to simply erase > cygwin_premain0() in main.c of zsh, and recompile zsh. If you don't > mount any filesystem as text-mode, there will be no problem. But it > seems that cygwin_premain0() is introduced for text-mode users. So, the > solution I mentioned might not work well for text-mode users. Infact, it's not a solution at all for that very reason. Your "fix" undoes the very thing that people complained about in the first place. The fact that it works for you and your specific environmental setup does not mean it will work for 99% of the rest of the users of Zsh in Cygwin. As long as there are different mount types in Cygwin (and Windows uses CRLF), this will continue to be a problem with no good solution. As always, you are welcome to take a crack at re-working zsh code to work better on windows (I even mention this as a future direction in the main.c comments), but this "fix" of yours is not it. -- Peter A. Castro or "Cats are just autistic Dogs" -- Dr. Tony Attwood -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple