X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-0.5 required=5.0 tests=BAYES_05,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Message-ID: <4BD35527.8020101@jade.plala.or.jp> Date: Sun, 25 Apr 2010 05:31:35 +0900 From: Yutaka Amanai User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ja; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: zsh 4.3.9-1: text-mode stdin problem (breaking base64) References: <4BCEBF96 DOT 3030201 AT jade DOT plala DOT or DOT jp> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-VirusScan: Outbound; msa03b; Sun, 25 Apr 2010 05:31:36 +0900 X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com 2010/04/24 10:03 Peter A. Castro wrote: >> Could you give me a simple test case that fails without >> cygwin_premain0()? I set my filesystems as text-mode and tried to find >> such cases, but I couldn't. > > It's been a while since I've looked at this, but the problem was mostly > with binary-mode mounts, not text-mode mounts. The problem was that, > say, you had your root mounted as text-mode, but your /tmp mounted as > binary-mode. Zsh (and other utilities) create temp files fairly often > and feed those as input to itself or other programs. Or, reverse the > case (root mounted binary and /tmp mounted text). > > {f}open() in Cygwin is context sensitive to the filesystem mount mode. > This leads to such situations as calling fopen("/tmp/foo","r") and > expecting it to read "text" lines, but "/tmp" is mounted binary and file > "foo" contains CRLF's because it was created by a Windows program or > editor. So, when you read the lines you will get the CR as well as the > LF, when you really only want the LF. Where as if "/tmp" was mounted > text, the CR would be stripped off as part of text processing. Thank you. Indeed, even if a person mounts root (or some filesystems) as text-mode, still he might mount /tmp as binary-mode. So, I see that we need to take measures to meet such cases. >> I thought about two cases: >> * If you don't use CRLF scripts at all and mount all your filesystems as >> binary-mode, there should be no problem (without premain hack). > > In a pure Cygwin eco-system that might work. However, many Cygwin users > have to interact with non-Cygwin created data and files. If you ask the > good users on this mailing list you might find that people have any > combination of file systems mounted for their particluar needs. > >> * If you use CRLF scripts and mount all your filesystem as text-mode, >> there should be no problem (without premain hack). > > But, now, you won't get binary data from the files using a naked "open()" > as so many typically coded apps do. > >> Is it right? > > If you could keep things strictly black-and-white like that, yes, in > theory these could work. Well, the first one would be preferable as > opposed to the second one. But the problem is that most Cygwin users > don't operate in such a strict environment. I might have been shortsighted. Especially, I didn't consider so much about using text-mode and binary-mode simultaneously. >> I don't know well about zsh code, but I think it will be hard to do the >> hack without cygwin_premain0(), as you said. But, how about bash? bash >> seems not to have such hacks, but it seems to work well. And I think >> it's confusing that bash and zsh treat stdin as different mode. > > Have a look at Bash code some time. I recall seeing some O_TEXT options > being set in the various {f}open()'s that it does. Again, I looked at > doing the same in Zsh code, but after some initial experiments it proved > that there were too many dependencies and assumptions about the > carriage-control of "text" files to make it work quickly. I took a look at Bash code and found it sometimes opens filehandles in text-mode, although I didn't read in detail. Anyway, Bash is also apparently not perfect (for example, it can't read CRLF scripts on binary-mode filesystems), so I see that we can't say which is right. >> Indeed, it's theoretically right that any programs which perform binary >> I/O should set stdin/stdout as binary mode for portability. But >> practically, it will be a heavy work to check that all programs on our >> system follow the rule, and I think the check can't be perfect. I'd > > Reguardless of how much work it might be, it's a matter of "due > diligence". When you find something that doesn't behave appropriately, > report it to the maintainers. > > And, in that vein, yes, I acknowledge there are issues with Zsh in this > area. The premain is one "solution" that works for most cases. You > appear to have found one case that doesn't work as expected > (congratulations!). But, as I said, that particular case appears to be > more a matter of that the Stdin handle should be treated as and work > appropriately. > > This problem is still under consideration. Having more than one type of > filesystem mode is part of the equasion and attempting to treat that > correctly is somewhat difficult in Zsh. Yes, this seems not so simple problem as I thought. For the present, if I find a problem like Base64 again, I will report it. >> rather keep all my scripts as LF than break my data by some programs >> like base64, so I will continue to use the customized zsh. > > If that works for you, great. That's why the source is available. > I do hope to get back to this issue at some point. > Thanks for pointing it out. My pleasure. I understood that we need premain hack for now in the package for public. Thank you for considering about my post. And I appreciate you for maintaining Cygwin Zsh package. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple