X-Spam-Check-By: sourceware.org To: cygwin AT cygwin DOT com From: mwoehlke <mwoehlke AT tibco DOT com> Subject: Re: bash-3.1-7 BUG Date: Wed, 13 Sep 2006 17:08:00 -0500 Lines: 42 Message-ID: <ee9vg0$qeb$1@sea.gmane.org> References: <091320060438 DOT 11140 DOT 45078B490008FD8600002B8422007610640A050E040D0C079D0A AT comcast DOT net> <20060913052510 DOT GB1256 AT trixie DOT casa DOT cgf DOT cx> <loom DOT 20060913T160909-692 AT post DOT gmane DOT org> <ee9oa9$sj$1 AT sea DOT gmane DOT org> <ee9q53$74g$1 AT sea DOT gmane DOT org> <loom DOT 20060913T234039-426 AT post DOT gmane DOT org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.5) Gecko/20060719 Thunderbird/1.5.0.5 Mnenhy/0.7.4.0 In-Reply-To: <loom.20060913T234039-426@post.gmane.org> X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com> List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com> List-Archive: <http://sourceware.org/ml/cygwin/> List-Post: <mailto:cygwin AT cygwin DOT com> List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Eric Blake wrote: > mwoehlke <mwoehlke <at> tibco.com> writes: > >>> Would it be possible to do this dynamically (instead of keying off of >>> mounts, etc.): if the first line of the file read by bash has a \r\n, >>> use text-mode (1-char-at-a-time) semantics, else use binary semantics >>> (lseek)? >> I hate to say this, but... if bash goes this route, could it be a shopt? >> I would rather know that my scripts are broken (DOS-format). >> > > Thanks for the ideas; here's what I'll try. Bash does indeed already scan the > first line (I'm not sure if it is line or first 80 characters or what it is > exactly, but I do know it scans) to see if it detects any NUL bytes, at which > point it complains the file is binary and not a script. So I can probably hack > that scan to also look for \r. So first I will open the file according to the > mount point rules. If the file is text mode, perform the scan in binary mode, > and if any \r is seen, revert to text mode and no lseeks. If the scan in > binary mode succeeds, then leave the file in binary mode, assuming that the > file is unix format even though it is on a text mount, and that lseeks will > work. If the file starts life binary mode (ie. was on a binary mount), skip > the check for \r in the scan (under the assumption that on a binary mount, \r > is intentional and not a line ending to be collapsed), and use lseeks. No > guarantees on whether this will pan out, or be bigger than I thought, but > hopefully you will see a bash 3.1-8 with these semantics soon. Sounds good! That will satisfy my request to not silently work on files that should be broken. :-) Alternatively (and I kind-of hate to say this :-)), now that I think of it, you might want to talk to Rodney over at the Interix forums. I recall hearing that the Interix version of bash actually handles files with a mix of DOS and UNIX line endings (which may not be the best thing to do, but might be worth investigating). I would imagine that version is always reading in binary (I don't think Interix - like UNIX, but unlike Cygwin - ever had a 'text mode' concept). There might even be an official patch for this, that just needs to be flipped on for Cygwin (or maybe the two of you can petition to make it an official patch). -- Matthew 41% of all statistics are made up on the spot. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/