X-Spam-Check-By: sourceware.org Message-ID: <45098A3B.4060105@scytek.de> Date: Thu, 14 Sep 2006 12:58:36 -0400 From: Volker Quetschke User-Agent: Thunderbird 1.5.0.5 (Windows/20060719) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: bash-3.1-7 BUG References: <091320060438 DOT 11140 DOT 45078B490008FD8600002B8422007610640A050E040D0C079D0A AT comcast DOT net> <20060913052510 DOT GB1256 AT trixie DOT casa DOT cgf DOT cx> <45089854 DOT 8010705 AT scytek DOT de> <20060914001902 DOT GB24899 AT trixie DOT casa DOT cgf DOT cx> <4508ABAF DOT 5090408 AT scytek DOT de> <20060914020737 DOT GC24899 AT trixie DOT casa DOT cgf DOT cx> <45093972 DOT 7080606 AT byu DOT net> In-Reply-To: <45093972.7080606@byu.net> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigAB789755B5F974DB0F68F33E" X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com --------------enigAB789755B5F974DB0F68F33E Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi! Eric Blake wrote: > According to Christopher Faylor on 9/13/2006 8:07 PM: >>> I doubt that Eric will want to deal with the fallout of having bash not >>> understand \r\n line endings but, if he does, it would be his decision >>> and, again, I would support it 100%. I am very eager to see things like >>> configure scripts work faster and if we have to drop a few scared or >>> lazy people along the way to accomplish that goal, that's fine with me. >>> I have no problem at all with being a part of a smaller community which >>> doesn't need to use notepad to edit their bash scripts. Hey, don't shoot at me, I'm only voicing my opinion and am perfectly fine with your decision. Maybe I'm lacking some coffee, but this ... > Here's the difference between 3.1-7 and 3.1-8: >=20 (snip) > +#ifdef __CYGWIN__ > + /* lseek'ing on text files is problematic; lseek reports the true > + file offset, but read collapses \r\n and returns a character > + count. We cannot reliably seek backwards if nr is smaller than > + the seek offset encountered during the read, and must instead > + treat the stream as unbuffered. */ > + if ((bp->b_flag & (B_TEXT | B_UNBUFF)) =3D=3D B_TEXT) ------------------------^^^^^^^^^^^^^^^^^ ^^^^^^ part of the patch looks suspicious to me. You probably just want to test if the LHS expression is true. Volker > + { > + off_t offset =3D lseek (bp->b_fd, 0, SEEK_CUR); > + nr =3D zread (bp->b_fd, bp->b_buffer, bp->b_size); > + if (nr > 0 && nr < lseek (bp->b_fd, 0, SEEK_CUR) - offset) > + { > + lseek (bp->b_fd, offset, SEEK_SET); > + bp->b_flag |=3D B_UNBUFF; > + nr =3D zread (bp->b_fd, bp->b_buffer, bp->b_size =3D 1); > + } > + } > + else > +#endif > nr =3D zread (bp->b_fd, bp->b_buffer, bp->b_size); > if (nr <=3D 0) > { > @@ -454,15 +477,6 @@ > return (EOF); > } >=20 > -#if defined (__CYGWIN__) > - /* If on cygwin, translate \r\n to \n. */ > - if (nr >=3D 2 && bp->b_buffer[nr - 2] =3D=3D '\r' && bp->b_buffer[nr -= 1] =3D=3D > '\n') > - { > - bp->b_buffer[nr - 2] =3D '\n'; > - nr--; > - } > -#endif > - > bp->b_used =3D nr; > bp->b_inputp =3D 0; > return (bp->b_buffer[bp->b_inputp++] & 0xFF); > only in patch2: > unchanged: > --- bash-3.1-orig/input.h 2002-01-30 07:11:47.000000000 -0700 > +++ bash-3.1/input.h 2006-09-14 03:29:05.484375000 -0600 > @@ -47,6 +47,7 @@ > #define B_ERROR 0x02 > #define B_UNBUFF 0x04 > #define B_WASBASHINPUT 0x08 > +#define B_TEXT 0x10 /* Text stream, when O_BINARY is nonzero */ >=20 > /* A buffered stream. Like a FILE *, but with our own buffering and > synchronization. Look in input.c for the implementation. */ >=20 >=20 > My thoughts on the matter are that if you use binary mounts (and I highly > recommend them), then every character in your file is important. Since > bash on Linux does not ignore \r, and POSIX does not allow bash to ignore > \r by default (although you can set IFS to include \r as a whitespace > character to ignore), then neither should bash on a binary cygwin file. > If you use text mounts, then this patch is smart enough to buffer data up > until the point that an \r\n pair is converted by the text mode file into > a single character, at which point the lseek optimization breaks down and > the text mode file is subsequently processed a byte at a time. If you > need DOS line endings, use a text mount. If you need speed, use UNIX line > endings on a binary mount, although even UNIX line endings on a text mount > will be faster than DOS line endings. Case closed, since I'm the > maintainer, and I really don't want to bother with anything larger than > the above patch (and also plan on submitting the above patch upstream, > where it is less likely to be accepted if it is larger). >=20 > -- > Life is short - so eat dessert first! >=20 > Eric Blake ebb9 AT byu DOT net -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ --=20 =3D http://wiki.services.openoffice.org/wiki/Debug_Build_Problems =3D PGP/GPG key (ID: 0x9F8A785D) available from wwwkeys.de.pgp.net key-fingerprint 550D F17E B082 A3E9 F913 9E53 3D35 C9BA 9F8A 785D --------------enigAB789755B5F974DB0F68F33E Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.1 (MinGW) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFCYpGPTXJup+KeF0RAkdwAKCEimleLajJmDrIeU6pQ59A2Juh0wCeIzBN BkpY2O06EW8tqpdeowxSaQc= =yb4H -----END PGP SIGNATURE----- --------------enigAB789755B5F974DB0F68F33E--