X-Spam-Check-By: sourceware.org To: cygwin AT cygwin DOT com From: Eric Blake Subject: Re: =?utf-8?b?YmFzaC0zLjEtNxskQiEhGyhCQlVH?= Date: Thu, 14 Sep 2006 17:25:58 +0000 (UTC) Lines: 60 Message-ID: References: <091320060438 DOT 11140 DOT 45078B490008FD8600002B8422007610640A050E040D0C079D0A AT comcast DOT net> <20060913052510 DOT GB1256 AT trixie DOT casa DOT cgf DOT cx> <45089854 DOT 8010705 AT scytek DOT de> <20060914001902 DOT GB24899 AT trixie DOT casa DOT cgf DOT cx> <4508ABAF DOT 5090408 AT scytek DOT de> <20060914020737 DOT GC24899 AT trixie DOT casa DOT cgf DOT cx> <45093972 DOT 7080606 AT byu DOT net> <45098A3B DOT 4060105 AT scytek DOT de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit User-Agent: Loom/3.14 (http://gmane.org/) X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Volker Quetschke scytek.de> writes: > > > (snip) > > +#ifdef __CYGWIN__ > > + /* lseek'ing on text files is problematic; lseek reports the true > > + file offset, but read collapses \r\n and returns a character > > + count. We cannot reliably seek backwards if nr is smaller than > > + the seek offset encountered during the read, and must instead > > + treat the stream as unbuffered. */ > > + if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT) > ------------------------^^^^^^^^^^^^^^^^^ ^^^^^^ > part of the patch looks suspicious to me. You probably just want to test > if the LHS expression is true. That part is correct as presented - I really did mean to check with bitwise AND if we are dealing with a text file which has not previously been marked unbuffered... > > Volker > > > + { > > + off_t offset = lseek (bp->b_fd, 0, SEEK_CUR); > > + nr = zread (bp->b_fd, bp->b_buffer, bp->b_size); ...as the condition to perform extra lseeks and make sure that lseek and the unbuffered text file are still consistent; if not... > > + if (nr > 0 && nr < lseek (bp->b_fd, 0, SEEK_CUR) - offset) > > + { > > + lseek (bp->b_fd, offset, SEEK_SET); > > + bp->b_flag |= B_UNBUFF; ... we change the flags to mark the stream unbuffered, and never fall into this if-block again for the rest of the life of the file. > > + nr = zread (bp->b_fd, bp->b_buffer, bp->b_size = 1); > > + } > > + } > > + else > > +#endif > > nr = zread (bp->b_fd, bp->b_buffer, bp->b_size); And the else-block works equally well whether a file is non-text (reading multiple bytes), or is unbuffered (reading just one byte). The other thing to remember is that a file will be marked unbuffered if you cannot seek on it (as in a pipe), or if it is a text file that failed the lseek consistency checks above. And it does mean that even with \n line endings on a text mount, that although the file is read in the same number of buffers as the corresponding binary mount, the text mount is penalized with 2 additional lseeks per buffer, but that is a smaller penalty than doing one-byte reads. > > if (nr <= 0) > > { -- Eric Blake -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/