X-Spam-Check-By: sourceware.org
To: cygwin@cygwin.com
From: Eric Blake <ebb9@byu.net>
Subject:  Re: =?utf-8?b?YmFzaC0zLjEtNxskQiEhGyhCQlVH?=
Date: Thu, 14 Sep 2006 17:25:58 +0000 (UTC)
Lines: 60
Message-ID:  <loom.20060914T190937-598@post.gmane.org>
References:  <091320060438.11140.45078B490008FD8600002B8422007610640A050E040D0C079D0A@comcast.net> <20060913052510.GB1256@trixie.casa.cgf.cx> <loom.20060913T160909-692@post.gmane.org> <ee9oa9$sj$1@sea.gmane.org> <ee9q53$74g$1@sea.gmane.org> <loom.20060913T234039-426@post.gmane.org> <ee9vg0$qeb$1@sea.gmane.org> <45089854.8010705@scytek.de> <20060914001902.GB24899@trixie.casa.cgf.cx> <4508ABAF.5090408@scytek.de> <20060914020737.GC24899@trixie.casa.cgf.cx> <45093972.7080606@byu.net> <45098A3B.4060105@scytek.de>
Mime-Version:  1.0
Content-Type:  text/plain; charset=us-ascii
Content-Transfer-Encoding:  7bit
User-Agent: Loom/3.14 (http://gmane.org/)
X-IsSubscribed: yes
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com

Volker Quetschke <quetschke <at> scytek.de> writes:

> > 
> (snip)
> > +#ifdef __CYGWIN__
> > +  /* lseek'ing on text files is problematic; lseek reports the true
> > +     file offset, but read collapses \r\n and returns a character
> > +     count.  We cannot reliably seek backwards if nr is smaller than
> > +     the seek offset encountered during the read, and must instead
> > +     treat the stream as unbuffered.  */
> > +  if ((bp->b_flag & (B_TEXT | B_UNBUFF)) == B_TEXT)
> ------------------------^^^^^^^^^^^^^^^^^      ^^^^^^
> part of the patch looks suspicious to me. You probably just want to test
> if the LHS expression is true.

That part is correct as presented - I really did mean to check with bitwise AND 
if we are dealing with a text file which has not previously been marked 
unbuffered...

> 
>   Volker
> 
> > +    {
> > +      off_t offset = lseek (bp->b_fd, 0, SEEK_CUR);
> > +      nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);

...as the condition to perform extra lseeks and make sure that lseek and the 
unbuffered text file are still consistent; if not...

> > +      if (nr > 0 && nr < lseek (bp->b_fd, 0, SEEK_CUR) - offset)
> > +       {
> > +         lseek (bp->b_fd, offset, SEEK_SET);
> > +         bp->b_flag |= B_UNBUFF;

... we change the flags to mark the stream unbuffered, and never fall into this 
if-block again for the rest of the life of the file.

> > +         nr = zread (bp->b_fd, bp->b_buffer, bp->b_size = 1);
> > +       }
> > +    }
> > +  else
> > +#endif
> >    nr = zread (bp->b_fd, bp->b_buffer, bp->b_size);

And the else-block works equally well whether a file is non-text (reading 
multiple bytes), or is unbuffered (reading just one byte).  The other thing to 
remember is that a file will be marked unbuffered if you cannot seek on it (as 
in a pipe), or if it is a text file that failed the lseek consistency checks 
above.  And it does mean that even with \n line endings on a text mount, that 
although the file is read in the same number of buffers as the corresponding 
binary mount, the text mount is penalized with 2 additional lseeks per buffer, 
but that is a smaller penalty than doing one-byte reads.

> >    if (nr <= 0)
> >      {

-- 
Eric Blake




--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

