Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin AT sources DOT redhat DOT com Date: Tue, 23 Oct 2001 12:50:49 -0400 From: Christopher Faylor To: cygwin AT cygwin DOT com Cc: newlib AT sources DOT redhat DOT com Subject: Re: 1.3.3-2: fseek fails on multiples of 1024 (binary mode) Message-ID: <20011023125049.A22558@redhat.com> Reply-To: newlib AT sources DOT redhat DOT com Mail-Followup-To: cygwin AT cygwin DOT com, newlib AT sources DOT redhat DOT com References: <20011021225629 DOT 17478 DOT qmail AT foolabs DOT com> <20011022150648 DOT B10383 AT redhat DOT com> <3BD556FE DOT 46350849 AT syntrex DOT com> <20011023141012 DOT G31068 AT cygbert DOT vinschen DOT de> <3BD59914 DOT 780FD423 AT syntrex DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3BD59914.780FD423@syntrex.com> User-Agent: Mutt/1.3.21i Again, these observations should go to the mailing list which is responsible for maintaining the code that you've analyzed: newlib AT sources DOT redhat DOT com . I've redirected this discussion there. Thanks for your in-depth analysis of the problem. cgf On Tue, Oct 23, 2001 at 06:21:40PM +0200, Pavel Tsekov wrote: >Upon further investigation this problem seems to be much bigger than >I've initially thought and there is no easy patch to it (at least I >think so) so I'll describe what've found about it. I started a patch >though and will would like to know if you like my approach or will >suggest another one. > >So here is what I've found: > >1. The described problem appears only when you have buffering enabled. > >2. It basically has to do with a intended optimisation which involves >not reading the same buffered region twice, if the fseek call requests >a offset which is in the buffered region. This optiomisation is >performed only if the file is opened in READONLY mode. > >3. The problem is to happen only if you fseek backwards from the >current file position and the difference between the current file >position and the new position is not greater than the buffer size used >to buffer reads. A requirement for the problem to appear is to have >two or more fseeks between no other operation between them. > >4. This problem is likely to affect all BSD flavour OSes - the newlib >stdio code is derived from BSD. I looked at the fseek code in OpenBSD >and FreeBSD and I think it has the error. > >Now let me point you to the code location which introduces the problem: > >--> Target is where do we wont to go > > if (whence == SEEK_SET) > target = offset; > else > { > if (_fstat_r (ptr, fp->_file, &st)) > goto dumb; > target = st.st_size + offset; > } > >--> What is actually done here up to the next --> mark is that fseek > tries ot determine the position where the last read occured. fp->_r is > number if bytes left in the buffer to be read, n = fp->_p - > fp->_bf._base is the bytes from the buffer already read. n + fp->_r is > the size of the buffer. By decrementing the current file positon with > n + fp->_r the code expects to find the position where the last read > occured. > > if (!havepos) > { > if (fp->_flags & __SOFF) > curoff = fp->_offset; > else > { > curoff = (*seekfn) (fp->_cookie, 0L, SEEK_CUR); > if (curoff == POS_ERR) > goto dumb; > } > curoff -= fp->_r; > if (HASUB (fp)) > curoff -= fp->_ur; > } > > /* > * Compute the number of bytes in the input buffer (pretending > * that any ungetc() input has been discarded). Adjust current > * offset backwards by this count so that it represents the > * file offset for the first byte in the current input buffer. > */ > > if (HASUB (fp)) > { > curoff += fp->_r; /* kill off ungetc */ > n = fp->_up - fp->_bf._base; > curoff -= n; > n += fp->_ur; > } > else > { > n = fp->_p - fp->_bf._base; > curoff -= n; > n += fp->_r; > } > >--> Here we try to determine if the position we request > is in the buffered region. If we determnine erronosly > that the requested position is in this buffer we got the > error which started this thread. > > /* > * If the target offset is within the current buffer, > * simply adjust the pointers, clear EOF, undo ungetc(), > * and return. (If the buffer was modified, we have to > * skip this; see fgetline.c.) > */ > > if ((fp->_flags & __SMOD) == 0 && > target >= curoff && target < curoff + n) > { > register int o = target - curoff; > > fp->_p = fp->_bf._base + o; > fp->_r = n - o; > if (HASUB (fp)) > FREEUB (fp); > fp->_flags &= ~__SEOF; > return 0; > } > >Let me show you with some digits whats going on: > >We have 2048 bytes file. We have bufsize 1024. We fread 1024 bytes - >this fills exactly the internal buffer thus 'n' will become 1024, >fp->_r will become 0. We request then file position 0 from the end of >file. ftell reports 2048. Now we want to go back to 1024 and read 8 >bytes. fseek tries to optiomize since we are in read only mode ... it >does this > >curoff = 2048 (current position) > >curoff -= r; 2048 - 0 = 2048 >curoff -= n; 2048 - 1024 = 1024 Determines where last read occured > >n + r = 1024 + 0; Determines the upper limit of the internal buffer > >now the final check > >target >= curoff && target < curoff + n >1024 >= 1024 && 1024 < 1024 + 1024 is TRUE > >we read from the old buffer which is error. > >I think to fix the situation so this optimisation really works the best >thing is to add a new member to FILE which is the last known read >position and replace curoff in the last check with this member. >However this requires much more work than just a simple patch - So do >you think there is another (better) way to do this. Or perhaps just >remove this optimisation at all ? Which is easy and will require >chabges just to fseek. > >Corinna Vinschen wrote: >> >>On Tue, Oct 23, 2001 at 01:39:42PM +0200, Pavel Tsekov wrote: >>>I think I found the problem - is there anyone who can test a patch ? >> >>Just send it to the cygwin-patches mailing list, please. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/