delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1996/10/01/19:20:50

From: STENNS AT vw DOT tci DOT uni-hannover DOT de (Michael Stenns)
Newsgroups: comp.os.msdos.djgpp
Subject: ftell bug with text files ?
Date: 1 Oct 1996 17:02:14 GMT
Organization: Institut fuer Technische Chemie d Univ Hannover
Lines: 96
Message-ID: <52riqm$c8m@newsserver.rrzn.uni-hannover.de>
Reply-To: stenns AT vw DOT tci DOT uni-hannover DOT de
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

Hello,
in my program I scan a text file and remember some positions
with ftell to revisit them later. Sometimes the value returned
by ftell is one byte too large. Adding or removing one byte at
(nearly) any position of the text file lets the problem disappear.

It seems that the failure occurs when a line break is on a
mutilple of 16384 bytes (the value of BUFSIZ in stdio.h).
Line breaks in MSDOS are <cr><lf> pairs, but in the internal buffer
a line break is only <lf>. Ftell scans the unread part of the buffer
to account for stripped <cr>'s, but if the buffer ends with a <cr>
this is currently not seen by ftell.

Here is my suggestion for a corrected ftell.c, changed lines are 
marked with a '|':

  /* Copyright (C) 1995 DJ Delorie, see COPYING.DJ for details */
  #include <libc/stubs.h>
  #include <stdio.h>
  #include <unistd.h>
  #include <libc/file.h>
  #include <fcntl.h>
  #include <libc/dosio.h>

  long
  ftell(FILE *f)
  {
|  long tres=0;
   int adjust=0;
   int idx;

    if (f->_cnt < 0)
      f->_cnt = 0;
    if (f->_flag&_IOREAD)
    {
      /* When reading files, the file position known by `lseek' is
       at the end of the buffered portion of the file.  So `adjust'
       is negative (current buf position is BEFORE the one returned
       by `lseek') and, for TEXT files, it gets decremented (larger
       in absolute value) for every NL from current pos to the end
       of the buffer, to account for stripped CR characters.  */
      adjust = - f->_cnt;

      if (__file_handle_modes[f->_file] & O_TEXT) /* if a text file */
      {
        if (f->_cnt)
        {
|       char *cp = f->_base + BUFSIZ - 1;
|
|         /* Check for suppressed CR at end of buffer.
|            As long as the position returned by lseek is a
|            multiple of the buffer size, the end of the file
|            is not reached and `adjust' should be decremented. */
|         if (*cp == '\r')
|         {
|           tres = lseek(fileno(f), 0L, 1);
|           if (!(tres % BUFSIZ)) adjust--;
|         }

          /* For every char in buf AFTER current pos... */
          for (cp=f->_ptr + f->_cnt - 1; cp >= f->_ptr; cp--)
            if (*cp == '\n')      /* ...if it's LF... */
              adjust--;           /* ...there was a CR also */
        }
      }
    }
    else if (f->_flag&(_IOWRT|_IORW))
    {
      /* When writing a file, the current file position known by `lseek'
         is at the beginning of the buffered portion of the file.  We
         have to adjust it by our offset from the beginning of the buffer,
         and account for the CR characters which will be added by `write'.  */
      if (f->_flag&_IOWRT && f->_base && (f->_flag&_IONBF)==0)
      {
        int lastidx = adjust = f->_ptr - f->_base;

        if (__file_handle_modes[f->_file] & O_TEXT)
        for (idx=0; idx < lastidx; idx++)
          if (f->_base[idx] == '\n')
            adjust++;
      }
    }
    else
      return -1;
|   if (!tres) tres = lseek(fileno(f), 0L, 1);
    if (tres<0)
      return tres;
    tres += adjust;
    return tres;
  }

--
Michael Stenns,
Email:  stenns AT vw DOT tci DOT uni-hannover DOT de
WWW: http://www.tci.uni-hannover.de/extrakt/extrakt.htm

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019