delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/10/07/12:46:15

From: "A. Sinan Unur" <asu1 AT cornell DOT edu>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: How to get file size?
Date: Tue, 30 Sep 1997 20:09:23 -0400
Organization: Cornell University http://www.cornell.edu
Lines: 50
Sender: asu1 AT cornell DOT edu (Verified)
Message-ID: <343194B3.780D4B7C@cornell.edu>
References: <01bccc5b$8138ba00$0200a8c0 AT ingo> <34302025 DOT 7E59 AT cornell DOT edu> <01bccde3$d710bd40$0200a8c0 AT ingo>
Reply-To: asu1 AT cornell DOT edu
NNTP-Posting-Host: cu-dialup-0042.cit.cornell.edu
Mime-Version: 1.0
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

Ingo Ruhnke wrote:
> 
> A. Sinan Unur <sinan DOT unur AT cornell DOT edu> schrieb
> > Ingo Ruhnke wrote:
> 
> > > This returns me the number of bytes of the file, but the number of
> > > chars is a little bit smaller, because of the <LF><CR>.
> > >
> 
> > i wouldn't think so... at least not without reading in the whole 
> > file first. what is the problem with reserving space for a few extra 
> > bytes?
> 
> Another question, what is the return value of fread(), I think it is 
> the size of the part of the file, but with the <CR> or without them? 
> Maybe this could be the solution for my problem.

i do not know how ansi it is but with djgpp, fopen(buffer, 1, n, fin)
where fin is a stream opened in text mode does get rid of the <cr>. for
example, if i have a dos text file into which i type 5 "enter"s, the
resulting file contains:
<CR><LF><CR><LF><CR><LF><CR><LF><CR><LF>
whereas the call above will store
<LF><LF><LF><LF><LF>
in buffer.

however, this is of no help to you because he wanted to be able allocate
buffer to hold exactly the number of characters in the file after
getting rid of the <CR>s. buffer in the example will have to be
allocated to hold the number of bytes reported by a call to fstat which
will be greater than or equal to the number of characters and n will
have to be that number, too.

one way to avoid allocating a larger than necessary buffer is to
allocate a small one, say 1K, do incremental fgets calls, all the time
resizing the buffer. i think the performance penalty incurred by doing
this would outweigh the memory savings, but i have no time for a proper
comparison. of course, if one expects to encounter files consisting
mostly of <CR><LF> pairs routinely, then it might justify a more
involved approach. even then, i would start with using the number of
bytes reported with fstat and deal with the issue if it presents
problems.
-- 
----------------------------------------------------------------------
A. Sinan Unur
Department of Policy Analysis and Management, College of Human Ecology,
Cornell University, Ithaca, NY 14853, USA

mailto:sinan DOT unur AT cornell DOT edu
http://www.people.cornell.edu/pages/asu1/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019