Sender: rich AT phekda DOT freeserve DOT co DOT uk
Message-ID: <3E2FC531.F37C6D24@phekda.freeserve.co.uk>
Date: Thu, 23 Jan 2003 10:34:25 +0000
From: Richard Dawe <rich AT phekda DOT freeserve DOT co DOT uk>
X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.2.23 i586)
X-Accept-Language: de,fr
MIME-Version: 1.0
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
CC: djgpp-workers AT delorie DOT com
Subject: Re: readv, writev [PATCH]
References: <Pine DOT SUN DOT 3 DOT 91 DOT 1030123082200 DOT 15630D AT is>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Reply-To: djgpp-workers AT delorie DOT com

Hello.

Eli Zaretskii wrote:
> 
> On Wed, 22 Jan 2003, Richard Dawe wrote:
> 
> > The calls allocate a buffer big enough to store all the data
> > for the I/O vector, so that a single read or write call can be used.
> > This seemed like the simplest and most robust way of doing it, since
> > there are no readv or writev system calls.
> 
> This design worries me a bit.  The buffer you allocate can be quite
> large, in which case we are talking some major heap exercising.  The
> function could even unjustly fail (for lack of enough memory) in some
> borderline cases, like if the file is much smaller than what you tell
> the function about the buffers' length.
> 
> What exactly are the problems in doing this straightforwardly, i.e. read
> into the buffers directly one by one?  What am I missing?

For writev:

Say you write the first part of the data, but then the write for the next part
fails. What do you return? The call has failed, but you have written some
data. It seems to me that it would be hard for a program to recover gracefully
from this, since it doesn't know what has been written to the file.

If write can fail after partially writing some data, then I guess write
suffers from the problem I was trying to avoid with writev. In that case,
writev could be rewritten to not use the buffer and just write out each iovec.

In fact, looking at the description of the write function in SUSv2, I see no
guarantee that the file will not be touched on failure.

If writev were to write iov-by-iov and one call failed, it would need to seek
to the position that it was at, before writing. This is to ensure:

1. that a retry will overwrite the data;
2. that we read from the position where the writes started.

If we seeked to the position after the last successful write, the second
condition would not be true.

For readv:

Nothing really, apart from symmetry with writev. I can't think of anything
that would prevent us not using a buffer in readv. It would be fiddly, but
that's not a good reason.

Other issues:

* FSEXT support?

* It should cope with non-blocking write calls. New POSIX, draft 7 and SUSv2
don't describe what should happen for writev, if the file descriptor is in
non-blocking mode. If we're writing iov-by-iov, I guess it should return the
number of bytes successfully written, when it hits the first write call that
fails with -1 and errno == EAGAIN. Non-blocking applications should be written
to retry the write, until all the data is written, so this shouldn't break
anything (in the non-blocking case).

> > +   /* Read in the data. */
> > +   ret = read(fd, buf, maxbytes);
> 
> Can't you use _read instead?  Is readv supposed to handle text files and
> do CRLF->NL conversions?
> 
> The same holds for _write in writev.

I assumed that readv, writev were just vector-input versions of read, write
and so should have the same CFLF->NL conversion characteristics.

> > + @subheading Syntax
> > +
> > + @example
> > + #include <sys/uio.h>
> > +
> > + ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
> > + @end example
> [...]
> > + @code{struct iovec} is defined as follows:
> 
> Since we are about to add index entries to the manual, how about having
> @findex at the beginning of the node and @tindex for `struct iovec'?

Yes.
 
> > + Otherwise, a value of -1 is returned and @var{errno} is set appropriately.
> 
> `errno' is the actual name of a variable, it does not stand for something
> else.  So it should be in @code, not in @var.

OK.
 
> And please don't forget an entry in wc204.txi.

Something like this:

@findex readv
@findex writev
The @code{readv} and @code{writev} functions were added.
 
> Last but not least, thanks for working on this.

You're welcome!

Thanks, bye, Rich =]

-- 
Richard Dawe [ http://www.phekda.freeserve.co.uk/richdawe/ ]