Sender: vheyndri AT rug DOT ac DOT be
Message-Id: <35165AB3.5061@rug.ac.be>
Date: Mon, 23 Mar 1998 13:50:59 +0100
From: Vik Heyndrickx <Vik DOT Heyndrickx AT rug DOT ac DOT be>
Mime-Version: 1.0
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
Cc: DJ Delorie <dj AT delorie DOT com>, djgpp-workers AT delorie DOT com
Subject: Re: ^Z in text-mode output to the screen
References: <Pine DOT SUN DOT 3 DOT 91 DOT 980315164044 DOT 10360D-100000 AT is>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Precedence: bulk

Eli Zaretskii wrote:
> 
> There's a problem in the current libc which I'd like to solve in
> v2.02.  One of its consequences is the following lossage:
> 
>         C:\DJGPP\BIN> grep foobar *
>         grep: writing output: No space left on device (ENOSPC)
> 
> This happens if one of the files has the ^Z character embedded in it.
>
> This is caused by the assumption in the low-level libc functions which
> write data, that if you instruct DOS to write a buffer and DOS writes
> only part thereof, the reason is that the disk is full, and so they
> set errno to ENOSPC.

Isn't the REAL problem here that the ^Z should never get returned by a
read operation from a character device/file. IMO the way text-data is
stored should be entirely transparent to the user program (AFAIK POSIX
requires this), this means that the read functions should do CR/LF to NL
and ^Z to EOF translations. AFAIK this is enough to ensure that ^Z never
gets passed to the write functions. A write function can optionally
append a ^Z upon close-time.

> This assumption breaks if the data is written to the console which
> hasn't been switched to raw mode (i.e., the I/O is done in text
> mode).  DOS stops writing and returns prematurely when asked to write
> data with embedded ^Z characters to a character device in cooked
> mode.
> 
> Here's the problem: DOS doesn't give any indication that would allow
> to distinguish between the disk full case and the ^Z case.  The only
> thing we can do is to assume that if the handle points to a character
> device that's in cooked mode, we have the ^Z case.  We can certainly
> do so in the case of the console device (it obviously cannot become
> ``full'').  I'm not certain about other devices, though.  For example,
> what if somebody writes to COM1, and the other side breaks the
> connection?

IMO, there are only two cases: text files/devices and binary
files/devices. I don't see any use for making a distinction between
cooked-mode devices and files (I almost wrote cooked devices :-) )

> Assuming we can distinguish between these two cases, the next question
> is what to do about that.  It seems that the best alternative is to
> filter ^Z out of the data, as if it were not there.  Any other
> alternative would mean trouble in some cases.  For example, if the
> buffer begins with ^Z, if we don't write it and return 0, many
> programs will take that as an error and print an error message.  (Btw,
> an attempt to have that error message make some sense was the original
> motivation for assigning ENOSPC to errno in these cases.)

Assuming that ^Z still needs correct handling on output:
IMO, a ^Z (at any place in the output data) should turn a file in EOF
mode, and let write and family ignore any further output to that file
(until the EOF indicator gets reset).
The fact that 0 will be returned is in fact an error condition in this
case since a ^Z should never have got read.

> Filtering ^Z is easily done in functions which examine each character
> in the buffer (e.g., `write').  In other cases, I suggest only to
> handle the case where ^Z is the first character in the buffer.  If ^Z
> is somewhere in the middle, the caller will get a smaller return value
> than the size of buffer it wanted to write, and will typically try to
> write the rest of the buffer beginning with the next unwritten
> character, which is ^Z.

Why would a user make the assumption that after a partial write, the
remainder will get written succesful?

>   1) Is it okay to assume that premature end of output to the console
>      device in cooked mode means ^Z?  How about other character
>      devices?
>   2) Does anybody see any problems with filtering ^Z out
>      of data when writing to the console device in cooked mode?  How
>      about other character devices?
>   3) Is the filtering method outlined above good enough?  Does anybody
>      see a better way?

I expressed my opinion about these above.

-- 
 \ Vik /-_-_-_-_-_-_/   
  \___/ Heyndrickx /          
   \ /-_-_-_-_-_-_/