Mail Archives: cygwin/2006/06/30/07:16:29
Lev Bishop wrote:
> On 6/28/06, Darryl Miles wrote:
> See how-to-debug-cygwin.txt
> http://cygwin.com/cgi-bin/cvsweb.cgi/src/winsup/cygwin/how-to-debug-cygwin.txt?rev=1.12&content-type=text/x-cvsweb-markup&cvsroot=src
Thanks for your pointers. Everything I'm wanting to get started is
already covered in the how-to-debug-cygwin.txt.
>> indications from select(2) interface. But if no worker thread is busy
>> working on that fd then you get writability back ?
>
> Yes, but it is very hard to get the precise unix semantics. For
> example, the application issues a write() which spawns off a thread
> that then blocks. Then the application exit()s, causing the thread to
> also terminate before completing its write, and the write never
> completes.
This is a very valid point, but not one that is a problem in the
situations I'm looking at. The situation I am looking as it much more
chronic.
How does Overlapping I/O get around this, since you have send the data
into the kernel layer and are now waiting on a completion notification
or event signalling. If the application holding the handle exits from
under it, does Win32 kernel abort the I/O in this circumstance ?
What about if this was gotten around via a fork() but not at every I/O
but only if we exit and there is an incomplete I/O operation still in
progress. Can we:
* fork()
* reaquire handle, as per dup()
* CloseHandle() from dying process
* receive IO completion callback with indication of failure, handle
was closed!
* hand data over to the child (of fork()) for it to take up the mission.
Maybe there is a resident part of cygwin that could take up the mission,
since a named pipe can be obtained by any process on the system. This
resident part is a process outside of the lifecycle of the emulated
POSIX processes.
It still would not be perfect but I can't think of any situation that
would use a single write call (as two writes would be allowed to cause
blocking) and the data must reliability make it to the reader, but once
written the writer exited. Pretty rare if you ask me. Even when it was
queued into a POSIX kernel there is no guarantee the reader will read
it, it might sit in the buffer. Applications that need that guarantee
would round trip the other end of the pipe to be sure.
At least we should be able to _DETECT_ that incomplete pipe writing I/O
is still in progress when a process exits. So maybe we can log a
warning and pickup any real problem from there. Rather than thinking
too deeply about that rare case.
> There is also the issue of what return value to give the application
> doing the write() on the pipe. You'll have to be careful to deal with
> error conditions, SIGPIPE, etc, etc.
As cgf put:
| If I understand the plan correctly, in the scenario where select says
| it's ok to write but it really isn't, the write would return as if it
| succeeded and a writer thread would be created which sits around
| trying to empty the pipe.
This is _EXACTLY_ the problem as I see. We have to deal with those
rules, if the OS can't tell us in a reliable way that a write() will work.
The writer thread sits around trying to fill the pipe, would be more
correct.
There maybe other ways to deal with that write() but as far as I
understand the NT kernel does not provide a true non-blocking mechanism
to work from with pipes. This is where you can offer to the kernel the
data and if the buffers are full the kernel will reject the data without
blocking leaving the application holding it. Overlapped I/O as I
understand it does not work like this.
I have read the Overlapped I/O model as documented, but in my (limited)
understanding of Overlapped I/O is that the call to
WriteFile()/WriteFileEx() can still block (and it probably will under
the pipelined conditions of rsync+ssh) when the kernel can't queue new
requests.
I have not read this anywhere but surely everyone can appreciate that an
application can't keep doing continuous overlapped I/O into the kernel
and expect to get back an ERROR_IO_PENDING everytime without it ever
blocking the applications call. Something has to block or the kernel
has to give back another error equivalent to EAGAIN of POSIX. As I
can't see any EAGAIN equivalent I presume it must block where the data
rate of the writer is faster than the reader end of the pipe.
This is not true non-blocking IO as I see it. So there is actually no
non-blocking API unless you use PIPE_NOWAIT, for which there is a big
fat warning not to use. Nature did not intend PIPE_NOWAIT to exist.
As cgf writes:
| The idea of using threads for pipe writing has been bounced around for
| a long time. It doesn't solve the select problem if there are
| multiple processes writing to one pipe. That is not a completely
| unusual event, unfortunately.
I dont see the problem here, each writing process will have its own
worker thread taking the block.
But to pickup with the point here.
The problem is between the select/poll/read/write event notification
system within the same application. We need to ensure when we signal
writability on a pipe via the select/poll event mechanism that some work
appears to be getting done at the next write() call. Maybe we can
return 0 ? So at least we didn't block, the application has to already
deal with partial writes when in O_NONBLOCK anyway. In a real POSIX
system it would never return 0 and always at least PIPE_BUF, but this
may still be less than the 64Kb chunk the application was trying do in
the first place.
When in blocking mode we can return EINTR (a ficticious signal
notifcation) but then we run into problem where the application has
blocked signals, but what about signals outside the scope of POSIX, like
Linux RT signals. What I'm saying buy this there maybe some signals
that can not be blocked anyway so EINTR may still be valid. But there
is probably lots of application code which does not expect EINTR when it
has already blocked all the signals it can think of.
Ah ha! Eureka moment....
What about if all pipe write operations used overlapping I/O and was
FIFO serialized within cygwin. I believe the WriteFileEx() can return
TRUE when the I/O went through first time and ERROR_IO_PENDING when its
going to signal completion later. This sticks with the always make a
private copy of the POSIX application's data buffer in plan A, so thats
a double buffered throughput loss for every write. Ah well.
If we get TRUE back there is no problem, business as usual next time, if
we get FALSE back with ERROR_IO_PENDING we consider that I/O to be an
outstanding write on a pipe and we revoke the writability status in select.
We then call WaitForSingleObject() for the I/O completion (or we have a
completion function do that work), when we get I/O completion we allow
the next I/O from the FIFO through the gate. If there was no more I/O
in the FIFO we set write_ready=true and wakeup select's.
This model does not rely on over-writing to find the call that would
block to be able to revoke writability. It just uses the IO completion
mechanism of overlapping IO which is how nature intended.
If throughput becomes a problem it maybe possible to apply heuristics
with a guesstimate of the amount of OVERLAPPING IO the kernel can buffer
before blocking. Then instead of only one I/O per fd per process we
could account for the amount of outstanding bytes and revoke writability
based on that threshold figure. This way multiple overlapped IOs can be
outstanding in the kernel before throttle it with select. But for now I
just want get back to a working app.
If the POSIX pipe is in blocking mode we _deliberatly_ make it block
until it gets completion signalled. If its non-blocking mode and we
have already revoked writability we return EAGAIN.
Thanks for your replies.
I have started to write WIN32 application code to help me completely
understand the various windows IO models and NamedPiped implementation
in detail. So there can be some solid ground for me to tweak the
proposal based on the rules in play with the NT kernel.
Darryl
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -