delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2012/04/26/17:19:26

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,RCVD_IN_HOSTKARMA_NO,RCVD_IN_HOSTKARMA_YE,RCVD_IN_SORBS_WEB,SPF_HELO_PASS
X-Spam-Check-By: sourceware.org
From: "James Johnston" <JamesJ AT motionview3d DOT com>
To: <cygwin AT cygwin DOT com>
Cc: <cygwin AT alanhowells DOT e4ward DOT com>
Subject: Cygwin passes through null writes to other software when redirecting standard input/output (i.e. piping)
Date: Thu, 26 Apr 2012 21:18:27 -0000
Message-ID: <020501cd23f2$20f07620$62d16260$@motionview3d.com>
MIME-Version: 1.0
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

I have "run into an issue with Cygwin".  This is arguably not a bug in
Cygwin, but in other platform runtime libraries.  Never-the-less, the
symptoms occur only with Cygwin and not the Windows command prompt.  So,
from a practical standpoint, Cygwin is "broken."  This is almost certainly
the same issue mentioned in the "1.7.10->1.7.13 : output from .NET programs
does not get through pipeline to a visual c++ program" thread started by
cygwin AT alanhowells.e4ward.com last week.  It's also related to the issue
I raised over a month ago, titled "Can't reliably redirect standard output
from C# program in recent Cygwin".

To summarize: when piping output from one program to another (i.e. running
"A | B" command to pipe standard output from A to B's standard input),
Cygwin passes through ALL writes performed by a program to standard output
(and probably error), including null writes.  Cygwin passing through these
null writes is a big problem, because multiple runtimes - I tested .NET
Framework 3.5 and the Visual C++ 2008 runtime - cannot properly handle null
writes that they receive on standard input.  (Recall that a null write would
be when a program calls WriteFile Win32 API to write 0 bytes to standard
output or standard error.)  While a null write appears nonsensical, every
single .NET program that uses the Console class to write to standard
output/error will do a null write, as .NET does this to verify the stream is
OK.  Other software could easily decide to write zero bytes to standard
output as well (e.g. if outputting an empty string).

I think these are bugs in the runtimes that handle standard input, because
the documentation for ReadFile clearly states that if the handle is a pipe,
the call succeeds, and zero bytes are returned - then it is a null write on
the pipe and does NOT signify end-of-file.  Instead, end-of-file is
signified by ERROR_BROKEN_PIPE.  Yet, these runtimes erroneously handle the
null write as an end-of-file on standard input anyway, causing the software
using the runtime to malfunction.  For example, I used Reflector to
decompile the Read method in the Stream class that handles standard
input/output in the .NET Framework.  It's very obvious that it calls
ReadFile, and a successful ReadFile that returns zero bytes is treated as
end-of-file.

An alternative explanation might be that all this behavior is by design, and
Cygwin is buggy.  But I haven't found anything in MSDN that would justify
that.  (For example, documentation stating that a redirected standard input
may never have a null write would mean that runtimes could safely assume
that standard input won't have null writes, and Cygwin in error for doing a
null write.  However, as an example, the STARTUPINFO structure documentation
imposes no such requirements when describing the hStdInput handle.)

These runtimes (.NET Framework and Visual C++ runtime) are in wide use.  I
don't think it is realistic to expect any fixes for them any time soon:
Microsoft would need to fix both of these runtimes, and then application
vendors would need to use them.  That will take years.  Therefore,
realistically I hope Cygwin can work around them.

I think a workable fix would be for Cygwin not to pass through null writes
it receives on an output/error pipe.  For example, somewhere in Cygwin I
assume there is a loop that calls ReadFile to read the redirected standard
output from the first program, and then calls WriteFile to send this output
to the second program's standard input.  If the call to WriteFile was
skipped if it would write zero bytes (i.e. so Cygwin doesn't do null writes
itself), I think it would fix the problem and work around all these buggy
runtimes.

I am providing several sample programs that can be compiled in Visual C#
2008 / .NET Framework 3.5, or Visual C++ 2008.  They should demonstrate the
problem 100% of the time on any system, because appropriate Sleep() calls
are made.  This was reproduced on a Cygwin 1.7.14-2 system that I updated
this morning.  Sample programs are divided as follows:

 * A "Sender" program performs a null write on standard output, and then
writes a normal line of text.
 * A "Receiver" program attempts to write all received lines of text.

Source code is below; directions for testing and test results at end of
e-mail following code:

========== SenderCS.cs: Sender program in Visual C# 2008 ==========

class SenderCS {
	static void Main(string[] args) {
		/* Notice how this is such a basic program */
		System.Threading.Thread.Sleep(1000); /* wait for pipes to
set up and for receiving app to block on first ReadFile call */
		/* This will do a null write; it could be left out and the
problem would still occur because any WriteLine() call will do it. */
		/* I include it so that the null write can be placed between
two Sleep() calls. */
		System.Console.Write("");
		System.Threading.Thread.Sleep(1000); /* wait for receiving
end to get the null write */
		System.Console.WriteLine("Hello world!"); /* normal line of
text */
	}
}

========== SenderC.c: Sender program in Visual C++ 2008 ==========

#include <windows.h>
int main() {
	char * test = "Hello world!\n";
	DWORD written;
	HANDLE h = GetStdHandle(STD_OUTPUT_HANDLE); /* Get standard output
file handle */
	Sleep(1000); /* wait for pipes to set up and for receiving app to
block on first ReadFile call */
	WriteFile(h, test, 0, &written, NULL); /* do null write */
	Sleep(1000); /* wait for receiving app to get the null write */
	WriteFile(h, test, lstrlenA(test), &written, NULL); /* print hello
message */
	return 0;
}

========== ReceiverCS.cs: Receiver program in Visual C# 2008 that
demonstrates bug in .NET Framework 3.5 ==========

class Program {
	static void Main(string[] args) {
		/* use a retry loop, because we can't distinguish between a
null write
		and end-of file, due to bugs in Console stream's Read
method. */
		for (int i = 0; i < 10; i++) {
			/* BUG:  docs for ReadLine() say a null signifies
end-of-file, but if it
			encounters a null write on standard input pipe, it
incorrectly thinks it
			is end-of-file and will return null. */
			string nl = System.Console.ReadLine();
			if (nl == null) {
				System.Console.WriteLine("Got end-of-file");
			} else {
				System.Console.WriteLine("Got line {0}",
nl);
			}
		}
	}
}

========== ReceiverCPP.cpp: Receiver program in Visual C++ 2008 that
demonstrates bug in VC++ 2008 runtime / STL ==========

#include <iostream>
#include <string>
using namespace std;
int main() {
	/* you have to use a retry loop, for exact same reasons given for C#
receiver program:
	there is no way to tell difference between end-of-file and null
write. */
	for (int i = 0; i < 10; i++) {
		string str;
		/* BUG: cin will indicate end-of-file on a null write. */
		getline(cin, str);
		if (cin.eof()) {
			cout << "Got end-of-file" << endl;
		} else {
			cout << "Got line " << str << endl;
		}
		cin.clear(); /* future getline calls will always immediately
fail without attempting another read unless we clear EOF/fail flags */
	}
	return 0;
}

========== Test results ==========

The test programs are designed so that they can be run in any combination
from the command prompt.  The output from a sender is piped to the input of
a receiver.  Each combination delivers identical output to the other
combinations:

 * ./SenderCS | ./ReceiverCS
 * ./SenderCS | ./ReceiverCPP
 * ./SenderC | ./ReceiverCS
 * ./SenderC | ./ReceiverCPP

Output from Cygwin will always be:

Got end-of-file
Got line Hello world!
Got end-of-file
Got end-of-file
<snip>

This is wrong, because the program received end-of-file before it was
actually at the end of the input stream, due to the bug in its runtime's
handling of return values from ReadFile API.  I did not do any tests using
standard error, but I assume Cygwin redirects standard error in the same way
it redirects standard output, in which case it would have the same problem.

Note that we can do the same test from the Windows command prompt:

Got line Hello world!
Got end-of-file
Got end-of-file
<snip>

The Windows command prompt apparently strips the null writes before passing
them to the receiver's standard input, so that the results are now correct.
The fact that Windows command prompt delivers correct results is probably a
reason why these bugs exist in the runtimes in the first place: probably
nobody really tested to make sure the runtime was compatible with other
redirected standard inputs that might have null writes.

Best regards,

James Johnston



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019