X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-0.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,RCVD_IN_HOSTKARMA_NO,RCVD_IN_HOSTKARMA_YE,RCVD_IN_SORBS_WEB,SPF_HELO_PASS X-Spam-Check-By: sourceware.org From: "James Johnston" To: Cc: Subject: Cygwin passes through null writes to other software when redirecting standard input/output (i.e. piping) Date: Thu, 26 Apr 2012 21:18:27 -0000 Message-ID: <020501cd23f2$20f07620$62d16260$@motionview3d.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com I have "run into an issue with Cygwin". This is arguably not a bug in Cygwin, but in other platform runtime libraries. Never-the-less, the symptoms occur only with Cygwin and not the Windows command prompt. So, from a practical standpoint, Cygwin is "broken." This is almost certainly the same issue mentioned in the "1.7.10->1.7.13 : output from .NET programs does not get through pipeline to a visual c++ program" thread started by cygwin AT alanhowells.e4ward.com last week. It's also related to the issue I raised over a month ago, titled "Can't reliably redirect standard output from C# program in recent Cygwin". To summarize: when piping output from one program to another (i.e. running "A | B" command to pipe standard output from A to B's standard input), Cygwin passes through ALL writes performed by a program to standard output (and probably error), including null writes. Cygwin passing through these null writes is a big problem, because multiple runtimes - I tested .NET Framework 3.5 and the Visual C++ 2008 runtime - cannot properly handle null writes that they receive on standard input. (Recall that a null write would be when a program calls WriteFile Win32 API to write 0 bytes to standard output or standard error.) While a null write appears nonsensical, every single .NET program that uses the Console class to write to standard output/error will do a null write, as .NET does this to verify the stream is OK. Other software could easily decide to write zero bytes to standard output as well (e.g. if outputting an empty string). I think these are bugs in the runtimes that handle standard input, because the documentation for ReadFile clearly states that if the handle is a pipe, the call succeeds, and zero bytes are returned - then it is a null write on the pipe and does NOT signify end-of-file. Instead, end-of-file is signified by ERROR_BROKEN_PIPE. Yet, these runtimes erroneously handle the null write as an end-of-file on standard input anyway, causing the software using the runtime to malfunction. For example, I used Reflector to decompile the Read method in the Stream class that handles standard input/output in the .NET Framework. It's very obvious that it calls ReadFile, and a successful ReadFile that returns zero bytes is treated as end-of-file. An alternative explanation might be that all this behavior is by design, and Cygwin is buggy. But I haven't found anything in MSDN that would justify that. (For example, documentation stating that a redirected standard input may never have a null write would mean that runtimes could safely assume that standard input won't have null writes, and Cygwin in error for doing a null write. However, as an example, the STARTUPINFO structure documentation imposes no such requirements when describing the hStdInput handle.) These runtimes (.NET Framework and Visual C++ runtime) are in wide use. I don't think it is realistic to expect any fixes for them any time soon: Microsoft would need to fix both of these runtimes, and then application vendors would need to use them. That will take years. Therefore, realistically I hope Cygwin can work around them. I think a workable fix would be for Cygwin not to pass through null writes it receives on an output/error pipe. For example, somewhere in Cygwin I assume there is a loop that calls ReadFile to read the redirected standard output from the first program, and then calls WriteFile to send this output to the second program's standard input. If the call to WriteFile was skipped if it would write zero bytes (i.e. so Cygwin doesn't do null writes itself), I think it would fix the problem and work around all these buggy runtimes. I am providing several sample programs that can be compiled in Visual C# 2008 / .NET Framework 3.5, or Visual C++ 2008. They should demonstrate the problem 100% of the time on any system, because appropriate Sleep() calls are made. This was reproduced on a Cygwin 1.7.14-2 system that I updated this morning. Sample programs are divided as follows: * A "Sender" program performs a null write on standard output, and then writes a normal line of text. * A "Receiver" program attempts to write all received lines of text. Source code is below; directions for testing and test results at end of e-mail following code: ========== SenderCS.cs: Sender program in Visual C# 2008 ========== class SenderCS { static void Main(string[] args) { /* Notice how this is such a basic program */ System.Threading.Thread.Sleep(1000); /* wait for pipes to set up and for receiving app to block on first ReadFile call */ /* This will do a null write; it could be left out and the problem would still occur because any WriteLine() call will do it. */ /* I include it so that the null write can be placed between two Sleep() calls. */ System.Console.Write(""); System.Threading.Thread.Sleep(1000); /* wait for receiving end to get the null write */ System.Console.WriteLine("Hello world!"); /* normal line of text */ } } ========== SenderC.c: Sender program in Visual C++ 2008 ========== #include int main() { char * test = "Hello world!\n"; DWORD written; HANDLE h = GetStdHandle(STD_OUTPUT_HANDLE); /* Get standard output file handle */ Sleep(1000); /* wait for pipes to set up and for receiving app to block on first ReadFile call */ WriteFile(h, test, 0, &written, NULL); /* do null write */ Sleep(1000); /* wait for receiving app to get the null write */ WriteFile(h, test, lstrlenA(test), &written, NULL); /* print hello message */ return 0; } ========== ReceiverCS.cs: Receiver program in Visual C# 2008 that demonstrates bug in .NET Framework 3.5 ========== class Program { static void Main(string[] args) { /* use a retry loop, because we can't distinguish between a null write and end-of file, due to bugs in Console stream's Read method. */ for (int i = 0; i < 10; i++) { /* BUG: docs for ReadLine() say a null signifies end-of-file, but if it encounters a null write on standard input pipe, it incorrectly thinks it is end-of-file and will return null. */ string nl = System.Console.ReadLine(); if (nl == null) { System.Console.WriteLine("Got end-of-file"); } else { System.Console.WriteLine("Got line {0}", nl); } } } } ========== ReceiverCPP.cpp: Receiver program in Visual C++ 2008 that demonstrates bug in VC++ 2008 runtime / STL ========== #include #include using namespace std; int main() { /* you have to use a retry loop, for exact same reasons given for C# receiver program: there is no way to tell difference between end-of-file and null write. */ for (int i = 0; i < 10; i++) { string str; /* BUG: cin will indicate end-of-file on a null write. */ getline(cin, str); if (cin.eof()) { cout << "Got end-of-file" << endl; } else { cout << "Got line " << str << endl; } cin.clear(); /* future getline calls will always immediately fail without attempting another read unless we clear EOF/fail flags */ } return 0; } ========== Test results ========== The test programs are designed so that they can be run in any combination from the command prompt. The output from a sender is piped to the input of a receiver. Each combination delivers identical output to the other combinations: * ./SenderCS | ./ReceiverCS * ./SenderCS | ./ReceiverCPP * ./SenderC | ./ReceiverCS * ./SenderC | ./ReceiverCPP Output from Cygwin will always be: Got end-of-file Got line Hello world! Got end-of-file Got end-of-file This is wrong, because the program received end-of-file before it was actually at the end of the input stream, due to the bug in its runtime's handling of return values from ReadFile API. I did not do any tests using standard error, but I assume Cygwin redirects standard error in the same way it redirects standard output, in which case it would have the same problem. Note that we can do the same test from the Windows command prompt: Got line Hello world! Got end-of-file Got end-of-file The Windows command prompt apparently strips the null writes before passing them to the receiver's standard input, so that the results are now correct. The fact that Windows command prompt delivers correct results is probably a reason why these bugs exist in the runtimes in the first place: probably nobody really tested to make sure the runtime was compatible with other redirected standard inputs that might have null writes. Best regards, James Johnston -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple