X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:reply-to :references:mime-version:content-type:in-reply-to; q=dns; s= default; b=x4BfCSqdGgdB6rrQZwRA5dNXVKnGedcYK0FAvFVyQmPpPNbtUNdyk Jw6zsP0LkdXcpRsJE1C0yP+KQg4C3rmfLhw/YFvRKViNMzoHhH8zWTLX4OjV3dtU As1SD2eIPKuOhk/0QBCxKDrGAXjwJ45VFsnuAXiZlokjidkwSCqblY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:to:subject:message-id:reply-to :references:mime-version:content-type:in-reply-to; s=default; bh=j+nS+QK1SLFh0xHV/JqPQJD32G4=; b=PUfTWRBNFh8cmeUgFmPX2OL91V/k GaAW2ixlc0amJyj/zGLUTGBSQK3302wmpTMygqOO0D1NXcJ6DDeaBzNVX+HkZ6Fy x31IMcMSMmIeAHt5tE7zJ7pjzN09QP4pbcoiqqtczJv9Msgbw0sq6wV0pePfup7k +cZjwTX9WaX7X3k= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.5 required=5.0 tests=AWL,BAYES_20,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 X-HELO: calimero.vinschen.de Date: Thu, 2 Jul 2015 13:52:50 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: read() returns data but not the number of read bytes Message-ID: <20150702115250.GR2918@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="p1Od3smaOkJqivj4" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) --p1Od3smaOkJqivj4 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Jul 1 16:02, Theodor DOT Kazhan AT gmx DOT de wrote: > Hi all, >=20 > The issue was observed in cygwin versions 1.7.30-1, 2.0.2-1 and > 2.0.4-1 on Win7. Upon receiving 'HUGEFILE' form a debian system (scp > user AT server:logDownload_m/TEST.txt .), scp in cygwin hangs from time > to time...=20 >=20 > Based on the observed occurrence behaviour of the issue and my > experience in hunting ghosts, I - based on the new aspect described > below - without much doubt assume a race condition between the cygwin > read()/write() function implementations. I started investigating the > cygwin-newlib implementation and gdb-ing it but I'm not yet that much > into the details of read()/write()-implementation to identify possible > race conditions. > So any contribution / co-working offers / investigation proposals are > highly appreciated as I can only progress slowly. >=20 >=20 > As a starting point I'd have, based on: "scp and it's child process > ssh share a pipe to receive the data of the downloaded file..." the > following Q: > 1) Is a child forked into a new, separate Win-process or is the fork > simulated in cygwin, e.g. by threads? Fork is starting a new process. > 2) Is the pipe implementation in cygwin in such a setup using > Win-methods to forward the data or is the pipe-functionality simulated > in cygwin? Pipes are using Windows named pipes w/ overlapped IO. > 3) Can the action of providing data for reading from a pipe by the > consumer process (i.e. indicating data availability and the actual > copying of the data in a read()-call) be interrupted/impacted by a > write()-call from a pipe-data-producing process? I don't understand the question. Writing to the pipe on one end will trigger an event handle on the other end and the read call will check if overlapped data is available in a call to GetOverlappedResult. Writing to the pipe will not generate POSIX signals in the consumer process, if that's what you mean. > ---8<--- > size_t > atomicio6(ssize_t (*f) (int, void *, size_t), int fd, void *_s, size_t n, > int (*cb)(void *, size_t), void *cb_arg) > { > [...] > --->8--- A diff -up against the orignal code would be helpful for stuff like that. > $ cat -n scp_20150701_123824.txt > . . . > 50782 TK: atomicio.c: atomicio6: fd=3D8, event=3D1, res=3D49152, errno= =3D011, s[0]=3DB014565-N000001-, s[Q1-16]=3DB014565-N001024-B014= 566-N000001-, s[Q2-16]=3DB014566-N001024-B014567-N000001-, s[Q3-16]=3DB0145= 67-N001024-B014568-N000001-, s[Q4-16]=3DB014568-N001024- > 50783 TK: scp.c: sink: amt=3D65536, j=3D65536, i=3D238616576, size=3D15= 53121293, count=3D65536, buf=3DB014565-N000001- > 50784 TK: dispatch.c: ssh_dispatch_run: type=3D94 > 50785 TK: channels.c: channel_input_data: data_len=3D16384, buf=3DB0145= 69-N000001- > 50786 TK: channels.c: channel_handle_wfd(2): write: len=3D16384, buf=3D= B014569-N000001- > 50787 TK: atomicio.c: atomicio6: fd=3D8, event=3D1, res=3D-0001, errno= =3D011, s[0]=3DB014569-N000001-, s[Q1-16]=3DB014569-N001024-B014= 566-N000001-, s[Q2-16]=3D, s[Q3-16]=3D, s[Q4-16]=3D Hmm. I inspected the read code for overlapped reads again and I don't see anything wrong. The question is why an EAGAIN is triggered. The only reason I can think of is if GetOverlappedResult returns FALSE when there actually is data in the buffer. For the underlying functionality, see https://sourceware.org/git/?p=3Dnewlib-cygwin.git;a=3Dblob;f=3Dwinsup/cygwi= n/fhandler.cc;h=3D6f024da3288053359de36baf41fa095819d252cc;hb=3DHEAD#l1944 I'm now trying to reproduce this again, but I'm really not sure where to look and, as I wrote, last time I was unable to reproduce this despite running the testcase for hours. It might be interesting to see if there's a certain bordercase possible: GetOverlappedResult returning FALSE with bytes set to some non-0 value. Locally I changed the debug output to diff --git a/winsup/cygwin/fhandler.cc b/winsup/cygwin/fhandler.cc index 6f024da..556c240 100644 --- a/winsup/cygwin/fhandler.cc +++ b/winsup/cygwin/fhandler.cc @@ -1993,10 +1993,14 @@ fhandler_base_overlapped::wait_overlapped (bool inr= es, bool writing, DWORD *byte overridden by the return of GetOverlappedResult which could detect that I/O completion occurred. */ CancelIo (h); + *bytes =3D 0; wores =3D GetOverlappedResult (h, get_overlapped (), bytes, false); err =3D GetLastError (); ResetEvent (get_overlapped ()->hEvent); /* Probably not needed but CYA = */ - debug_printf ("wfres %u, wores %d, bytes %u", wfres, wores, *bytes); + if (!wores && *bytes > 0) + system_printf ("wfres %u, wores %d, bytes %u", wfres, wores, *bytes); + else + debug_printf ("wfres %u, wores %d, bytes %u", wfres, wores, *bytes); if (wores) res =3D overlapped_success; /* operation succeeded */ else if (wfres =3D=3D WAIT_OBJECT_0 + 1) I'm running the test with this change now. If you see anything else wrong in that code, please yell. Corinna --=20 Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat --p1Od3smaOkJqivj4 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJVlSYSAAoJEPU2Bp2uRE+g0pUP/iUBrBp1HfwCnzsh+mN8s4Yo j5tV9WX23xOaq45fFaSqSu0/1q7zHENDTySRxCPHVAOeAY4K+awkxGEZQk+FW3t9 uEojX9Lu3RdxX2qaCkZb6uuaiooky4DhOcogFWqWk/BG6DW2YL01QuzxD1KirQf8 oKRB4+HFEW9z08T1P8gPAjJDR7Sjg9KGa1Kb9EwtDcrhV8YRwItnMsVgndgqTf42 GYfxH0SxYmfrFqOtMpr13Y+c4gLB2zrNb0StAXKSlTu/OGhJK2JX7uwAzr2XHMlA 39zXvmft0Wgb0FIvsFcQdoEsl9GblxVbdgQ4faZHSLtrTHjaCVaAlQhiJPqKCC5l TxbCtdIrPkMmNGq0UMM2NtYaKgr5rrhAUUNMIeh0k6tKMlePoVM0HssAxiro030c oeJxsaFLudG2L8kZ5veA1OygvSQP04PxKzmm3hoNxXK3qBukRue3oAbagY3ofmGX V7zJm7bqpadlj5qBK78nwV4yGWCp35iFxMGm352pgQydmITUUUqBeyd0WBTIL2ZQ 3vuXUZAGiN4gDQAYVwhhfy+SRitv+dT8uH8DPpnOIB6rEaRp4UiBdYmhgZxU89ao 4eDUIcj1e4RduTOoBlboio/DnJZ3J7Zr3DfkqOTIa+hJ1S24iemAsZtfcuq6Y3kv OkxxCe64mmeG/n++En4/ =rTkl -----END PGP SIGNATURE----- --p1Od3smaOkJqivj4--