Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin AT sources DOT redhat DOT com From: "Markus Hoenicka" MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14895.64888.834000.461968@gargle.gargle.HOWL> Date: Thu, 7 Dec 2000 21:13:28 +0000 (CT0) To: cygwin Subject: Re: socket read()/write() problem in 1.1.6 In-Reply-To: <3A2ED44C.1A46F386@redhat.com> References: <14894 DOT 33300 DOT 706000 DOT 788630 AT gargle DOT gargle DOT HOWL> <3A2ED44C DOT 1A46F386 AT redhat DOT com> X-Mailer: VM 6.75 under Emacs 20.6.1 Hi, Corinna Vinschen writes: > Markus Hoenicka wrote: > > > > Hi, > > > > I stumbled over a problem with socket reading/writing after I upgraded > > from cygwin1.dll version 1.1.4 to 1.1.6 > > [...] > > Did you strace the server and/or client on Cygwin? I wonder what's > going on there. If this would be a 100% reproducable problem in > Cygwin, all my ssh connections all day would fail, so... > I'm aware that if read() or write() were plain buggy the list would be full of complaints. However I see the problem on two Cygwin installations and I *always* see it, so its not some random thing. I have had some dangling pointers and similar problems, but all these seem to be fixed now. They would crash the program immediately on both Linux and Cygwin. The current problem never crashes the app, it just makes it do the wrong thing consistently with a newer Cygwin DLL. This certainly does not mean that something with Cygwin read() or write() must be wrong. It could as well be some small memory corruption that happens to be without effect on Linux and with older Cygwin versions. I tested several Cygwin DLLs. 1.1.3 and 1.1.4 *never* show the problem. 1.1.5 and 1.1.6 *always* show the problem. I can use my Cygwin client running with DLL 1.1.6 and connect to a server running on Linux without problem. The other way round (client on Linux and server on Cygwin DLL 1.1.6) does have the problem, so it appears the problem is on the write() side of the socket. > A little bit of debugging would be gladly appreciated. > I'll do my best as I can't expect that someone else is debugging this for me. I was just asking because this might ring some bell. I looked at the strace output but I'm afraid that my knowledge is not sufficient to make much sense of it (my degree is in Biology after all, not CS). There are certainly some differences but I'd need someone to look at that who knows a bit more about socket communication in older and recent Cygwin DLLs. The relevant parts of the server straces are appended below (yes I know nobody is begging to read that, but anyway...). On the client side I basically see what my debug printf()s also reported: After the first successful read() there is a successful select() but the next read() fails with __set_winsock_errno: read:119 - 10054 (WSAECONNRESET) -> 104 > However, could you please provide the sources of two testapps (server, > client) which reproducable show the problem on your box? > Oh well. I'm looking at 700 kb of obfuscate sources which were not yet cleaned up for release. I could just make a tarball of this. The cleaner way would certainly be to cut it down to a minimal testcase, but this would take quite some time. I'll try to come up with something unless someone can tell from the strace what might be wrong. Thanks for your help Markus Cygwin dll 1.1.4 --------------------------------------------------------------------- 420 26972362 [main] refdbs 1018 select_stuff::wait: n 1, ms 15000 13042 26985404 [select_socket] refdbs 1018 thread_socket: stuff_start 0xA05DE94 346 26985750 [select_socket] refdbs 1018 thread_socket: Win32 select returned 1 210 26985960 [select_socket] refdbs 1018 thread_socket: s 0xA053FE0, testing fd 5 (/dev/tcp) 198 26986158 [select_socket] refdbs 1018 thread_socket: write_ready 618 26986776 [main] refdbs 1018 select_stuff::wait: woke up. wait_ret 1. verifying 213 26986989 [main] refdbs 1018 select_stuff::wait: gotone 1 880 26987869 [main] refdbs 1018 peek_socket: considering handle 0x3C 194 26988063 [main] refdbs 1018 peek_socket: adding write fd_set /dev/tcp, fd 5 194 26988257 [main] refdbs 1018 peek_socket: adding except fd_set /dev/tcp, fd 5 236 26988493 [main] refdbs 1018 peek_socket: WINSOCK_SELECT returned 1 201 26988694 [main] refdbs 1018 set_bits: me 0xA053FE0, testing fd 5 (/dev/tcp) 198 26988892 [main] refdbs 1018 set_bits: ready 1 191 26989083 [main] refdbs 1018 select_stuff::poll: returning 1 192 26989275 [main] refdbs 1018 select_stuff::~select_stuff: calling cleanup routines 319 26989594 [main] refdbs 1018 socket_cleanup: si 0xA05AE70 si->thread 0xC0 196 26989790 [main] refdbs 1018 socket_cleanup: connection to si->exitsock 0x48 1617 26991407 [main] refdbs 1018 socket_cleanup: returning 200 26991607 [main] refdbs 1018 select_stuff::~select_stuff: deleting select records 1014 26992621 [main] refdbs 1018 _write: write (5, 0x245E980, 34) 44258 27036879 [main] refdbs 1018 _write: 34 = write (5, 0x245E980, 34) 269 27037148 [main] refdbs 1018 cygwin_select: 6, 0x0, 0x245F938, 0x245F930, 0x245F928 1027 27038175 [main] refdbs 1018 hinfo::select_write: /dev/tcp fd 5 838 27039013 [main] refdbs 1018 hinfo::select_except: /dev/tcp fd 5 191 27039204 [main] refdbs 1018 cygwin_select: to->tv_sec 15, to->tv_usec 0, ms 15000 317 27039521 [main] refdbs 1018 cygwin_select: sel.always_ready 0 374 27039895 [main] refdbs 1018 start_thread_socket: Handle 0x3C 191 27040086 [main] refdbs 1018 start_thread_socket: Added to writefds 192 27040278 [main] refdbs 1018 start_thread_socket: Added to exceptfds 863 27041141 [main] refdbs 1018 start_thread_socket: exitsock 0x88 214 27041355 [main] refdbs 1018 start_thread_socket: stuff_start 0x245F820 424 27041779 [main] refdbs 1018 select_stuff::wait: n 1, ms 15000 41400 27083179 [select_socket] refdbs 1018 thread_socket: stuff_start 0xA05DE94 339 27083518 [select_socket] refdbs 1018 thread_socket: Win32 select returned 1 214 27083732 [select_socket] refdbs 1018 thread_socket: s 0xA051E60, testing fd 5 (/dev/tcp) 200 27083932 [select_socket] refdbs 1018 thread_socket: write_ready 554 27084486 [main] refdbs 1018 select_stuff::wait: woke up. wait_ret 1. verifying 210 27084696 [main] refdbs 1018 select_stuff::wait: gotone 1 849 27085545 [main] refdbs 1018 peek_socket: considering handle 0x3C 193 27085738 [main] refdbs 1018 peek_socket: adding write fd_set /dev/tcp, fd 5 193 27085931 [main] refdbs 1018 peek_socket: adding except fd_set /dev/tcp, fd 5 234 27086165 [main] refdbs 1018 peek_socket: WINSOCK_SELECT returned 1 202 27086367 [main] refdbs 1018 set_bits: me 0xA051E60, testing fd 5 (/dev/tcp) 195 27086562 [main] refdbs 1018 set_bits: ready 1 191 27086753 [main] refdbs 1018 select_stuff::poll: returning 1 192 27086945 [main] refdbs 1018 select_stuff::~select_stuff: calling cleanup routines 196 27087141 [main] refdbs 1018 socket_cleanup: si 0xA05AE70 si->thread 0xAC 193 27087334 [main] refdbs 1018 socket_cleanup: connection to si->exitsock 0x88 1647 27088981 [main] refdbs 1018 socket_cleanup: returning 198 27089179 [main] refdbs 1018 select_stuff::~select_stuff: deleting select records 1742 27090921 [main] refdbs 1018 _write: write (5, 0xA0516C0, 1) 228 27091149 [main] refdbs 1018 _write: 1 = write (5, 0xA0516C0, 1) 848 27091997 [main] refdbs 1018 _close: close (5) 78812 27170809 [main] refdbs 1018 _close: 0 = close (5) 264 27171073 [main] refdbs 1018 do_exit: do_exit (0) 211 27171284 [main] refdbs 1018 void: 0x1 = signal (20, 0x1) 197 27171481 [main] refdbs 1018 void: 0x0 = signal (1, 0x1) 192 27171673 [main] refdbs 1018 void: 0x0 = signal (2, 0x1) 194 27171867 [main] refdbs 1018 void: 0x0 = signal (3, 0x1) 836 27172703 [main] refdbs 1018 _close: close (0) 1210 27173913 [main] refdbs 1018 fhandler_base::set_io_handle: set handle to 0xFFFFFFFF 1648 27175561 [main] refdbs 1018 _close: 0 = close (0) 833 27176394 [main] refdbs 1018 _close: close (1) 1163 27177557 [main] refdbs 1018 fhandler_base::close: handle 0x2C 1653 27179210 [main] refdbs 1018 _close: 0 = close (1) 31997 27211207 [main] refdbs 1018 _close: close (2) 1219 27212426 [main] refdbs 1018 fhandler_base::set_io_handle: set handle to 0xFFFFFFFF 1649 27214075 [main] refdbs 1018 _close: 0 = close (2) 833 27214908 [main] refdbs 1018 _close: close (3) 1149 27216057 [main] refdbs 1018 fhandler_base::close: handle 0x70 1676 27217733 [main] refdbs 1018 _close: 0 = close (3) 832 27218565 [main] refdbs 1018 _close: close (4) 3050 27221615 [main] refdbs 1018 _close: 0 = close (4) 17524 27239139 [main] refdbs 1018 proc_terminate: nchildren 0, nzombies 0 197 27239336 [main] refdbs 1018 proc_terminate: leaving 192 27239528 [main] refdbs 1018 sigproc_terminate: entering 381 27239909 [main] refdbs 1018 sigproc_terminate: done 240 27240149 [main] refdbs 1018 __to_clock_t: dwHighDateTime 0, dwLowDateTime 701008 199 27240348 [main] refdbs 1018 __to_clock_t: total 00000000 00000046 198 27240546 [main] refdbs 1018 __to_clock_t: dwHighDateTime 0, dwLowDateTime 500720 195 27240741 [main] refdbs 1018 __to_clock_t: total 00000000 00000032 224 27240965 [main] refdbs 1018 my_parent_is_alive: parent still alive 205 27241170 [main] refdbs 1018 do_exit: calling ExitProcess 0 305881 27250913 [proc] refdbs 1014 proc_subproc: args: 3, 0 299 27251212 [proc] refdbs 1014 proc_subproc: pid 1018[0] terminated, handle 0x104, nchildren 1, nzombies 0 206 27251418 [proc] refdbs 1014 remove_child: removing [0], pid 1018, handle 0x104, nchildren 1 202 27251620 [proc] refdbs 1014 sig_send: pid 1014, signal 20, its_me 1 202 27251822 [proc] refdbs 1014 sig_send: ReleaseSemaphore succeeded 195 27252017 [proc] refdbs 1014 sig_send: Not waiting for sigcomplete. its_me 1 sig 20 197 27252214 [proc] refdbs 1014 sig_send: returning 0 from sending signal 20 196 27252410 [proc] refdbs 1014 proc_subproc: returning 0 193 27252603 [proc] refdbs 1014 wait_subproc: looping 216 27252819 [sig] refdbs 1014 wait_sig: awake 208 27253027 [sig] refdbs 1014 wait_sig: processing signal 20 195 27253222 [sig] refdbs 1014 wait_sig: Got signal 20 192 27253414 [sig] refdbs 1014 sig_handle: signal 20 192 27253606 [sig] refdbs 1014 sig_handle: signal 20 ignored 192 27253798 [sig] refdbs 1014 sig_handle: returning 0 195 27253993 [sig] refdbs 1014 proc_subproc: args: 4, 0 193 27254186 [sig] refdbs 1014 proc_subproc: looking for processes to reap 195 27254381 [sig] refdbs 1014 proc_subproc: finished processing terminated/stopped child 197 27254578 [sig] refdbs 1014 proc_subproc: returning 1 193 27254771 [sig] refdbs 1014 wait_sig: looping --------------------------------------------------------------------- Cygwin dll 1.1.6 --------------------------------------------------------------------- 432 114845 [main] refdbs 199 select_stuff::wait: m 2, ms 15000 26255 141100 [select_socket] refdbs 199 thread_socket: stuff_start 0xA01CBC4 2019 143119 [select_socket] refdbs 199 thread_socket: Win32 select returned 1 1994 145113 [select_socket] refdbs 199 thread_socket: s 0xA013D18, testing fd 5 (/dev/tcp) 658 145771 [select_socket] refdbs 199 thread_socket: write_ready 987 146758 [main] refdbs 199 select_stuff::wait: woke up. wait_ret 1. verifying 1817 148575 [main] refdbs 199 select_stuff::wait: gotone 1 215 148790 [main] refdbs 199 select_stuff::wait: returning 0 235 149025 [main] refdbs 199 peek_socket: considering handle 0x10 205 149230 [main] refdbs 199 peek_socket: adding write fd_set /dev/tcp, fd 5 198 149428 [main] refdbs 199 peek_socket: adding except fd_set /dev/tcp, fd 5 240 149668 [main] refdbs 199 peek_socket: WINSOCK_SELECT returned 1 284 149952 [main] refdbs 199 set_bits: me 0xA013D18, testing fd 5 (/dev/tcp) 201 150153 [main] refdbs 199 set_bits: ready 1 192 150345 [main] refdbs 199 select_stuff::poll: returning 1 195 150540 [main] refdbs 199 select_stuff::~select_stuff: calling cleanup routines 195 150735 [main] refdbs 199 socket_cleanup: si 0xA019BA0 si->thread 0x70 196 150931 [main] refdbs 199 socket_cleanup: connection to si->exitsock 0x60 1711 152642 [main] refdbs 199 socket_cleanup: returning 196 152838 [main] refdbs 199 select_stuff::~select_stuff: deleting select records 367 153205 [main] refdbs 199 _write: write (5, 0x245E984, 32) 50066 203271 [main] refdbs 199 _write: 32 = write (5, 0x245E984, 32) 254 203525 [main] refdbs 199 cygwin_select: 6, 0x0, 0x245F93C, 0x245F934, 0x245F92C 384 203909 [main] refdbs 199 dtable::select_write: /dev/tcp fd 5 195 204104 [main] refdbs 199 dtable::select_except: /dev/tcp fd 5 196 204300 [main] refdbs 199 cygwin_select: to->tv_sec 15, to->tv_usec 0, ms 15000 197 204497 [main] refdbs 199 cygwin_select: sel.always_ready 0 371 204868 [main] refdbs 199 start_thread_socket: Handle 0x10 193 205061 [main] refdbs 199 start_thread_socket: Added to writefds 194 205255 [main] refdbs 199 start_thread_socket: Added to exceptfds 874 206129 [main] refdbs 199 start_thread_socket: exitsock 0x110 218 206347 [main] refdbs 199 start_thread_socket: stuff_start 0x245F82C 417 206764 [main] refdbs 199 select_stuff::wait: m 2, ms 15000 32616 239380 [select_socket] refdbs 199 thread_socket: stuff_start 0xA01CBC4 1821 241201 [select_socket] refdbs 199 thread_socket: Win32 select returned 1 2132 243333 [select_socket] refdbs 199 thread_socket: s 0xA011B98, testing fd 5 (/dev/tcp) 661 243994 [select_socket] refdbs 199 thread_socket: write_ready 1912 245906 [main] refdbs 199 select_stuff::wait: woke up. wait_ret 1. verifying 1953 247859 [main] refdbs 199 select_stuff::wait: gotone 1 217 248076 [main] refdbs 199 select_stuff::wait: returning 0 204 248280 [main] refdbs 199 peek_socket: considering handle 0x10 197 248477 [main] refdbs 199 peek_socket: adding write fd_set /dev/tcp, fd 5 197 248674 [main] refdbs 199 peek_socket: adding except fd_set /dev/tcp, fd 5 239 248913 [main] refdbs 199 peek_socket: WINSOCK_SELECT returned 1 203 249116 [main] refdbs 199 set_bits: me 0xA011B98, testing fd 5 (/dev/tcp) 198 249314 [main] refdbs 199 set_bits: ready 1 192 249506 [main] refdbs 199 select_stuff::poll: returning 1 197 249703 [main] refdbs 199 select_stuff::~select_stuff: calling cleanup routines 272 249975 [main] refdbs 199 socket_cleanup: si 0xA019BA0 si->thread 0x38 202 250177 [main] refdbs 199 socket_cleanup: connection to si->exitsock 0x110 1660 251837 [main] refdbs 199 socket_cleanup: returning 197 252034 [main] refdbs 199 select_stuff::~select_stuff: deleting select records 368 252402 [main] refdbs 199 _write: write (5, 0xA017A18, 1) 230 252632 [main] refdbs 199 _write: 1 = write (5, 0xA017A18, 1) 884 253516 [main] refdbs 199 _close: close (5) 280 253796 [main] refdbs 199 _close: 0 = close (5) 237 254033 [main] refdbs 199 do_exit: do_exit (0) 200 254233 [main] refdbs 199 void: 0x1 = signal (20, 0x1) 196 254429 [main] refdbs 199 void: 0x0 = signal (1, 0x1) 193 254622 [main] refdbs 199 void: 0x0 = signal (2, 0x1) 195 254817 [main] refdbs 199 void: 0x0 = signal (3, 0x1) 276 255093 [main] refdbs 199 fhandler_base::close: handle 0x14 273 255366 [main] refdbs 199 fhandler_base::close: handle 0xB0 297 255663 [main] refdbs 199 proc_terminate: nchildren 0, nzombies 0 205 255868 [main] refdbs 199 proc_terminate: leaving 192 256060 [main] refdbs 199 sigproc_terminate: entering 199 256259 [main] refdbs 199 sigproc_terminate: done 414 256673 [main] refdbs 199 __to_clock_t: dwHighDateTime 0, dwLowDateTime 400576 199 256872 [main] refdbs 199 __to_clock_t: total 00000000 00000028 200 257072 [main] refdbs 199 __to_clock_t: dwHighDateTime 0, dwLowDateTime 100144 197 257269 [main] refdbs 199 __to_clock_t: total 00000000 0000000A 197 257466 [main] refdbs 199 _pinfo::exit: Calling ExitProcess 0 240241 25334193 [proc] refdbs 288 proc_subproc: args: 3, 0 269 25334462 [proc] refdbs 288 proc_subproc: pid 199[0] terminated, handle 0x130, nchildren 1, nzombies 0 267 25334729 [proc] refdbs 288 proc_subproc: removing [0], pid 199, handle 0x130, nchildren 1 213 25334942 [proc] refdbs 288 proc_subproc: returning 1 199 25335141 [proc] refdbs 288 sig_send: pid 288, signal 20, its_me 1 202 25335343 [proc] refdbs 288 sig_send: Not waiting for sigcomplete. its_me 1 sig 20 198 25335541 [proc] refdbs 288 sig_send: returning 0 from sending signal 20 195 25335736 [proc] refdbs 288 wait_subproc: looping 507858 25843594 [sig] refdbs 288 wait_sig: awake 611 25844205 [sig] refdbs 288 wait_sig: processing signal 20 556 25844761 [sig] refdbs 288 wait_sig: Got signal 20 884 25845645 [sig] refdbs 288 sig_handle: signal 20 551 25846196 [sig] refdbs 288 sig_handle: signal 20 ignored 557 25846753 [sig] refdbs 288 sig_handle: returning 0 233 25846986 [sig] refdbs 288 proc_subproc: args: 4, 0 196 25847182 [sig] refdbs 288 proc_subproc: looking for processes to reap 197 25847379 [sig] refdbs 288 proc_subproc: finished processing terminated/stopped child 199 25847578 [sig] refdbs 288 proc_subproc: returning 1 193 25847771 [sig] refdbs 288 wait_sig: looping 3722442 29570213 [unknown (0x10F)] refdbs 288 _kill: kill (-288, 2) 782 29570995 [unknown (0x10F)] refdbs 288 kill_pgrp: pid 288, sig 2 2183 29573178 [unknown (0x10F)] refdbs 288 pinfo::init: execed process windows pid 105, cygwin pid 299 855 29574033 [unknown (0x10F)] refdbs 288 kill_pgrp: killing pid 288, pgrp 288, p->ctty 1073741824, myself->ctty 1073741824 729 29574762 [unknown (0x10F)] refdbs 288 sig_send: pid 288, signal 2, its_me 1 697 29575459 [unknown (0x10F)] refdbs 288 sig_send: Waiting for thiscomplete 0x84 --------------------------------------------------------------------- -- Markus Hoenicka, PhD UT Houston Medical School Dept. of Integrative Biology and Pharmacology 6431 Fannin MSB4.114 Houston, TX 77030 (713) 500-6313, -7477 (713) 500-7444 (fax) Markus DOT Hoenicka AT uth DOT tmc DOT edu http://ourworld.compuserve.com/homepages/hoenicka_markus/ -- Want to unsubscribe from this list? Send a message to cygwin-unsubscribe AT sourceware DOT cygnus DOT com