Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: From: Claudio Natoli To: cygwin AT cygwin DOT com Subject: race condition causes psql to hang Date: Thu, 14 Aug 2003 19:58:47 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Hi all, I previously reported an apparent race condition with cygipc under cygwin here: http://archives.postgresql.org/pgsql-bugs/2003-03/msg00209.php (detailed bug report here) and http://archives.postgresql.org/pgsql-cygwin/2003-04/msg00015.php However, I've had a chance to dig a little bit deeper, and have found that it is NOT with cygipc at all, and appears to resemble a bug of the type reported here: http://sources.redhat.com/ml/cygwin-developers/2001-10/msg00175.html (google link for those unable to access cygwin-developers) http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=fix+cond_race+site%3A .redhat.com&btnG=Google+Search&meta= I'm now using cygwin-1.5.2-1 and postgresql (from cvs at time of posting), both compiled from sources with --enable-debug, with the same result. I can fairly reliably reproduce this bug (although increased logging diminishes the likelihood of error, damn Heisenbug... currently have to log to a ramdisk in order to get it to fail with the logging turned on!), and am happy working with any developer to get this nailed. Any takers? :-) Cheers, Claudio gdb backtrace follows: Thread 4 (thread 1388.0x580): #0 0x77fa144c in ntdll!DbgUiConnectToDbg () from /cygdrive/c/WINNT/system32/NTDLL.DLL #1 0x7c50dfdb in KERNEL32!DebugActiveProcess () from /cygdrive/c/WINNT/system32/KERNEL32.DLL #2 0x7c4e987c in SetThreadExecutionState () from /cygdrive/c/WINNT/system32/KERNEL32.DLL Thread 3 (thread 1388.0x6e4): #0 0x77f94091 in ntdll!ZwWaitForSingleObject () from /cygdrive/c/WINNT/system32/NTDLL.DLL #1 0x74fd1394 in _libkernel32_a_iname () from /cygdrive/c/WINNT/system32/msafd.dll #2 0x74fd3c59 in WSPSetSockOpt () from /cygdrive/c/WINNT/system32/msafd.dll #3 0x750312f5 in WSACreateEvent () from /cygdrive/c/WINNT/system32/ws2_32.dll #4 0x61070819 in thread_socket(void*) (arg=0x10049d70) at /usr/src/cygwin-1.5.2-1/winsup/cygwin/select.cc:1220 #5 0x61002b17 in cygthread::stub(void*) (arg=0x610d5b20) at /usr/src/cygwin-1.5.2-1/winsup/cygwin/cygthread.cc:69 #6 0x7c4e987c in SetThreadExecutionState () from /cygdrive/c/WINNT/system32/KERNEL32.DLL Thread 2 (thread 1388.0x498): #0 0x77f9323e in ntdll!ZwWaitForMultipleObjects () from /cygdrive/c/WINNT/system32/NTDLL.DLL #1 0x7c4ebdd7 in WaitForMultipleObjectsEx () from /cygdrive/c/WINNT/system32/KERNEL32.DLL #2 0x7c4fabfb in WaitForMultipleObjects () from /cygdrive/c/WINNT/system32/KERNEL32.DLL #3 0x61076a97 in wait_sig(void*) (self=0x610d5af8) at /usr/src/cygwin-1.5.2-1/winsup/cygwin/sigproc.cc:1108 #4 0x61002b17 in cygthread::stub(void*) (arg=0x610d5af8) at /usr/src/cygwin-1.5.2-1/winsup/cygwin/cygthread.cc:69 #5 0x7c4e987c in SetThreadExecutionState () from /cygdrive/c/WINNT/system32/KERNEL32.DLL Thread 1 (thread 1388.0x520): #0 0x77f9323e in ntdll!ZwWaitForMultipleObjects () from /cygdrive/c/WINNT/system32/NTDLL.DLL #1 0x7c4ebdd7 in WaitForMultipleObjectsEx () from /cygdrive/c/WINNT/system32/KERNEL32.DLL #2 0x7c4fabfb in WaitForMultipleObjects () from /cygdrive/c/WINNT/system32/KERNEL32.DLL #3 0x6106dd59 in select_stuff::wait(_types_fd_set*, _types_fd_set*, _types_fd_set*, unsigned long) (this=0x22fa20, readfds=0x22f9a4, writefds=0x22f994, exceptfds=0x22f984, ms=4294967295) at /usr/src/cygwin-1.5.2-1/winsup/cygwin/select.cc:265 #4 0x6106d5a8 in cygwin_select (maxfds=4, readfds=0x22fb28, writefds=0x22fb18, exceptfds=0x22fb08, to=0x0) at /usr/src/cygwin-1.5.2-1/winsup/cygwin/select.cc:155 #5 0x6105aebd in poll (fds=0x22fbc8, nfds=1, timeout=-1) at /usr/src/cygwin-1.5.2-1/winsup/cygwin/poll.cc:80 #6 0x67988b91 in pqSocketPoll (sock=3, forRead=1, forWrite=0, end_time=-1) at fe-misc.c:1023 #7 0x67988a8a in pqSocketCheck (conn=0x10040578, forRead=1, forWrite=0, end_time=-1) at fe-misc.c:965 #8 0x67988969 in pqWaitTimed (forRead=1, forWrite=0, conn=0x10040578, finish_time=-1) at fe-misc.c:897 #9 0x67988927 in pqWait (forRead=1, forWrite=0, conn=0x10040578) at fe-misc.c:880 #10 0x67986359 in PQgetResult (conn=0x10040578) at fe-exec.c:979 #11 0x67986658 in PQexecFinish (conn=0x10040578) at fe-exec.c:1186 #12 0x00405a72 in SendQuery (query=0x100491a0 "\tdrop table dud_table;") at common.c:623 #13 0x0040eb8d in MainLoop (source=0x610cf288) at mainloop.c:386 #14 0x004105d1 in main (argc=3, argv=0x61621f5c) at startup.c:287 ================================== debugging log snippet at the failure point: CREATE TABLE 134849669 [main] psql 972 cygwin_select: 4, 0x22FB28, 0x22FB18, 0x22FB08, 0x0 134850068 [main] psql 972 cygwin_select: to NULL, ms FFFFFFFF 134850167 [main] psql 972 cygwin_select: sel.always_ready 0, sel.windows_used 0 134850344 [main] psql 972 start_thread_socket: Handle 0x2F8 134850386 [main] psql 972 start_thread_socket: Added to readfds 134850794 [main] psql 972 start_thread_socket: exitsock 0x2E0 134851026 [main] psql 972 start_thread_socket: stuff_start 0x22FA24 134851076 [main] psql 972 cygthread::cygthread: name select_socket, id 0x6E4 134851237 [select_socket] psql 972 thread_socket: stuff_start 0x1004CD94 134851258 [main] psql 972 cygthread::cygthread: activated thread_sync 0x2CC 134851599 [main] psql 972 select_stuff::wait: m 2, ms 4294967295 By comparison with the snippet below (from a successfully completed call), it is clear that the call to thread_socket never completes (specifically, the code called by the macro WINSOCK_SELECT never returns), in agreement with the backtrace for Thread 3 above. Logging output snippet from a completed call follows: CREATE TABLE 134721315 [main] psql 972 cygwin_select: 4, 0x22FB28, 0x22FB18, 0x22FB08, 0x0 134721456 [main] psql 972 cygwin_select: to NULL, ms FFFFFFFF 134721513 [main] psql 972 cygwin_select: sel.always_ready 0, sel.windows_used 0 134721682 [main] psql 972 start_thread_socket: Handle 0x2F8 134721752 [main] psql 972 start_thread_socket: Added to readfds 134723228 [main] psql 972 start_thread_socket: exitsock 0x2E0 134723361 [main] psql 972 start_thread_socket: stuff_start 0x22FA24 134723417 [main] psql 972 cygthread::cygthread: name select_socket, id 0x6E4 134723475 [select_socket] psql 972 thread_socket: stuff_start 0x1004CD94 134723567 [select_socket] psql 972 thread_socket: Win32 select returned 1 134723658 [select_socket] psql 972 thread_socket: s 0x100498E0, testing fd 3 (/dev/streamsocket) 134723713 [select_socket] psql 972 thread_socket: read_ready 134723538 [main] psql 972 cygthread::cygthread: activated thread_sync 0x2CC 134723818 [main] psql 972 select_stuff::wait: m 2, ms 4294967295 134723898 [main] psql 972 select_stuff::wait: woke up. wait_ret 1. verifying 134723951 [main] psql 972 select_stuff::wait: gotone 1 134724002 [main] psql 972 select_stuff::wait: returning 0 134724053 [main] psql 972 select_stuff::cleanup: calling cleanup routines 134724104 [main] psql 972 socket_cleanup: si 0x10049D70 si->thread 0x610D5B20 134724156 [main] psql 972 socket_cleanup: connection to si->exitsock 0x2E0 134724430 [main] psql 972 cygthread::detach: WFSO returns 0, id 0x6E4 134724653 [main] psql 972 socket_cleanup: returning 134724720 [main] psql 972 peek_socket: considering handle 0x2F8 134724773 [main] psql 972 set_bits: me 0x100498E0, testing fd 3 (/dev/streamsocket) 134724825 [main] psql 972 set_bits: ready 1 134724877 [main] psql 972 select_stuff::poll: returning 1 134724928 [main] psql 972 select_stuff::cleanup: calling cleanup routines 134724979 [main] psql 972 select_stuff::~select_stuff: deleting select records --- WE HAVE MOVED - PLEASE NOTE OUR NEW CONTACT DETAILS: THE BASEMENT, 33 EWELL STREET, BALMAIN NSW 2041 TEL: +61 2 9555 1544 FAX: +61 2 9555 6911 Certain disclaimers and policies apply to all email sent from Memetrics. For the full text of these disclaimers and policies see http://www.memetrics.com/emailpolicy.html -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/