Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin AT sources DOT redhat DOT com Subject: Re: SIGILL with pthreads and sockets From: Robert Collins To: Wesel Cc: cygwin AT cygwin DOT com In-Reply-To: <3B6CA668.86B5E8C0@pacbell.net> References: <3B6CA668 DOT 86B5E8C0 AT pacbell DOT net> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <996978320.22039.93.camel@lifelesswks> Mime-Version: 1.0 X-Mailer: Evolution/0.11 (Beta Release) Date: 05 Aug 2001 12:28:36 +1000 X-OriginalArrivalTime: 05 Aug 2001 02:14:38.0715 (UTC) FILETIME=[6121D4B0:01C11D54] On 04 Aug 2001 18:50:32 -0700, Wesel wrote: > And bears, oh my. > > I've been trying to make some very simple proxy server software (just an > Advertisement filter) and not having much luck. I managed to track the problem > down to the function select() and those for resolving host names. When used in > child threads, these functions somehow render the system unstable so that later, > during an innocuous program statement, a SIGILL is raised. Ok, pragmatic solution: just use the precompiled squid proxy server or modify the squid source to do what you need. Onto the less pragmatic solution... > Anyway, here's the code that produces a SIGILL. By commenting out the "#define > THREADED" line, I made all errors go away (as well as any semblance of speed or > efficiency). Could somebody wiser in the ways of cygwin please tell me if I'm > running up against some unforseen or little-known compiler problem? I'm really at a > loss why it doesn't work. What version of cygwin are you using (uname -a)? > Some notes before the code. > 1) Most of the time my threads run synchronously, only one running at a time. Code > I put between the UNLOCK and the LOCK macros consists of blocking functions, select, > gethostname, and such. I repeat, most of the time the thread is LOCKed. Outside of > between the UNLOCK and LOCK macros, shared resources can not be accessed at the > same time. Good. > 2) My mysterious lock_function is a rather lame cludge that checks to make sure I'm > not calling pthread_mutex_lock twice in a row in a thread. If you need a kludge to prevent double locking, there is likely a design issue with your program. > 3) threadcounter is a local variable for each thread. It is initialized to a > constantly incrementing g_threadcounter, so every thread has a unique number > starting from 0 which is the main thread, and continuing 1 through TEST_SET which > are the child threads. I could have used pthread_self(), but where's the fun in > reading hexadecimal anyway? :) . > 4) The hosts string array is not intended to infringe upon any copyrights, being > that it's as many URLs as I could think up in 5 minutes. Please don't sue me, > Disney. > > File: test.cpp > --- > #include > #include > #include > #include > #include //for all our possible error message > > #include > #include > > #include > #include //for watching thread locks Whats in ? My ignorance is showing here. > #define THREADED > > #define TRUE 1 > > struct SocketInfo > { > SocketInfo() {} > > SocketInfo(const SocketInfo& sp) > { > address = sp.address; > socket = sp.socket; > } > sockaddr_in address; > int socket; > const char* host; > }; > > void Test4(void); > void* Test4Thread(void*); > > int main(int argc, char* argv[]) > { > try > { > Test4(); > } > catch(int error) > { > printf("Feep! %s\n",strerror(error)); > } > catch(...) > { > puts("Feeperific!"); > throw; > } > > return 0; > } > > #ifdef THREADED > > pthread_mutex_t popcorn = PTHREAD_MUTEX_INITIALIZER; > > void lock_function(int which, int inc) > { > static map lock_test; > > lock_test[which] += inc; > > if(lock_test[which]>1) > { > printf("Feep! %d thread locked itself twice!\n", which); > exit(0); > } > > if(lock_test[which]<0) > { > printf("Feep! %d thread unlocked itself twice!\n", which); > exit(0); > } > } > > #define LOCK { \ > pthread_mutex_lock(&popcorn); \ > lock_function(threadcounter, 1); \ > printf("%d> Lock\n",threadcounter); \ > } > #define UNLOCK { \ > printf("%d> Unlock\n",threadcounter);\ > lock_function(threadcounter, -1);\ > pthread_mutex_unlock(&popcorn); \ > } > > #else > > #define LOCK > #define UNLOCK > > #endif > > #define DESTPORT 80 > > int g_threadcounter = 0; > int sock_size = sizeof(sockaddr_in); > > > #define TEST_SET 15 > //Make 10 connections. > > SocketInfo dest[TEST_SET]; > pthread_t t_id[TEST_SET]; > char* hosts[TEST_SET] = { > "transform.to", > "integral.org", > "www.google.com", > "altavista.com", > "208.180.232.33", > "www.gamefaqs.com", > "204.71.200.74", > "www.pokemon.com", > "www.disney.com", > "216.218.194.6", > "www.ucdavis.edu", > "www.cnet.com", > "www.gnu.org", > "www.landfield.com", > "216.200.16.61"}; > > int yes = 1; > > void* Test4thread(void* arg) { > > int threadcounter = ++g_threadcounter; You have a race here. Every thread _can_ end up with the same threadcounter. > LOCK; > > SocketInfo& dest = *((SocketInfo*) arg); I may be off my rocker here, but shouldn't this be SocketInfo dest * = (SocketInfo *) arg; otherwise there is no point having a globally declared dest[] anyway. also relevant is the duplicate name that can lead to scope confusion BTCATK. > hostent* hp = NULL; > > try > { > > printf("Thread #%d starting!\n",threadcounter); > > > bzero((char*) dest.address.sin_zero, 8); > // zero the rest of the struct > dest.address.sin_family = AF_INET; > // host byte order > dest.address.sin_port = htons(DESTPORT); > // short, network byte order > > > UNLOCK; > printf("Thread #%d resolving!\n",threadcounter); > printf("Resolving %s...\n", dest.host); You have races here: printf to console is not guaranteed threadsafe. > dest.address.sin_addr.s_addr = inet_addr(dest.host); > if(dest.address.sin_addr.s_addr == (unsigned)-1) > { > //host is not an IP address. Attempt to resolve... > hp = gethostbyname(dest.host); > if (hp) > { > printf("%d> Host! %s\n", threadcounter, hp->h_name); > dest.address.sin_family = hp->h_addrtype; > bcopy(hp->h_addr, (caddr_t)&dest.address.sin_addr, hp->h_length); > } > else > { > printf("Unknown host %s\n", dest.host); > return arg; > } > } > printf("Thread #%d done resolving.\n",threadcounter); > LOCK; > > int right_fd = connect(dest.socket,(sockaddr*) &dest.address,sock_size); > > if(right_fd == -1) > { > puts("Destination would not connect!"); > printf("%s %d\n", _sys_errlist[errno], threadcounter); > return arg; Where's your UNLOCK? > } > > fd_set sockfd; > timeval timeout; > timeout.tv_sec = 1; > timeout.tv_usec = 0; > > > FD_ZERO(&sockfd); > FD_SET(dest.socket,&sockfd); > > printf("Thread #%d selecting!\n",threadcounter); > UNLOCK; > if(select(dest.socket+1, &sockfd, NULL, NULL, &timeout)<0) > { > printf("%d> Feep! %s", threadcounter, strerror(errno)); > return arg; > } > LOCK; > printf("Thread #%d done selecting!\n",threadcounter); > > } > catch(...) { puts("Feeperdeep"); } > > close(dest.socket); > > UNLOCK; > > return arg; > } > > void Test4(void) { > > setbuf(stdout,NULL); > > int threadcounter = g_threadcounter; if g_threadcounter is your global tool to id threads, then this is suspect: you need int *threadcounter = &g_threadcounter; > int i = 0; > > for(i = 0; i < TEST_SET; i++) > { > dest[i].host = hosts[i]; > > if ((dest[i].socket = socket(AF_INET, SOCK_STREAM, 0)) == -1) { > perror("socket"); > exit(1); > } > > if (setsockopt(dest[i].socket,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int)) == -1) { > perror("setsockopt"); > exit(1); > } > > } > > LOCK; //Wait for it... > for(i = 0; i < TEST_SET; i++) > { > t_id[i] = new pthread_t; This is wrong. pthreads is a C interface, not a C++ interface. If you want to initialize a pthread_t variable before pthread_create, use memset (&t_id[i], '\0', sizeof(pthread_t)); It shouldn't be causing your bug, but it will lead to memory leaks at the very least. > #ifdef THREADED > pthread_create (t_id + i, NULL, Test4thread, (void*) (dest + i)); This is not intuitive to read: even though it is functionally correct. I'd suggest pthread_create (&t_id[i], NULL, Test4thread, (void *) (&dest[i])); > #else > Test4thread((void*) (dest + i)); > #endif > } > UNLOCK; //Go! > > for(i = 0; i < TEST_SET; i++) > { > #ifdef THREADED > pthread_join(t_id[i],NULL); > #endif > LOCK; > printf("Thread %d joined.\n",i+1); > UNLOCK; > delete t_id[i]; again, this is wrong, new and delete aren't valid for pthread operations. I have _no idea_ what sideeffects you could be creating by doing this. You are trying to free an opaque pointer. Your compiler *should* be spitting the dummy at this. > } > > threadcounter--; > > puts("This is after threads."); > return; > } > --- > > Get the picture? Basically it creates a bunch of sockets, pairs them up with a > host name in my SocketInfo structure, then has 10 baby threads resolve the host > names, connect the sockets and wait for readable data. Since I'm connecting at > the HTTP port, there will never be readable data until I send "GET somefile.html" > or something. Therefore, the select functions all timeout, then the threads clean > up and exit harmlessly. > > Well maybe not so harmlessly. After thread #6 prints "Thread #6 Done selecting" > and UNLOCKS, a SIGILL happens. It's always thread #6 in GDB. Thread #9 when > not using GDB. I don't know why. It happens sometime while returning from the > lock_function function, according to GDB. > > I'd appreciate it if someone would tell me if this is a problem with cygwin > itself, or with the nut on the end of the keyboard. #0 tell us what version of cygwin this happens on. #1 build yourself a debug version of cygwin. #2 break point at the line before failing, and then breakpoint the fialing system call, that will get you debugging intop the cygwin1.dll itself. #3 I'd do the following, if the code where mine. 1) rather than arrays of data for per thread stuff use pthread_key 2) LOCK and UNLOCK sparingly rather than mostly LOCKED. Just lock around printf's and other global behaviour. Rob > > Wesel > -- > > Please do not feed me Twinkies > > -- > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple > Bug reporting: http://cygwin.com/bugs.html > Documentation: http://cygwin.com/docs.html > FAQ: http://cygwin.com/faq/ > -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/