delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2001/08/04/21:44:57

Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT sources DOT redhat DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT sources DOT redhat DOT com>
List-Help: <mailto:cygwin-help AT sources DOT redhat DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT sources DOT redhat DOT com
Delivered-To: mailing list cygwin AT sources DOT redhat DOT com
Date: Sat, 04 Aug 2001 18:50:32 -0700
From: Wesel <wassdamo AT pacbell DOT net>
Subject: SIGILL with pthreads and sockets
To: cygwin AT cygwin DOT com
Message-id: <3B6CA668.86B5E8C0@pacbell.net>
MIME-version: 1.0
X-Mailer: Mozilla 4.75 [en] (Win98; U)
X-Accept-Language: en,ja

And bears, oh my.

I've been trying to make some very simple proxy server software (just an
Advertisement filter) and not having much luck.  I managed to track the problem 
down to the function select() and those for resolving host names.  When used in
child threads, these functions somehow render the system unstable so that later,
during an innocuous program statement, a SIGILL is raised.

I looked up SIGILL and it said either it was caused by either a hardware problem or
a problem with the compiler itself.  I'm hoping desperately that it's not a hardware
error as the only thing that sucks worse than my modem is my financial standing. 
We're talking student level poor, folks.  :)  My only solace lies in a compiler error,
and the generous aid of you all.

Anyway, here's the code that produces a SIGILL.  By commenting out the "#define
THREADED" line, I made all errors go away (as well as any semblance of speed or
efficiency).  Could somebody wiser in the ways of cygwin please tell me if I'm
running up against some unforseen or little-known compiler problem?  I'm really at a
loss why it doesn't work.

Some notes before the code.
1)  Most of the time my threads run synchronously, only one running at a time.  Code
I put between the UNLOCK and the LOCK macros consists of blocking functions, select,
gethostname, and such.  I repeat, most of the time the thread is LOCKed.  Outside of
between the UNLOCK and LOCK macros, shared resources can not be accessed at the
same time.
2) My mysterious lock_function is a rather lame cludge that checks to make sure I'm
not calling pthread_mutex_lock twice in a row in a thread.
3) threadcounter is a local variable for each thread.  It is initialized to a
constantly incrementing g_threadcounter, so every thread has a unique number
starting from 0 which is the main thread, and continuing 1 through TEST_SET which
are the child threads.  I could have used pthread_self(), but where's the fun in
reading hexadecimal anyway?  :)
4) The hosts string array is not intended to infringe upon any copyrights, being
that it's as many URLs as I could think up in 5 minutes.  Please don't sue me,
Disney.

File: test.cpp
---
#include <arpa/inet.h>
#include <sys/socket.h>
#include <netdb.h>
#include <unistd.h>
#include <errno.h>//for all our possible error message

#include <stdio.h>
#include <stdlib.h>

#include <pthread.h>
#include <map>//for watching thread locks

#define THREADED

#define TRUE 1

struct SocketInfo
{
	SocketInfo() {}
	
	SocketInfo(const SocketInfo& sp)
	{
		address = sp.address;
		socket = sp.socket;
	}
	sockaddr_in address;
	int socket;
	const char* host;
};

void Test4(void);
void* Test4Thread(void*);

int main(int argc, char* argv[])
{
	try
	{
		Test4();
	}
	catch(int error)
	{
		printf("Feep! %s\n",strerror(error));
	}
	catch(...)
	{
		puts("Feeperific!");
		throw;
	}
	
	return 0;
}

#ifdef THREADED

pthread_mutex_t  popcorn = PTHREAD_MUTEX_INITIALIZER;

void lock_function(int which, int inc)
{
	static map<int,int> lock_test;
	
	lock_test[which] += inc;
	
	if(lock_test[which]>1)
	{
		printf("Feep!  %d thread locked itself twice!\n", which);
		exit(0);
	}
	
	if(lock_test[which]<0)
	{
		printf("Feep!  %d thread unlocked itself twice!\n", which);
		exit(0);
	}
}
	
#define LOCK { \
	pthread_mutex_lock(&popcorn); \
	lock_function(threadcounter, 1); \
	printf("%d> Lock\n",threadcounter); \
}
#define UNLOCK { \
	printf("%d> Unlock\n",threadcounter);\
	lock_function(threadcounter, -1);\
	pthread_mutex_unlock(&popcorn); \
}

#else

#define LOCK
#define UNLOCK

#endif

#define DESTPORT 80

int g_threadcounter = 0;
int sock_size = sizeof(sockaddr_in);


#define TEST_SET 15
//Make 10 connections.

SocketInfo dest[TEST_SET];
pthread_t t_id[TEST_SET];
char* hosts[TEST_SET] = { 
	"transform.to", 
	"integral.org",
	"www.google.com",
	"altavista.com",
	"208.180.232.33",
	"www.gamefaqs.com",
	"204.71.200.74",
	"www.pokemon.com",
	"www.disney.com",
	"216.218.194.6",
	"www.ucdavis.edu",
	"www.cnet.com",
	"www.gnu.org",
	"www.landfield.com",
	"216.200.16.61"};

int yes = 1;

void* Test4thread(void* arg) {
	
	int threadcounter = ++g_threadcounter;
	
	LOCK;
	
	SocketInfo& dest = *((SocketInfo*) arg);
	hostent* hp = NULL;
	
	try 
	{
	
	printf("Thread #%d starting!\n",threadcounter);

	
	bzero((char*) dest.address.sin_zero, 8); 
	// zero the rest of the struct
	dest.address.sin_family = AF_INET;	
	// host byte order
	dest.address.sin_port = htons(DESTPORT);
	// short, network byte order
	
	
	UNLOCK;
	printf("Thread #%d resolving!\n",threadcounter);
	printf("Resolving %s...\n", dest.host);
	dest.address.sin_addr.s_addr = inet_addr(dest.host);
	if(dest.address.sin_addr.s_addr == (unsigned)-1) 
	{
		//host is not an IP address.  Attempt to resolve...
		hp = gethostbyname(dest.host);
		if (hp)
		{
			printf("%d> Host! %s\n", threadcounter, hp->h_name);
			dest.address.sin_family = hp->h_addrtype;
			bcopy(hp->h_addr, (caddr_t)&dest.address.sin_addr, hp->h_length);
		}
		else
		{
				printf("Unknown host %s\n", dest.host);
				return arg;
		}
	}
	printf("Thread #%d done resolving.\n",threadcounter);
	LOCK;
		
	int right_fd = connect(dest.socket,(sockaddr*) &dest.address,sock_size);
	
	if(right_fd == -1)
	{
		puts("Destination would not connect!");
		printf("%s %d\n", _sys_errlist[errno], threadcounter);
		return arg;
	}
	
	fd_set sockfd;
	timeval timeout;
	timeout.tv_sec = 1;
	timeout.tv_usec = 0;
	
	
	FD_ZERO(&sockfd);
	FD_SET(dest.socket,&sockfd);
	
	printf("Thread #%d selecting!\n",threadcounter);
	UNLOCK;
	if(select(dest.socket+1, &sockfd, NULL, NULL, &timeout)<0)
	{
		printf("%d> Feep! %s", threadcounter, strerror(errno));
		return arg;
	}
	LOCK;
	printf("Thread #%d done selecting!\n",threadcounter);
	
	}
	catch(...) { puts("Feeperdeep"); }
	
	close(dest.socket);
	
	UNLOCK;
			
	return arg;
}

void Test4(void) {
	
	setbuf(stdout,NULL);
	
	int threadcounter = g_threadcounter;
	
	int i = 0;

	for(i = 0; i < TEST_SET; i++)
	{
		dest[i].host = hosts[i];
				
		if ((dest[i].socket = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
			perror("socket");
			exit(1);
		}
		
		if (setsockopt(dest[i].socket,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int)) == -1) {
			perror("setsockopt");
			exit(1);
		}
	
	}
	
	LOCK;  //Wait for it...
	for(i = 0; i < TEST_SET; i++)
	{
		t_id[i] = new pthread_t;
#ifdef THREADED
		pthread_create (t_id + i, NULL, Test4thread, (void*) (dest + i));
#else
		Test4thread((void*) (dest + i));
#endif
	}
	UNLOCK; //Go!
	
	for(i = 0; i < TEST_SET; i++)
	{
#ifdef THREADED
		pthread_join(t_id[i],NULL);
#endif
		LOCK;
		printf("Thread %d joined.\n",i+1);
		UNLOCK;
		delete t_id[i];
	}
	
	threadcounter--;
		
	puts("This is after threads.");
	return;
}
---

Get the picture?  Basically it creates a bunch of sockets, pairs them up with a
host name in my SocketInfo structure, then has 10 baby threads resolve the host
names, connect the sockets and wait for readable data.  Since I'm connecting at
the HTTP port, there will never be readable data until I send "GET somefile.html"
or something.  Therefore, the select functions all timeout, then the threads clean
up and exit harmlessly.

Well maybe not so harmlessly.  After thread #6 prints "Thread #6 Done selecting"
and UNLOCKS, a SIGILL happens.  It's always thread #6 in GDB. Thread #9 when
not using GDB.  I don't know why.  It happens sometime while returning from the
lock_function function, according to GDB.

I'd appreciate it if someone would tell me if this is a problem with cygwin
itself, or with the nut on the end of the keyboard.


Wesel
--

Please do not feed me Twinkies

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019