delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/04/01/09:36:47

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Message-ID: <424D5C64.5050706@smousseland.com>
Date: Fri, 01 Apr 2005 16:36:20 +0200
From: Vincent Dedun <kraken AT smousseland DOT com>
User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: ipc, sockets and windows sp2
References: <424D0232 DOT 5060305 AT smousseland DOT com> <20050401090414 DOT GD7415 AT cygbert DOT vinschen DOT de> <424D2B0B DOT 8000604 AT smousseland DOT com> <20050401121143 DOT GD1471 AT cygbert DOT vinschen DOT de>
In-Reply-To: <20050401121143.GD1471@cygbert.vinschen.de>

--------------030508050500010108040108
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Corinna Vinschen wrote :

>>So I hope you wouldn't mind I attached a short testing program you can 
>>easily compil with gcc to reproduce the bug.
>>    
>>
>
>Cool, that's exactly what I was asking for.  I was immediately able to
>reproduce the problem and it turned out, that on fork() the socket
>duplication from parent to child process for some reason occupied space
>in the child, which in the parent is occupied by the shared memory returned
>by shmat.
>
>Consequentially the duplication of the shared memory couldn't occupy the
>same address as in the parent.  That's a fatal error so the forked child
>terminated itself with error 487, which basically means "Invalid address".
>
>I've changed fork() so that the shared memory is duplicated before sockets
>are duplicated, which is ok because sockets don't have special requirements
>for memory addresses.  That works fine for me, but it would be good if you
>could test the next snapshot, which I just uploaded, nevertheless.
>
>It's just incredible that nobody found this problem before.
>  
>

Yes, I find this incredible as any unix server which use IPC (instead of 
threads for exemple), will wants to support multiple connections at a 
time so use this mechanisms.
I doubt that we're the only ones to use shared memory, socket and 
multi-process !!

Anyway, BIG THANKS to have resolved the problem so quickly.
I recompiled from the cygwin cvs, and it solved my problem, my master 
now runs well.

However, there is still a problem, sorry ;)

This time with semaphores (either part of IPC). It's less important for 
me as the master can runs without them, but it's better to have them.
So i updated the test case to see what happens.

I added semaphore lock/release function that I call in the child 
process, so each child want to lock before accepting connection and 
released when connection is finished.

For one child, it is ok, but starting second child, the semaphore lock 
operation (semop() with sem_flg=SEM_UNDO and sem_op=-1) makes cygserver 
hangs !
Then I get "lost connection to cygserver" errors from my process, plus 
some "error getting signal_arrived to server(6)" from cygserver process.

So, instead of waiting for semaphore release (semval to go back from 0 
to 1), semop returns even if the semaphore is locked, then the program 
continues like the semaphore was unlocked, but it is still locked.

moreover, sem value is decremented at each semaphore_lock call, so it 
get -1 value at third call, where we want it to have either 0 for locked 
and 1 for unlocked. Then it stops here as cygserver is hanged, no more 
news from next childs (I set 10 child in the exemple).

under osx for exemple, you see the first child locking the semaphore, 
then all childs wait for the semaphore to be released (semop wait for 
releasing), and semaphore value is 1 then 0.

I hope this will help,
thank you again for your fix.

Vincent

PS: the same conditions as previous ones apply to this test (windows 
version, cygwin dll contains your update on fix_shm_after_fork).

------------------------

--------------030508050500010108040108
Content-Type: text/plain;
 name="fork-ipc-sem.c"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="fork-ipc-sem.c"

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/sem.h>
#include <signal.h>
#include <sys/wait.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/errno.h>

#define USE_IPC
#define USE_SEM
//define BIND_AFTER_FORK 

#define BUFFERLEN 256

struct	database
{
	int		shmid;
	int 	semid;
	int 	test1;
	int 	test2;
}
*wdb;

int			get_shared_memory(char *path_key)
{
	key_t 	key;
	int		shmid;
	int		shmflg;
	char	file[BUFFERLEN];

  snprintf(file, BUFFERLEN-1, "%s.exe", path_key);
	if ((key = ftok(file, 'Z')) == -1)
	{
		perror("Getting key for shared memory");
		exit(1);
	}
	shmflg = IPC_CREAT|0600;
	if ((shmid = shmget(key, sizeof(struct database), shmflg)) == -1)
	{
		perror ("Getting shared memory");
		exit(1);
	}
	fprintf(stderr,"shmid: %i\n", shmid);
	return (shmid);
}

int					get_semaphores(char *path_key)
{
	key_t			key;
	int				semid;
	struct sembuf	op;
	int				semflg;
	char			file[BUFFERLEN];

  snprintf(file, BUFFERLEN-1, "%s.exe", path_key);
	if ((key = ftok(file, 'Z')) == -1)
	{
		perror("Getting key for semaphores");
		exit(1);
	}
	semflg = IPC_CREAT|0600;
	if ((semid = semget(key, 1, semflg)) == -1)
	{
		perror("Getting semaphores");
		exit(1);
	}
	if (semctl(semid, 0, SETVAL, 1) == -1)
	{
		perror("semctl SETVAL -> 1");
		exit(1);
	}
	if (semctl(semid, 0, GETVAL) == 0)
	{
		op.sem_num = 0;
		op.sem_op = 1;
		op.sem_flg = 0;
		if (semop(semid, &op, 1) == -1)
		{
			perror("semaphore_release");
			exit(1);
		}
	}
	fprintf(stderr,"semval: %i semid: %i\n", semctl (semid, 0, GETVAL), semid);
	return (semid);
}

void		*attach_shared_memory(int shmid)
{
	void	*rv; // return value

	if ((rv = shmat(shmid, 0, 0)) == (void *) -1)
	{
		perror("shmat");
		return ((void *) -1);
	}

	return (rv);
}

int		detach_shared_memory(void *shmaddr)
{
	int	rv; // return value

	if ((rv = shmdt(shmaddr)) == -1)
	{
		perror("shmdt");
		return (-1);
	}

	return (rv);
}

void					set_signal_handlers (void)
{
	struct sigaction	ignore;

	ignore.sa_handler = SIG_IGN;
	sigemptyset(&ignore.sa_mask);
	ignore.sa_flags = 0;
	sigaction(SIGHUP, &ignore, NULL); // So we keep running as a daemon
}

int						get_socket(short port)
{
	int					sfd; //socket file descriptor
	struct sockaddr_in	addr;
	int					opt;

	opt = 1;
	sfd = socket(PF_INET, SOCK_STREAM, 0);
	if (sfd == -1)
	{
		perror("socket");
		exit(1);
	}
	else
	{
		if (setsockopt(sfd, SOL_SOCKET, SO_REUSEADDR, (int *) &opt, sizeof(opt)) == -1)
			perror ("setsockopt");
		addr.sin_family = AF_INET;
		addr.sin_port = htons(port);
		addr.sin_addr.s_addr = htonl(INADDR_ANY);
		if (bind(sfd, (struct sockaddr *) &addr, sizeof (addr)) == -1)
		{
			perror("bind");
			sfd = -1;
		} else {
			listen (sfd, 5);
		}
	}
	return (sfd);
}

int		accept_socket	(int sfd, struct sockaddr_in *addr)
{
  int	fd;
  int	len = sizeof(struct sockaddr_in);

	if ((fd = accept(sfd, (struct sockaddr *) addr, &len)) == -1)
  {
    perror("Accepting connection\n");
    exit(1);
  }
  return (fd);
}

void 			semaphore_lock(int semid)
{
  struct sembuf	op;

  op.sem_num = 0;
  op.sem_op = -1;
  op.sem_flg = SEM_UNDO;

  fprintf(stderr,"Locking... semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
  if (semop(semid, &op, 1) == -1)
  {
	perror("semaphore_lock");
	printf("%i\n",errno);
	exit(0);
  }
  fprintf(stderr,"Locked !!! semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
}

void			semaphore_release(int semid)
{
  struct sembuf	op;

  fprintf(stderr,"Unlocking... semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
  op.sem_num = 0;
  op.sem_op = 1;
  op.sem_flg = SEM_UNDO;
  if (semop(semid, &op, 1) == -1)
  {
    perror ("semaphore_release");
	printf("%i\n",errno);
	exit(0);
  }
  fprintf(stderr,"Unlocked !!! semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
}

int						main(int argc, char *argv[])
{
	int					sfd; // socket file descriptor
	int					csfd; // child sfd, the socket once accepted
	int					shmid; // shared memory id
	int					semid; // semaphore id
	struct sockaddr_in	addr; // Address of the remote host
	pid_t				child;
	pid_t				child_wait;
	int					n_children;
	int					rc; // Return code
	int					i; // For loops

	n_children = 0;
	set_signal_handlers();
	
#ifdef USE_IPC
	shmid = get_shared_memory(argv[0]);
	semid = get_semaphores(argv[0]);
	if ((wdb = attach_shared_memory(shmid)) == (void *) -1)
		exit (1);
	wdb->shmid = shmid;
	wdb->semid = semid;
#endif

#ifndef BIND_AFTER_FORK
	if ((sfd = get_socket(1234)) == -1)
		exit(0);
#endif

	printf ("Waiting for connections...\n");
	while (1)
	{
		if (n_children < 10)
		{
			if ((child = fork()) == 0)
			{
#ifdef BIND_AFTER_FORK
				if ((sfd = get_socket(1234)) == -1)
					exit(0);
#endif
#ifdef USE_SEM
				semaphore_lock(wdb->semid);
#endif
				if ((csfd = accept_socket(sfd, &addr)) != -1)
				{
					close(sfd);
					// handle connection here
					close(csfd);
				}
				else
					perror("Accepting connection\n");
#ifdef USE_SEM
				semaphore_release(wdb->semid);
#endif
				exit(0);
			}
			else if (child != -1)
				n_children++;
			else
				perror("Forking\n");
		}
		else
		{
			if ((child_wait = wait (&rc)) != -1)
				n_children--;
		}
	}
	exit(0);
}


--------------030508050500010108040108
Content-Type: text/plain;
 name="fork-ipc-sem.out"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="fork-ipc-sem.out"

shmid: 65536
semval: 1 semid: 65536
Waiting for connections...
Locking... semval: 1 semid: 65536
Locked !!! semval: 0 semid: 65536
Locking... semval: 0 semid: 65536
     13 [main] a 2468 transport_layer_pipes::connect: lost connection to cygserver, error = 2
Locked !!! semval: -1 semid: 65536
     10 [main] a 4120 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      7 [main] a 1092 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4616 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      8 [main] a 4844 transport_layer_pipes::connect: lost connection to cygserver, error = 2
     11 [main] a 4024 transport_layer_pipes::connect: lost connection to cygserver, error = 2
     15 [main] a 4596 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      8 [main] a 4368 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4448 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 3800 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 2212 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 5192 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 588 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 5876 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4940 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      7 [main] a 2304 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      4 [main] a 6080 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 1488 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4076 transport_layer_pipes::connect: lost connection to cygserver, error = 2
     10 [main] a 2980 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4152 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      6 [main] a 1836 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      6 [main] a 3660 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      7 [main] a 5408 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4720 transport_layer_pipes::connect: lost connection to cygserver, error = 2
     10 [main] a 460 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 5444 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 1752 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      4 [main] a 1944 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      8 [main] a 5796 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 2928 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 5068 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 1096 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4156 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 3720 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 5992 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      9 [main] a 5052 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 3424 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 364 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4360 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4440 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 5548 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 3832 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 2756 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 5148 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      9 [main] a 3880 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      5 [main] a 4356 transport_layer_pipes::connect: lost connection to cygserver, error = 2
      8 [main] a 5836 transport_layer_pipes::connect: lost connection to cygserver, error = 2


--------------030508050500010108040108
Content-Type: text/plain; charset=us-ascii

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/
--------------030508050500010108040108--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019