Mail Archives: cygwin/2005/04/01/09:36:47
--------------030508050500010108040108
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Corinna Vinschen wrote :
>>So I hope you wouldn't mind I attached a short testing program you can
>>easily compil with gcc to reproduce the bug.
>>
>>
>
>Cool, that's exactly what I was asking for. I was immediately able to
>reproduce the problem and it turned out, that on fork() the socket
>duplication from parent to child process for some reason occupied space
>in the child, which in the parent is occupied by the shared memory returned
>by shmat.
>
>Consequentially the duplication of the shared memory couldn't occupy the
>same address as in the parent. That's a fatal error so the forked child
>terminated itself with error 487, which basically means "Invalid address".
>
>I've changed fork() so that the shared memory is duplicated before sockets
>are duplicated, which is ok because sockets don't have special requirements
>for memory addresses. That works fine for me, but it would be good if you
>could test the next snapshot, which I just uploaded, nevertheless.
>
>It's just incredible that nobody found this problem before.
>
>
Yes, I find this incredible as any unix server which use IPC (instead of
threads for exemple), will wants to support multiple connections at a
time so use this mechanisms.
I doubt that we're the only ones to use shared memory, socket and
multi-process !!
Anyway, BIG THANKS to have resolved the problem so quickly.
I recompiled from the cygwin cvs, and it solved my problem, my master
now runs well.
However, there is still a problem, sorry ;)
This time with semaphores (either part of IPC). It's less important for
me as the master can runs without them, but it's better to have them.
So i updated the test case to see what happens.
I added semaphore lock/release function that I call in the child
process, so each child want to lock before accepting connection and
released when connection is finished.
For one child, it is ok, but starting second child, the semaphore lock
operation (semop() with sem_flg=SEM_UNDO and sem_op=-1) makes cygserver
hangs !
Then I get "lost connection to cygserver" errors from my process, plus
some "error getting signal_arrived to server(6)" from cygserver process.
So, instead of waiting for semaphore release (semval to go back from 0
to 1), semop returns even if the semaphore is locked, then the program
continues like the semaphore was unlocked, but it is still locked.
moreover, sem value is decremented at each semaphore_lock call, so it
get -1 value at third call, where we want it to have either 0 for locked
and 1 for unlocked. Then it stops here as cygserver is hanged, no more
news from next childs (I set 10 child in the exemple).
under osx for exemple, you see the first child locking the semaphore,
then all childs wait for the semaphore to be released (semop wait for
releasing), and semaphore value is 1 then 0.
I hope this will help,
thank you again for your fix.
Vincent
PS: the same conditions as previous ones apply to this test (windows
version, cygwin dll contains your update on fix_shm_after_fork).
------------------------
--------------030508050500010108040108
Content-Type: text/plain;
name="fork-ipc-sem.c"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="fork-ipc-sem.c"
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/sem.h>
#include <signal.h>
#include <sys/wait.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/errno.h>
#define USE_IPC
#define USE_SEM
//define BIND_AFTER_FORK
#define BUFFERLEN 256
struct database
{
int shmid;
int semid;
int test1;
int test2;
}
*wdb;
int get_shared_memory(char *path_key)
{
key_t key;
int shmid;
int shmflg;
char file[BUFFERLEN];
snprintf(file, BUFFERLEN-1, "%s.exe", path_key);
if ((key = ftok(file, 'Z')) == -1)
{
perror("Getting key for shared memory");
exit(1);
}
shmflg = IPC_CREAT|0600;
if ((shmid = shmget(key, sizeof(struct database), shmflg)) == -1)
{
perror ("Getting shared memory");
exit(1);
}
fprintf(stderr,"shmid: %i\n", shmid);
return (shmid);
}
int get_semaphores(char *path_key)
{
key_t key;
int semid;
struct sembuf op;
int semflg;
char file[BUFFERLEN];
snprintf(file, BUFFERLEN-1, "%s.exe", path_key);
if ((key = ftok(file, 'Z')) == -1)
{
perror("Getting key for semaphores");
exit(1);
}
semflg = IPC_CREAT|0600;
if ((semid = semget(key, 1, semflg)) == -1)
{
perror("Getting semaphores");
exit(1);
}
if (semctl(semid, 0, SETVAL, 1) == -1)
{
perror("semctl SETVAL -> 1");
exit(1);
}
if (semctl(semid, 0, GETVAL) == 0)
{
op.sem_num = 0;
op.sem_op = 1;
op.sem_flg = 0;
if (semop(semid, &op, 1) == -1)
{
perror("semaphore_release");
exit(1);
}
}
fprintf(stderr,"semval: %i semid: %i\n", semctl (semid, 0, GETVAL), semid);
return (semid);
}
void *attach_shared_memory(int shmid)
{
void *rv; // return value
if ((rv = shmat(shmid, 0, 0)) == (void *) -1)
{
perror("shmat");
return ((void *) -1);
}
return (rv);
}
int detach_shared_memory(void *shmaddr)
{
int rv; // return value
if ((rv = shmdt(shmaddr)) == -1)
{
perror("shmdt");
return (-1);
}
return (rv);
}
void set_signal_handlers (void)
{
struct sigaction ignore;
ignore.sa_handler = SIG_IGN;
sigemptyset(&ignore.sa_mask);
ignore.sa_flags = 0;
sigaction(SIGHUP, &ignore, NULL); // So we keep running as a daemon
}
int get_socket(short port)
{
int sfd; //socket file descriptor
struct sockaddr_in addr;
int opt;
opt = 1;
sfd = socket(PF_INET, SOCK_STREAM, 0);
if (sfd == -1)
{
perror("socket");
exit(1);
}
else
{
if (setsockopt(sfd, SOL_SOCKET, SO_REUSEADDR, (int *) &opt, sizeof(opt)) == -1)
perror ("setsockopt");
addr.sin_family = AF_INET;
addr.sin_port = htons(port);
addr.sin_addr.s_addr = htonl(INADDR_ANY);
if (bind(sfd, (struct sockaddr *) &addr, sizeof (addr)) == -1)
{
perror("bind");
sfd = -1;
} else {
listen (sfd, 5);
}
}
return (sfd);
}
int accept_socket (int sfd, struct sockaddr_in *addr)
{
int fd;
int len = sizeof(struct sockaddr_in);
if ((fd = accept(sfd, (struct sockaddr *) addr, &len)) == -1)
{
perror("Accepting connection\n");
exit(1);
}
return (fd);
}
void semaphore_lock(int semid)
{
struct sembuf op;
op.sem_num = 0;
op.sem_op = -1;
op.sem_flg = SEM_UNDO;
fprintf(stderr,"Locking... semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
if (semop(semid, &op, 1) == -1)
{
perror("semaphore_lock");
printf("%i\n",errno);
exit(0);
}
fprintf(stderr,"Locked !!! semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
}
void semaphore_release(int semid)
{
struct sembuf op;
fprintf(stderr,"Unlocking... semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
op.sem_num = 0;
op.sem_op = 1;
op.sem_flg = SEM_UNDO;
if (semop(semid, &op, 1) == -1)
{
perror ("semaphore_release");
printf("%i\n",errno);
exit(0);
}
fprintf(stderr,"Unlocked !!! semval: %i semid: %i\n",semctl (semid,0,GETVAL),semid);
}
int main(int argc, char *argv[])
{
int sfd; // socket file descriptor
int csfd; // child sfd, the socket once accepted
int shmid; // shared memory id
int semid; // semaphore id
struct sockaddr_in addr; // Address of the remote host
pid_t child;
pid_t child_wait;
int n_children;
int rc; // Return code
int i; // For loops
n_children = 0;
set_signal_handlers();
#ifdef USE_IPC
shmid = get_shared_memory(argv[0]);
semid = get_semaphores(argv[0]);
if ((wdb = attach_shared_memory(shmid)) == (void *) -1)
exit (1);
wdb->shmid = shmid;
wdb->semid = semid;
#endif
#ifndef BIND_AFTER_FORK
if ((sfd = get_socket(1234)) == -1)
exit(0);
#endif
printf ("Waiting for connections...\n");
while (1)
{
if (n_children < 10)
{
if ((child = fork()) == 0)
{
#ifdef BIND_AFTER_FORK
if ((sfd = get_socket(1234)) == -1)
exit(0);
#endif
#ifdef USE_SEM
semaphore_lock(wdb->semid);
#endif
if ((csfd = accept_socket(sfd, &addr)) != -1)
{
close(sfd);
// handle connection here
close(csfd);
}
else
perror("Accepting connection\n");
#ifdef USE_SEM
semaphore_release(wdb->semid);
#endif
exit(0);
}
else if (child != -1)
n_children++;
else
perror("Forking\n");
}
else
{
if ((child_wait = wait (&rc)) != -1)
n_children--;
}
}
exit(0);
}
--------------030508050500010108040108
Content-Type: text/plain;
name="fork-ipc-sem.out"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="fork-ipc-sem.out"
shmid: 65536
semval: 1 semid: 65536
Waiting for connections...
Locking... semval: 1 semid: 65536
Locked !!! semval: 0 semid: 65536
Locking... semval: 0 semid: 65536
13 [main] a 2468 transport_layer_pipes::connect: lost connection to cygserver, error = 2
Locked !!! semval: -1 semid: 65536
10 [main] a 4120 transport_layer_pipes::connect: lost connection to cygserver, error = 2
7 [main] a 1092 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4616 transport_layer_pipes::connect: lost connection to cygserver, error = 2
8 [main] a 4844 transport_layer_pipes::connect: lost connection to cygserver, error = 2
11 [main] a 4024 transport_layer_pipes::connect: lost connection to cygserver, error = 2
15 [main] a 4596 transport_layer_pipes::connect: lost connection to cygserver, error = 2
8 [main] a 4368 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4448 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 3800 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 2212 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 5192 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 588 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 5876 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4940 transport_layer_pipes::connect: lost connection to cygserver, error = 2
7 [main] a 2304 transport_layer_pipes::connect: lost connection to cygserver, error = 2
4 [main] a 6080 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 1488 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4076 transport_layer_pipes::connect: lost connection to cygserver, error = 2
10 [main] a 2980 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4152 transport_layer_pipes::connect: lost connection to cygserver, error = 2
6 [main] a 1836 transport_layer_pipes::connect: lost connection to cygserver, error = 2
6 [main] a 3660 transport_layer_pipes::connect: lost connection to cygserver, error = 2
7 [main] a 5408 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4720 transport_layer_pipes::connect: lost connection to cygserver, error = 2
10 [main] a 460 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 5444 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 1752 transport_layer_pipes::connect: lost connection to cygserver, error = 2
4 [main] a 1944 transport_layer_pipes::connect: lost connection to cygserver, error = 2
8 [main] a 5796 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 2928 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 5068 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 1096 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4156 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 3720 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 5992 transport_layer_pipes::connect: lost connection to cygserver, error = 2
9 [main] a 5052 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 3424 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 364 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4360 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4440 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 5548 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 3832 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 2756 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 5148 transport_layer_pipes::connect: lost connection to cygserver, error = 2
9 [main] a 3880 transport_layer_pipes::connect: lost connection to cygserver, error = 2
5 [main] a 4356 transport_layer_pipes::connect: lost connection to cygserver, error = 2
8 [main] a 5836 transport_layer_pipes::connect: lost connection to cygserver, error = 2
--------------030508050500010108040108
Content-Type: text/plain; charset=us-ascii
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
--------------030508050500010108040108--
- Raw text -