delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2004/01/07/12:54:33

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
From: "Dave Korn" <dk AT artimi DOT com>
To: <cygwin AT cygwin DOT com>
Subject: Oh dear, pthreads and stdio still not mt-safe :-(
Date: Wed, 7 Jan 2004 17:47:31 -0000
MIME-Version: 1.0
Message-ID: <NUTMEGOngNqOwUx9Etk000000db@NUTMEG.CAM.ARTIMI.COM>
X-OriginalArrivalTime: 07 Jan 2004 17:47:32.0031 (UTC) FILETIME=[5368CCF0:01C3D546]

------=_NextPart_000_0000_01C3D546.5355BA20
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit


    Hi everyone (and Arash in particular!),

  Re: my earlier message [at
http://sources.redhat.com/ml/cygwin/2004-01/msg00072.html ]

  Well, I thought the latest snapshot had solved my problem with stdio
getting messed up by threads, but there's still a bug in there somewhere.

  Recall my original testcase: main spawns two threads, each of which does
nothing apart from spit out single chars to stdout at regular intervals. The
foreground then waits for a crlf from stdin (using scanf) and set a global
flag that causes the threads to exit.

  Well, with the latest cygwin dll snapshot, that testcase works fine - or
in any case, it stopped giving SEGVs.  But it turns out to be an accident of
the fixed and deterministic timing relationships between the output from the
various threads.  When I modified the testcase ever so slightly, to randomly
vary the delay between spitting out chars in the spawned threads, it breaks
again.  It doesn't crash, but the output gets mangled and corrupted.

  Here's an example: mttest1.exe is my original testcase, with intervals of
313ms and 257ms in the threads that print 'W' and 'R' respectively; in
mttest2.exe, both threads output chars at random intervals between 250ms -
506ms.  Now watch them run:

---snip---
dk AT mace /test/mt-test/test1> ./mttest1
Press return/enter to terminate.....Thread #1 enters tf1...
Thread #2 enters tf2...
RWRWRWRWRRW
****AFTER SCANF
RLeaving thread func2!
WLeaving thread func1!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---
dk AT mace /test/mt-test/test1> ./mttest1
Press return/enter to terminate.....Thread #1 enters tf1...
Thread #2 enters tf2...
RWRWRWRWRRWRWRWRWRWRRWRWRWRWRRWRWRWRWRWRRWRWRWRWRWR
****AFTER SCANF
RLeaving thread func2!
WLeaving thread func1!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---

  OK, that's the non-random version working fine.  Now here's the one with
random delays.  Note that as the code doesn't seed the prng, the sequence is
the same every time.  Note also that rand_r isn't exported in cygwin.din, so
I had to snarf the code directly from  the cygwin/newlib source... :) I
believe this omission may be accidental, n'est-ce pas?

---snip---
dk AT mace /test/mt-test/test1> ./mttest2.exe
Press return/enter to terminate.....WRWRWRWRRRRWRWWRWRRW...
****AFTER SCANFs tf2...
RLeaving thread func2!WLeaving thread func1!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---
dk AT mace /test/mt-test/test1> ./mttest2.exe
Press return/enter to terminate.....Thread #1 enters tf1...
Thread #2 enters tf2...
WRWRWRWRRRRWRWWR
****AFTER SCANF
WRLeaving thread func2!
Leaving thread func2!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---
dk AT mace /test/mt-test/test1> ./mttest2.exe
Press return/enter to terminate.....Thread #1 enters tf1...
Thread #2 enters tf2...
WRWRWRWRRRRWRWWR
****AFTER SCANF
WRLeaving thread func2!
Leaving thread func2!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---
dk AT mace /test/mt-test/test1> ./mttest2.exe
Press return/enter to terminate.....Thread #1 enters tf1...
Thread #2 enters tf2...
WRWRWRWRRRRWRWWRWR
****AFTER SCANF
RLeaving thread func2!
WLeaving thread func1!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---
dk AT mace /test/mt-test/test1> ./mttest2.exe
Press return/enter to terminate.....Thread #1 enters tf1...
Thread #2 enters tf2...
WRWRWRWRRRRWRWWR
****AFTER SCANF
WRLeaving thread func2!
Leaving thread func2!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---
dk AT mace /test/mt-test/test1> ./mttest2.exe
Press return/enter to terminate.....Thread #1 enters tf1...
Thread #2 enters tf2...
WRWRWRWR
****AFTER SCANF
RLeaving thread func2!
LLeaving thread func1!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
---snip---

  As you can see, during the first run it outputs a spurious cursor-up
character and starts overwriting earlier output.  In the second run the
"Leaving thread func1" message gets dropped and the "Leaving thread func2"
message is duplicated - or perhaps just the '1' gets overwritten by a
spurious '2'.  In the third and fifth runs the same problem occurs as in the
second run; it's completely reproducible by pressing CR just as the WR
characters line up under the word 'enters' in the "Thread #2 enters..."
message.  Only the fourth run appears correct.  In the sixth run, there's a
duplicated L at the start of the "Leaving thread func1" message: probably
one of the 'W' chars got turned to an L.

  And oh dear even more.  I've just seen my original testcase fail, so I
guess even that one wasn't completely fixed after all.  Notice the spurious
cursor-up again:

---snip---
dk AT mace /test/mt-test/test1> ./mttest1.exe
Press return/enter to terminate.....RWread #1 enters tf1...
****AFTER SCANFs tf2...
RLeaving thread func2!
WLeaving thread func1!
exit ok
exit 2 ok
thread exticodes $     0x1 $     0x2
dk AT mace /test/mt-test/test1> ./mttest1.exe
Press return/enter to terminate.....RWRWRWR#1 enters tf1...
****AFTER SCANFs tf2...
WLeaving thread func1!
exit ok
RLeaving thread func2!
exit 2 ok
thread exticodes $     0x1 $     0x2
dk AT mace /test/mt-test/test1>
---snip---

  This post is long enough as it is, so I've omitted my cygcheck output for
the moment, since it's the same as the one in my last post (url at top of
this post), apart from the fact I'm now using the snapshot
cygwin1-20040103.dll.

  Anyway, that all seems to show there's definitely still a bug in there.
I'm pretty sure it's not down to me having accidentally used any non-mt-safe
functions or otherwise not obeyed the C and POSIX specs, and I hope the
reproducibility of these results might help whoever's working in that area
track down the problem, but I'm pretty stumped myself.  Any clues / hints /
suggestions gratefully received.

   cheers,
     DaveK



------=_NextPart_000_0000_01C3D546.5355BA20
Content-Type: application/octet-stream;
	name="makefile"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="makefile"



# Configurable flags for compilation
CFLAGS?=3D-O0 -g -D_MT -D_REENTRANT
LFLAGS?=3D-lm -lpthread
ALLFLAGS?=3D-Wall


all: mttest1.exe mttest2.exe

mttest1.exe: mttest1.c
	gcc $(CFLAGS) $(LFLAGS) $(ALLFLAGS) mttest1.c -o mttest1.exe

mttest2.exe: mttest1.c
	gcc -DRANDOM $(CFLAGS) $(LFLAGS) $(ALLFLAGS) mttest1.c -o mttest2.exe



------=_NextPart_000_0000_01C3D546.5355BA20
Content-Type: text/plain;
	name="mttest1.c"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="mttest1.c"


#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>

#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <unistd.h>
#include <sys/time.h>

static volatile int thread_exit =3D 0;

int thread_delay (unsigned int time_ms)
{
    usleep (time_ms * 1000);
    return 0;
}

int rand_r (unsigned int *seed)
{
        long k;
        long s =3D (long)(*seed);
        if (s =3D=3D 0)
          s =3D 0x12345987;
        k =3D s / 127773;
        s =3D 16807 * (s - k * 127773) - 2836 * k;
        if (s < 0)
          s +=3D 2147483647;
        (*seed) =3D (unsigned int)s;
        return (int)(s & RAND_MAX);
}

void * thread_func1 (void *arg)
{
int n;
#ifdef RANDOM
unsigned int randseed1 =3D 0xf00dface;
#endif

    n =3D (int)arg;
    fprintf (stdout, "Thread #%d enters tf1...\n", n);
    fflush (stdout);
    while (1)
    {
#ifdef RANDOM
        thread_delay (250 + (rand_r (&randseed1) * 256 / RAND_MAX));
#else
        thread_delay (313);
#endif
        fprintf (stdout, "W");
        fflush (stdout);
        if (thread_exit)
            break;
    }
    fprintf (stdout, "Leaving thread func1!\n");
    fflush (stdout);
    return (void *) 1UL;
}

void * thread_func2 (void *arg)
{
int n;
#ifdef RANDOM
unsigned int randseed2 =3D 0xf00dface;
#endif

    n =3D (int)arg;
    fprintf (stdout, "Thread #%d enters tf2...\n", n);
    while (1)
    {
#ifdef RANDOM
        thread_delay (250 + (rand_r (&randseed2) * 256 / RAND_MAX));
#else
        thread_delay (257);
#endif
        fprintf (stdout, "R");
        fflush (stdout);
        if (thread_exit)
            break;
    }
    fprintf (stdout, "Leaving thread func2!\n");
    fflush (stdout);
    return (void *) 2UL;
}


int main (int argc, const char **argv)
{
int rv;
pthread_t thr1, thr2;
void *tr1, *tr2;

    // spawn two threads.....
    rv =3D pthread_create (&thr1, NULL, thread_func1, (void *)1UL);
    if (rv)
        fprintf (stderr, "err %d create thr1\n", errno);

    rv =3D pthread_create (&thr2, NULL, thread_func2, (void *)2UL);
    if (rv)
        fprintf (stderr, "err %d create thr2\n", errno);

    fflush (stderr);

    // Only actually run if both threads started ok!
    if (!rv) while (1)
    {
        // Control thread: wait for user input and execute it
        char dummy[8];
        fprintf (stdout, "Press return/enter to terminate.....");
        fflush (stdout);
//        fflush (stdin);
//        scanf ("%*[^\r\n]%1[\r\n]", &dummy[0]);
        scanf ("%1c", &dummy[0]);
        fprintf (stderr, "****AFTER SCANF\n");
        // never loop... this is just a test after all....
        break;
    }

    thread_exit =3D 1;
    rv =3D pthread_join (thr1, &tr1);
    if (rv)
        fprintf (stderr, "pthr join errno1 %d\n", errno);
    else
        fprintf (stderr, "exit ok\n");

    rv =3D pthread_join (thr2, &tr2);
    if (rv)
        fprintf (stderr, "pthr join errno1 %d\n", errno);
    else
        fprintf (stderr, "exit 2 ok\n");

    fprintf (stderr, "thread exticodes $%8p $%8p\n", tr1, tr2);
    return 0;

}




------=_NextPart_000_0000_01C3D546.5355BA20
Content-Type: text/plain; charset=us-ascii

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/
------=_NextPart_000_0000_01C3D546.5355BA20--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019