Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin AT sources DOT redhat DOT com Message-ID: <44F361990E78D411996600805F59B81E3B3939@AMSS504> From: "Karman, Geert G OGNL-OGUF" To: "'cygwin AT cygwin DOT com'" Cc: "'ffc AT hetnet DOT nl'" Subject: failing malloc in multithreaded program using cygwin 1.3.2 Date: Mon, 27 Aug 2001 10:42:33 +0200 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: text/plain; charset="iso-8859-1" Hello, I'm trying to port an application which I built on linux to Windows 95. It consists of a GUI displaying data that is produced by a technical simulation program. The main thread starts does the GUI processing, a second thread starts the simulation program as a background process. Once in a while, the simulation program produces data that I keep in a list, so as to be able to scan through it and display some of it in graphs. When new data was added to the list, I signal an event to the main thread and the GUI then updates the displays. I tried to make an extract with the important ingredients in the following lines. // ------------------------------------------------------------ // Event to communicate that new data was made by the process Event new_data; // Mutex to protect data Mutex data_mutex; // GUI class class GUI { public: //... void update() { // this function sets the data and text labels // to be displayed data_mutex.Acquire(); // scan data and extract some numbers // ... // get id number of the data int id = data.get_id(); data_mutex.Release(); // some text processing, for example std::ostrstream text; text << " data id: " << id << endl; // SIGSEV typically occurs in std::string temp_s = text.str(); // one of these statements char * temp_c = temp_s.c_str(); // .... strncpy(label_, temp_c, 100); // some more processing ... } // ..... private: char label_[100]; }; int main() { // making and showing the gui GUI gui; gui.show(); // making and starting the background process // produces new data, typically every 1/2 sec Process proc; proc.Start(); while(proc.is_running()) { if (new_data.Test()) { // if new data were produced: update gui gui.update(); gui.redisplay(); } // wait a little while before trying to update microsec_sleep(200); } } // -------------------------------------------------------------------- The simulation program is actually some old c code (and even some FORTRAN routines) wrapped in a c++ class. The c is compiled with gcc, the FORTRAN with g77, the rest is compiled and the whole thing linked with g++. The c uses printf() which I know is not thread safe, but as there is only one thread executing the c code, I don't expect any problems there, at least not of the kind that I now experience. This program runs fine on linux (although there is some memory leak which I still have to find) for more than a day, but crashes on Windows 95 with a segmentation fault after about five minutes. The point in the source code where this happens, for a single build, is more or less the same, but the point in time can vary (the number of times that the GUI was updated). The SIGSEV mostly occurs in the main thread, typically in an operation involving std::string - I have the feeling that it is when creating memory for strings, though I wasn't always able to trace it -, but it can also occur in the secondary thread (much less likely, one out of ten I'd say), when creating memory for the newly produced data. When I change the code, for example to avoid string processing, the error will occur somewhere else where memory is involved, like when creating a temporary std::vector. In some older e-mails (1998), I read that linking with mmaloc might solve the problem. I gave it a shot but got the following message from gdb: ------------------------------------------- Program received signal SIGSEGV, Segmentation fault. 0x004c2776 in mmalloc () at gui_process_port.cxx:78 Current language: auto; currently c++ (gdb) bt #0 0x004c2776 in mmalloc () at gui_process_port.cxx:78 #1 0x004c29e8 in malloc () at gui_process_port.cxx:78 #2 0x004d0876 in __builtin_new (sz=100) #3 0x004d0b13 in __builtin_vec_new (sz=100) #4 0x004c573e in default_alloc () at gui_process_port.cxx:78 #5 0x004c94bd in _IO_str_overflow () at gui_process_port.cxx:78 #6 0x004c60f2 in strstreambuf::overflow () at gui_process_port.cxx:78 #7 0x004c86c1 in __overflow () at gui_process_port.cxx:78 #8 0x004c8ab2 in _IO_default_xsputn () at gui_process_port.cxx:78 #9 0x004c6eaa in streambuf::xsputn () at gui_process_port.cxx:78 #10 0x004c4b44 in ostream::write () at gui_process_port.cxx:78 #11 0x004e4d92 in ostream & operator<<, __default_alloc_template > (o=@0xcbf7b4, s=@0xcbf71c) at /usr/include/g++-3/std/bastring.cc:470 #12 0x0041c717 in TR_Axis::set_label (this=0x1a96cc8) at TR_Axis.cxx:204 #13 0x0041b011 in Graph_Plot_Area_Window::display (this=0x1a98f00, curve_list=@0x1a8d16c) at Graph_Plot_Area.cxx:232 #14 0x0040b7b2 in Graph_Window::display (this=0x1a8d280, curve_list=@0x1a8d16c) at Graph_Window.cxx:137 #15 0x00429b08 in TP_Curve_Manager::update_running_curves (this=0x1a8d100, running_case=0x15fae00) at TP_Curve_Manager.cxx:154 #16 0x004fbe33 in Graph_Window::update_running_curves (this=0x1a8d280, running_case=0x15fae00) at Graph_Window.h:45 #17 0x00409c0c in Process_Plot::update_running_case (this=0xcbfbcc, running_case=0x15fae00) at Process_Plot.cxx:146 #18 0x00421eba in TR_Running_Case::time_out (v=0x15fae00) at TR_Case.cxx:581 #19 0x0043800d in Fl::wait () at gui_process_port.cxx:78 #20 0x004380e8 in Fl::run () at gui_process_port.cxx:78 #21 0x004065d9 in main () at trfplot.cxx:10 #22 0x61003aea in _libwsock32_a_iname () #23 0x61003cbd in _libwsock32_a_iname () #24 0x61003cfc in _libwsock32_a_iname () #25 0x004cea93 in cygwin_crt0 () at gui_process_port.cxx:78 ---------------------------------------------- There are still some tests that I could do before I give up (like using a pseudo process, and / or going back to single thread processing) but I wondered whether this sounds familiar to anybody. Am I looking at a bug in my code (why then would operator new fail?) or am I using cygwin in the wrong way? I recently upgraded from cygwin 1.1 to 1.3.2, but get exactly the same results. Also different hardware (one box with 64Mb RAM, another with 256) doesn't matter. I'm not subscribed to the mailing list, so please copy me in when replying. Thanks in advance for any help. Geert Karman -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/