Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com To: cygwin AT cygwin DOT com From: Frank Seesink Subject: Problems running Jabberd v1.4.3 under Cygwin v1.5.7 (or latest snapshot), and heap allocation error caused by fork() Date: Thu, 11 Mar 2004 19:03:13 -0500 Lines: 165 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet AT sea DOT gmane DOT org X-Gmane-NNTP-Posting-Host: franktp.wvn.wvnet.edu User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113 QUESTION: Is there an issue in Cygwin 1.5.7 (and still in the 20040306 snapshot) that might cause a program which has been working back in November 2003 to suddenly stop doing so, specifically in its communication with a secondary executable which it forks/launches to do DNS resolution? Second, is there a definite guide to what it truly means when an application kicks up a Win32 error 487 "*** couldn't allocate cygwin heap..." type message? [I am not looking for answers like "try the latest cygwin1.dll" or other "throw the wrench in the engine block and see if the noise stops" sort of solutions. I am looking for a deeper understanding of the error message. I like to 'know' what a program is doing to cause such an error, if that makes sense.] DETAILS: ____________________________________________________________ ISSUE #1: APPLICATION NO LONGER WORKING UNDER LATEST CYGWIN Please note I wrote this list last regarding the same application (Jabberd v1.4.3) back on 4 Dec 2003: http://article.gmane.org/gmane.os.cygwin/41362 for those looking for more background. That thread concerned an apparent issue (since fixed) where cygrunsrv did not send the proper TERMSIG to an entire process group. (Kudos again to Brian Ford and Corinna Vinschen for resolving that. Cygrunsrv has been a champ ever since!) Now I seem to be dealing with a new issue. Since the release of Jabberd v1.4.3 in Nov2003, I have had it running under Cygwin without issue (with the exception of the above). However, starting around the beginning of February, Jabberd can apparently no longer do DNS resolution. Note all other features still work (basically any communications between a Jabber client program and the server itself). Only when attempts are made which involve an external server (like doing server-2-server communication) is there an issue. And the only thing I can think that may have caused it--after talking to our Systems folks to verify no changes were made in our DNS servers or their configurations, or anything else which might have "broken" this simple server app--was an update to Cygwin to the latest and greatest at the time (v1.5.7, released on 31Jan2004). I tried to roll back to v1.5.5 using setup.exe, but that met with some not-so-pleasant messages as I simply fired up BASH. [I think I need to rollback more than just the cygwin package, but not sure what all needs to be reverted. There is no dependency checking in setup.exe that I can tell.] I also tried the latest Cygwin snapshot (whole install, not just DLL), but alas that hasn't fixed things either. To grasp where I think the issue lies, a quick background on Jabberd. Without delving into boring details, Jabberd v1.4.x is an open-source Jabber/XMPP server, details here: http://jabberd.jabberstudio.org/1.4 [For those inclined to ask, I am not working on/with Jabberd2 at this point due to the fact it is a complete rewrite and has its own set of issues under Cygwin, and 1.4.x is still in high use.] The basic achitecture of Jabberd 1.4.x is a main process which fires up, then loads dynamic library modules to handle various subtasks: jabberd.exe +---dialback.dll +---dnsrv.dll <---> jabadns.exe *NOTE: See below +---jsm.dll +---pthsock.dll +---xdb_file.dll [Substitute .so for .dll if you're under *nix.] Note in the diagram above the module 'dnsrv.dll'. This is the one piece left in Jabberd which is completely different under Windows/Cygwin than it is under *nix. In *nix, dnsrv.so is built using libresolv and basically spawns new processes via fork() to do resolution. But between various fork() issues in Cygwin in the past and the lack of a libresolv, it appears the original authors (I cannot take credit for the work) chose to rewrite the asynchronous DNS resolver for Cygwin. The Windows version (dnsrv.dll) basically fires up a second executable called jabadns.exe, which in turn does the DNS resolution via Windows calls and then returns the results. Noting the response when simply typing 'jabadns' at the BASH prompt: $ jabadns Syntax: jabadns it appears that the dynamic library dnsrv.dll communicates with jabadns.exe via handles. That's about all I know. The key thing is that all of this worked just fine for months, then suddenly stopped working around the beginning of Feb 2004. I looked all over, and eventually I noticed that Cygwin v1.5.7 was released on 31Jan2004, and that I had upgraded not long after that...right around the time this DNS resolution started failing. Anyway, apologies for the long-winded explanation, but hoping something in hear clicks for the core cygwin developers. Please note the Jabber server I am running is the same one I compiled back in November. On my testbed machine, I tried recompiling under the latest Cygwin DLLs and tools, etc., but to no avail. I'm afraid I'm stumped. Thoughts/ideas? ____________________________________________________________ ISSUE #2: MAKING JABBERD BUILD LIKE UNDER *NIX (i.e., 'The Bigger Picture') With this latest issue, I decided to retackle what I alluded to in my 4Dec2003 post; namely, ripping out the above Windows-centric version of the dnsrv module and using the original *nix-based version. This was possible thanks to the inclusion of the 'minires' package in Cygwin, and getting Jabberd to compile under Cygwin as if it were any other *nix was painless. The only problem is, I am struggling with a perennial Win32 error 487 "couldn't allocate cygwin heap..." message, which I have definitely traced via gdb to a fork() call within the dnsrv code. Please understand, I do not believe the issue to be the use of fork() itself per se. Jabberd's main process also calls fork(), as well as using GNU Pth 2.0.0 (which in turn calls various spawn() functions from what I remember). But the issue occurs when the one particular fork() call is executed in the dynamic library dnsrv.dll. This made me wonder whether using fork() in a .DLL was a no-no. However, a quick test program confirmed that this was not the case. I wrote a simple main() which used dlopen() to open a simple library which in turn called fork() and ran different commands in the parent and child, and all worked well. I Googled for information and found various mailing list threads where folks have had this Win32 error 487 with various software projects. However, the only advice ever given was of the sort "Oh, that version of Cygwin1.dll appears to have issues. Try another." This does not help me track down the root cause, unless in fact cygwin1.dll still has issues in this regard. I again ran across references to Jason Tishler's rebase tool (and rebaseall script), so using the modified version of rebaseall which I hacked to rebase the Jabberd DLL files, I tried that. But to no avail. No matter what I try, I get this error message. I found indications that under Cygwin, gcc defaults to a set heap/stack size, with a default of 1MB if I read things correctly. So I tried passing arguments like -Wl,--heap,5000000,--stack,5000000 to the linker via the gcc line in the Makefile, in an attempt to make the default heap/stack size larger. Again, nothing changed. At this point, I do not know if I am chasing my tail or not. When an application suffers this Win32 error 487 message, is it usually an indication of some glitch in cygwin1.dll, or is it as the message seems to indicate, either some sort of issue of not enough stack/heap space, or worse, some kind of access violation where the program is attempting to access memory it should not? Again, any and all guidance, advice, wisdom, pointers, etc., welcome. Thanks in advance to anyone who read this far...and especially if you are still willing to help me. :-) -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/