X-Spam-Check-By: sourceware.org Message-ID: From: "Dill, Jens (END-CHI)" To: cygwin AT cygwin DOT com Subject: RE: cygheap base mismatch detected Date: Sat, 18 Feb 2006 21:12:02 -0600 MIME-Version: 1.0 Content-Type: text/plain Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com We are finally zeroing in on the problem. Mark Geisert writes: > The code at /src/rebase-2.3.1/rebase.c:255 assumes the signature is at offset 0x80 > in the image. This was true in the early Windows days but has long since been > generalized. The technique nowadays is to obtain the short integer value e_lfanew > at offset 0x3C in the image, and use that as the offset to check for the signature. I dumped out a couple of dlls in hex to see if they supported Mark's assertion. Yes, indeed. And I was able to verify (see below) that all the DLLs that were linked into my app and were not touched by rebase, in fact has something other than 0x0080 in that position of the image (see below). I also did some experimentation that makes me much more certain that a repaired "rebase" will fix the problem. The rebase documentation talks about the fact that rebase exists to fix problems when one process forks another that has the same DLLs based at an incompatible address. This appears to be what is happening to my app. I have verified that if any CygWin shell tries to launch my app (either by fork/exec or just by exec), or if my my app tries to launch a CygWin shell, we get the identical "cygheap base mismatch" problem. If the launch is indirect (via a Windows .bat file), there is no problem. So, I have a workaround of sorts. I can have my script launch my app by writing the command line to a .bat file and executing it. Definitely not something I can use to convince my management to go with CygWin. (The latest word was that our VP "thinks that there is no enterprise app out there that uses cygwin so he is skeptical too.") I finally found where to get the rebase source, and verified that in fact, what Mark noticed in 2.3.1 is still true in 2.4.2-1. I can easily make the obvious fix and change the is_rebaseable function to get the pe_signature_offset out of position 0x3c in the image rather than assuming it is 0x80. But that only affects the bit of the code that decides if a DLL is rebaseable. I would need more time and knowledge to convince myself that the code that actually does the rebasing is not making the same mistake. It seems that there is indeed more to it. I did make the "obvious" change and reran rebaseall. The message I got from the first Oracle DLL it encountered was: ReBaseImage (/cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orasql9.dll) failed with last error = 6 I can't do more without learning a lot more than I currently know about the internals of DLLs and of rebase. But, let us assume for the moment, that we have found the problem, that someone can fix "rebase", and we can use it to keep out of trouble. How shall I represent this to my management? Can someone tell me how long it might take for the fix to get into a "stable" CygWin release? My management may be willing to use an uncertified release for a short while, and even may be willing to own the responsibility for making the change to "rebase", but they'll want to know how long they have to wait for it to be "official." And if "rebase" solves the problem, I presume we have to do it after we've installed Oracle and before we run any of our apps. What happens if we or one of our customers reinstalls Oracle? Do we have to make sure that rerunning "rebaseall" is part of the drill? The doc for "rebase" says: Note it is *strongly* recommended that users only use rebaseall unless they *really* know what they are doing or are instructed by one of the Cygwin developers. Not something we want to have to hand off to our customers, or even to our installation techs if we can avoid it. -- Jens below this line is the code I used to peek at our DLLs, and the results ----------------------------------------------------------------------- $ cat dllpeek.c #include #include int main(int argc, char * * argv) { while (argc--) { int rc = 0; unsigned char at3C[2]; char signature[4]; char * fname = *++argv; FILE * fp = fopen(fname,"rb"); if (!fp) { fprintf (stderr, "%s: could not open file\n", fname); continue; } rc = fseek(fp, 0x3c, SEEK_SET); if (rc) { fprintf (stderr, "%s: fseek returned %d\n", fname, rc); continue; } rc = fread (at3C, 2, 1, fp); if (rc != 1) { fprintf (stderr, "%s: could not read at 0x3c\n", fname); continue; } fprintf (stdout, "%s: %02x%02x\n", fname, at3C[1], at3C[0]); } return 0; } $ ~/dllpeek.exe $(cygpath -u $(cygcheck ./acqjob.exe)) ./acqjob.exe: 0080 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orasql9.dll: 00e0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oracore9.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranls9.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oraunls9.dll: 00e8 /cygdrive/c/WINDOWS/system32/MSVCRT.dll: 00e8 /cygdrive/c/WINDOWS/system32/KERNEL32.dll: 00e8 /cygdrive/c/WINDOWS/system32/ntdll.dll: 00d0 /cygdrive/c/WINDOWS/system32/WSOCK32.dll: 00d8 /cygdrive/c/WINDOWS/system32/WS2_32.dll: 00e0 /cygdrive/c/WINDOWS/system32/ADVAPI32.dll: 00e0 /cygdrive/c/WINDOWS/system32/RPCRT4.dll: 00e0 /cygdrive/c/WINDOWS/system32/WS2HELP.dll: 00d8 /cygdrive/c/WINDOWS/system32/ole32.dll: 00f0 /cygdrive/c/WINDOWS/system32/GDI32.dll: 00e8 /cygdrive/c/WINDOWS/system32/USER32.dll: 00e0 /cygdrive/c/WINDOWS/system32/WINMM.dll: 00e0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oraclient9.dll: 00f8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oravsn9.dll: 00d8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oracommon9.dll: 00f8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orageneric9.dll: 00f0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oraxml9.dll: 0100 /cygdrive/c/WINDOWS/system32/MSVCIRT.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oraxsd9.dll: 00f0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oran9.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranl9.dll: 00f0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranldap9.dll: 00f0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orannzsbb9.dll: 00f8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oraldapclnt9.dll: 00f0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orancrypt9.dll: 00e0 /cygdrive/c/WINDOWS/system32/OLEAUT32.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/ORATRACE9.dll: 00f8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranro9.dll: 00e0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranhost9.dll: 00e0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranoname9.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orancds9.dll: 00d8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orantns9.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranms.dll: 00e0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oranmsp.dll: 00f0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orapls9.dll: 00f0 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/oraslax9.dll: 00e8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orasnls9.dll: 00d8 /cygdrive/d/oracle/app/oracle/product/9.2.0/bin/orawtc9.dll: 00e8 ./cygxerces-c27.dll: 0080 /usr/bin/cygwin1.dll: 0080 ./cygicuuc34.dll: 0080 ./cygicudt34.dll: 0080 (null): could not open file -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/