delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/04/06/22:44:02

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE
X-Spam-Check-By: sourceware.org
Message-ID: <4BBBF5C2.30102@gmail.com>
Date: Wed, 07 Apr 2010 04:02:26 +0100
From: Dave Korn <dave DOT korn DOT cygwin AT googlemail DOT com>
User-Agent: Thunderbird 2.0.0.17 (Windows/20080914)
MIME-Version: 1.0
To: Bruno Haible <bruno AT clisp DOT org>
CC: Eric Blake <eblake AT redhat DOT com>,
Dave Korn <dave DOT korn DOT cygwin AT googlemail DOT com>, cygwin AT cygwin DOT com
Subject: Re: weak symbols on Cygwin
References: <4BBB31C6 DOT 7080703 AT redhat DOT com> <201004070141 DOT 36284 DOT bruno AT clisp DOT org>
In-Reply-To: <201004070141.36284.bruno@clisp.org>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On 07/04/2010 00:41, Bruno Haible wrote:
> Hi Dave,

  Hi Bruno,

> Dave Korn wrote:
>>>> These were all due to the fact that gcc 4.3.x on Cygwin 1.7.2
>>>> accepts "#pragma weak foo", but the symbol foo then evaluates
>>>> to the NULL address, even if foo is defined in libc.
>>> Dave, are weak symbols something that should work on cygwin with new
>>> enough binutils/gcc?  Or is this an indicator of a gcc bug, for silently
>>> accepting #pragma weak foo that it can't support?
>>   Weak symbols work on Cygwin, but the semantics of undefined weak symbols
>> aren't identical to ELF platforms: a weak reference won't pull in an archive
>> member that wouldn't otherwise be linked; the implications in relation to
>> import libraries should be fairly obvious.
> 
> I don't know what semantics is implemented by "#pragma weak" on Cygwin.

  Well, weak symbols on Cygwin are implemented by using the "weak external
symbols" defined in the PE specification.  To save you looking it up: they are
a kind of undefined reference that don't cause a link failure if no definition
is supplied at final link time; instead, they resolve to a default constant
value, generally zero.

> On ELF platforms, I use "#pragma weak" in order to detect whether a symbol
> is defined in the libraries which are linked in with the executable
> (including libc). This does not work on Cygwin: this program
> 
> ======================================================
>    #include <stdio.h>
>    extern void gurky (void);
>    #pragma weak fputs
>    #pragma weak gurky
>    int main ()
>    {
>      printf ("fputs %s, gurky %s\n",
>              fputs != NULL ? "present" : "missing",
>              gurky != NULL ? "present" : "missing");
>      return 0;
>    }
> ======================================================
> 
> compiled and run with
>   $ gcc -o foo foo.c -Wall
>   $ ./foo
> 
> prints on glibc systems:
>   fputs present, gurky missing
> 
> but on Cygwin 1.7.2:
>   fputs missing, gurky missing
> 
> With this inability to distinguish present from missing libc symbols,
> "#pragma weak" is useless to me on Cygwin.

  Ah, but what you're relying on there is not merely the weak symbol aspect of
ELF, but another, far more problematic one to emulate on Windows: the fact
that on ELF platforms, you can leave undefined references in an executable, to
be filled in by ld.so at runtime according to what the loader finds present
once it's loaded libc.

  Both fputs and gurky are undefined symbols in the foo executable when you
compile it on Linux.  At runtime, ld.so fills in the value for fputs, which it
finds when it loads libc for other reasons, and it doesn't complain that gurky
is undefined because it's a weak symbol.

  On windows, this technique can't work, because every undefined reference in
an executable has to be fully resolved at link-time.  It's the static linker
that has to decide whether there's anything to resolve a weak symbol against,
and if not to resolve the default value.  There's no run-time process of
scanning all available symbols from all loaded libraries against undefined
references in the executable.

  Here's where the subtle semantic difference between weak symbols on ELF and
PE comes in: PE weak symbols don't cause any archive members to be pulled out
of library files into the final link.  And that interacts with another
difference between PE and ELF: on Windows, all the library functions imported
from libc come in the form of import stubs from the (static) import library.
So weak references won't cause any of those symbols to be pulled in, even if
the library does provide them.  But if something else in the program causes
them to be pulled in, the weak references will resolve to the imported
function.  You can see that by adding a second object to your testcase, with a
strong undefined reference:

> $ cat bar.c
> 
> #include <stdio.h>
> 
>   void bar (const char *x)
>   {
>     fputs (x, stdout);
>   }
> 
> $ gcc -o foo  foo.c   -Wall --save-temps
> 
> $ ./foo
> fputs missing, gurky missing
> 
> $ gcc -o foo  foo.c bar.c  -Wall --save-temps
> 
> $ ./foo
> fputs present, gurky missing
> 
> $ 

  (We currently rely on this behaviour to make the libstdc++ malloc wrappers
work: we use weak references to find out what functions a program defines
overrides for, without pulling in unused functions from libstdc++ in their
place.  It's also used in libgcc, which uses a weak reference to tell if the
EH machinery is linked into the application, and call the frame registration
handlers if present, without causing the whole EH machinery to be pulled into
every link.)

  So, it's not really the weakness of the symbols - which really just tells
the linker to shut up and not complain if no definition is provided at link
time - that differs here and causes the problem; it's the whole undefined
references in executables and resolving them by ld.so at runtime that your
testcase relies on here to determine the presence or absence of functions in
the library.  Back to your requirements:

> On ELF platforms, I use "#pragma weak" in order to detect whether a symbol
> is defined in the libraries which are linked in with the executable
> (including libc).

  This whole concept isn't going to translate directly to Windows platforms.
There's no interposing or providing missing definitions at runtime; if it's
not already there at final link time, it's not going to be there at runtime.
The only way windows can determine this sort of thing is through dlsym()ing
all the open libraries, I think.

  What's the larger-scale goal you're trying to support by this method?  Are
you after implementing some sort of plug-in architecture or something?

    cheers,
      DaveK


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019