delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2003/02/07/23:56:43

Date: Sat, 8 Feb 2003 04:56:37 GMT
From: abc AT anchorageinternet DOT org
Message-Id: <200302080456.h184ubLx009232@en26.groggy.anc.acsalaska.net>
X-Authentication-Warning: en26.groggy.anc.acsalaska.net: abc set sender to abc AT anchorageinternet DOT org using -f
Subject: Re: Report a Bug (#! bash bug)
X-Mailer: Umail v2.9.2
To: DJ Delorie <dj AT delorie DOT com>
To: <djgpp AT delorie DOT com>
Reply-To: djgpp AT delorie DOT com

> The functionality you reference is not handled by bash at all, so
> quoting the bash man page is irrelevent.  The relevent man page
> is the one for execve(), which states:
> 
> DESCRIPTION
>        execve() executes the program pointed to by filename.  filename
>        must be either a binary executable, or a script starting with a
>        line of the form "#! interpreter [arg]".  In the latter case,
>        the interpreter must be a valid pathname for an executable
>        which is not itself a script, which will be invoked as
>        interpreter [arg] filename.
> 
> Note it says [arg] and not [args]

Thank you for your informative reply.  I did take time to study it,
and make no claim to have knowledge in this area ...  But there must
be a reason it works on the several open unix systems and Bourne shells
i've tried, and not in DJGPP.

1.  note that all references you provided, and 20 others i've read, state:
    script starting with a line of the form "#! interpreter [arg]".

    emphasis on "starting" - it does not say that there must (or should),
    not be any more to the line before the end of the line.  it could've
    been written to say:

    "script containing a line of the form ..." or
    "script consisting of a line of the form ..." or
    "script with a line of the form ...", etc.

    but it wasn't written that way.  all man pages say "starting" ...
    i don't think the word was used frivolously, i think it was used
    intentionally, for a reason.

2. int execve(const char *path, char *const argv[], char *const envp[]);

    i believe the disagreement is essentially based on the method
    used to fill argv[], which is left ambiguous in the execve()
    man pages, other than to define argv[0] (interpreter basename),
    argv[1] (the interpreters "[arg]", if any), and
    argv[2] (the original execve'd path, the script,
             provided argv[1] wasn't used).

    90% of the man pages for execve(2) i've read do not specify a precise
    method of loading the scripts options/arguments (most say the only
    "requirement" is that argv[0] contain the basename of the path),
    other than to say that they fill the rest of argv[]:

        execve(2POSIX)

        argv[]  Pointer to a null-terminated array of character pointers to
                null-terminated character strings. These strings construct the
                argument list to be made available to the new process. At
                least one argument must be present in the array; by custom,
                the first element should be the name of the executed program
                (for example, the last component of path).
--
    so, therefore, it appears that it is left to the programmer to
    load argv[] with additional script arguments however is most
    intelligent, effective, and desirable.

    one would normally assume the rest of argv[] to be loaded with
    arguments from the command line only:

        #!/bin/sh scriptB               (in scriptA)
        $ scriptA -a -b -c

        argv[0]   argv[1]   argv[2] argv[3] argv[4] argv[5]
        sh        scriptB   scriptA -a      -b      -c

    however, this is not necessarily essentially dictated or mandated
    in execve() man pages, and is left wide open in POSIX specs.  it would
    not violate any aspect of execve(2) functionality (or requirement)
    to load argv[] beginning with the arguments given to a script
    (which is given to an interpreter as the "[arg]"):

        #!/bin/sh scriptB -xyz          (in scriptA)
        $ scriptA -a -b -c

        argv[0]   argv[1]   argv[2] argv[3] argv[4] argv[5] argv[6]
        sh        scriptB   -xyz    scriptA -a      -b      -c

    and it makes good sense to be able to do so in many situations,
    for example, i use this kind of set up as the 1st line in awk scripts,
    which causes a sh script (with its options) to parse #include's in
    the awk script, before running the awk script with an appropriate
    command line.  it would be a real kludge to not have this
    functionality, and fortunately, i don't know of a free
    unix and/or Bourne shell that doesn't have it.

3.  i see the System V description:

    "The remaining arguments to the interpreter are arg0 through argn
    of the originally exe-cuted file".

    which is pretty strict, and pretty clear, but it's not pretty
    functional, it doesn't seem to be common to be that strict
    (after reading execve(2) man pages from 20 different systems),
    and the strictness is for no good reason that i can see, and
    very contrary to more POSIX oriented definitions.

    so, i guess, in conclusion, i just ask "why not allow DJGPP
    this functionality?"  if the only answer is "because a SYSV
    definition is strict about it" (i saw a Solaris one too that
    was as well), i'd have to question the sanity of the argument,
    since most other systems are not so strict (18 out of 20 i'd say)
    and since POSIX is very loose on the same issue - and last but not
    least, functionality and functional compatibility matters,
    especially if it is at the cost of nothing else.

> The System V Interface Definition documents the exec*() functions
> as follows:
> 
> DESCRIPTION
> 
>   exec in all its forms overlays a new process image on an old
>   process. The new pro-cess image is constructed from an ordinary
>   executable file. This file is either an exe-cutable object file or a
>   file of data for an interpreter. There can be no return from a
>   successful e x e c because the calling process image is overlaid by
>   the new process image.
> 
>   An interpreter file begins with a line of the form
> 
>     #! pathname [arg]
> 
>   where pathname is the path of the interpreter, and arg is an
>   optional argument. When you exec an interpreter file, the system
>   execs the specified interpreter. The pathname specified in the
>   interpreter file is passed as arg0 to the interpreter. If arg was
>   specified in the interpreter file, it is passed as arg1 to the
>   interpreter. The remaining arguments to the interpreter are arg0
>   through argn of the originally exe-cuted file.
> 
> Note that it explicitly requires that it be *one* argument.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019