Delivered-To: listarch-cygwin AT sourceware DOT cygnus DOT com Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm Sender: cygwin-owner AT sourceware DOT cygnus DOT com Delivered-To: mailing list cygwin AT sourceware DOT cygnus DOT com Date: Wed, 17 Feb 1999 19:09:28 GMT Message-Id: <199902171909.TAA28859@gpo.cam.harlequin.co.uk> From: Andrew Innes To: F DOT J DOT Wright AT qmw DOT ac DOT uk CC: mike DOT fabian AT it-mannesmann DOT de, Rolf DOT Sandau AT de DOT bosch DOT com, ntemacs-users AT cs DOT washington DOT edu, cygwin AT sourceware DOT cygnus DOT com In-reply-to: <36C9ADFC.ABD3ACE4@Maths.QMW.ac.uk> (F.J.Wright@qmw.ac.uk) Subject: Re: AW: how to use emacs in -batch mode from bash? References: <5B9BE15FBECDD111A1820000F843B87C16C16F AT bkmail1 DOT bk DOT bosch DOT de> <199902161547 DOT HAA28357 AT june DOT cs DOT washington DOT edu> <36C9ADFC DOT ABD3ACE4 AT Maths DOT QMW DOT ac DOT uk> [added cygwin AT sourceware DOT cygnus DOT com] On Tue, 16 Feb 1999 17:42:20 +0000, "Dr Francis J. Wright" said: >OK. Putting the pieces together, this works and appears to do what you >want: > >bash-2.02$ hi=HO; emacs -batch --eval "(message \\\"$hi\\\")" >HO > >But that leaves the question: why does it work? > >bash-2.02$ set -x >bash-2.02$ emacs -batch --eval "(message \\\"hi\\\")" >+ emacs -batch --eval '(message \"hi\")' >hi > >Hence, this is equivalent to my previous suggestion after variable >interpolation. But I agree with you, Mike, that so many \s should not >be necessary. > >Could it be that NTEmacs is parsing its command line based on an >assumption that is wrong when the shell is bash? It's probably using >libraries that assume the shell is COMMAND or CMD, which have different >quoting rules. Hence, when using bash it is necessary to quote in a way >that makes no sense from a UNIX/bash perspective. That's pretty much right on the nose (except that command.com/cmd.exe don't really have quoting rules; they are too dumb for that). This is the old "Microsoft vs Cygnus" quoting rules problem, but in reverse this time. The basic problem is that Windows applications normally rely on the C library startup code to construct the argv[] array (list of command line arguments) by parsing the command line. (On DOS/Windows, the command line is passed as a single string and it is entirely up to the application how it interprets that string. On Unix, applications receive a list of argument strings exactly as provided by the parent. The C libraries for Windows compilers provide startup code to reconstruct the list of argument strings to emulate the Unix environment.) This technique of the startup code parsing the command line to construct the argument list is perfectly reasonable, but Cygnus put a fly in the ointment by using slightly incompatible rules for parsing the command line. The basic rule is the same for both: arguments are separated by white space (which is discarded), so quotes must be put around arguments that are intended to contain white space. The rules diverge when handling the case where a quote character itself appears in an argument (an embedded quote), and must be escaped so it isn't misconstrued as the end of the argument. Now Emacs was made aware of the two quoting rules back in 19.34.6 days, to solve the problem of constructing the command line for subprocesses started from Emacs, so that the subprocess will "see" the list of arguments that Emacs intends even if there are embedded quotes. (Aside: At the same time, I added some magic so that Emacs would detect automatically which rules to use by looking at the application executable, specifically to check whether it imports cygwin.dll. That has worked well, except that the magic broke with newer releases of the Cygnus library when the dll name changed. The next version of Emacs will have better magic which works with all releases of the cygwin library, and will hopefully continue to work with any future releases.) However, we are now seeing the same problem occuring, this time on the Cygnus side. The Cygnus port of bash will be applying the normal shell quoting rules to parse the command line typed by the user (or entered in the shell script), to construct the list of arguments to pass to Emacs. However, when bash invokes spawn() or exec() or some similiar library function to actually invoke Emacs, it has to flatten the argument list into a single string. Clearly, the library function that does that is assuming the subprocess will use the Cygnus quoting rules to reconstruct the list of arguments. That fails when an argument contains an embedded quote and the application doesn't use the Cygnus rules, which is the situation here. Note that this is a problem with bash that applies when it invokes any application not compiled with the cygwin library, not just Emacs. I see two possible solutions to this general problem: 1. Change the cygwin spawn/exec/whatever library functions to use the Microsoft rules for escaping embedded quotes when running non-cygwin applications (I believe they already detects when they are spawning non-cygwin applications; if not, the method Emacs uses could be reused for this). 2. Change the cygwin quoting rules to match the Microsoft ones. This would apply to spawn/exec and the startup code, and would cause some breakage when mixing with applications compiled with old versions of cygwin. Since cygwin-compiled applications tend to be recompiled when new releases of the library come out, option (2) might actually be viable, and would be the preferred solution since it would maximise the interoperability between applications. But even option (1) would be a major improvement. AndrewI PS. There is a certain amount of irony in all this: the Microsoft startup code looks like it was intended to support escaping embedded quotes by doubling them (as Cygnus does), but the parsing code contains a bug which prevents this from working. If not for this bug, the problem with bash invoking non-cygwin applications wouldn't arise. -- Want to unsubscribe from this list? Send a message to cygwin-unsubscribe AT sourceware DOT cygnus DOT com