X-Spam-Check-By: sourceware.org To: cygwin AT cygwin DOT com From: Lewis Hyatt Subject: Re: igncr vs text mode mounts, performance vs compatibility Date: Wed, 25 Oct 2006 19:48:13 +0000 (UTC) Lines: 101 Message-ID: References: <1160655422743 DOT antti DOT nospam DOT 1605718 DOT wGO_WJ9D1NlId3tB-z6Qig AT luukku DOT com> <20061012123406 DOT GA30908 AT trixie DOT casa DOT cgf DOT cx> <452EA386 DOT 9010201 AT qualcomm DOT com> <20061012212011 DOT GA8535 AT trixie DOT casa DOT cgf DOT cx> <452EFDDB DOT 1010301 AT qualcomm DOT com> <452F8719 DOT 9060300 AT cygwin DOT com> <4536BC88 DOT 3030003 AT qualcomm DOT com> <4536C922 DOT 4090807 AT qualcomm DOT com> <45376CBF DOT 1030104 AT byu DOT net> <453EA9C1 DOT 8060402 AT byu DOT net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit User-Agent: Loom/3.14 (http://gmane.org/) X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com > Propose a patch, and I will consider it. In my opinion, it was much > easier to do igncr as an all or none option than it was to parse the first > line and discard \r on a per-file basis, not to mention that all-or-none > is easily configurable so that those of us who WANT literal \r as required > by POSIX can do so. To date, I have been the only person proposing > patches in the area of \r\n handling, rather than just ideas (which is > probably why I am the maintainer of the cygwin bash port). And I have no > personal interest in redoing this patch to a more complex algorithm at > this time. I can understand your point of view, I'm sure it is a huge pain to deal with this and then have to answer all the questions and complaints from people who don't know why their scripts broke. Given the number of support requests on the mailing list, it definitely seems worthwhile to have this issue handled more transparently. The most compelling argument for not translating \r\n, as I understand it, is that searching for the \r slows down processing of large scripts, such as configure, when it is not necessary. On the other hand, most user-created scripts that do contain \r\n line endings are not large enough for the overhead to be problematic. For that reason, I think it makes the most sense to enable or disable igncr on a per-file basis. The other obvious alternative would be to have igncr turned on by default, but then users would have to explicitly disable it to get the performance enhancement when running configure scripts, and most would simply not know they needed to do it. Below is a patch to shell.c which implements my suggestion (It is just the output of diff -u, is that OK? The patch is relative to the 3.2-4 version.) I just modified the open_shell_script() function; now after it reads in 80 characters to make sure the script isn't binary, it also scans that buffer for the first \n it finds and if it is preceded by \r, then it enables igncr. If necessary it reads more characters, 80 bytes at a time. In most cases the performance decrease for this check will be entirely negligible, particularly since the first line of the script is almost always going to just be the #!/bin/bash line anyway. The only downside I can see to this is that bash will behave differently depending whether it is reading from a file or from a pipe. But that is already true in other cases as well, for instance the check for binary files is not performed when reading from a pipe. It would be possible to extend the \r\n check to piped input as well, although it would be somewhat more annoying to implement. If you think this overall concept is a good idea, I am happy to take everyone's suggestions and implement them. It's a bit of a struggle for me to write C code instead of C++, but I can give it a shot. -Lewis --------- --- shell.c 2006-10-25 15:40:53.625000000 -0400 +++ shell_patched.c 2006-10-25 15:41:02.093750000 -0400 @@ -1335,6 +1335,7 @@ open_shell_script (script_name) char *script_name; { + extern int igncr; int fd, e, fd_is_tty; char *filename, *path_filename, *t; char sample[80]; @@ -1426,8 +1427,32 @@ internal_error ("%s: cannot execute binary file", filename); exit (EX_BINARY_FILE); } + + /* Now scan the first line of the file, if it ends with \r\n, then + enable the igncr shell option. */ + + while(sample_len>1) + { + int i; + for(i=0; i!=sample_len; ++i) if(sample[i]=='\n') + { + if(i>0 && sample[i-1]=='\r') igncr=1; + break; + } + if(i