delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1999/01/26/09:46:51

Message-Id: <199901261445.OAA45486@out1.ibm.net>
From: "Mark E." <snowball3 AT usa DOT net>
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>, djgpp-workers AT delorie DOT com
Date: Tue, 26 Jan 1999 09:46:05 -0500
MIME-Version: 1.0
Subject: Re: new version of bash 2.02.1 uploaded
References: <199901260020 DOT AAA225886 AT out2 DOT ibm DOT net>
In-reply-to: <Pine.SUN.3.91.990126093847.16926H-100000@is>
X-mailer: Pegasus Mail for Win32 (v3.01d)
Reply-To: djgpp-workers AT delorie DOT com

> 
> On Mon, 25 Jan 1999, Mark E. wrote:
> 
> > This is to (hopefully) prevent mixed EOL styles that confuse Bash with
> > libtool, etc. generated files.
> 
> Isn't it a better idea to fix whatever reason causes Bash to become 
> confused with mixed EOL format?  It's usually best to solve the bug at the
> place where it happens, instead of looking for ways of working around it.

True enough, but the hack works for now.

> 
> Can you describe why does Bash barf on mixed DOS/Unix files? 

The problem arises in input.c. The function fill_buffer detects whether 
any number of CRs were skipped over in the call to read() that it does. 
Later in check_bash_input, it's neccessary to sync up the file descriptor 
with what hasn't been read from the data buffer. This is done by a call 
to sync_buffered_stream_crlf.

The problem arises because this function assumed that when the 
'CRs were detected' flag is set, that every LF had a CR that wasn't 
placed in the buffer filled by read(). In mixed style files, this assumption 
is wrong and in this case results in the file pointer pointing to the 
wrong place. This leads to parse errors like what you see with libtool 
generated files.

Some possible solutions I'm thinking over:

1) Add even more logic to sync_buffered_stream_crlf that would add 
after the two lseek() calls alread there a read() call and compare the 
number of bytes it puts into the buffer with the number expected. Any 
difference is the result of lone LFs. Then one more lseek to get the file 
pointer right going back.

2) Have files handled by input.c be read in binary mode, and have any 
CRs read in be ignored. Then all the additional logic above doesn't 
need to be added and the current EOL logic can be removed.

#2 seems like a better solution to me, so that will be one I try first.

Mark

--- 
Mark Elbrecht snowball3 AT usa DOT net
http://members.xoom.com/snowball3/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019