delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/09/08/03:37:56

Date: Sat, 8 Sep 2001 10:21:26 +0300 (WET)
From: Andris Pavenis <pavenis AT lanet DOT lv>
X-Sender: pavenis AT ieva06
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>,
Juan Manuel Guerrero <ST001906 AT HRZ1 DOT HRZ DOT TU-Darmstadt DOT De>
Cc: djgpp-workers AT delorie DOT com
Subject: Re: Problem with sed3028b.zip
In-Reply-To: <3B993126.31880.6454B5@localhost>
Message-ID: <Pine.A41.4.05.10109081013060.53626-100000@ieva06>
MIME-Version: 1.0
Reply-To: djgpp-workers AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp-workers AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com


On Fri, 7 Sep 2001 pavenis AT lanet DOT lv wrote:

> On 7 Sep 2001 at 17:41, Eli Zaretskii wrote:
> 
> > > From: "Juan Manuel Guerrero" <ST001906 AT HRZ1 DOT HRZ DOT TU-Darmstadt DOT De>
> > > Date: Fri, 7 Sep 2001 14:43:28 +0200
> > > 
> > > Only FYI: I have used the following very naive test file:
> > > 
> > > bar\
> > > foo\
> > > bar
> > > 
> > > and run the sed command:
> > > 
> > > "s/bar/rab/;s/foo/oof/"
> > > 
> > > I have inspected the produced file with a hex-editor. The file
> > > is completely ok. Especially, all backslashes are followed *only*
> > > by CR LF. This is independent of the EOL used in the source file.
> > 
> > The code I wrote to remove CRs only if followed by an LF might have
> > some obscure bug, that only raises its ugly head in some rare
> > situation, like when the CR character is the last one in the buffer
> > (and Sed needs to look ahead to see if there's an LF after it).  The
> > logic there is pretty tricky.  I will take another look, but don't
> > hold your breath: I usually don't find obscure bugs by just looking at
> > the code.  So I agree a small test case that reproduces this would be
> > useful: there's nothing like GDB to find such bugs.
> 
> It looks very like that. I rebuilt gcc yesterday after some small modifications (regenerated DJGPP source 
> archive and was not able
> to reproduce that). The afected  line was really only passed through 
> sed without modifications that time.
> 

Below is simple test example that shows this problem. I didn't tried to
debug the problem in sed though. Tests were done under DOSEMU-1.0.2
(Linux-2.4.10-pre2, ...), gcc-3.0.1 linux to DJGPP cross-compiler.
(therefore ^M's in message, as I didn't edit test files generated under 
DOSEMU)

I really got double CR in output with sed-3.02.80

Andris

------------------------  test example  ---------------------------------
#include <stdio.h>
#include <stdlib.h>

int main (void)
{
   int i;
   printf ("Generating test data...\n");
   FILE * output = fopen ("sedtest.in","wb");
   for (i=0; i<10000; i++)
     fputs ("a\r\n",output);
   fclose (output);
   system ("sed --version | grep version");
   printf ("Running sed ...\n");
   /* I added some substitution which is never done: no 'b' in input file   */
   system ("sed -e 's:b:a:g' <sedtest.in >sedtest.out");
   printf ("Testing result ...\n");
   if (system("cmp sedtest.in sedtest.out")!=0)
      printf ("Test failed\n");
   else
      printf ("Test passed\n");
   return 0;
}

-----------------------  output with sed-3.02  ---------------------------
Generating test data...^M
GNU sed version 3.02
Running sed ...^M
Testing result ...^M
Test passed^M
-----------------------  output with sed-3.02.80
Generating test data...^M
GNU sed version 3.02.80
Running sed ...^M
Testing result ...^M
sedtest.in sedtest.out differ: char 8193, line 2731^M
Test failed^M

 

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019