X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=1.9 required=5.0 tests=AWL,BAYES_50,KAM_VIAGRA1 X-Spam-Check-By: sourceware.org X-MDAV-Processed: hima.com, Wed, 18 May 2011 08:27:11 +0200 X-Spam-Processed: hima.com, Wed, 18 May 2011 08:27:10 +0200 (not processed: message from trusted or authenticated source) X-MDRemoteIP: 192.168.100.10 X-Return-Path: s DOT severus AT hima DOT com X-Envelope-From: s DOT severus AT hima DOT com X-MDaemon-Deliver-To: cygwin AT cygwin DOT com Message-ID: <4DD36619.1010401@hima.com> Date: Wed, 18 May 2011 08:24:25 +0200 From: Sven Severus User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: 1.7.9: Problem with line endings of Perl output redirected to a file with textmode mounting Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8bit Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Note-from-DJ: This may be spam Hello all, let me report a strange behaviour with Cygwin Perl (I'm using cygwin1.dll 1.7.9-1, full installation 2 weeks ago). File foo.h is an ordinary text file, all lines are terminated with DOS style line endings (hex: 0d 0a). It is located in a directory with textmode mounting in cygwin. One sequence of foo.h is split by a 4096 byte boundary within the file: "od -c -Ax foo.h" shows a (='\r') at byte offset 4095 (0xfff) and a (='\n') at offset 4096 (0x1000): ... 000ff0 / / / / / / \r \n / / X X X X X \r 001000 \n / / \r \n / / \r \n 001009 Now I issued the command "perl -pe 's/12345/54321/' foo.h >foomod.h" to produce foomod.h, located in the same directory as foo.h, thus with textmode mounting too. When I examined the result, I noticed that foomod.h was one byte bigger then foo.h. I expected identical size, and "od -c -Ax foomod.h" reports: ... 000ff0 / / / / / / \r \n / / X X X X X \r 001000 \r \n / / \r \n / / \r \n 00100a Ups! The original sequence starting at offset 4095 (0xfff) became a three character sequence ! The is duplicated! In other files created by Perl with output redirection I observed this behaviour with every line ending, that is split by a 4096 byte boundary (even multiple times in one output file). Line endings not split by a 4096 byte boundary do not show this behaviour. The behaviour does not occur, when the destination file is located in a directory with binmode mounting. It does not occur either, when I use sed instead of Perl ("sed -e 's/12345/54321/' foo.h >foomod.h"), so I think the problem is specific to Cygwin Perl, not to Cygwin in general. I this a bug of the output buffering mechanism of Cygwin Perl? Or do I anything wrong? Any answer is highly appreciated. Thanks in advance. Best regards Sven -- Mit freundlichen Grüßen Dipl. Inform. Sven Severus Softwareentwicklung ---------------------------------------------------------- HIMA Paul Hildebrandt GmbH + CO KG Abt: Entwicklung Software Albert-Bassermann-Strasse 28 68782 Bruehl Germany Tel: +49 6202 709-289 Fax: +49 6202 709-299 E-Mail: s DOT severus AT hima DOT com Internet: www.hima.de -- HIMA Paul Hildebrandt GmbH + Co KG, Albert-Bassermann-Str. 28, 68782 Bruehl bei Mannheim Kommanditgesellschaft, Sitz Bruehl, Deutschland - Registergericht Mannheim HRA 421017 Ust-ID: DE 144286400, St.Nr: 43038 00190 Persoenlich haftende Gesellschafterin Paul Hildebrandt Verwaltungsgesellschaft mbH, Sitz Bruehl, Deutschland - Registergericht Mannheim HRB 420588 Geschaeftsfuehrer: Dipl.-Betriebswirt Steffen Philipp -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple