delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2017/06/08/04:50:45

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:from:to:subject:date:message-id:references
:in-reply-to:content-type:mime-version
:content-transfer-encoding; q=dns; s=default; b=VfJLPp2XTmnevIGG
Lgc+Vc9OZLSBWGRBUNYfRgp2sSiDlMiOFBV46MyUSyRhJLEIM2tReq5VrpLqup6f
px1gK7OEuo/F4ro8fGU/h/5TSu8DfmzwmDocWG9hIQD6s3EbxPHxiIOzZDiGdWyp
YgpMWfjLIw3/A7ahPWqmk3IoUeU=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:from:to:subject:date:message-id:references
:in-reply-to:content-type:mime-version
:content-transfer-encoding; s=default; bh=ebG7BdsZu70pv2MvGzIPTq
AX8nM=; b=Pwwsr8kKDFOp99c7cQrkHVFUd8YzMxzj5MqW7TxRuXcqxufh4VwKTQ
WBKZHkh4JBzCH9CMIkVdFnvRlwlfwdL+EFUdLFRfvYBbmF97qIKh4F3kcCgYEVjd
FWKRjFiiU4Tcivv7VEGBASJ3EzLfNicx1UHZsBmdf9WL3Ts1dFy9I=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,MIME_BASE64_BLANKS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=DOS, dear, Tel, tel
X-HELO: mga06.intel.com
X-ExtLoop1: 1
From: "Soegtrop, Michael" <michael DOT soegtrop AT intel DOT com>
To: Eric Blake <eblake AT redhat DOT com>, "cygwin AT cygwin DOT com" <cygwin AT cygwin DOT com>
Subject: RE: CR-LF handling behavior of SED changed recently - this breaks a lot of MinGW cross build scripts
Date: Thu, 8 Jun 2017 08:50:23 +0000
Message-ID: <0F7D3B1B3C4B894D824F5B822E3E5A175B26CE47@IRSMSX102.ger.corp.intel.com>
References: <0F7D3B1B3C4B894D824F5B822E3E5A175B2636E4 AT IRSMSX103 DOT ger DOT corp DOT intel DOT com> <a53282b6-d00c-aad8-76a6-26b4089a9623 AT redhat DOT com>
In-Reply-To: <a53282b6-d00c-aad8-76a6-26b4089a9623@redhat.com>
dlp-product: dlpe-windows
dlp-version: 10.0.102.7
dlp-reaction: no-action
MIME-Version: 1.0
X-IsSubscribed: yes
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id v588ofak024041

Dear Eric,

> No, the documented behavior is that CR-LF is converted to LF only for text-
> mounted files; but pipelines are default binary-mounted.  If you want to strip
> CR from a pipeline, then make it explicit.
> 
> > var=$( prog | sed .)
> 
> Rewrite that to var=$( prog | tr -d '\r' | sed .)

I have two problems with this:

1.) I build many (~ 50) unix libs and tools MinGW cross on cygwin from sources and this breaks many of the configure and other scripts. Feeding back the fixes to the individual lib/tool maintainers will take quite some time and also results in lengthy discussion why they should care about crappy DOS artefacts at all. A compatibility option via environment variable would have been nice.

2.) It is very hard to interpret the documentation in this way. I am citing from https://www.gnu.org/software/sed/manual/sed.html:

-b --binary  
This option is available on every platform, but is only effective where the operating system makes a distinction between text files and binary files. When such a distinction is made—as is the case for MS-DOS, Windows, Cygwin—text files are composed of lines separated by a carriage return and a line feed character, and sed does not see the ending CR. When this option is specified, sed will open input files in binary mode, thus not requesting this special processing and considering lines to end at a line feed.

This doesn't say what is treated as a text file and what is treated as a binary file and one can reasonably assume that a text tool like sed opens everything not explicitly declared as binary as text, if a documented option like -b exists.

This cygwin sed behavior is documented in https://cygwin.com/cygwin-ug-net/using-textbinary.html but I wouldn't expect people using sed on cygwin will find this.

In summary I would say that the behavior of sed in cygwin is documented in the cygwin documentation, but it is contradicting the documentation of sed itself, and possibly the intended function of sed as a text processing tool.

I must admit that building Linux stuff for MinGW cross on cygwin works substantially better than doing this on MSys/MSys2. The number of patches I need is small, so the decisions the cygwin team took seem to be the right ones. But this change adds at least one order of magnitude in my "number of patches required" statistics. 

Best regards,

Michael

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019