X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:subject:reply-to:to:references:message-id :date:mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=nddgHM+Z0Wl0T2xE 6ZgdD760PK8eYghT2P5OYzYPSXi65edxiDVnfoLv2hoYMrnOWWQDPB38F0qTyroH S/4bJ08FiVimcFQXv+axl7flKrU9HLV9ijTMkslSRpvGP/4fN0h1yPtmmCEvvVq/ DVsBDeBi3cfM0p7cwJa6IwAX5Vo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:subject:reply-to:to:references:message-id :date:mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=kmLlqmVIk5njsytIc9dFN/ 9L9Us=; b=GdiTtKb5EGQnPl3xHduj/LmGOqJlLGQnOmPp+Ws5Cjblk+6dWb6D3I q0V9kSCvYQ5KY/oOLdkF8x6szsSRU8SUjq0f8C0oAVH98S94/axujlI8C5Mp5zV7 fUZ0WV8NMRC8U88UVMlTtFxq0yhK2BK5TjoHZc0KaevKyvSDz+HH0= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=calgary, H*r:ip*192.168.1.100, Calgary, Alberta X-HELO: smtp-out-so.shaw.ca X-Authority-Analysis: v=2.3 cv=cav8UELM c=1 sm=1 tr=0 a=MVEHjbUiAHxQW0jfcDq5EA==:117 a=MVEHjbUiAHxQW0jfcDq5EA==:17 a=N659UExz7-8A:10 a=CCpqsmhAAAAA:8 a=b_J7Jx8z1COwrFdO0U8A:9 a=pILNOxqGKmIA:10 a=FFR_xKssrUEA:10 a=Z3i3OTMkIkoA:10 a=rhrKo6LjUkkA:10 a=CHDpIXivPXAA:10 a=ul9cdbp4aOFLsgKbc677:22 From: Brian Inglis Subject: Re: gawk Regression: CR characters are not stripped on Windows Reply-To: Brian DOT Inglis AT SystematicSw DOT ab DOT ca To: cygwin AT cygwin DOT com References: Message-ID: Date: Tue, 27 Feb 2018 08:03:10 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-CMAE-Envelope: MS4wfCQeEp2AU4wXdxHayKTOPOwaCMNX515M54BfRvB/tF6w7MyghcvyikPovuJD+FpVV4G8pD2BLVYVECSa5hW+msmotiU/ZuCCMIET/KLHfe9QQiAc2cY9 Me83h1G9MBlpgdZrSTC7qLAulOeafRWB3LvQWSdsFfTdyyaLAijKH7aKq/ose/vKedyOmRXFKXoSwA== X-IsSubscribed: yes On 2018-02-27 00:22, Orgad Shaneh wrote: > Cross-posting per Eli Zaretskii's request. > CR characters used to be automatically stripped on Windows (MSYS2 and > Cygwin environments). This is broken in 4.2.0. Cygwin binary mounts treat files as on Unix. You missed all the discussions in early 2017 about gawk, grep, sed EOL handling: https://sourceware.org/ml/cygwin/2017-02/msg00152.html https://sourceware.org/ml/cygwin/2017-02/msg00188.html https://sourceware.org/ml/cygwin/2017-02/msg00189.html following on from discussions about bash after ShellShock: https://sourceware.org/ml/cygwin/2016-08/msg00097.html > Minimal example: > echo -en "foo\r\n\r\nbar\r\n" > foo.txt > awk '/^$/ { print "found" }' foo.txt # This worked with 4.1.4 and > doesn't work with 4.2.0 > awk '/^\r$/ { print "found" }' foo.txt # This works with 4.2.0 and > doesn't work with 4.1.4 >> Under MS-Windows, 'gawk' (and many other text programs) silently >> translates end-of-line '\r\n' to '\n' on input and '\n' to '\r\n' on >> output. Cygwin does not try to be an MS Windows environment. Cygwin tries its best to be a POSIX/Unix/Linux environment. > and on Feb 8 the following section was added: >> Recent versions of Cygwin open all files in binary mode. This means >> that you should use 'RS = "\r?\n"' in order to be able to handle >> standard MS-Windows text files with carriage-return plus line-feed line >> endings. Use DOS files from a Cygwin text mount which does the conversion. > This breaks compatibility between different gawk versions. What were > the reasons for this change in cygwin, and why was it pushed upstream? Compatibility with POSIX/Unix/Linux systems, except on a text mount, to allow scripts which deal with binary data or embedded \r to work correctly, and require scripts which work correctly, on Windows or Unix text as the application provides, prefers, or ignores, and under Unix/Cygwin/Msys/Mingw. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple