X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 79519385841C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1693314240; bh=XQYl3kR8KEk96a+niEpiec1BRXSEF5vrd9aekBFP3tI=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=BAejaRbbF2x4PF9yHHIs9AIT5RgDzuHa3qe12oDqFRaS7JOcGPN7vkxecVx00Biya xTBwWg1wjlO8jcHsq2xn4DTC/5r10foNj1lbnqtUaabm7rPF7BGZLoIKfPOCXfF9Mx jCNkVlN0T8gkCSRAa2nVR+iQwn2YsfJn9euYxGR0= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1EB873858D20 Date: Tue, 29 Aug 2023 15:03:45 +0200 To: cygwin AT cygwin DOT com Subject: Re: gawk core dumped on too many input values Message-ID: Mail-Followup-To: cygwin AT cygwin DOT com, Ed Morton References: <0b55205e-fc7c-98cf-c186-783e5c0655b0 AT comcast DOT net> <50471900-708d-2e92-bd90-aaacfb64873a AT Shaw DOT ca> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <50471900-708d-2e92-bd90-aaacfb64873a@Shaw.ca> X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Corinna Vinschen via Cygwin Reply-To: cygwin AT cygwin DOT com Cc: Corinna Vinschen , Ed Morton Content-Type: text/plain; charset="utf-8" Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 37TD41hJ013277 On Aug 28 12:20, Brian Inglis via Cygwin wrote: > On 2023-08-28 05:47, Joshuah Hurst via Cygwin wrote: > > On Mon, Aug 28, 2023 at 1:08 AM Jeremy Hetzler via Cygwin > > wrote: > > > > > > On Sun, Aug 27, 2023 at 2:25 PM Ed Morton via Cygwin > > > wrote: > > > > > > > > This (original email below) turned out to be a general cygwin issue, not > > > > a gawk issue: > > > > > > > > $ LC_ALL=C sed 's/x/y/' $(seq 1000000) > > > > Segmentation fault (core dumped) > > > > > > > > $ LC_ALL=C grep 'foo' $(seq 1000000) > > > > Segmentation fault (core dumped) This is fixed in current git and can be tested with the next test release cygwin-3.5.0-0.404.gca2a4ec24362, which is just being built and uploaded in a few mins. > > [...] > > Is this limit? > > > > $ getconf -a | grep -E 'ARG_MAX' > > _POSIX_ARG_MAX 4096 > > ARG_MAX 32000 This isn't the real limit. ARG_MAX has been chosen at one point to be 32000, because that's a safe size for the Windows command line length. Therefore this is a hard limit if you start non-Cygwin executables. Cygwin executables don't have this limit. In fact, the limit is defined only by the amount of memory the parent process has available when creating the argv and environment lists for the child. We fixed that in git. As a result, sysconf(_SC_ARG_MAX) will now return -1. I. e., ARG_MAX has an indeterminate limit: $ getconf -a | grep -E 'ARG_MAX' _POSIX_ARG_MAX 4096 ARG_MAX $ getconf ARG_MAX undefined However! limits.h still defines ARG_MAX as 32000, and we'll stick to this, on account it being a safe value. This has a precedent on Linux, where getconf returns something big, but ARG_MAX is still 131072: $ grep ARG_MAX /usr/include/linux/limits.h #define ARG_MAX 131072 /* # bytes of args + environ for exec() */ $ getconf ARG_MAX 2097152 The limits.h limit of 131072 is historical (32 pages for argv and envp). The getconf value is a quarter of the stack which is reserved for argv and envp. I hope that explains things sufficiently. The patches will be backported to 3.4.9. Thanks, Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple