delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/06/18/10:52:28

DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 45IEqRPN2788726
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=b5R6McTa
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6D7903882AE0
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1718722346;
bh=UxInKuF3GgZlHkQ56fmPRjdvaMpTm3aPARlOrhrQ/ls=;
h=Subject:Date:In-Reply-To:To:References:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
From;
b=b5R6McTaMzBte/4bqMILzYn2NTqMvKVQny1O6B9SP429UlDZrkKtrCSjYba14qJYT
9Vas2pB4Io+GvayCiYIMKnXNpHYkwXmOPbVgZpNsIOu+IIjt8lhzEiuryzuVNEg1hI
SSunBEA1UDdTXsU3whEnym09YQhM6jf9bC+mUG7o=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4EE413882AC6
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4EE413882AC6
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718722323; cv=none;
b=gPno74LIjZ2ez0RAn0xQgUIq/yWVASlWioYcQDKg2uIPK3w1WQjX/WuIE88eyo42D3FsZcM1VDB++207UYJW+HhyukFNqXmROquZqHfYgmDg+K6KvqT6xBiYhz8o8/xPjzkh8KewSSkibCqqraxgbd3zGjeh7wx7/Qy5/05uJ/Y=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1718722323; c=relaxed/simple;
bh=T3p3FOeUDxUjE0tPh/kQH434rpuQFgQ/PKak3rMIQbA=;
h=DKIM-Signature:From:Message-Id:Mime-Version:Subject:Date:To;
b=ECvC0e03ZDyMplm32BiQM5dDbu29mW7XxIl3BSVKvJFa2hTHb4+LxQDSJ9TmhqB0wt2o5526+nxsr+bJyj9AMPXJb92iLbKn1Y7RKDsNUQm4eYIv8Gs/byq2blCfJOm7W/VPB5x2P+oWaobtcZVCl8d3qlsAD1r+uD61xA1aQM0=
ARC-Authentication-Results: i=1; server2.sourceware.org
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1718722316; x=1719327116;
h=references:to:cc:in-reply-to:date:subject:mime-version:message-id
:from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=Y+4mX+0i9sftbtJiKT1HyJQGe6iJW9GKcQZlMaZ91uc=;
b=NYBop87ny8wRyLIeiTVOillsHyaYYvE8kPAiu9vj5LllWUH2HRv065sXy4RMbDU6Gk
GE1zXNwSP3ygpLjg1zDpXN6oMPy421yyYc0xE7MqQIZ9NvJwecCd47f+YuC6vVetKCfU
uBt6ZPBruKHrmpCfy5d2hPwVetMQpngwToxNjNaFsdnYcTzdCrDgTubV54Xb9IOCKhrc
fQHDPVkYAuXa7b1yZtFR3UhWkCiwj/FsH5gZBsQPfgJfBKE0lZ7FRR6EO72ercWPbsxr
2kut8+Qkg/NOISUD3j2Sid3EpZwH2QfLf2lO4WYpovRnvtVZQtjq5tJSi9KyV4dAY/++
MJCg==
X-Gm-Message-State: AOJu0YxwmqeGvyKz0dV56JD9R/qGts4FqpPmUeP7kIKu4fr/1jBP0f23
lzvhWAS50d0R8ziFTl2IUUQsLOMpeDgbYQQFJDYBbPvfSeswoGZl0+zInYOhxPs=
X-Google-Smtp-Source: AGHT+IEaXE06JzXBbrTjBSJ8iMAD9bCaue5L9RSRFbD5kZNF33hqevkrzwO7Np8VLWtiH6ymyzhqxA==
X-Received: by 2002:a25:86c1:0:b0:dfd:b613:cd5f with SMTP id
3f1490d57ef6-e02be0ff3c8mr109933276.5.1718722315287;
Tue, 18 Jun 2024 07:51:55 -0700 (PDT)
Message-Id: <EE5DC946-7AA2-4D74-8919-8B176089550F@zaxiom.com>
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\))
Subject: Re: Cygwin outputting message to stderr on dofork EAGAIN failure even
when Python exception is caught and handled
Date: Tue, 18 Jun 2024 09:51:44 -0500
In-Reply-To: <PH0PR16MB4782ED8A31ADDFE5A3AC6ABAF5CD2@PH0PR16MB4782.namprd16.prod.outlook.com>
To: "Dale Lobb (Sys Admin)" <Dale DOT Lobb AT bryanhealth DOT org>
References: <DC419F64-5E24-493F-AFB7-9B31062A3FB6 AT zaxiom DOT com>
<1399464798 DOT 20240617105808 AT yandex DOT ru>
<PH0PR16MB4782ED8A31ADDFE5A3AC6ABAF5CD2 AT PH0PR16MB4782 DOT namprd16 DOT prod DOT outlook DOT com>
X-Mailer: Apple Mail (2.3774.500.171.1.1)
X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, DKIM_SIGNED,
DKIM_VALID, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP,
T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
server2.sourceware.org
X-Content-Filtered-By: Mailman/MimeDel 2.1.30
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Nicholas Williams via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Nicholas Williams <nicholas DOT williams AT zaxiom DOT com>
Cc: "cygwin AT cygwin DOT com" <cygwin AT cygwin DOT com>
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 45IEqRPN2788726

Andrey and Dale,

> On Jun 17, 2024, at 11:03, Dale Lobb (Sys Admin) <Dale DOT Lobb AT bryanhealth DOT org> wrote:
> 
> Greetings, Nicholas;
> 
>> From: Cygwin <cygwin-bounces+dale.lobb=bryanhealth DOT org AT cygwin DOT com <mailto:cygwin-bounces+dale.lobb=bryanhealth DOT org AT cygwin DOT com>> On Behalf Of Andrey Repin via Cygwin
>> Sent: Monday, June 17, 2024 2:58 AM
>> To: Nicholas Williams <nicholas DOT williams AT zaxiom DOT com <mailto:nicholas DOT williams AT zaxiom DOT com>>; cygwin AT cygwin DOT com <mailto:cygwin AT cygwin DOT com>
>> Cc: Andrey Repin <anrdaemon AT yandex DOT ru <mailto:anrdaemon AT yandex DOT ru>>
>> Subject: EXTERNAL SENDER: Re: Cygwin outputting message to stderr on dofork EAGAIN failure even when Python exception is caught and handled
>> 
>> Greetings, Nicholas Williams! > We have a Python (installed and run through Cygwin) process running on > Windows Server 2022 that was very, very occasionally failing when subprocess. check_output was called: > 0 [main] python3 28481
>> 
>> Greetings, Nicholas Williams!
>> 
>>> We have a Python (installed and run through Cygwin) process running on
>>> Windows Server 2022 that was very, very occasionally failing when subprocess.check_output was called:
>> 
>>> 0 [main] python3 28481 dofork: child -1 - forked process 16856 died
>>> unexpectedly, retry 0, exit code 0xC0000142, errno 11
>>> …
>>>    subprocess.check_output(["cygpath", "-w", directory.name], encoding="utf-8").strip()
>>> File "/usr/lib/python3.9/subprocess.py", line 424, in check_output <>
>>>     <>return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, <>
>>> File "/usr/lib/python3.9/subprocess.py", line 505, in run <>
>>>     <>with Popen(*popenargs, **kwargs) as process: <>
>>> File "/usr/lib/python3.9/subprocess.py", line 951, in __init__ <>
>>>     <>self._execute_child(args, executable, preexec_fn, close_fds, <>
>>> File "/usr/lib/python3.9/subprocess.py", line 1754, in _execute_child
>>>    self.pid = _posixsubprocess.fork_exec(
>>> BlockingIOError: [Errno 11] Resource temporarily unavailable
>> 
>>> Setting aside for a minute the various reasons this might be happening
>>> occasionally, which we cannot solve for at this moment, the error number
>>> (EAGAIN) indicates that you should “try again.” So that’s exactly what we
>>> did. We added a try/catch to the Python code to catch the BlockingIOError
>>> and, if and only if the error number is EAGAIN, we try up to two more times.
>>> This fixed the problem and caused the application to stop quitting. We
>>> output a warning to our log so that we don’t forget about the problem, but
>>> the warning only ever appears once, so retrying a single time seems to help.
>> 
>>> However … even though Python handles the dofork error, turns it into a
>>> Python exception, and our code catches the Python exception and handles it
>>> properly, Cygwin (not Python … Cygwin) still outputs a message to stderr
>>> right before our warning message. This Cygwin error message shows up as an error in our log tracking:
>> 
>>> 0 [main] python3 15042 dofork: child -1 - forked process 6780 died
>>> unexpectedly, retry 0, exit code 0xC0000142, errno 11
>>> 06/16 13:57:53. 87520: WARNING: Retrying command in 2 seconds due to EAGAIN: [the command we’re running]
>> 
>>> I’m sure there could be any number of things I might be missing, but IMO,
>>> if the process calling dofork properly handles the error raised by dofork,
>>> Cygwin should not be outputting an error message to stderr.
>> 
>>> Thoughts?
>> 
>> My inexperienced and uneducated thought would be that forking code is fragile
>> and some parts of it prone to misbehavior. When an unforeseen error is
>> detected, it's better to report it sooner, than to get bitten by it later.
>> 
>> Regarding your specific issue, if you create a STC[1] (a minimally enough
>> version of your code that, say, fork a process thousands of times, which
>> reliable reproduce the issue) somebody else could run to test the cause, that
>> would be wonderful.
>> 
>> (If, however, you could find and fix the cause, that would be even more wonderful!)
>> 
> 
>  I have seen this exact issue on every Windows  2019 or 2022
> server where I have installed new versions of Cygwin since fall of 2023.
> Admittedly, that has only been 3 or 4 machines, but it sure seems like
> a pattern.  I have resisted upgrading old Cygwin installations for fear
> that they also would start to exhibit this fork problem.
> 
> https://cygwin.com/pipermail/cygwin/2023-September/254417.html
> 
>  The only thing I have found that decreases the frequency of the
> errors is to increase the amount of RAM assigned to the machine.
> It does not eliminate the issue.  I've tried a ton of different
> options with re-basing the Cygwin executables, to no avail.
> 
> 
> Best Regards,
> 
> Dale

To be clear, the problem I’m reporting *IS NOT* the fork failure. Sure, there might be a bug there, or it might just be that we have a resource exhaustion problem that we haven’t been able to identify yet. We’ll figure that out eventually; for now, we have successfully worked around the problem by retrying.

The problem I’m reporting is that Cygwin, for some reason, prints an error message to stderr whenever fork fails, instead of letting the application calling fork do its own error handling. This means that, even though we catch the Python exception and retry (successfully), an error message still gets written to stderr and ends up in our logs. This error message is coming from this Cygwin code:

https://github.com/cygwin/cygwin/blob/7e3c833592b282355a57dd34459b152e4e078d19/winsup/cygwin/fork.cc#L381-L382

In our opinion, low-level system calls like this shouldn’t be writing to stderr. That’s what errno is for (which this call does properly set)—for the application making the system call to decide what to do about the error. That application can always decide to print an error. The system call should not. There appears to be no way for us to disable this error printing.

Nick

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019