DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 5AEJSBCv3977903 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 5AEJSBCv3977903 X-Recipient: archive-cygwin AT delorie DOT com X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BED3D3858433 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BED3D3858433 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1763148435; cv=none; b=KLoP7X5TBP2EEmIIyZEB2S8gmRrqmBIvYTxakheUHQC7H0x3Vl8spCuLUov/ZlchQku6c1VpqAyi24MDNRhvum3Kw9Enfheq12EDZanU1CxfkI/zWHep8kUyXaCAO8BpI99zBhloob772miUTupQ4RZI5yYd5ao49JxizNj5KA8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1763148435; c=relaxed/simple; bh=WRsBR1ZV0i8S0suWLRuZenTBACHwe6AiEwhrpzYZC2I=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=GYBV5aWwesHcSkDiCLnYJxMMyERXxdzR0I4yJLLaczS9go+C+p371+SmKfOAvPAhhSxfgs2N0fLDIqdAJdnf+uh0Sk9HVnOeyE1AQM0G+6vzzI2wn0vyJodj2hHAt2yqBImta5FH7AuMkSMqPyy/FDfjqljvmNFFICw+awyGuLI= ARC-Authentication-Results: i=1; server2.sourceware.org X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763148435; x=1763753235; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wu5Eesn/uaw7D/+i2G0w1IwcNEyHoAfk0g2GvqQIq6A=; b=FvKAqM1Ne44Q1+IUKZjWq7c1th/ag4e2g/yInzKz+pF0zE5Yg99vr9zHyPSms57TdZ Smim5MpgMPob2aYAoHvtHPOuPDewTs6rqD/5ycxI0j5ASpRo4xCxVYkGGM5h9RZVtPX5 9dvEDud7wck2YbMv/eCGhSjD7GWmWES2ROZaymNuiwBdBLUtC684lEJmku4XJ9LnvjrY rsswl0A2ofIGQUijrWz2TD4AWKN761F08BAcUy3dZjIM+rEnJxJTc3WeA+xGEW/XnEaV rempXZYgBOAps6AJ18HklR/2Y2mmW9VQPH1MpoqxgDcPWEQbIb3kfU4U2LY9/UahWrMa QbBQ== X-Gm-Message-State: AOJu0YzPWzNil89MhkEWRU0HIvxbKJpYTSh6fScvb0k6xISKhzJqi4me UF7VMgPjBn7Avw6wpDsGCZ3bNBvS+XQtdA1QKmL0FnfOh6hdFEL3jZ4/dZruWWmSen1wMU2ioqK AtFeywoqBkG/BQl07oInbKYnpJQWQQv03zNg= X-Gm-Gg: ASbGncttJr09UhYDjL1rAbxU2vxQfFdkubCKkqjJnR1uXT0C8ztQ/8sstIPWuN/pg5c Krwc1XqyBF53D22iqwEm0V8ppjoLQ+ybSZGgzacJFvQEhMZuUeAd0+p+eJQbqK+veuLDszskR4N uJ1/3ezoAB0u8gXHVf34zm+VJGrBSDzur1wsS9iKWN6n9Hgw8yrS3YEHws/BWPaU3uTUzayShQy 7WBR+z7mccwx30ag0SUKlaS7uYLt9Yo/HpstwteHPEd4m8edzeb7bE6d7wo5XgR4mq2oRUyI/Z9 Avwuka93GLJYAxJtIsLUk2fsqWEzV4vd3vTr X-Google-Smtp-Source: AGHT+IHU66/hAESuG+Eb4XFuAIThWP9EyNLwWf5yOUub3EObDIcddrN8kpkHb6AekIm3ASGDTLKndotgN2NrJVxPCdQ= X-Received: by 2002:a05:690c:fc1:b0:786:896d:884a with SMTP id 00721157ae682-78929f60178mr28763107b3.9.1763148435018; Fri, 14 Nov 2025 11:27:15 -0800 (PST) MIME-Version: 1.0 References: <20251112182412 DOT ba3a65f36838b9b5fd7d3f9b AT nifty DOT ne DOT jp> In-Reply-To: <20251112182412.ba3a65f36838b9b5fd7d3f9b@nifty.ne.jp> Date: Fri, 14 Nov 2025 11:27:04 -0800 X-Gm-Features: AWmQ_blKQGc5LfIpMvFldDV1eBJjFXs2V1rBuCAaz0d7c1eX4hqEHsuKBHrINtU Message-ID: Subject: Re: flock/open random error To: Takashi Yano Cc: cygwin AT cygwin DOT com X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , From: Nahor via Cygwin Reply-To: Nahor Content-Type: text/plain; charset="utf-8" Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 5AEJSBCv3977903 If `flock()` was used on the same file descriptor, then this might have been a valid point. However, each thread has its own file descriptor in this case, so this would be very surprising if it wasn't thread-safe. Moreover, it's not just `flock()` failing, it's also (and mostly!) `open()` that fails. And it's the `open()` for a completely different file than the one being locked. So that would suggest that `open()` is not also not MT-safe. And not safe when using different files. And not safe across multiple different functions (flock+open). And more generally, I've never heard of file operations not being thread safe. Atomicity and ordering are common problems, but not thread safety. Nahor On Wed, Nov 12, 2025 at 1:24 AM Takashi Yano wrote: > > On Tue, 21 Oct 2025 17:41:51 -0700 > Nahor wrote: > > Hi, > > > > There is a test in the Fish shell (tests::history::test_history_races) > > that systematically fails when I run it. The test simulates multiple > > processes/threads trying to write to the shell history file at the > > same time. > > In my case, the test freezes/deadlocks with errors like "Bad Addr" and > > "Is Directory". > > When I add a sleep, the freeze/deadlocks disappear but the test > > eventually fails because the fake history is not the right size. > > See https://github.com/fish-shell/fish-shell/issues/11933 for more details. > > > > I wrote a test case in pure C (attached) that also triggers the issue > > although it's not as systematic (30-50%). > > To compile: gcc main.c -o test.exe > > To run: ./test.exe > > > > > > Most failures look like this: > > ``` > > $ ./test.exe > > tmp_dir: /tmp/flockc2Hz4c > > open file error: 21 - Is a directory > > /tmp/flockc2Hz4c/append_file > > assertion "file_fd >= 0" failed: file "main.c", line 49, function: > > thread_func > > Aborted > > ``` > > Occasionally (maybe 10%), it looks like that: > > ``` > > $ ./test.exe > > tmp_dir: /tmp/flock5Oly9J > > lock error: 14 - Bad address > > assertion "lock_res == 0" failed: file "main.c", line 38, > > function: thread_func > > Aborted > > ``` > > > > I believe the freeze/deadlock in the Fish test is because, unlike my > > test, they don't assert/crash, and the next time they access the > > history file, there is a bunch of deadlock in cygwin internals. > > If that helps, this is a partial capture of the stack traces at one such time: > > ``` > > Thread 9 > > #2 0x00000001800d487f in muto::acquire (this=0x1802c24c0 > > , ms=ms AT entry=4294967295) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/sync.cc:84 > > #3 0x00000001800dd6e0 in dtable::lock (this=) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/local_includes/dtable.h:77 > > #4 cygheap_fdnew::cygheap_fdnew (this=, > > seed_fd=-1, lockit=true) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/local_includes/cygheap.h:593 > > #5 open (unix_path=0xa0002b3b0 > > "[...]/fish-shell/target/fish-test-home", flags=262144) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/syscalls.cc:1576 > > > > Thread 10 > > #2 0x00000001800d487f in muto::acquire (this=0x1802c24c0 > > , ms=ms AT entry=4294967295) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/sync.cc:84 > > #3 0x00000001800dd6e0 in dtable::lock (this=) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/local_includes/dtable.h:77 > > #4 cygheap_fdnew::cygheap_fdnew (this=, > > seed_fd=-1, lockit=true) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/local_includes/cygheap.h:593 > > #5 open (unix_path=0xa0002bfe0 > > "[...]/fish-shell/target/fish-test-home/race_test_history.FwyAgK", > > flags=264706) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/syscalls.cc:1576 > > > > Thread 11 > > #2 0x00000001800670bb in inode_t::LOCK (this=0x80000ba20) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/flock.cc:314 > > #3 inode_t::get (dev=1881899537, ino=ino AT entry=10977524092162599, > > create_if_missing=create_if_missing AT entry=false, lock=lock AT entry=true) > > at /d/S/B/src/msys2-runtime/winsup/cygwin/flock.cc:504 > > #4 0x0000000180068eb1 in fhandler_base::del_my_locks > > (this=0x80000b810, from=on_close) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/flock.cc:402 > > #5 0x000000018010d5bf in fhandler_base::close_with_arch > > (this=0x80000b810, flag=flag AT entry=-1) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/fhandler/base.cc:1306 > > #6 0x00000001800de36b in __close (fd=5, flag=-1) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/syscalls.cc:1710 > > #7 close (fd=5) at /d/S/B/src/msys2-runtime/winsup/cygwin/syscalls.cc:1722 > > > > Thread 12 > > #2 0x00000001800d487f in muto::acquire (this=0x1802c24c0 > > , ms=ms AT entry=4294967295) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/sync.cc:84 > > #3 0x00000001800dd6e0 in dtable::lock (this=) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/local_includes/dtable.h:77 > > #4 cygheap_fdnew::cygheap_fdnew (this=, > > seed_fd=-1, lockit=true) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/local_includes/cygheap.h:593 > > #5 open (unix_path=0x7ff10b488 > > "[...]/fish-shell/target/fish-test-home/race_test_history.pZO5DS", > > flags=263169) at > > /d/S/B/src/msys2-runtime/winsup/cygwin/syscalls.cc:1576 > > ``` > > The freeze/deadlock can be reproduced in my C code by calling > > "continue" inside the "if (lock_res != 0) {" instead of triggering the > > assert just after. > > > > > > I haven't been able to reproduce the missing data in the history file > > so it's unknown if it's an issue in Fish or flock not locking properly > > at times. So far the test passes on Linux and MacOS. > > Thanks for the report. > > Do you have any evidence that flock() should be MT-safe? > This issue seems to be caused by the fact that flock() in > cygwin is not MT-safe. > > If flock is guarded by mutex in your test case, the issue > does not happen. > > -- > Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple