DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 4APFMTUR2765308 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 4APFMTUR2765308 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=FPpXCRqa X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 63BEA3858429 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1732548147; bh=lhJsHhJaz060Nkn9yIY4e8oB6WEUmwKZfAU2MCeYKQ4=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=FPpXCRqafUxIggURHMIXfTx8z4jPuw69l/hm5QBmJ88mn9IgGBtsf9nv4rMBsHPHA zAEgrhJzVegi+WYt8iyZLaHApEer8fjVrnQtNWVLs9JbbhvkeB32eaTSpbhUOlrIGv ZZF79RKGh+O/JZ1Xgv1628iSlwdkEVGeWSmHkm14= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 580203858C42 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 580203858C42 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1732548118; cv=none; b=hffC3Jgav+gyijkxH6TZN67uUphFUBqUnw3h+96LlLGta41euz5o6UBq7PEKb9jMd8OU+X8ikMnv273TrgFDuqIPcFX5qTubVkY9A4Kj4bA+8CzQuCqBbX9qNby5pKlhqa0+qFR/Y9L4Ork4jturYy5CVt9TtZfAfcn55DM1QXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1732548118; c=relaxed/simple; bh=u4RGLf9x02mWBiEX6pccAWn/EMoWhAjfoNbHACoPCLY=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=oG8OpYcUgqefWbeSqHfuc1oYogOuODEaohZkfLyxywCfeCNPkJIbHfLXsBHv6cNSMp6R+T7BGC9eG3+JsN3361iUWkH5GiS+NLGJTh3xf75g/orO0j6hgaffcEiL0wdj6Ku8OuWSUqPj3cU4ao8ouB7xbRE/8XRypFmyuqJ2YSk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 580203858C42 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732548116; x=1733152916; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qp11Pan24X1L/eFZz1Q20p1VwMuRA0i4Ni050MxBLy4=; b=b/zuX87EedtJt44RfcjYmUri612pnjRJ6R+Gk+pH3mKdTAHfnftvv6VEBHZJpplem+ PJu9qpymtmkOGRu26kA0o3bkiLre5Fb8ktd1KT/Q6dLjQvYEU6M3zhKRYcua+Tp/kZ6K t/lAWoo/U8N1O6Wn9fFT10kXO8xb2gHoudpfljKQsPd1tdYM+giwzDChzCMIKIntwyiM CINARAmxHz270m5c9rjgQRUHCzLIIHBlRGwfLGVyb3SUnLDV2S9Rn2FiLmyrThK0tpiN lwQrI86TQYy3y0aWTw7yH3HHOjO7kWF6cjX7xa7lJ0FMWhIJe30Kftuzesq88Yuv5V4W lSxQ== X-Gm-Message-State: AOJu0Yy2enQAaKiF+aaeQFObJzxbQ30pOrFKn+P1B8x+ZHoRjzxU9iwA gxhakgF6B8nOghrJ6FXbmYiJfy2pYhogNw009316Rdmsu4Gg52fKIhkXcDF/TjIEVgDhGXXjXfD YhWnLulbQd7gAfSLkVnYJZZeZ6FxD1PUf X-Gm-Gg: ASbGnctupDP6cbMv5CwIGaQOhi2MEGgr1llLBqClZbfWi9E9yyJdhU/nMFPac4C9x4l WAE9u3/HcB7U+dgB2VGl4b9fjQD1/ZnI= X-Google-Smtp-Source: AGHT+IGpS6vBS7Q5YSMQa54M++UEBAseJQu2cVbKXLkO0aoos7EGWUgGVOA+u712VcWSw1s/JhG6USmWKxIFCT237Fo= X-Received: by 2002:a05:6402:5211:b0:5cf:3d11:c795 with SMTP id 4fb4d7f45d1cf-5d02060d22bmr15219668a12.7.1732548116015; Mon, 25 Nov 2024 07:21:56 -0800 (PST) MIME-Version: 1.0 References: <08d6a039-215c-c8e8-0280-15bc0ce28a43 AT jdrake DOT com> In-Reply-To: Date: Mon, 25 Nov 2024 16:21:19 +0100 Message-ID: Subject: Re: [Ms-nfs41-client-devel] Corrupted file name in Cygwin - does Cygwin do a silly rename if a file is open? To: cygwin AT cygwin DOT com, ms-nfs41-client-devel AT lists DOT sourceforge DOT net X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , From: Cedric Blancher via Cygwin Reply-To: Cedric Blancher Content-Type: text/plain; charset="utf-8" Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 4APFMTUR2765308 On Mon, 25 Nov 2024 at 12:41, Roland Mainz wrote: > > On Sun, Nov 24, 2024 at 8:32 AM Cedric Blancher > wrote: > > > > On Sat, 23 Nov 2024 at 17:47, Jeremy Drake wrote: > > > > > > On Sat, 23 Nov 2024, Cedric Blancher via Cygwin wrote: > > > > > > > Good afternoon! > > > > > > > > Does Cygwin do a silly rename if a Cygwin file is open but gets > > > > /bin/rm at the same time? > > > > > > Yes! See function try_to_bin in winsup/cygwin/syscalls.cc: > > > /* Create unique filename. Start with a dot, followed by "cyg" > > > transposed into the Unicode low surrogate area (U+dc00) on file > > > systems supporting Unicode (except Samba), followed by the inode > > > number in hex, followed by a path hash in hex. The combination > > > allows to remove multiple hardlinks to the same file. */ > > > > That code is wrong. > > > > bash -c 'printf ".\udc63\udc79\udc67#\n"' | iconv -f UTF-8 > > .iconv: illegal input sequence at position 1 > > > > 334 RtlAppendUnicodeToString (&recycler, > > 335 (pc.fs_flags () & FILE_UNICODE_ON_DISK > > 336 && !pc.fs_is_samba ()) > > 337 ? L".\xdc63\xdc79\xdc67" : L".cyg"); > > > > SAMBA is right to reject L".\xdc63\xdc79\xdc67", because it is not a > > valid UTF-16 sequence. ReFS with validation, OpenZFS and so on will > > all REJECT such file names, and neither can NFSv4 because file names > > must be valid Unicode (even if nfsd would not validate then filesystem > > being shared via nfsd will reject that). > > So this can only work on ntfs, and only if it is not validating the > > input UTF.16 sequence. > > > > AFAIK FILE_UNICODE_ON_DISK means that the wchar_t sequences must be > > valid UTF-16, and not just be a random sequence of 16bit values. > > > > @Corinna Vinschen Could this sequence please be changed to a VALID > > UTF-8 sequence, such as \u[fffc]\u[fffc]\u[fffc]? That might work with > > SAMBA, ReFS, OpenZFS NFSv4, ... > > That does not help with existing Cygwin installations and Cygwin > 32bit, which is stuck at Cygwin 3.3.x ... ;-( > > I agree that the L".\xdc63\xdc79\xdc67" prefix will backfire on > something like ReFS, OpenZFS etc (SAMBA uses the prefix for > filesystems which do NOT have |FILE_UNICODE_ON_DISK| set), but for > ms-nfs41-client I just stomp over the issues with this patch (wording > still needs to be improved): > ---- snip ---- > diff --git a/daemon/setattr.c b/daemon/setattr.c > index 9eaafb5..6e9729e 100644 > --- a/daemon/setattr.c > +++ b/daemon/setattr.c > @@ -284,6 +284,46 @@ static int handle_nfs41_rename(void > *daemon_context, setattr_upcall_args *args) > > EASSERT((rename->FileNameLength%sizeof(WCHAR)) == 0); > > +#define CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE 1 > + > +#ifdef CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE > + /* > + * Stomp Cygwin "silly rename" invalid Unicode sequence > + * > + * Cygwin has it's own variation of "silly rename" (i.e. if > + * someone deletes a file while someone else still has > + * a valid fd to that file it first renames that file with a > + * special prefix, see > + * newlib-cygwin/winsup/cygwin/syscalls.cc, function > + * |try_to_bin()|). > + * > + * Unfortunately on filesystems supporting Unicode > + * (i.e. |FILE_UNICODE_ON_DISK|) Cygwin adds the prefix > + * L".\xdc63\xdc79\xdc67", which is NOT a valid UTF-16 sequence, > + * and will be rejected by a filesystem validating the > + * UTF-16 sequence (e.g. SAMBA, ReFS, OpenZFS, ...). > + * In our case the NFSv4.1 protocol requires valid UTF-8 > + * sequences, and the NFS server will reject filenames if either > + * the server or the exported filesystem will validate the UTF-8 > + * sequence. > + * > + * Since Cygwin only does a |rename()| and never a lookup by > + * that filename we just stomp the prefix with the prefix used > + * for non-|FILE_UNICODE_ON_DISK| filesystems. > + * We ignore the side-effects here, e.g. that Win32 will still > + * "remember" the original filename in the file name cache. > + */ > + if ((rename->FileNameLength > (4*sizeof(wchar_t))) && > + (!memcmp(rename->FileName, > + L".\xdc63\xdc79\xdc67", (4*sizeof(wchar_t))))) { > + DPRINTF(1, ("handle_nfs41_rename(args->path='%s'): " > + "Cygwin sillyrename prefix \".\\xdc63\\xdc79\\xdc67\" " > + "detected, squishing prefix to \".cyg\"\n", > + args->path)); > + (void)memcpy(rename->FileName, L".cyg", 4*sizeof(wchar_t)); > + } > +#endif /* CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE */ > + > dst_path.len = (unsigned short)WideCharToMultiByte(CP_UTF8, > WC_ERR_INVALID_CHARS|WC_NO_BEST_FIT_CHARS, > rename->FileName, rename->FileNameLength/sizeof(WCHAR), > ---- snip ---- > LGTM, but msys2 has a copy of that code with a different prefix, again with invalid UTF-16: https://github.com/msys2/msys2-runtime/blob/msys2-3.5.4/winsup/cygwin/syscalls.cc#L350C8-L350C35 Ced -- Cedric Blancher [https://plus.google.com/u/0/+CedricBlancher/] Institute Pasteur -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple