delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/12/16/08:34:00

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 4BGDY0W33701296
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 4BGDY0W33701296
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=fuoOaqFi
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DB1E43858415
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1734356039;
bh=ApTp96W27jrnUVm8mstGPz8m7zJpTxte34N1TBjMl+Q=;
h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
From;
b=fuoOaqFiY66KArou/ievdFt3uJRWRrIjQ1rKqEVfcAjxmeM9H3o/sYFBU3gPA59Bp
QZbSrvAjUtfbOcDJ0DOcz7RSvcO8XAxT56aAkZuNa/4cAyPZ6MHml5HObyR66+JkHw
RjGgL5iXT73bi1gk6zrIJoLTlZ4Vb6Hz1a4ZVpuc=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 662563858D26
Date: Mon, 16 Dec 2024 14:32:58 +0100
To: cygwin AT cygwin DOT com
Subject: Re: Atomic mmap replacement
Message-ID: <Z2AsCg7Oo4FyHFjG@calimero.vinschen.de>
Mail-Followup-To: cygwin AT cygwin DOT com
References: <66bf4f86-4618-b9a3-3e33-2c240b9204d0 AT cornell DOT edu>
<20180219090042 DOT GC3417 AT calimero DOT vinschen DOT de>
<e6b3bd42-f981-405e-b65b-529693598735 AT cornell DOT edu>
<d84f7f6c-5527-4f39-83a5-1aa16d8e451f AT cornell DOT edu>
MIME-Version: 1.0
In-Reply-To: <d84f7f6c-5527-4f39-83a5-1aa16d8e451f@cornell.edu>
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Corinna Vinschen via Cygwin <cygwin AT cygwin DOT com>
Reply-To: cygwin AT cygwin DOT com
Cc: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 4BGDY0W33701296

Hi Ken,

On Dec 15 12:29, Ken Brown via Cygwin wrote:
> On 12/14/2024 7:00 PM, Ken Brown via Cygwin wrote:
> > Hi Corinna,
> > 
> > On 2/19/2018 4:00 AM, Corinna Vinschen wrote:
> > > On Feb 17 22:37, Ken Brown wrote:
> > > > Some code in emacs wants to reserve a chunk of address space with a big
> > > > PROT_NONE anonymous mapping, and then carve it up into separate mappings
> > > > associated to segments of a file.  This fails on Cygwin.
> > 
> > [...]
> > I'm returning to this very old thread because of come up against another
> > application that wants to allocate a big block of memory and then
                              ^^^^^^^^
                              reserve?

> > allocate pieces of it later.  I've looked at the documentation of
> > VirtualAlloc, and it seems that this should be possible:
> > 
> >     VirtualAlloc cannot reserve a reserved page. It can commit a page
> >     that is already committed. This means you can commit a range of
> >     pages, regardless of whether they have already been committed, and
> >     the function will not fail.
> > 
> >     You can use VirtualAlloc to reserve a block of pages and then make
> >     additional calls to VirtualAlloc to commit individual pages from
> >     the reserved block. This enables a process to reserve a range of
> >     its virtual address space without consuming physical storage until
> >     it is needed.

While it looks like this is possible, it has limitations the POSIX
definition of mmap() has not.

> > [...]
> > If you think this is feasible, I would be willing to work on it.  But in
> > that case I would appreciate some suggestions on how to implement it,
> > since I'm not yet very familiar with the mmap code.
> It looks like a lot of the machinery for doing what I want is already
> present in mmap.cc.  If I want the initial allocation to reserve without
> committing [in the Windows sense of "reserve"], I just need to specify
> MAP_NORESERVE in the call to mmap [now we're using "noreserve" in the Linux
> sense].  Right?  Then future mmap calls to allocate memory within that first
> block could simply check for the noreserve flag and use MEM_COMMIT without
> MEM_RESERVE.  Obviously there are a lot of details that I haven't yet
> thought through, but I'm cautiously optimistic.

Right now, mmaping with PROT_NONE and then re-mmaping with PROT_WRITE
doesn't work.  Cygwin implements PROT_NONE not as MAP_RESERVE, but as
MEM_COMMIT with PAGE_NOACCESS.  mmap() doesn't check if the requested
pages are already allocated with PAGE_NOACCESS and then succeeds while
in fact just changing the page protection.  This is basically what
you want.  Right now, you'd have to call mprotect() instead.

With anonymous mappings only, this is all just adding a bit of code to
mmap() to do what mprotect() does in this case.  The problems really
start when you add file-backed mappings to the picture.  In POSIX,
it is allowed to call

  void *addr, *addr2;
  int fd;

  fd = open ("my-file", O_RDWR);
  addr = mmap (NULL, 4 * PAGE_SIZE, PROT_NONE,
               MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
  addr2 = mmap (addr + PAGE_SIZE, 2 * PAGE_SIZE, PROT_WRITE,
	        MAP_FIXED | MAP_PRIVATE, fd, 0);

The second mmap() simply replaces the old mapping on the pages
it specifies.  This is totally not supported on Windows.  Worse,
even if the file has a length of only 1 byte, it doesn't take 1 page of
4K, but one allocation granularity block of 64K, i.e.  16 pages from the
processes VM.  None of these pages can be used for another mapping.

So only anonymous mappings would be possible, assuming we tweak mmap()
to check if the old mapping was anonymous either and then allow to
just change the page protection, as if mprotect has been called.

And, funny enough, something pretty similar already exists in mmap().
See mmap.cc, line 1051 and the mmap_list::try_map() method.  Right
now it only checks if an anonymous mapping has been partially unmapped
and can be recycled.  But it could be improved by allowing to recycle
the anonymous mapping either way, as long as the new mapping is also
anonymous and the SHARED/PRIVATE flags match.

MAP_NORESERVE looks like what you want, but it's not quite the same
thing.  Theoretically the idea here is that you just don't reserve swap
space.  That's pretty much what happens in Windows with MAP_RESERVE.

But then again, the idea of MAP_NORESERVE is that you don't ever call
mmap() again for this memory.  Rather, you just write to a page and the
system just reserves (aka "commits" in Windows terminology) the page in
physical memory.  If no physical memory exists for the request, SIGSEGV
is raised.  That's what mmap_is_attached_or_noreserve() is for.

As a sidenote, that's why Cygwin implements PROT_NONE as a commited
page with PAGE_NOACCESS protection: PROT_NONE does not mean that the
pages are not backed by swap space. So we have to commit them and
not just reserve them, otherwise you might encounter SIGSEGVs, just
because you call mprotect() aon a page.

IIRC the aforementioned try_map() takes MAP_NORESERVE into account and
allows to MEM_COMMIT these pages, but I'm not exactly sure this works
correctly and may need testing or some patch...

> Ken
> 
> P.S. The conflicting meaning of "reserve" in Windows vs. Linux was very
> confusing to me at first.  There's probably nothing that can be done to make
> the code less confusing.

Yeah, it's not really funny, but I don't see a way around that.  Linux
talks about reserving swap space, which is default and means MEM_COMMIT
in Windows, while Windows talks about just reserving virtual memory
which is more or less equivalent to MAP_NORESERVE.

As twisted as it is, MAP_NORESERVE and the internally used noreserve()
method are basically (but not quite) the same as MEM_RESERVE.  When in
doubt, keep in mind we're trying to always look from the POSIX/Linux
perspective.


Corinna

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019