delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/11/10/04:08:09

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
Date: Tue, 10 Nov 2009 10:07:49 +0100
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Broken autoconf mmap test (was Re: 1.7] BUG - GREP slows to a crawl with large number of matches on a single file)
Message-ID: <20091110090749.GS26344@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <20091108141548 DOT GB26344 AT calimero DOT vinschen DOT de> <4AF716DB DOT 8060904 AT cwilson DOT fastmail DOT fm> <20091109115903 DOT GE26344 AT calimero DOT vinschen DOT de> <4AF80FFB DOT 4040701 AT byu DOT net> <20091109140529 DOT GJ26344 AT calimero DOT vinschen DOT de> <4AF8E8B3 DOT 5070003 AT byu DOT net>
MIME-Version: 1.0
In-Reply-To: <4AF8E8B3.5070003@byu.net>
User-Agent: Mutt/1.5.20 (2009-06-14)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Nov  9 21:14, Eric Blake wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> According to Corinna Vinschen on 11/9/2009 7:05 AM:
> > This part of the testcase
> > 
> >   data2 = (char *) malloc (2 * pagesize);
> >   if (!data2)
> >     return 1;
> >   data2 += (pagesize - ((long int) data2 & (pagesize - 1))) & (pagesize - 1);
> >   if (data2 != mmap (data2, pagesize, PROT_READ | PROT_WRITE,
> >                        MAP_PRIVATE | MAP_FIXED, fd, 0L))
> >     return 1;
> > 
> > is bad.  The chance that the address of data2 is not usable for mmap on
> > Windows/Cygwin is 100%.
> 
> But in testing this further, I discovered that you CAN do:
> 
> data2 = mmap(...);
> munmap (data2,...);
> mmap (data2, ... MAP_FIXED)
> 
> and get success on cygwin.

Yes, but basically only if you unmap the entire mmaped region.  See
below.

>   So I will be updating autoconf accordingly,
> based on the STD below.  Unfortunately, it looks like I also found a hole
> in cygwin.  Consider this (borrowing heavily from the autoconf test that I
> am fixing):
> [...]
> This test behaves differently on Linux than on cygwin; on Linux, both
> './foo' and './foo 1' give status 0, but on cygwin, './foo' gives status
> 6, and only './foo 1' succeeds.  In other words, the second mmap fails if
> there is no intermediate munmap.
> 
> POSIX apparently allows cygwin's behavior:
> 
> "If MAP_FIXED is set, mmap() may return MAP_FAILED and set errno to
> [EINVAL]. If a MAP_FIXED request is successful, the mapping established by
> mmap() replaces any previous mappings for the pages in the range
> [pa,pa+len) of the process."
> 
> However, since we already have to maintain a list of mappings in order to
> implement fork(), it seems like it would be easy to fix cygwin to
> implicitly munmap anything that would otherwise be in the way of a
> subsequent MAP_FIXED request, rather than blindly calling
> NtMapViewOfSection and failing because of the overlap, so that we could be
> even more like Linux behavior.

That's tricky and bound to fail.  The problem is that, in Windows,
you can't munmap mmap'ed regions only partially.  NtUnmapViewOfSection
only allows to unmap an entire section.  So, with the bookkeeping in
Cygwin you can re-use a partially unmapped region of anonymous
memory to map new anonymous memory, but you can't reuse a partially
unmapped region to mmap another file at this point in memory, nor
even the same file with just another offset.

The only way around this problem would be to map files and anonymous
memory always in single 64K chunks, so that every page of a map can be
actually unmapped on OS level.  But in that case the process of allocating
memory is not atomic anymore, so we get the other potential problem of
not being able to fulfill a request because another thread has called
VirtualAlloc one way or the other.

> > That's why I think we need at least two tests in autoconf, a generic
> > mmap test and a mmap test for the "mmap private/shared fixed at
> > somewhere already mapped" case, if an application actually insists on
> > using that.
> 
> In the case of the autoconf test, I think a single test is still
> sufficient, once it is fixed to be portable to what POSIX requires.

One problem is actually grep, which started the entire discussion.  It
really uses malloc/mmap(MAP_FIXED), along the lines of what the HAVE_MMAP
test tests.  Fortunately, grep doesn't fail if mmap returns an error, so
it doesn't hurt.  Of course it would be nice if grep would use mmap in
a more portable way.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019