X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f X-Recipient: djgpp AT delorie DOT com X-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:openpgp:autocrypt:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=fYaQBD1qB2DeCmgyiNplRgb0tzgNPOGgZHwp2OkOCB8=; b=pQ8LvL6hM/OctUNTxkXkLtwfdaKztWYT4t4jMyDOLhWb3ewBFdBHauyQV6s4EgbxnA WM/BRqetZmkKNlg3dAxhWjQ8tkL9nFqFamxTyIo46ftSvxS/5pZWl+yTur5uJTbL1MK7 SXQwPa+B233sJE/Ujwo17yv0rlZ+lqrzQljlhAbG/6aaqgREryU7sXGty341VERowxXR Ax0gS/hMEde7e6ZgivQw7zGqzLlpMVB/R5h2HDx8fqXeQ7SSz65xIx+EJTJ4Gm96aKUD FusSSkhSx7rrsMPGfFihq/M8J7C8V4BHOcxF1QQSYge47PQvlmnhCG9G5jgt2ZwnvXwz ZADA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=fYaQBD1qB2DeCmgyiNplRgb0tzgNPOGgZHwp2OkOCB8=; b=MtLTFoyGrezPRL/ik3s8S4H5saGyJLKlz577+cPqyLOEZAI6BFpN1BExk02fHtfYBp qYqfGjYCWWm+VOCDU7J/5wrZsS1zpHEkqLOcYt/22p2TtnMZpbqiV1SL7txSgwaD4mQl Pw+yNodsdqABP3b+Vs2so0NeKpxutK02oTBsdIGEi3coejZYKW5RFH7UT1HimKduUVV0 JWnhbg+JPhNXSmmAdQ8SyHh/+kxW6qj3UfU8kJAhz7QetyKWc+l2s+dPM0cS3mVAuCZy NF9WSsSCgRmjTkmGrB9Bje97Q0yhqioHHzzYa4KY7Cbm42Vqp+ScUuuyBUYg4ZH+0kV/ afNg== X-Gm-Message-State: APjAAAVldvRTj1R9stzcjYJ0YCNWFbf/W/rPvsR9ny5Se59LYCMVXmyy kWFW+yCSlAVF3kB2jr+5nZTmL0PC X-Google-Smtp-Source: APXvYqym8xjik6CqvyLmpIPnxk0i/GJnXjGs7CK2yolDSYLmBAj4zKhU8aOct6JrSEwFxL+pys7SZw== X-Received: by 2002:a50:fa83:: with SMTP id w3mr58309817edr.47.1560990235326; Wed, 19 Jun 2019 17:23:55 -0700 (PDT) Subject: Re: malloc() returns pointer to already allocated memory To: djgpp AT delorie DOT com References: <158e5d20-0a90-4beb-de48-da328379d8fb AT gmail DOT com> From: "J.W. Jagersma (jwjagersma AT gmail DOT com) [via djgpp AT delorie DOT com]" Openpgp: id=D1694EA4DA1338AF4905293BA5102F469FA45960 Autocrypt: addr=jwjagersma AT gmail DOT com; prefer-encrypt=mutual; keydata= mJMEWlMYlBMJKyQDAwIIAQENBAMEORD9eiW30pI0XzJbuE7/4WF3ZyEOkpRgrmZnpTRmzXMW z22N+2YrczEM+q+NzM/wn8XOVH4hZ7eSaUsLT86YLS+gF/F44IbRZVCHxdrL+qSQQ0SafwDM Aaddx2azX+6MtlFO2B6oDbCHT7Nn5oQi3IgR+pgBzaCkfxyOnU1wjvO0JEouVy4gSmFnZXJz bWEgPGp3amFnZXJzbWFAZ21haWwuY29tPojQBBMTCgA4FiEE0WlOpNoTOK9JBSk7pRAvRp+k WWAFAlpTGJQCGwMFCwkIBwIGFQgJCgsCBBYCAwECHgECF4AACgkQpRAvRp+kWWA3xAH/WIBG sLngyIrTsn5uKz4JygtlCfjMRRSwl3/UIaW96VvIM/wUyzHl+vlyLmuclOjAiTVia/Js03Ar zB9iCHZQ0QH/W8cRwHjRNZRukU2IR8eqoGsl6hTUrrAAOKOVO04wrjO4jGDHg3Sw2GxBx7Md mD0QRId/gFwR/1UWS0jBdSUWC7iXBFpTGJQSCSskAwMCCAEBDQQDBFipNF/RUEAer/hhT6yJ 5no1ZLAZgpypvWZ3xJSrBKmcDTaOmNKNRrw1dNtSZ6M6wYcBmtCq7uvR5iiGyK3nD+olwe7a 4WDkhEEvbmdAYl2LnBsYeyruoW+N+KiMalvf9SphDmBi2NwdSZ1IhnBpxlUns8jqlgZGvG1G xE1sleq7AwEKCYi4BBgTCgAgFiEE0WlOpNoTOK9JBSk7pRAvRp+kWWAFAlpTGJQCGwwACgkQ pRAvRp+kWWBnyAH+JLrW/cK2aVDvq2nxn471opBgHyExzMi0hvbuy8BjyEt+AfL0RNCFcdds CQakIGDv5k69TAgOftECgASwv6CG5wH8D3jnksh6YvYMf0hCBlSfwJ5ehIeqQTj0m2410RgC KWywqmfU1X5YVXgmuXatRNgV/ggXS2Q7GnSP/StQzlh3sQ== Message-ID: <64786234-be30-3862-b2ee-133d2c49fb1a@gmail.com> Date: Thu, 20 Jun 2019 02:22:43 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Reply-To: djgpp AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk On 2019-06-19 03:43, Rod Pemberton wrote: > On Mon, 17 Jun 2019 18:33:09 +0200 > "J.W. Jagersma (jwjagersma AT gmail DOT com) [via djgpp AT delorie DOT com]" > wrote: > >> On 2019-06-17 09:27, Rod Pemberton wrote: >>> The program seems to make a large number (on my machine) of small >>> allocations around ~268M (~0x0ff5d000) before a "clobber". CWSDPR0 >>> aborts with an error about page tables. Could CWSDPMI be running >>> out of page tables? ... Also, IIRC, PMODETSR doesn't use paging, >>> and it completes without any "clobbers". The large number of >>> allocation might also be why no one noticed, or maybe this was a >>> known issue in the past but forgotten? >> >> The test program also works okay on hdpmi32, which doesn't use paging >> either. In the program I'm working on, this clobbering does happen >> with hdpmi, and almost immediately on startup too, after allocating >> maybe 4MB or so. Hence why I'm not sure if this initial issue is >> related. Still I've traced the clobbering pointer down and it does >> seem to come straight from malloc. > > a) did you guys dismiss the out of page tables thought? As Eli said, at this point it seems very unlikely that there's anything wrong with cwsdpmi or libc, and I agree with that. Someone else would've noticed long ago. The test program I posted is clearly detecting something other than actual clobbering, and so far that "something" appears to be harmless. It's also not related to the problem I initially had (and still have) where malloc literally gives me the same pointer twice. > b) how did you notice this issue originally? > c) is the "clobbering" actually causing corruption in your program? It is causing very obvious corruption, that's how I discovered it. Variables changing for no reason. Then a pointer or some offset changes and the next access triggers a page or GP fault. Running in a debugger throws things off just enough that it happens somewhere else on every run. I eventually tried hard-coding watchpoints and it triggered in a std::unordered_map node constructor. Backtracking through the disassembled code, I can only conclude that the offending pointer was returned directly from malloc. That's when I came up with this (flawed) method of detecting memory overlaps. Just now I had another watchpoint hit in mv2freelist() (at src/libc/ansi/stdlib/nmalloc.c:524), which was called from malloc(). It's triggered on access to 'm->prevfree' which means the pointer 'm' overlaps with memory I previously allocated in a std::vector. So either I'm looking at a bug in malloc which only happens under _very_ specific circumstances, or more likely, something in my code is trashing malloc's internal data structures. I just fail to see what would cause that. In any case it's proving very hard to debug since the slightest changes to the code means some other memory area will be affected, and triggering watchpoints is based on sheer luck. > d) are you worried that the large array malloc() moves to a memory > region where "clobbers" are not occurring or not being detected? I'm not sure. If it did turn out to be an issue with malloc() itself it might depend on some very specific sequence of allocations. > For e), if I save the malloc'd pointer (twice) after the magic values > (two unique), i.e., 4 value sequence, then I fetch the saved pointer > upon a detected "clobber" matching both magic values, then write two > new values at the saved pointer location, no change of (two) values > occurs at (or anywhere around) the clobbered location. I.e., the > original "clobber" would seems to be some type of mysterious > intermittent copying of data occurring ... It doesn't seem to be > re-use, dual-use, dual-mapping, or overlap of memory regions. A > possible guess would be that existing pages are sometimes being > re-used. So, this may just be stale data. It's possible that this > might be a hardware issue, not software, e.g., maybe processor MMU > instead of CWSDPMI. See the attachment to my reply to Eli Zaretskii, I came to the same conclusion. I doubt the cpu does this since that could be a security risk in multi-tasking environments. More likely it's some quirk in cwsdpmi, and right now I don't feel the need to investigate exactly how it works since it seems harmless and is unrelated to the problem I'm seeing.