From: "Charles Sandmann" Newsgroups: comp.os.msdos.djgpp Subject: Re: Malloc bug in DJGPP V2.03 Date: Sat, 16 Sep 2000 11:01:49 Organization: Aspen Technology, Inc. Lines: 73 Message-ID: <39c3531d.sandmann@clio.rice.edu> References: <20000916140347 DOT B284 AT ajax DOT netspace DOT net DOT au> NNTP-Posting-Host: dcloan.hou.aspentech.com X-Trace: selma.aspentech.com 969124317 6876 10.32.115.107 (16 Sep 2000 17:11:57 GMT) X-Complaints-To: postmaster AT aspentech DOT com NNTP-Posting-Date: 16 Sep 2000 17:11:57 GMT X-NewsEditor: ED-1.5.8 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com > that DJGPP's malloc could be improved to cope with certain allocation > limitations in DPMIs (like CWSDPMI, as I mentioned before changing its > internal heap parameter affects how far the testmem program gets), > unfortunately I don't have the time to work on it. Here's the ugly problem with malloc, sbrk and CWSDPMI. Malloc algorithms are typically designed with either a speed or memory efficiency goal as the first priority. Since advanced operating systems handle memory efficiently (being able to sparsely allocate memory in the address space) they typically choose a malloc algorithm with speed in mind. The DJGPP malloc is a "speed" algorithm and doesn't mind wasting memory. For relatively small allocations, it keeps them in bins which are segregated by factors of 2 (ie 256byte, 512byte, 1Kbyte, etc). In the worst case you have a factor of 2 memory overhead - unless you realloc to a bigger bin and never use the small sized bin again. But malloc() is just a bookkeeper - it actually gets the memory via sbrk(). In DJGPP sbrk() also tries to be smart and do bookkeeping too, since it knows it's calling DPMI and wants to avoid performance and fragmentation issues. DJGPP's sbrk() tries to make sure each request is 64Kb in size or larger. In the days of 4Mb 386s this made lots of sense. Today, with 64Mb being an entry level machine, we have some problems. The problem comes in when you do lots of relatively small allocations - for example lets say you want to allocate 33Kb Kb structures. malloc asks sbrk() for 64Kb, sbrk() will ask CWSDPMI for 64Kb. CWSDPMI has to keep track of the memory, so it uses 14 bytes of DOS memory for each 64Kb request. CWSPDMI has to get that from it's internal heap, and it's a 16-bit application. If it has 1000 paragraphs to work with (16Kb is a big chunk of the 64Kb max data space) that will be 1170 memory zones - or around 75Mb of memory. The default internal stack is less and you only get around 20Mb of memory. I could increase the default internal CWSDPMI heap allocation which defers the problem (but this hurts small systems which need the memory the most). I could fix CWSDPMI to try decrease the storage pointer size. I could try and buffer more in DOS allocated far pointer memory. I could even try to move some structure storage to extended memory. Or I could do nothing - since I haven't used CWSDPMI myself on a real project in several years. Another fix would be to make sbrk() smarter. It internally stores a list of memory zones it allocated for some other libc features. That list is only 256 long if I remember correctly - which means some libc features break if you have lots of little memory zones. Since sbrk() is already broken, lets fix it. Instead of a static 64Kb chunk - lets make that a variable size, and increase it the more you ask for. So the first 32 are 64K, then they start to grow in 128K, 256K round off chunks. By the time you get to sbrk() chunk 224 or so, we ought to be up to a 8Mb minimum request per buffer. For example, this would allow you to allocate over 500Mb of memory in small chunks before the internal sbrk() bookkeeping overflowed 256 zones. For current well behaved programs that allocate a mixture of big and small chunks, there would be no change at all until they used more than 32 memory zones. I suspect performance would greatly improve for programs making more than 32 memory zone requests also... I prototyped something like this some years back, but it had bugs, and I don't know what happened to it. The crt0.s also changed, and there were merge issues. The fact that sbrk() is written in GAS is a real pain and has prevented it from getting fixed. Memory table example: Count End SizeK T Bin TotalK 32 32 64 2048 2048 32 64 128 4096 6144 32 96 256 8192 14336 32 128 512 16384 30720 32 160 1024 32768 63488 32 192 2048 65536 129024 32 224 4096 131072 260096 32 256 8192 262144 522240