Message-Id: <199710302300.KAA01562@mona.lexicon.net.au> Comments: Authenticated sender is From: "John Machin" To: "Salvador Eduardo Tropea (SET)" , djgpp AT delorie DOT com Date: Fri, 31 Oct 1997 09:59:15 +1000 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Subject: Re: malloc() In-reply-to: <199710300506.AAA27000@delorie.com> Precedence: bulk > Message-Id: > Comments: Authenticated sender is > From: "Salvador Eduardo Tropea (SET)" > To: djgpp AT delorie DOT com Date: Wed, 29 Oct 1997 16:19:37 +0000 > Subject: Re: malloc() > > From: "John Machin" > Subject: malloc() > I did a test with the Doug Lea malloc propossed by John Machin. I > found that dlmalloc uses 2^n chunks BUT joins various of different ^^^^^^^^^^ not so ... read the source (again?) ... chunks are multiples of 8 bytes, minimum chunk is 16 bytes (which might be a problem for programs with a very large number of very small requests e.g. a lot of two-pointer list nodes). > size > to satisfy a malloc request. In this way avoids some memory wastage. > My test was the following: > > My program uses 10000 pseudo-aleatory numbers, the numbers indicates > if we will delete or allocate a block, the index in a 1024 elements > array and the size of the block (0-128Kb). > > The result was that dlmalloc wastes less memory: > > BSD malloc (DJGPP) used 67305472 bytes for the heap. > dlmalloc used 51052544 bytes, that's 24% less. > > But the difference is speed is huge: > > BSD malloc/free 0.22 seconds. > dlmalloc: 1.54 seconds 7 times more!!! see below. > > I don't think is a good idea to replace the current one. Perhaps > is > possible to allow the option to use dlmalloc or put it in the FAQ. > Here is the program used (you'll need more than 64Mb free to run > it ;-). [ most of program snipped ] > > if (!del) > { > size=(val>>11) & 0x1FFFF; > if (array[index]!=0) > free(array[index]); > array[index]=malloc(size); Insert here: if (!array[index]) { printf("malloc failed after %d iterations\n", i); break; } You may well find a reason for the huge "speed" difference in your example, and that it is actually a time difference. Programmers often have to work hard to obey Jon Bentley's maxim "Do nothing gracefully"; it is much easier to "do nothing rapidly". You might also like to try making it slightly more realistic by using say memset() to touch the allocated memory; in the real world, only buggy programs allocate memory and then don't use it. Other problems are the fixed 128Kb maximum size of allocated block (such a uniform distribution is not real-world) and the fixed number (1024) of allocated blocks. I have modified SET's program to jump out when malloc fails, and allow command-line input for (1) max number of iterations [default=10000] (2) touch the memory [no] (3) max size of allocated block [128Kb]. I haven't (yet) fiddled with making the number of blocks a variable. I'm sending a copy to SET; if any one else wants a copy, e-mail me. Generally the results are: (a) When both versions access only real memory (i.e. no swapping), the Doug Lea version is SLIGHTLY slower than the BSD version (like e.g. 7.58 secs & 7.47 secs when max blocksize is 8192). (b) The DL's overhead is ranges from about 5% to 20%. The BSD's overhead is over 50%. Note: 50% overhead means for every 100 bytes that I've malloc'ed but not free'd, the package has grabbed 150 bytes, i.e 50 bytes wasted. (c) The BSD version runs out of real memory & starts swapping sooner. (d) The BSD version runs out of swap memory & fails sooner. In any case, simulations using pseudo-random numbers are not a very good idea; simulations using actual traces of real-world programs are much better, and more difficult :-( John Machin 1/27 Auburn Grove Hawthorn East, VIC 3123, Australia Phone: +61-3-98130561