From: "Tom Demmer" Organization: Lehrstuhl Stroemungsmechanik, RUB To: djgpp AT delorie DOT com Date: Thu, 30 Oct 1997 11:21:16 GMT-1 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Subject: Re: malloc Reply-to: Demmer AT LStM DOT Ruhr-Uni-Bochum DOT De Message-ID: <783CCFA70C5@brain1.lstm.ruhr-uni-bochum.de> Precedence: bulk I performed a few tests on the libc version of malloc and the Doug Lea version. As SET already pointed out, the libc version of malloc is faster then dl's. But, taking into account that you might actually access the fresh allocated memory, thinks look differently. I uses these three routines to check performance: /* ** ** This one only checks the allocator/deallocator ** If a block is already allocated, free it, ** then alloc again. NOTE: No memory ** moves are required ** */ void Random_Test1(long repeat, int maxsize){ long i; memset(foo,0,sizeof(foo)); for(i=0; i< repeat; ++i){ int k = random() % maxsize +1; int j = random() % 1024; if( foo[j] ){ free(foo[j]); } foo[j] = malloc(k); } for(i=0; i< 1024; ++i){ free(foo[i]); foo[i] = NULL; } } /* ** ** This one uses realloc to change the size ** of blocks. On shrinking, no memory moves ** req'd. On increase, it might happen. This ** has the consequence of actually allocating ** memory of the DPMI host and reducing the ** physically available memory ** */ void Random_Test2(long repeat, int maxsize){ long i; memset(foo,0,sizeof(foo)); for(i=0; i< repeat; ++i){ int k = random() % maxsize+1; int j = random() % 1024; if( foo[j] ) foo[j]=realloc(foo[j],k); else foo[j] = malloc(k); } for(i=0; i< 1024; ++i){ free(foo[i]); foo[i] = NULL; } } /* ** ** This one callocs memory and fills is with ** a value. So we really get mem from the system ** and use it. ** */ void Random_Test3(long repeat, int maxsize){ long i; memset(foo,0,sizeof(foo)); for(i=0; i< repeat; ++i){ int k = (random() % maxsize)+1; int j = random() % 1024; if( foo[j] ){ free(foo[j]); foo[j]=NULL; } else{ foo[j] = calloc(k,1); memset(foo[j],20,k); } } for(i=0; i< 1024; ++i){ if(foo[i]) free(foo[i]); foo[i] = NULL; } } The different routines were used with 100000 repeats and maxsize varying from 1 to 15000. This is probably quite a high number and found only in very long running programs. Starting at a blocksize of 100, the dl version becomes comparable or even faster than the libc version. Test 3 with 15000 outperforms the libc version about the factor 2. The benchmark results can be found on my ftp server in ftp://ftp.lstm.ruhr-uni-bochum.de/pub/djgpp/mbench.zip as they are too long to be posted here. The benchmarks were performed on a 486DX4/100 with 16 MB RAM and some 13 MB available as physical memory before the start. As a preliminary conclusion I'd say that although dl's version tends to be better, it is not worth the hassle. Maybe its location could be added as a programming resource to the FAQ or on SET's link page. What would make it interesting is the possibilty to give back memory to the operating system. This is not done by the BSD version of libc, and has no real effect under cwsdpmi, because it does not change the amount of physical available memory. I don't know if this is a limitiation of the DPMI specs, but really making the memory available for other processes would be an advantage in a multitasking environment. Maybe even if one program spawns another, but I am not sure. Any ideas which other criteria might be important to compare the two versions are welcome. Ciao Tom ****************************************************************** * Thomas Demmer * Phone : +49 234 700 6434 * * Universitaetsstr. 150 * Fax : +49 234 709 4162 * * Lehrstuhl fuer Stroemungsmechanik * * * D-44780 Bochum * * ****************************************************************** * Email: demmer AT LStM DOT Ruhr-Uni-Bochum DOT De * * WWW: http://www.lstm.ruhr-uni-bochum.de/~demmer * ****************************************************************** Ablaza's Observation: Every machine will eventually fall apart.