Mail Archives: cygwin/2008/06/02/15:53:12
Brian Dessent wrote on 02 June 2008 19:43:
> Dave Korn wrote:
>
>> Regardless of how well (or poorly) the
>> hash function distributes DLLS into the various buckets, there are only
>> 1024 of them, and we have many DLLs, many of which will occupy multiple
>> buckets; collisions are inevitable.
>
> First of all, I don't see where this 1024 comes from. By my reading the
> hash distributes over the range 0x61300000 - 0x712C0000 in 64k
> increments, meaning 4092 buckets.
static unsigned long
compute_dll_image_base (const char *ofile)
{
unsigned long hash = strhash (ofile);
return 0x61300000 + ((hash << 16) & 0x0FFC0000);
}
#endif
Looking at the mask value, it has ten bits set. Hence 1024 possible
results. Looking at the 0 bits to the right of them, they are spaced in
1<<18 = 1<<(16+2) = 1<<16 * 1<<2 = 64k * 4 = 256kB units. I did this in my
head, so I may be having a brain fart for all I know.
> But what I really meant wasn't necessarily to improve the hashing
> function per se but to give it more buckets, a wider range.
Point, but short of the world moving to 64-bit address space, we're always
likely to run into collisions round the top end of the 2Gb space.
cheers,
DaveK
--
Can't think of a witty .sigline today....
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -