delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2002/01/20/10:45:16

X-Authentication-Warning: delorie.com: mailnull set sender to djgpp-bounces using -f
Message-ID: <3C4ADB8E.552CAE40@yahoo.com>
From: CBFalconer <cbfalconer AT yahoo DOT com>
Organization: Ched Research
X-Mailer: Mozilla 4.75 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.lang.c,comp.os.msdos.djgpp,comp.compilers.lcc
Subject: hash library - interface criticisms
Lines: 173
Date: Sun, 20 Jan 2002 15:37:05 GMT
NNTP-Posting-Host: 12.90.167.26
X-Complaints-To: abuse AT worldnet DOT att DOT net
X-Trace: bgtnsc05-news.ops.worldnet.att.net 1011541025 12.90.167.26 (Sun, 20 Jan 2002 15:37:05 GMT)
NNTP-Posting-Date: Sun, 20 Jan 2002 15:37:05 GMT
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

This is a copy of a message I originally posted on comp.lang.c and
comp.programming.  The reason for the repost is that I have run
into an anomaly with both gcc on djgpp (2.953) and lccwin32.

I was expecting performance problems from allocating many small
items during insertion.  However, this turns out to be no
problem.  What does occur is a problem with de-allocation, which
shows up on both gcc and lcc.  It is due to the runtime packages,
as can be shown by the fact that compilation with VC6 does not
exhibit the slowdown.

On my 486/80 65M system with gcc or lcc insertion of 10000 items
or more has a serious pause during program exit, when hshkill is
called, which in turn calls the hshfreefn and thence free for each
item.  The pause becomes oppressive for larger amounts of
insertions, and is **much** longer than the time to insert.  The
items are freed in no particular order.

I suspect this has something to do with recombining free space
when free is called.

Under VC6 the problem goes away.  I can insert and free 1,000,000
different items on my machine in under 1 minute, and execution
time remains linear with item count, i.e. the algorithm is O(n). 
Under gcc or lcc I have never seen the timing for 50,000 items
freed - I abort it first.

Has anybody any ideas?  For a copy of the testbed email me at the
worldnet address (the reply-to field).  If the problem is built
into the runtime maybe that should be looked at.

------------ original message follows ------------

I am developing a library for hash table storage, and the
interface has reached a fairly stable point.  I would like
criticisms of it.  So far it appears able to do everything I
want of it, and is written in purely standard C.  The header
file follows, pasted as a quote to avoid line wrapping.  I
expect this file to more or less serve as the manual.

> /* -------------- File hashlib.h ------------------ */
> #ifndef hashlib_h
> #define hashlib_h
> 
> #ifdef __cplusplus
> extern "C"
> {
> #endif
> 
> /* This is an example of object oriented programming in C, in   */
> /* that it isolates the hashtable functioning from the objects  */
> /* it stores and retrieves.  It is expected to be useful in     */
> /* such areas as symbol tables and Markov chains.               */
> 
> /* This software is copyright (c) 2002 by C.B. Falconer, and is */
> /* provided under the GPL license.                              */
> 
> /* This library is intended to be totally re-entrant and thread */
> /* safe.  It can provide storage and lookup for arbitrary data  */
> /* items of arbitrary types.  The hshkill() function will       */
> /* release all associated storage, after which hshinit() is     */
> /* needed before using the database again.                      */
> 
> /* The pointers returned by hshinsert() and hshfind() may be    */
> /* used to modify the data items, PROVIDED THAT such does NOT   */
> /* affect the values returned from hshcmp() in any way.  Note   */
> /* that these returned pointers are normally NOT the same as    */
> /* the pointers passed in to those two functions.               */
> 
> /* opaque incomplete object */
> typedef struct hshtag hshtbl;
> 
> /* Possible error returns, powers of 2 */
> enum hsherr {hshOK = 0, hshNOMEM, hshTBLFULL, hshINTERR = 4};
> 
> /* NOTE: probes and misses aids evaluating hash functions       */
> typedef struct hshstats {
>    unsigned long probes, misses,  /* usage statistics */
>                  hentries;        /* current entries count */
>    enum hsherr   herror;          /* error status */
> } hshstats;
> 
> /* -------------- The auxiliary function types ---------------- */
> /* The actual storage for the various void * item pointers is   */
> /* known to the calling program, but not to this library. These */
> /* function pointers allow the library to adapt itself to       */
> /* arbitrary types of data.                                     */
> /*                                                              */
> /* When called from the hashlib system the 'item' parameter to  */
> /* these functions will never be NULL.  The application is thus */
> /* free to use such a value for any special purpose, such as    */
> /* re-initialization of static variables.                       */
> 
> /* a hashfn() returns some hashed value from the item           */
> /* Note that two functions of this form must be provided        */
> /* The quality of these functions strongly affects performance  */
> typedef unsigned long (*hshfn)(void *item);
> 
> /* A hshcmpfn() compares two items, and returns -ve, 0 (equal), +ve */
> /* corresponding to litem < ritem, litem == ritem, litem > ritem    */
> /* It need only return zero/non-zero if not to be used elsewhere    */
> typedef int  (*hshcmpfn)(void *litem, void *ritem);
> 
> /* A hshdupfn() duplicates the item into malloced space.  Further   */
> /* management of the duplicated space is the libraries duty.  It is */
> /* only used via hshinsert(), and must return NULL for failure.     */
> /* The application is free to modify fields of the allocated space, */
> /* provided such modification does NOT affect hshcmpfn              */
> typedef void *(*hshdupfn)(void *item);
> 
> /* A hshfreefn() reverses the action of a hshdupfn.  It is only     */
> /* called during execution of the hshkill() function.  This allows  */
> /* clean-up of memory malloced within the hshdupfn. After execution */
> /* of hshfree the item pointer should normally not be used.  See    */
> /* sort demo for an exception.                                      */
> typedef void (*hshfreefn)(void *item);
> 
> /* A hshexecfn() performs some operation on a data item.  It may be */
> /* passed additional data in datum.  It is only used in walking the */
> /* complete stored database. It returns 0 for success.              */
> /* xtra will normally be NULL, but may be used for debug purposes   */
> /* During a database walk, the item parameter will never be NULL    */
> typedef int (*hshexecfn)(void *item, void *datum, void *xtra);
> 
> /* ------------ END of auxiliary function types ------------- */
> 
> /* initialize and return a pointer to the data base */
> hshtbl *hshinit(hshfn hash, hshfn rehash,
>                 hshcmpfn cmp,
>                 hshdupfn dupe, hshfreefn undupe,
>                 int      hdebug);
> 
> /* 1------------------1 */
> 
> /* destroy the data base */
> void   hshkill(hshtbl *master);
> 
> /* 1------------------1 */
> 
> /* find an existing entry. NULL == notfound */
> void * hshfind(hshtbl *master, void *item);
> 
> /* 1------------------1 */
> 
> /* insert an entry.  NULL == failure, else item */
> void * hshinsert(hshtbl *master, void *item);
> 
> /* 1------------------1 */
> 
> /* apply exec to all entries in table. 0 = success */
> /* The order of application is arbitrary.  If exec */
> /* returns non-zero (error) the walk stops         */
> /* datum can provide a global area to exec         */
> int    hshwalk(hshtbl *master, hshexecfn exec, void *datum);
> 
> /* 1------------------1 */
> 
> /* return various statistics on use of this hshtbl */
> hshstats hshstatus(hshtbl *master);
> 
> #ifdef __cplusplus
> "}"
> #endif
> #endif
> /* -------------- File hashlib.h ------------------ */

-- 
Chuck F (cbfalconer AT yahoo DOT com) (cbfalconer AT XXXXworldnet DOT att DOT net)
   Available for consulting/temporary embedded and systems.
   (Remove "XXXX" from reply address. yahoo works unmodified)
   mailto:uce AT ftc DOT gov  (for spambots to harvest)


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019