delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2002/02/11/10:59:34

X-Authentication-Warning: delorie.com: mailnull set sender to djgpp-bounces using -f
Message-ID: <3C67DE6D.DEE8305C@yahoo.com>
From: CBFalconer <cbfalconer AT yahoo DOT com>
Organization: Ched Research
X-Mailer: Mozilla 4.75 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Alignment problem
References: <Pine DOT SUN DOT 3 DOT 91 DOT 1020211133619 DOT 25095H-100000 AT is>
Lines: 106
Date: Mon, 11 Feb 2002 15:10:03 GMT
NNTP-Posting-Host: 12.90.176.218
X-Complaints-To: abuse AT worldnet DOT att DOT net
X-Trace: bgtnsc05-news.ops.worldnet.att.net 1013440203 12.90.176.218 (Mon, 11 Feb 2002 15:10:03 GMT)
NNTP-Posting-Date: Mon, 11 Feb 2002 15:10:03 GMT
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Eli Zaretskii wrote:
> 
> On Mon, 11 Feb 2002, CBFalconer wrote:
> 
> > gcc 2.953.
> 
> I'm not sure -a is supported.  In fact, I'm not sure -a is supported in
> _any_ DJGPP port of any GCC version.
> 
> > Lots of time, see below.  At least until I can get a
> > detailed look inside free.
> 
> Well, I'm not sure: the 1.17 second taken by `free' is only about 20
> clock ticks, so you are still pretty much on the clock granularity
> level.
> 
> > I suspect free is talking back to the dpmi memory manager.
> 
> You have the source, and it should show that `free' doesn't issue any
> system calls (because DJGPP programs never release any memory to the DPMI
> server).  So I think the calls to __dpmi_int are irrelevant to what
> `free' does, it's just that your program calls some system services
> elsewhere, and the profiling code catches some of these calls when it
> randomly probes the EIP at each clock tick.
> 
> I think if the main loop of your program, the one that allocates and
> frees objects, runs longer, the percent of time spent inside __dpmi_int
> will go down (assuming it's more-or-less constant overhead during program
> initialization and shut-down).
> 
> `free' calls 2 subroutines (`merge' and `b2bucket'), but they are both
> inlined.  I'd suggest to edit malloc.c, delete "inline" from these two
> functions' declarations, and then rebuild and rerun the test program
> again.  Perhaps that will show that one of them takes most of the time.

I made the makefile do a "-Dinline= " and the areas showed up. 
Now, with a more detailed look, I turned up the time and got:

Flat profile:

Each sample counts as 0.0555556 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ns/call  ns/call  name
 82.04      9.39     9.39    30001 312952.53 314921.80  merge
  4.37      9.89     0.50                             __dpmi_int
  2.43     10.17     0.28    30066  9238.93 10851.34  malloc
  2.43     10.44     0.28                             inserted
  1.46     10.61     0.17                             mcount
  0.97     10.72     0.11    53249  2086.63  2086.63  size2bucket
  0.97     10.83     0.11    30015  3701.85 320446.04  free
  0.97     10.94     0.11                             putintbl
  0.97     11.06     0.11                             t1cmp
  0.49     11.11     0.06    60016   925.68  1969.27  b2bucket
  0.49     11.17     0.06                             dotest4
  0.49     11.22     0.06                             getcounts
  0.49     11.28     0.06                             hshkill
  0.49     11.33     0.06                             randomMT
  0.49     11.39     0.06                             reorganize
  0.49     11.44     0.06                             t1undupe
  0.00     11.44     0.00     6834     0.00     0.00  split_block

which puts the problem (going by name) precisely where it was
expected.  Next I shall take a look at the code in merge, where
this seems to be the O(N*N) coding:

>        bpp = freelist+bu;
>        for (bp=freelist[bu]; bp; bpp=&(bp->next), bp=bp->next)
>        {
>      #if DEBUG
>          printf("  %08x", bp);
>      #endif
>          if (bp == c)
>          {
>      #if DEBUG
>            printf("\n  snipping %u/%08x from freelist[%d]\n", bp->size, bp, bu);
>      #endif
>            *bpp = bp->next;
>            break;
>          }
>        }
>        CHECK(c);

This could probably be obviated at the application level by
freeing in the order in which things are malloced, but that is
onerous.  I still don't know how you are organizing the free list,
but in principle the only checks needed for merging are against
the blocks immediately before and after the block being freed.  If
something identifies such blocks as being free or not, independant
of the free list, this could be made linear.  This implies that
all blocks contain fields for next, size, and one bit for flags. 
I think it is inconceivable that size will ever exceed 32 bits, so
the free flag could be packed into the size bit of flags.  If the
free list is maintained in order of memory location, there only
need be one walk of it, and the free flag is not needed.  Such
maintenance again requires a single walk of the free list for each
free.

Where does malloc, realloc get the memory to manage, or what
describes it to the package?

-- 
Chuck F (cbfalconer AT yahoo DOT com) (cbfalconer AT XXXXworldnet DOT att DOT net)
   Available for consulting/temporary embedded and systems.
   (Remove "XXXX" from reply address. yahoo works unmodified)
   mailto:uce AT ftc DOT gov  (for spambots to harvest)

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019