Message-ID: <68C4CF842BD2D411AC1600902740B6DA02CDC2E9@mcoexc02.mlm.maxtor.com>
From: "Dykstra, Sean" <Sean_Dykstra AT maxtor DOT com>
To: "'sandmann AT clio DOT rice DOT edu'" <sandmann AT clio DOT rice DOT edu>,
        "Dykstra, Sean"
	 <Sean_Dykstra AT maxtor DOT com>
Cc: djgpp AT delorie DOT com
Subject: RE: Help on physical memory (not again!)
Date: Wed, 9 May 2001 13:45:08 -0600 
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Reply-To: djgpp AT delorie DOT com

Hello again!

Actually, item 3 sounds awesome.  We have a well defined abstraction called
the
allocation manager which isolates physical memory allocations very well. We
had
to do this when we made the tool work under the P word (Phar-286).  

As for building scatter-gather lists, this is not a problem.  I would
absolutely
love to dump himem.sys and emm386.

Any code you might have would be supremely appreciated.  What is the ftp
site where
I would want to look for this?

Thanks again!


-----Original Message-----
From: sandmann AT clio DOT rice DOT edu [mailto:sandmann AT clio DOT rice DOT edu]
Sent: Wednesday, May 09, 2001 1:33 PM
To: Sean_Dykstra AT maxtor DOT com
Cc: djgpp AT delorie DOT com
Subject: Re: Help on physical memory (not again!)


> By allocating physical memory, I see a 3-10X performance hit over a
standard
> malloc.  I believe this is because the XMS is "locked" into memory, and it
> is bypassing system cache.  You suggested I might be able to access some
> code that would allow me to access the page tables.  I am basically hoping
> that I will not have to "lock" memory until I actually need the physical
> buffer for the DMA transfer.  

I believe the problem is actually in the use of the DPMI map call 0x508 (?)
which puts the memory into your page tables.  Since this function is
designed
for mapping devices (such as video frame buffers, etc) into your address
space, the DPMI spec says to disable the cache on memory mapped with this
call.  The locking of memory via XMS calls doesn't cause a problem.

> The other alternative that would work is that perhaps I would not need to
> "lock" the XMS until I need it, but I think I need the physical address in
> order for my code to reference it.  Is there a way to take the XMS handle
> and convert it to a pointer, or do I need the physical location?

You need to lock the XMS memory.  Once XMS memory is locked it has a
physical 
address you can use for DMA - but the memory probably won't be in your page
tables.  That's why you use the 0x508 call to map it - but it sets the 
cache disable bit. 

> Any help you can provide would be tremendous.  Unfortunately, this
> performance hit makes the 32-bit DJGPP code slower than the DOS and
Windows
> versions of our tool, and may make it all but unusable for me.  With the
> standard malloc calls, the performance is awesome, but I cannot access it
> physically.  

You've really got several options:
1) Custom CWSDPMI which doesn't cache disable maps (ugh)
2) Page table modification code, which will fix the cache disable bits.  Can
be done in GCC very easily.
3) Don't use XMS, use regular malloc/sbrk to grab memory and scan the page
tables to find the physical address(es) of the buffer.  Advantages: will 
run without himem.sys in config.sys, no duplicate memory space, full speed.
Technique has been implemented by others successfully.
Disadvantage: in some environments the buffer may rarely be non-contiguous,
so you would need to check for this and/or use scatter/gather DMA.
4) If you can be sure the system will be a Pentium or newer system, skip 
the mapping step completely and create 4M page maps in the page directory
pointing to the XMS buffer.  Similar to 2, but simpler.  Does require 4Mb
alignment of buffer being used. Disadvantage: never been implemented.

I'd probably recommend 3, but 2 may be less changes for you.  Let me know
if you want code examples - I try to find some and up them on my ftp site.