delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2001/05/09/17:04:12

From: sandmann AT clio DOT rice DOT edu (Charles Sandmann)
Message-Id: <10105091932.AA20004@clio.rice.edu>
Subject: Re: Help on physical memory (not again!)
To: Sean_Dykstra AT maxtor DOT com (Dykstra, Sean)
Date: Wed, 9 May 2001 14:32:51 -0500 (CDT)
Cc: djgpp AT delorie DOT com
In-Reply-To: <68C4CF842BD2D411AC1600902740B6DA02CDC2DF@mcoexc02.mlm.maxtor.com> from "Dykstra, Sean" at May 09, 2001 09:03:32 AM
X-Mailer: ELM [version 2.5 PL2]
Mime-Version: 1.0
Reply-To: djgpp AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> By allocating physical memory, I see a 3-10X performance hit over a standard
> malloc.  I believe this is because the XMS is "locked" into memory, and it
> is bypassing system cache.  You suggested I might be able to access some
> code that would allow me to access the page tables.  I am basically hoping
> that I will not have to "lock" memory until I actually need the physical
> buffer for the DMA transfer.  

I believe the problem is actually in the use of the DPMI map call 0x508 (?)
which puts the memory into your page tables.  Since this function is designed
for mapping devices (such as video frame buffers, etc) into your address
space, the DPMI spec says to disable the cache on memory mapped with this
call.  The locking of memory via XMS calls doesn't cause a problem.

> The other alternative that would work is that perhaps I would not need to
> "lock" the XMS until I need it, but I think I need the physical address in
> order for my code to reference it.  Is there a way to take the XMS handle
> and convert it to a pointer, or do I need the physical location?

You need to lock the XMS memory.  Once XMS memory is locked it has a physical 
address you can use for DMA - but the memory probably won't be in your page
tables.  That's why you use the 0x508 call to map it - but it sets the 
cache disable bit. 

> Any help you can provide would be tremendous.  Unfortunately, this
> performance hit makes the 32-bit DJGPP code slower than the DOS and Windows
> versions of our tool, and may make it all but unusable for me.  With the
> standard malloc calls, the performance is awesome, but I cannot access it
> physically.  

You've really got several options:
1) Custom CWSDPMI which doesn't cache disable maps (ugh)
2) Page table modification code, which will fix the cache disable bits.  Can
be done in GCC very easily.
3) Don't use XMS, use regular malloc/sbrk to grab memory and scan the page
tables to find the physical address(es) of the buffer.  Advantages: will 
run without himem.sys in config.sys, no duplicate memory space, full speed.
Technique has been implemented by others successfully.
Disadvantage: in some environments the buffer may rarely be non-contiguous,
so you would need to check for this and/or use scatter/gather DMA.
4) If you can be sure the system will be a Pentium or newer system, skip 
the mapping step completely and create 4M page maps in the page directory
pointing to the XMS buffer.  Similar to 2, but simpler.  Does require 4Mb
alignment of buffer being used. Disadvantage: never been implemented.

I'd probably recommend 3, but 2 may be less changes for you.  Let me know
if you want code examples - I try to find some and up them on my ftp site.

> Again, thank you for all of you help.
> 
> -----Original Message-----
> From: sandmann AT clio DOT rice DOT edu [mailto:sandmann AT clio DOT rice DOT edu]
> Sent: Friday, April 13, 2001 10:54 PM
> To: Sean_Dykstra AT maxtor DOT com
> Cc: eliz AT is DOT elta DOT co DOT il; djgpp AT delorie DOT com
> Subject: Re: Help on physical memory (not again!)
> 
> 
> > I think I have come up with the
> > equivalent with DJGPP, but I have a few concerns.  The primary reason for
> > needing a near pointer is because of portability issues with the various
> > platforms.  (The app is already written, and creating macros for all of
> the
> > buffer pointer deref's would be a royal pain).
> 
> You can use near pointers on DOS memory and the XMS memory you have
> allocated
> directly without having to map it - and that's more portable across other
> environments.  I'll also try to answer the concerns as written.
> 
> > I am currently using __dpmi_map_device_in_memory_block () with great
> > success.  I first allocate a buffer of 32Mb using malloc, then I allocate
> > 32Mb of PHYSICAL memory using XMS, then I use the
> > _dpmi_map_device_in_memory_block.
> 
> In CWSDPMI, when you malloc the memory it doesn't actually put either the
> page directories or page tables in place until you actually use it (read
> or write).  When you use the DPMI call 0x508 to do the map in memory block
> it also frees the memory first.  So, what you are currently doing is a
> pretty efficient way of getting your 32Mb of XMS memory into a localized
> memory window.  However, you must have memory (or disk space) to back up
> that memory request.  On a 40Mb machine, with 32Mb allocated to XMS memory,
> the 32Mb malloc would fail unless you had 32Mb of disk space available. 
> Since you want to boot from floppy, I'll presume the disk may not be
> formatted so this is a problem.  Not mapping the memory, but using it
> directly as a near pointer is probably better.
> 
> > My Concerns:
> > 1)  This requires that I allocate two buffers (for a total of 64Mb).  Is
> > there a way around this?  It basically means that in order to run from a
> > floppy I must have a 128Mb system, when I really only need 64Mb once the
> app
> > is running.  Can I delete the malloc'd buffer immediately and cheat using
> > the address returned, or will this cause problems with future mallocs?
> 
> Better to not malloc the buffer at all, but to enable near pointers
> #include <sys/nearptr.h>
> __djgpp_nearptr_enable(); 
> XMSpointer = __djgpp_conventional_base + XMSphysicalAddress;
> *XMSpointer = ...
> (from memory, hope that's correct)
> 
> deleting the malloc'd buffer will just cause the malloc chain to be messed
> up - don't do this.
> 
> > 2)  Note from the DPMI spec for Map Device In Memory Block: 
> > If the DPMI host is not virtualizing the device, it must disable any
> memory
> > caching on the mapped pages; in particular, on the 486 or later, the PCD
> > (page cache disable) bit must be set in the page table entries.  
> > 
> > Is my only detriment on # 2 performance?  
> 
> Performance can be an issue; there is an #ifdef in CWSDPMI to avoid this
> problem.  But this requires a custom CWSDPMI, and for other reasons noted
> above I would avoid the 0x508 call.
> 
> > Are there any real dangers about doing this?  
> 
> Not really, but there are ways to make it faster and more flexible as
> above...
> 
> > I can live with the performance issue, as my benchmarks have
> > showed that in most other aspects DJGPP is 50% to 200% faster than the DOS
> > version I am using now. 
> 
> You shouldn't have to give up the performance.
> 
> > I have been truely greatful for all of the help in the archives so far.
> > Coming from the corporate world I am always skeptical of the "open source"
> > concept, but you guys have done a tremendous job on DJGPP and the CWSDPMI
> > host.  
> 
> I'm glad you have found it useful.
> 
> I have some additional comments that you might like to consider - using an
> XMS buffer requires that the CONFIG.SYS (and floppy) include HIMEM.SYS.
> There are ways to avoid this restriction by malloc/sbrk the memory from
> inside DJGPP and then using some code I can provide which allows you to
> access the pagetables.  You can then get the physical address of each page
> in the buffer.  If you lock the memory buffer it will typically end up
> all contiguous (no need for scatter/gather) and be appropriate for DMA.
> This removes the need for page file allocation and for dos files/config.sys
> requirements if you don't have control over them.
> 
> If you bind the CWSDSTUB with appropriate CWSPARAM configuration to your
> image, you can provide a single executable which will run from almost
> any "raw" DOS environment from a floppy.  This has been done by one other
> large software company which will use this technique in a future release
> of one of their "boot from diskette" utilities.  It does DMA to the hard
> drive for speed purposes.
> 
> I hope this helps you out.  If you have any more detailed questions you can
> email me directly - but sometimes it takes me several days to respond.  Eli
> nicely forwarded this to me or I probably would not have seen it at all
> this week...

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019