delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/02/28/12:44:24

Date: Mon, 28 Feb 2000 17:17:56 +0200 (IST)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
X-Sender: eliz AT is
To: Kalum Somaratna aka Grendel <kalum AT crosswinds DOT net>
cc: djgpp AT delorie DOT com
Subject: Re: Fastest bitblt?
In-Reply-To: <Pine.LNX.4.10.10002281742180.649-100000@darkstar.grendel.net>
Message-ID: <Pine.SUN.3.91.1000228170421.13170B-100000@is>
MIME-Version: 1.0
Reply-To: djgpp AT delorie DOT com
Errors-To: dj-admin AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Mon, 28 Feb 2000, Kalum Somaratna aka Grendel wrote:

> > All too easy in a buggy program on a single-HD laptop.
> 
> Comeon, nearptr's would be used in the code that needs to acess <1mb
> memory. So if you take a game the code that depends on nearptrs being
> enabled is likely to be very small in relation to the main code
> (calculating player moves, stuff etc). IMHO so it should be reasonably
> possible to root out most of the bugs in the code that would need to acess
> < 1mb of meory (typically blitting routines, sound card, DMA acess etc)..

Unfortunately, the potential offenders aren't limited to code that 
accesses low memory.  Once you enabled the near pointers, *any* stray 
pointer *anywhere* in the program can do its damage.  Many programs only
enable near pointers once, since doing so involves an expensive DPMI 
call.

> The only insns that will be executed by in this clearing routine will be
> the decl, jns and moveb. See how fast and simple the assembly is. So
> getting back to your question of what could be better than one function
> call?... one that uses only 3 instructions :-) And note that the memory
> accessing part (the moveb) is only one isn. Ideal for performance critical
> routines.

Use the _farpokeX functions, and you will see the same picture inside the 
loop.

> BTW I haven't acess to the movedata sources bit I bet it isn't as small or
> fast as the above assembly :-)

Perhaps you should look, then, because movedata uses REP MOVSL, so the 
loop body is only a single instruction.  There's some overhead of the 
loop setup, but if you know the transfer is an integral multiple of 32 
bits, you can use _movedatal which avoids most of that overhead.

> The point i'm trying to make is that when you use nearptrs gcc can
> optimize your code as it sees fit. Therefore you should  get better
> performance than movedata or even the farptrs.

I don't see how you can say that.  farptr functions are written in inline 
assembly that doesn't prevent compiler optimizations, and movedata is 
already hand-optimized to death.

> Well IMHO the idea of a protected mode DOS is that you *should* be able to
> acesss < 1mb memory as easily as possible. IMHO If you can't or it is too
> difficult to access <1mb this might prove to be a hindarance to
> programmers who use that environment. 

I don't see anything difficult in using farptr and movedata.  The entire 
libc is written this way, for starters.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019