delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/05/20/12:00:15

From: "Alexei A. Frounze" <alex DOT fru AT mtu-net DOT ru>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Inline ASM question...
Date: Sat, 20 May 2000 19:47:54 +0400
Organization: MTU-Intel ISP
Lines: 63
Message-ID: <3926B3AA.469EA5BA@mtu-net.ru>
References: <8empao$5k6$1 AT nnrp02 DOT primenet DOT com>
<390ef9f9$0$72098 AT SSP1NO17 DOT highway DOT telekom DOT at>
<8emvhq$7mn$1 AT nnrp03 DOT primenet DOT com>
<3 DOT 0 DOT 6 DOT 32 DOT 20000505015633 DOT 007b2210 AT pop DOT crosswinds DOT net>
<3 DOT 0 DOT 6 DOT 32 DOT 20000510204858 DOT 007b6e40 AT pop DOT crosswinds DOT net>
<3 DOT 0 DOT 6 DOT 32 DOT 20000511021045 DOT 007af4a0 AT pop DOT crosswinds DOT net> <3 DOT 0 DOT 6 DOT 32 DOT 20000519211524 DOT 007c7290 AT pop DOT crosswinds DOT net>
NNTP-Posting-Host: ppp105-32.dialup.mtu-net.ru
Mime-Version: 1.0
X-Trace: gavrilo.mtu.ru 958837680 75805 212.188.105.32 (20 May 2000 15:48:00 GMT)
X-Complaints-To: usenet-abuse AT mtu DOT ru
NNTP-Posting-Date: 20 May 2000 15:48:00 GMT
X-Mailer: Mozilla 4.72 [en] (Win95; I)
X-Accept-Language: ru,en
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

I'm not sure you really need to use assembler. There was a quite interesting
thread (inefficiency of GCC output code & -O problem) some time ago. I came
up with inline ASM problems and I mentioned that my inline ASM is much
faster than plane C. Optimized C can be as fast as ASM or even a little
faster than ASM on different CPUs due to different optimizations actually
needed.

I don't know is it possible to find that thread somewhere in the mail
archive which I don't use or at http://www.deja.com. Anyone can explain how
this works?

I don't think you can speed up your program as much as you expect by making
bitblitting in ASM. I think if you optimize your algorithm and run GCC with
optimization switches, you can achieve a very good result.

My free dimensional texture mapper made in plane C is almost as fast as the
same implementation in inline ASM on my computer. Dieter Buerssner achieved
ebven higher FPS rate with C version than initial version with lots of
inline ASM. GCC and me do a bit different optimizations, although both seem
to be very efficient.

Good Luck
Alexei A. Frounze
-----------------------------------------
Homepage: http://alexfru.chat.ru
Mirror:   http://members.xoom.com/alexfru


"Thomas J. Hruska" wrote:
> 
> Hello, I am doing that inline ASM thing...again.  The situation is that I
> am trying to speed up screen dumps from a buffer to the video buffer using
> far pointers.  The idea here is to perform the buffer copy using only one
> far pointer reference and a rep.  So, I loaded esi, edi, and ecx with the
> appropriate values (I hope).  After clearing the direction flag, I followed
> <sys/farptr.h>'s example for moving data (hence, the .byte 0x64).  However,
> the problem comes in that rep movsl (or movsd, movs, movll, movb, movsb,
> mobsbb, etc.) does not assemble.  The objective is to get the framerate up
> from 48 fps to 60 fps (maybe 70 fps) with this code.  NOTE:  The current
> selector is _dos_ds when the inline ASM executes (also, assume that y =
> 0x10000, x = screen_width * screen_height, x2 = 0).
> 
>       __asm__ __volatile__ ("
>         pushl %%esi
>         pushl %%edi
>         movl %0, %%esi
>         movl %1, %%edi
>         movl %2, %%ecx
>         cld
>         .byte 0x64
>         rep movsl
>         popl %%edi
>         popl %%esi"
>         :
>         : "g" (&CurrMode.Buffer[x2]), "g" (0xB0000 - y), "g" ((x - x2) %
> 0x10000));
> 
> Thanks for any help in advance!
> 
>            Thomas J. Hruska -- shinelight AT crosswinds DOT net
> Shining Light Productions -- "Meeting the needs of fellow programmers"
>                   http://www.shininglightpro.com/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019