delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/05/21/18:19:27

From: wmcgugan AT netcomuk DOT co DOT uk (William McGugan)
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Problem with NASM 0.94
Date: Tue, 20 May 1997 17:58:11 GMT
Organization: None. (via NETCOM Internet Ltd. USENET service).
Lines: 64
Message-ID: <3381e21a.854028@nntp.netcomuk.co.uk>
References: <33819E84 DOT 4C80 AT iiic DOT ethz DOT ch>
NNTP-Posting-Host: dialup-17-61.netcomuk.co.uk
Mime-Version: 1.0
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

On Tue, 20 May 1997 14:52:20 +0200, Andrea Martino
<amartino AT iiic DOT ethz DOT ch> wrote:

>Hello,
>	I don`t know if this is the right newsgroup, but I don`t know where
>find an answer for my question :)
>
>I want to "unroll" a loop, but I have
>some problems with the macro-directives. I have to do a loop for
>a scan-line in the gouraud

Unrolling a loop means repeating a section of code rather than having
a conditional jump. You can not unroll a loop that is to be repeated a
variable number of times.

>
>In the NASM, version 0.94 it looks like this...
>
>        mov ecx,[yQ]
>        sub ecx,[yP]
>        inc ecx			// ecx = number of pixel (max = 320)
>.Xloop:
>        mov al,dh		// dx = color in 8:8 fixed point
>        add dx,bx
>        stosb
>        dec ecx
>        jnz .Xloop
>
I recently coded a gouraud triangle function so I can give you a few
tips on optimization here. If you're using 32 bit code (I assume you
are), then 16 bit instructions will take longer to execute. Replace
the 'add dx, bx' with 'add edx, ebx', it will be faster. Another thing
to note is that 'stosb' is slower than a move and an increment (I
think).

>I have another two questions:
>1. what is the alignement? How does it work? What are the $$ and
>the $?
>
>times ($$ - $) & 3      nop	; Align the next instruction/data to
>                                ; a double-word boundary, assuming 
>                                ; segment is aligned to double-word
>
Alignment means moving the code so it starts on an address that is an
exact multiple of whatever. You would want to align the beginning of a
loop to 4 on a 486, to speed up caching. This is not important for a
Pentium.

$$ is the start address of the current segment, $ is the address of
the current instruction. '($$ -$)&3' calculates the number of NOPs
needed to move to the next dword boundry.

>2. How can I use the FPU of my PENTIUM with NASM 0.94. Where can I find
>a tutorial?

The FPU uses another set of instructions that are inserted as normal
in to your code. There's bound to be a tutorial some where on the web.


Hope that helped!

William McGugan
http://www.netcomuk.co.uk/~wmcgugan

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019