Sender: nate AT cartsys DOT com Message-ID: <366879EB.6C322602@cartsys.com> Date: Fri, 04 Dec 1998 16:10:19 -0800 From: Nate Eldredge X-Mailer: Mozilla 4.05 [en] (X11; I; Linux 2.0.35 i486) MIME-Version: 1.0 To: djgpp AT delorie DOT com Subject: Re: Extended ASM (Was: misc questions) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Reply-To: djgpp AT delorie DOT com Tal Lavi wrote: > > >If you have already studied those, you will > >have to ask further questions here. > > Here's one. > I need a FlushPage routine(macro?) for a 640x480x16bpp mode written in asm. > Until now, I used a C(++) routine which uses _farsetsel & _farnspokel. > All my experiments in writing such a routine with the extended ASM failed. > Here's one that SEEM to be alright to me, the unexperienced ASM programer. > Please, show me a better one(that works). > And, PLEASE, mention what the not so obvious parts of your code does. > > void FlushPage(unsigned short C) > { > __asm__ __volatile__ (" > movw %0,%%fs\n > .byte 0x64\n > movl $0,%%edi\n You have an FS override prefix to an instruction that doesn't even access memory. This is meaningless. What you probably meant to do was to apply it to the `stosl'. However, even this is not such a great idea, as the prefix will cost an extra cycle on each store operation. The best solution is to load your selector into the segment register which is the default; for `stos', it's `%es'. Remember, however, that GCC expects ES to be preserved, so save and restore it. (Adding it to the clobber list won't work; GCC doesn't even know about the existence of segment registers.) movl %%es, %%edx movl %0, %%es ... rep stosl movl %%edx, %%es > movl $153600, %%ecx\n > rep\n > stosl" > : > :"r"(LFBSelector[ScreenNum]), > "a"((long(C)<<16)+C) > :"ax","cx","di","memory" > ); > } > > I also have a couple of questions: > > 1) Should I use a rep stos, or maybe movl with a branch? For simplicity, `rep stosl' is good. Using `movl' will require the actual move, an add to update the index, a subtract (or possibly compare) to decide whether you're done, and a conditional jump which will almost always be taken. This will be significantly slower, especially since many processors must flush their prefetch queue on a jump. If you do the `mov's in bulk, it can lead to a speedup on some newer processors (the string instructions have got less optimization as processors evolved). This would probably look something like: movl count, %ecx movl start, %edi movl fill, %eax loop: movl %eax, (%edi) movl %eax, 4(%edi) movl %eax, 8(%edi) ... movl %eax, 28(%edi) addl $32, %edi subl $32, %ecx jnz loop But this needs extra care to handle the leftover bytes, is more complicated, is larger, and isn't universally better (and the speedup isn't a lot, even when it is faster). At least in the short run, stick with `stos'. > 2) > > The first are constraints again, and ".byte 0x64" causes the assembler to > > emit 0x64 into the binary code. 0x64 is the op-code for FS: prefix > > override (meaning the next instruction uses offsets into the segment > > whose selector is in the FS register). sys/farptr.h uses a byte constant > > because early versions of Gas didn't support prefixes (I'm not sure how > > the things are with Binutils 2.8.1). > > Why should you write(and know) the opcode? Isn't there an ASM instruction for that purpose? > (movl %something, %%fs(%something))? Yes, but some people prefer to write their prefixes explicitly. It does give one a better sense of what is being generated. So movl %reg, %fs:mem is equivalent to fs movl %reg, mem (Note that these really are prefixes since they can be applied to most instructions, not just `mov'.) These would be the clearest way to write those instructions. But since GAS was buggy and would ignore the prefixes under certain, ill-defined circumstances, many people would write the hex opcode with `.byte' instead, to be safe. This especially applied to the DJGPP headers, which might be used with almost any GAS version. -- Nate Eldredge nate AT cartsys DOT com