delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1998/08/23/00:02:15

Comments: Authenticated sender is <mert0407 AT sable DOT ox DOT ac DOT uk>
From: "George Foot" <george DOT foot AT merton DOT oxford DOT ac DOT uk>
To: Endlisnis <s257m AT unb DOT ca>
Date: Sun, 23 Aug 1998 05:00:35 +0000
MIME-Version: 1.0
Subject: Re: VESA: hints, clue but no examples
Reply-to: george DOT foot AT merton DOT oxford DOT ac DOT uk
CC: djgpp AT delorie DOT com
Message-Id: <E0zARLr-00041V-00@sable.ox.ac.uk>

On 22 Aug 98 at 23:55, Endlisnis wrote:

> George Foot wrote:
> 
> > On 22 Aug 98 at 15:35, Endlisnis wrote:
> >
> > >     Yes.  I've made my own 'setdata' function to use a bunch of _farnspokel calls to quickly write a single
> > > value to a bunch of contiguous locations.
> >
> > For a reasonable number of locations, this would probably be faster
> > if you used an inline assembler routine; push ES, load it with the
> > selector, load EDI with the offset, EAX with the value and ECX with
> > the number of longs, then "rep ; stosl" and finally pop ES back
> > again.  Perhaps this:
> >
> >     inline void flmemset (int selector, int offset, int value, int num_longs)
> 
>     This version of your 'flmemset' function allows any arbitrary # of bytes to be written, (not necessarily
> multiple of 4).

I wrote the `l' meaning that it fills in a set of longs; I thought 
that was what you wanted. :)

> inline void setdata (int selector, int offset, int value, int num_bytes)
> {
>  asm (
>  "pushl %%es;"
>  "movw %%dx, %%es;"
>  "cld;"
>  "movb %%al, %%ah;"
>  "rorl $8, %%eax;"
>  "movb %%al, %%ah;"
>  "rorl $8, %%eax;"
>  "movb %%al, %%ah;"

I think it's quicker to do:

    movb %%al, %%ah
    movl %%eax, %%edx
    shll $16, %%eax
    orl  %%edx, %%eax

(assuming the high parts of EAX were zero initially)

>  "shrl $1, %%ecx;"
>  "jnc NoByte;"
>  "stosb;"
>  "NoByte: ;"
>  "shrl $1, %%ecx;"
>  "jnc NoWord;"
>  "stosw;"
>  "NoWord: ;"
>  "rep; stosl;"
>  "popl %%es "

In fact I think it's better to get EDI aligned by doing a few stosbs 
at the start, then do as many stosls as necessary, then stosb the 
remainder.

    movl %%ecx, %%edx    /* or arrange to have the count in EDX, and use ECX above */
    movl %%edi, %%ecx
    negl %%ecx
    andl $3, %%ecx
    jz 1f
    subl %%ecx, %%edx
    rep
    stosb
 1: movl %%edx, %%ecx
    shrl $2, %%ecx
    rep
    stosl
    movl %%edx, %%ecx
    andl $3, %%ecx
    rep
    stosb

I'm not sure whether the `1f' labelling is safe there, but the
normal labelling (i.e. what you used) won't work if gcc inlines the
function.

>  : : "c" (num_bytes), "a" (value), "d" (selector), "D" (offset)
>  : "%ecx", "%edi" );
>  }
> 
> BTW, are any of those registers listed in the 'clobbered' list really necessary (in this case)?  Since they are
> listed as input registers, then aren't they automatically listed as 'clobbered'?

I think so, because we've changed their values.  I think gcc will 
assume that if we don't mark them clobbered, they still contain what 
it initially put into them.  Of course, your code also clobbers EAX, 
and my changes above make it clobber EDX too.

-- 
george DOT foot AT merton DOT oxford DOT ac DOT uk

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019