Mail Archives: pgcc/2000/02/24/19:40:32
Martin Ockajak wrote:
>
> On Wed, Feb 23, 2000 at 10:27:59AM -0800, Linda Walsh wrote:
> > Using pgcc from 'Mandrake(70)', it seems to default to "-mpentium".
> > If I opt for 386, 486, or pentiumpro, it changes a
> > mov -14(%bp),%ax
> >
> > into a:
> > movzwl -14(%bp),%eax
> >
> > If I use the pgcc that comes with Suse(63), it defaults for 486, but
> > only in the case of "-mpentiumpro" does it do the above substitution.
>
> I don't know what (p)gcc versions these distros use but
> use of movz insn family depends on the setting in gcc/config/i386.c
> which in the case of the latest pgcc (2.95.3) says:
>
> const int x86_movx = m_386 | m_PPRO | m_K6;
>
> > So....I'm not familiar with the movzwl instruction. What does it
> > do and how does it's timing compare with the 'mov'. It looks like
> > a "move word and zero top 16 bits".
>
> Exactly.
> The problem is whether to use
>
> xorl %reg0,%reg0
> movw disp(%reg0),%reg1
>
> or single
>
> movzwl disp(%reg0),%reg1
>
> > My guess is that this is the
> > cause for the slowdown?
>
> Surely not on Pentium.
> On other CPUs, this is questionable.
> AFAIK, on Athlon, probably on PPro and K6, movzxx are faster.
To be more specific, on some CPU's (at least on K6) movw requires a
prefix
so it cannot get decoded with any other instruction in the same cycle
and
needs to get executed in the alux unit. As far as I remember Pentium Pro
has a big penalty when only parts of a register are used. So on K6 the
sequence xorl, movw uses one and a half cycle, but movzwl uses only a
half
cycle (a half one, because it can get paired almost always).
>
> > Oddly, under SuSE, the 486 has the same alignment (.16) as the pentium
> > option does on Mandrake. Switching the two on the respective OS's,
> ^^^
> > both result in a .4 alignmnent. Of course this makes no sense.
>
> If I understand you correctly, you mean -malign-xxxx=2.
> For gcc on i386 architecture, alignment is set as power of 2,
> so this is correct.
>
> > RH seems to default to the 386 option. Their 486 give a .16
> > alignment, but -mpentium gives a .4 alignment, and the pentiumpro
> > option gives .4 alignment but with the 'movzwl' instructions.
> >
> > So exactly what *SHOULD* be the correct settings and should movzwl's
> > be faster than movw's on any arch?
>
> Code and data aligning is often very non-trivial problem.
> I'm afraid I can't give definite answers, if anybody.
> For more info, see aligning related discussions in the gcc and pgcc
> mailing lists archives.
>
> > Oh -- also, the 386 opt was the only one that used the "leave" instruction
> > to fixup the stack frame on exit. All others use the 2 mov instructions.
> > Did leave become slower on all subsequent x86's but it was faster on the
> > 386?
>
> "leave" is faster on i386, K6 and Athlon, but slower on the rest.
I wonder if leave is really faster on K6, at least it uses the long
decoder
and therefore cannot be paired (I mean decoded in the same cycle with
any
other instruction).
>
> > Thanks...
> > -linda
>
> --
> Martin Ockajak a.k.a. Mandos <mandos AT hq DOT alert DOT sk> http://hq.alert.sk/~mandos
> "The goal of Computer Science is to build something that will last at
> least until we've finished building it."
>
> ------------------------------------------------------------------------
> Part 1.2Type: application/pgp-signature
- Raw text -