Mail Archives: pgcc/1998/09/22/19:23:45
On Tue, Sep 22, 1998 at 12:24:11PM -0500, Steven Snyder wrote:
> I often see sequences like this in code generated by pgcc v1.1a:
>
> 115 00bf A100000000 movl ptrbase,%eax ; load pointer
> 117 00c9 B920000000 movl $32,%ecx ; load value to write
> 118 00ce 8888AC074000 movb %cl,4196268(%eax) ; write val to ptr addr
>
> Note that 0x00000020 is moved into reg ECX, then reg CL is actually used.
> Why is pgcc bloating up the code with those bytes (the high 24 bits of
> ECX) which will never be used?
While, in that example, its unneccessary, these operations often clear the
higher part of a register, which can later be reused.
> My understanding is that you get the same AGI conditions for any register
There is no AGI (Address Generator Interlock) involved. The only problem is
mixed-size access to registers on PPro, but the PPro has specially optimized
the above case (thus is fast)
> (e.g. ECX) or *partial* register (e.g. CL) used, so I see no benefit to
> using the 32-bit reg when the lower 8-bit reg will do.
On both pentium and ppro, using the full 32 bit register is often faster
than using it only as a 16bit (and to a lesser extend only to 8 bit). Thats
why pgcc tries to keep most operands as 32 bit values, to avoid having to
sign extend them later.
> Using only CL in the code above would have reduced the total size of this
> pointer operation from 16 bytes to 13 bytes. This sounds like a Good
> Thing to me.
This is true. But have you measured the speed? Apart from cache effects,
movl is as fast as movb, but it has the additional effect of clearing the
upper part of ecx.
It certainly would be nice for the -Os option, though.
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / pcg AT goof DOT com |e|
-=====/_/_//_/\_,_/ /_/\_\ --+
The choice of a GNU generation |
|
- Raw text -