X-pop3-spooler: POP3MAIL 2.1.0 b 4 980420 -bs-
Message-ID: <19980922211637.43150@cerebro.laendle>
Date: Tue, 22 Sep 1998 21:16:37 +0200
From: Marc Lehmann <pcg AT goof DOT com>
To: pgcc mailing list <beastium-list AT Desk DOT nl>
Subject: Re: Why use 32-bit reg for 8-bit value?
Mail-Followup-To: pgcc mailing list <beastium-list AT Desk DOT nl>
References: <Pine DOT SUN DOT 3 DOT 96 DOT 980922120854 DOT 9424A-100000 AT indy1>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
In-Reply-To: <Pine.SUN.3.96.980922120854.9424A-100000@indy1>; from Steven Snyder on Tue, Sep 22, 1998 at 12:24:11PM -0500
X-Operating-System: Linux version 2.1.122 (root AT cerebro) (gcc version pgcc-2.92.06 19980914 (gcc2 ss-980609 experimental)) 
Status: RO
Content-Length: 2123
Lines: 45

On Tue, Sep 22, 1998 at 12:24:11PM -0500, Steven Snyder wrote:
> I often see sequences like this in code generated by pgcc v1.1a:
> 
>  115 00bf A100000000    movl ptrbase,%eax       ; load pointer
>  117 00c9 B920000000    movl $32,%ecx           ; load value to write
>  118 00ce 8888AC074000  movb %cl,4196268(%eax)  ; write val to ptr addr
> 
> Note that 0x00000020 is moved into reg ECX, then reg CL is actually used.
> Why is pgcc bloating up the code with those bytes (the high 24 bits of
> ECX) which will never be used?  

While, in that example, its unneccessary, these operations often clear the
higher part of a register, which can later be reused.

> My understanding is that you get the same AGI conditions for any register

There is no AGI (Address Generator Interlock) involved. The only problem is
mixed-size access to registers on PPro, but the PPro has specially optimized
the above case (thus is fast)

> (e.g. ECX) or *partial* register (e.g. CL) used, so I see no benefit to
> using the 32-bit reg when the lower 8-bit reg will do.  

On both pentium and ppro, using the full 32 bit register is often faster
than using it only as a 16bit (and to a lesser extend only to 8 bit). Thats
why pgcc tries to keep most operands as 32 bit values, to avoid having to
sign extend them later.

> Using only CL in the code above would have reduced the total size of this 
> pointer operation from 16 bytes to 13 bytes.  This sounds like a Good
> Thing to me.

This is true. But have you measured the speed? Apart from cache effects,
movl is as fast as movb, but it has the additional effect of clearing the
upper part of ecx.

It certainly would be nice for the -Os option, though.

      -----==-                                              |
      ----==-- _                                            |
      ---==---(_)__  __ ____  __       Marc Lehmann       +--
      --==---/ / _ \/ // /\ \/ /       pcg AT goof DOT com       |e|
      -=====/_/_//_/\_,_/ /_/\_\                          --+
    The choice of a GNU generation                        |
                                                          |