delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1998/04/24/00:02:06

From: Bill Currie <bill AT taniwha DOT tssc DOT co DOT nz>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Question about optimization of DJGPP
Date: Thu, 23 Apr 1998 19:18:36 +1200
Organization: NetLink Wellington, New Zealand.
Lines: 82
Message-ID: <353EEB4C.C90047DE@taniwha.tssc.co.nz>
References: <199804130936 DOT LAA13390 AT euronet DOT nl> <01bd68a5$c8e3b260$151601bf AT cb001687> <6hko88$bea$1 AT star DOT cs DOT vu DOT nl>
NNTP-Posting-Host: nzlu02.tssc.co.nz
Mime-Version: 1.0
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

Ruiter de M wrote:
> 
> > > I've made a little program and when I looked in the disassembler
> > > window while running the program, I saw movl %eax,%eax. Why is
> > > this instruction there? Did I forget to turn on optimization
> > > somewhere?
> 
> Be sure to use -O or -O2 on the command-line. But I don't think
> that's the problem. Could be because the compiler tries to align
> jump-labels at 4 (or 8) byte boundaries for speed. Maybe the movl
> %eax,%eax is faster than two nop's?

On a 386, both `movl %eax,%eax' and `nop' take two cycles (one for 486?)
and so, yes, `movl %eax,%eax' is faster than two `nop's.

> > Uhm.. It's because the compiler is like stupid and stuff..
> 
> No it's not. GCC is (at least one of) the best optimizing portable
> compiler around.

Actually, it isn't the best for Pentiums :(, but people are working on
that (egcs).  However, in this case it isn't GCC doing this optimisation
at all, it's the assembler (as).  When gas sees a `.align' directive
(gcc spits these out when it wants a certain alignment) in the `.text'
section, it tries to do the alignment in one instruction, using upto 16
bytes for the instruction (any more and the cpu would either GPF or ILL
OP).  

Here is a small excerpt from tc-i386.c in the gas source tree
(binutils-2.8.1.0.15):

  /* Various efficient no-op patterns for aligning code labels.  */
  /* Note: Don't try to assemble the instructions in the comments. */
  /*       0L and 0w are not legal */
  static const char f32_1[] =
    {0x90};                                     /* nop                 
*/
  static const char f32_2[] =
    {0x89,0xf6};                                /* movl %esi,%esi      
*/
  static const char f32_3[] =
    {0x8d,0x76,0x00};                           /* leal 0(%esi),%esi   
*/
  static const char f32_4[] =
    {0x8d,0x74,0x26,0x00};                      /* leal 0(%esi,1),%esi 
*/
  static const char f32_5[] =
    {0x90,                                      /* nop                 
*/
     0x8d,0x74,0x26,0x00};                      /* leal 0(%esi,1),%esi 
*/
  static const char f32_6[] =
    {0x8d,0xb6,0x00,0x00,0x00,0x00};            /* leal 0L(%esi),%esi  
*/
  static const char f32_7[] =
    {0x8d,0xb4,0x26,0x00,0x00,0x00,0x00};       /* leal 0L(%esi,1),%esi
*/
  static const char f32_8[] =
    {0x90,                                      /* nop                 
*/
     0x8d,0xb4,0x26,0x00,0x00,0x00,0x00};       /* leal 0L(%esi,1),%esi
*/
  static const char f32_9[] =
    {0x89,0xf6,                                 /* movl %esi,%esi      
*/
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi
*/
  static const char f32_10[] =
    {0x8d,0x76,0x00,                            /* leal 0(%esi),%esi   
*/
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi
*/
  static const char f32_11[] =
    {0x8d,0x74,0x26,0x00,                       /* leal 0(%esi,1),%esi 
*/
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi
*/

I hope this helps clear things up a little
Bill
-- 
Leave others their otherness.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019