From: Bill Currie Newsgroups: comp.os.msdos.djgpp Subject: Re: Question about optimization of DJGPP Date: Thu, 23 Apr 1998 19:18:36 +1200 Organization: NetLink Wellington, New Zealand. Lines: 82 Message-ID: <353EEB4C.C90047DE@taniwha.tssc.co.nz> References: <199804130936 DOT LAA13390 AT euronet DOT nl> <01bd68a5$c8e3b260$151601bf AT cb001687> <6hko88$bea$1 AT star DOT cs DOT vu DOT nl> NNTP-Posting-Host: nzlu02.tssc.co.nz Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Precedence: bulk Ruiter de M wrote: > > > > I've made a little program and when I looked in the disassembler > > > window while running the program, I saw movl %eax,%eax. Why is > > > this instruction there? Did I forget to turn on optimization > > > somewhere? > > Be sure to use -O or -O2 on the command-line. But I don't think > that's the problem. Could be because the compiler tries to align > jump-labels at 4 (or 8) byte boundaries for speed. Maybe the movl > %eax,%eax is faster than two nop's? On a 386, both `movl %eax,%eax' and `nop' take two cycles (one for 486?) and so, yes, `movl %eax,%eax' is faster than two `nop's. > > Uhm.. It's because the compiler is like stupid and stuff.. > > No it's not. GCC is (at least one of) the best optimizing portable > compiler around. Actually, it isn't the best for Pentiums :(, but people are working on that (egcs). However, in this case it isn't GCC doing this optimisation at all, it's the assembler (as). When gas sees a `.align' directive (gcc spits these out when it wants a certain alignment) in the `.text' section, it tries to do the alignment in one instruction, using upto 16 bytes for the instruction (any more and the cpu would either GPF or ILL OP). Here is a small excerpt from tc-i386.c in the gas source tree (binutils-2.8.1.0.15): /* Various efficient no-op patterns for aligning code labels. */ /* Note: Don't try to assemble the instructions in the comments. */ /* 0L and 0w are not legal */ static const char f32_1[] = {0x90}; /* nop */ static const char f32_2[] = {0x89,0xf6}; /* movl %esi,%esi */ static const char f32_3[] = {0x8d,0x76,0x00}; /* leal 0(%esi),%esi */ static const char f32_4[] = {0x8d,0x74,0x26,0x00}; /* leal 0(%esi,1),%esi */ static const char f32_5[] = {0x90, /* nop */ 0x8d,0x74,0x26,0x00}; /* leal 0(%esi,1),%esi */ static const char f32_6[] = {0x8d,0xb6,0x00,0x00,0x00,0x00}; /* leal 0L(%esi),%esi */ static const char f32_7[] = {0x8d,0xb4,0x26,0x00,0x00,0x00,0x00}; /* leal 0L(%esi,1),%esi */ static const char f32_8[] = {0x90, /* nop */ 0x8d,0xb4,0x26,0x00,0x00,0x00,0x00}; /* leal 0L(%esi,1),%esi */ static const char f32_9[] = {0x89,0xf6, /* movl %esi,%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ static const char f32_10[] = {0x8d,0x76,0x00, /* leal 0(%esi),%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ static const char f32_11[] = {0x8d,0x74,0x26,0x00, /* leal 0(%esi,1),%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ I hope this helps clear things up a little Bill -- Leave others their otherness.