Sender: bill AT delorie DOT com Message-ID: <377EA4A6.EB687A6D@taniwha.org> Date: Sun, 04 Jul 1999 12:02:46 +1200 From: Bill Currie X-Mailer: Mozilla 4.6 [en] (X11; I; Linux 2.2.9 i486) X-Accept-Language: en MIME-Version: 1.0 To: djgpp-workers AT delorie DOT com Subject: Re: .align directives in libc.a References: <377BB217 DOT 2FFBAEA8 AT inti DOT gov DOT ar> <377C5986 DOT 1B33420B AT taniwha DOT org> <377CB4A1 DOT BB72A3EA AT inti DOT gov DOT ar> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Reply-To: djgpp-workers AT delorie DOT com salvador wrote: > Yes, most processors uses 32, but that's also a waste if you routine is around 32 > bytes and you pad it with 60 bytes ;-) (30+30). > 32 bytes is too much and you start losing from other things. I don't know how MSVC > determines when 32 bytes is good idea or not, perhaps is related to the size of the > functions. BTW MSVC also exploits "proximity" by moving functions closer to the > caller (mostly small static ones, that's usually better than inlining if the function > have more than a couple of lines). I beleive the goal is to land on the beginning of a cache line if you have to have a cache miss. However, for small functions, I agree that if you can pack a more than one function into a cache line you will win more often. You will also have a smaller > > > But looks like the most sensitive stuff is the entry point of functions, not the > > > align of loops or jumps. > > > > Nope, any destination: functions, loops and jumps are all equally > > important. > > Not for K6 and not for Pentium MMX, I tried it. In fact MSVC do *not* align jumps or > loops. In K6 processors it could be even worst (if a loop is in a xxxxxC memory > address works slowly), in Pentium MMX de difference is very small. Hmm, interesting. I'll believe it, as you've obviously actually measured it. Hmmm, this implies an improvement in handling jump instruction. > I think aligning jumps and loops is only good idea for big functions, small functions > could need more cache lines if you add bytes inside. Agreed, mostly. I would say from your ealier comments about loop alignments that even big functions could benefit from unaligned loops on newer processors. I think you would have to check this out (I can't yet, I've only got a 486 and a 386, but not for too much longer:) > Currently I think MSVC have better ideas than gcc because generates faster code. GCC is, unfortunatly, held back by its portability and multiple targets. MS gets to concentrate on just x86, and probably just the newer ones, and thus can do more clever tricks than gcc can ATM. > Take a look to my compila.html page. Is this the one you posted recently? I had a look at that one and noticed MSVC was a little faster, but (from memory) not always. > I think Gas should have conditional aligment instructions, like: "align it if all the > references are at 64 or more bytes of distance" Probably too hard an maybe not worth the effort. I think this would require too many passes. Bill -- Leave others their otherness.