From: Andrew Crabtree Message-Id: <199705131619.AA156030341@typhoon.rose.hp.com> Subject: Re: Any tips on optimizing C code? To: quacci AT vera DOT com (jon) Date: Tue, 13 May 1997 9:19:00 PDT Cc: djgpp AT delorie DOT com In-Reply-To: <33775c59.19219875@news.cis.yale.edu>; from "jon" at May 12, 97 6:15 pm Precedence: bulk > C code. In the specific thing I am writing, I've already done the > obvious things, like switched most calcs from FP to integer, using bit > shifting wherever possible for multiplying and dividing, etc. But is This kind of optimization is likely unneeded these days. From Intels optimizing guide they state that for a Pentium Pro, a multiple should only be replaced by a single shift. For a pentium a shift, a mov, and a subtract/add beats a multiple, but not any more. They have a small algorithm based on the number of bits set in the value to tell you wether to use shifts/adds on a given processor. Unless your using a 386 or 486 its probably not worth it. GCC will find the simple ones (like powers of 2). > Like, does running a "for" loop by > decrementing rather than incrementing actually save a cycle? Running a for loop thats triggers on 0 saves a couple of cycles. Again, GCC figures out to do this for you most of the time. I have seen it do both of these. for (i=0; i<=10;i++) 1) mov ECX,10 loop ... dec ECX jnz loop 2) mov ECX, -10 loop ... inc ECX jnz loop both of which eliminate a compare instruction. > or does a > "case" command actually beat a series of "if"s? If you can get a case statement optimized to a jump table (for a lot of cases), this is much faster. You should probably play around with the different optimization flags and see what kind of assembly they output, you'd be surprised at how good the compiler can be sometimes (you'd also be surprised how retarted it is). I would be careful using -m486, as that can often lead to slower performance on Pentium class machines or newer (it screws up alignment and uses shifts when it should mul). I have an environment setup now where I can rebuild gcc 2.8 with the pentium optimizations patches, and I will make the compiled EXEs available to whoever wants them. My preliminary testing shows a pretty decent speed increase. Your best bet in a high-level language is algorithm choice and overall program structure. Andrew