delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/1999/09/04/00:06:11

Date: Fri, 3 Sep 1999 22:02:13 +0200
From: Marc Lehmann <pcg AT opengroup DOT org>
To: pgcc AT delorie DOT com
Subject: [xomicron AT chat DOT ru: May be new peephole optimizations in pgcc.]
Message-ID: <19990903220213.D610@cerebro.laendle>
Mail-Followup-To: pgcc AT delorie DOT com
Mime-Version: 1.0
X-Operating-System: Linux version 2.2.12 (root AT cerebro) (gcc driver version pgcc-2.95.1 19990816 (release) executing gcc version 2.7.2.3)
Sender: Marc Lehmann <pcg AT goof DOT com>
Reply-To: pgcc AT delorie DOT com

Hi all!

Vadim Suhomlinov sent me the following suggestions, most of them relatively
easy to implement.

If anybody wants to have a look at pgcc or gcc and maybe write a patch, this
would be a start!

----- Forwarded message from vadim suhomlinov -----

Subject: May be new peephole optimizations in pgcc.

1) -fschedule-insns on PII may be desired. On bzip2 it improve perfomance by 9%
2) Problems with not enough registers can be avoided if using MMX registers as general with -mmx option.
3) Putting emms after mmx code is faster than putting emms at function epilogue.
-------------------------------------------------------------

This is peephole optimizations:
1) sin(arg) & cos(arg) -> fsincos
2) unroll strlen as shown in www.announce.com/agner. (Agner Fox's Pentium
optimization manual). Also when mmx target.
3) fild mem / fop -> fiop mem on PPro, K6,Cyrix
4) fstp st /fstp st -> fucompp on Pentium /Ppro
5) Anti AGI feature ,implemented in peephole with supporting
shl/add/sub/inc/dec with complex address operand.
Like this:   sal eax,2/ lea ecx,[eax*2+ebx+3]  ->  lea ecx,[eax*8+ebx+3]/
lea eax,[eax*4].
6) fsqrt/fabs -> fsqrt
7) fldz / fucompp -> (?) ftst
8) op reg,imm1/op reg,imm2 -> op reg, imm1 op imm2
( add esp, -8 / add esp,-16  -> add esp,-24)

This needs serious work (I think):
Like Intel C do not push function parameters into stack with using push, but
patch the header of the parent function to reserv enough space on stack and
use mov instruction to set parameters. This also did add esp,
unneccessary after function call. Reserve maximum space which may be needed
to call function after analysing all function calls in the target function.

----- End forwarded message -----

The last hint should be read with care, as pgcc once implemented this for
the amd-k6, but it was a loss on every cpu that was tested, including the
k6.  Most of the code to do this, however, should still be available.

-- 
      -----==-                                             |
      ----==-- _                                           |
      ---==---(_)__  __ ____  __       Marc Lehmann      +--
      --==---/ / _ \/ // /\ \/ /       pcg AT goof DOT com      |e|
      -=====/_/_//_/\_,_/ /_/\_\       XX11-RIPE         --+
    The choice of a GNU generation                       |
                                                         |

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019