Sender: graham AT delorie DOT com Message-ID: <3737B56C.58A8D860@home.com> Date: Mon, 10 May 1999 21:43:24 -0700 From: Graham TerMarsch Organization: Internet specialist for hire. X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.7 i586) X-Accept-Language: en MIME-Version: 1.0 To: pgcc AT delorie DOT com Subject: Re: What types of optimizations are present for the K6? References: <37374C32 DOT 4D12565A AT home DOT com> <19990511001039 DOT K22062 AT cerebro DOT laendle> Content-Type: multipart/mixed; boundary="------------84FD45CFA9E26AFECD30AE4F" Reply-To: pgcc AT delorie DOT com This is a multi-part message in MIME format. --------------84FD45CFA9E26AFECD30AE4F Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Marc Lehmann wrote: > > So, uh, wanted to find out a bit more about what types of optimizations we're > > doing for K6 processors, and find out if anyone had other tips on cmd line > > A different scheduling is used for the k6. Sounds fair. But, what I'm more interested to find out is whether or not the rescheduling that is done for the K6 should actually result in any performance gain over the standard '-m486 -O2' options. So far, I haven't been able to find any. > > FWIW, both the 'gzip' and 'xfree86' compiles were done with '-O2' for both > > egcs and pgcc compiles. > > pgcc should be almost identical to egcs if only -O2 is used. Check the FAQ! True, it is almost as close. -O6 however, is slower than -O2 on my K6-III. I went back and rebuilt XFree with -O6 to see if it made any difference, and ran 'x11perf' through most of the tests so that I could compare it against what I had from previous runs. Attached is the output from 'x11perfcomp -ro' showing the relative performance of a 'stock' RH5.2 XFree86, a '-march=k6 -O2' version compiled with pgcc, and a '-march=k6 -O6' version compiled with pgcc. The k6/O2 version is a bit faster for some things, slower on others. However, from seeing the output of the k6/O6 version, I can't see any improvement. Don't take this as a slag, I could totally understand K6 support not being as widespread or ferverously developed as standard Pentium support. I'm really just more curious to find out if the results that I'm seeing are more or less what people expect to see out of these options. -- Graham TerMarsch // ----------------------------------------------------------------- // DIDI ... is that a MARTIAN name, or, are we in ISRAEL? // ----------------------------------------------------------------- --------------84FD45CFA9E26AFECD30AE4F Content-Type: text/plain; charset=us-ascii; name="x11perfcomp" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="x11perfcomp" 1: 07-K6-III-400+RivaTNT+Xfree3.3.3-128MB.x11perf 2: 08-K6-III-400+RivaTNT+Xfree3.3.3-k6-O2-128MB.x11perf 3: 09-K6-III-400+RivaTNT+Xfree3.3.3-k6-O6-128MB.x11perf 1 2 3 Operation -------- ------ ------ --------- 6600000.0 1.10 0.82 Dot 2070000.0 1.03 0.88 1x1 rectangle 964000.0 1.02 0.95 10x10 rectangle 38100.0 1.00 1.00 100x100 rectangle 1890.0 1.01 1.00 500x500 rectangle 439000.0 0.97 0.94 1x1 stippled rectangle (8x8 stipple) 12600.0 0.95 0.95 10x10 stippled rectangle (8x8 stipple) 140.0 0.95 0.96 100x100 stippled rectangle (8x8 stipple) 5.6 0.96 0.96 500x500 stippled rectangle (8x8 stipple) 1050000.0 1.07 0.92 1x1 opaque stippled rectangle (8x8 stipple) 467000.0 1.01 0.97 10x10 opaque stippled rectangle (8x8 stipple) 15600.0 1.00 1.00 100x100 opaque stippled rectangle (8x8 stipple) 745.0 1.00 1.00 500x500 opaque stippled rectangle (8x8 stipple) 1050000.0 1.07 0.92 1x1 tiled rectangle (4x4 tile) 467000.0 1.01 0.97 10x10 tiled rectangle (4x4 tile) 14500.0 1.08 1.07 100x100 tiled rectangle (4x4 tile) 740.0 1.01 1.01 500x500 tiled rectangle (4x4 tile) 439000.0 0.94 0.94 1x1 stippled rectangle (17x15 stipple) 12400.0 0.92 0.96 10x10 stippled rectangle (17x15 stipple) 140.0 0.96 0.96 100x100 stippled rectangle (17x15 stipple) 5.7 0.95 0.95 500x500 stippled rectangle (17x15 stipple) 982000.0 1.06 0.92 1x1 opaque stippled rectangle (17x15 stipple) 453000.0 1.01 0.97 10x10 opaque stippled rectangle (17x15 stipple) 14700.0 1.00 1.00 100x100 opaque stippled rectangle (17x15 stipple) 737.0 1.00 1.00 500x500 opaque stippled rectangle (17x15 stipple) 1060000.0 1.06 0.92 1x1 tiled rectangle (17x15 tile) 469000.0 1.00 0.97 10x10 tiled rectangle (17x15 tile) 15100.0 1.00 1.00 100x100 tiled rectangle (17x15 tile) 776.0 1.00 1.00 500x500 tiled rectangle (17x15 tile) 439000.0 0.98 0.94 1x1 stippled rectangle (161x145 stipple) 13200.0 0.95 0.95 10x10 stippled rectangle (161x145 stipple) 148.0 0.95 0.95 100x100 stippled rectangle (161x145 stipple) 6.0 0.95 0.95 500x500 stippled rectangle (161x145 stipple) 511000.0 0.98 0.94 1x1 opaque stippled rectangle (161x145 stipple) 18900.0 1.04 1.03 10x10 opaque stippled rectangle (161x145 stipple) 237.0 0.96 0.94 100x100 opaque stippled rectangle (161x145 stipple) 9.5 0.97 0.95 500x500 opaque stippled rectangle (161x145 stipple) 827000.0 0.87 0.95 1x1 tiled rectangle (161x145 tile) 85700.0 1.03 1.01 10x10 tiled rectangle (161x145 tile) 2100.0 1.04 1.03 100x100 tiled rectangle (161x145 tile) 95.5 1.05 1.04 500x500 tiled rectangle (161x145 tile) 884000.0 1.00 0.96 1x1 tiled rectangle (216x208 tile) 96000.0 1.00 1.00 10x10 tiled rectangle (216x208 tile) 2700.0 1.03 1.03 100x100 tiled rectangle (216x208 tile) 127.0 1.02 1.02 500x500 tiled rectangle (216x208 tile) 835000.0 0.93 0.89 1-pixel line segment 564000.0 0.96 0.93 10-pixel line segment 136000.0 0.99 0.98 100-pixel line segment 30900.0 1.00 0.99 500-pixel line segment 131000.0 0.98 0.97 100-pixel line segment (1 kid) 123000.0 0.97 0.98 100-pixel line segment (2 kids) 117000.0 0.95 0.97 100-pixel line segment (3 kids) 819000.0 1.02 0.94 10-pixel dashed segment 150000.0 0.95 0.92 100-pixel dashed segment 110000.0 1.05 1.05 100-pixel double-dashed segment 737000.0 0.99 0.93 10-pixel horizontal line segment 562000.0 0.97 0.94 100-pixel horizontal line segment 403000.0 0.80 0.95 500-pixel horizontal line segment 464000.0 0.97 0.94 10-pixel vertical line segment 125000.0 1.00 0.98 100-pixel vertical line segment 31000.0 1.00 0.99 500-pixel vertical line segment 354000.0 0.90 0.92 10x1 wide horizontal line segment 101000.0 0.98 1.03 100x10 wide horizontal line segment 14400.0 1.00 1.00 500x50 wide horizontal line segment 312000.0 1.01 1.04 10x1 wide vertical line segment 75900.0 1.03 1.03 100x10 wide vertical line segment 10300.0 1.00 0.99 500x50 wide vertical line segment 843000.0 0.83 0.91 1-pixel line 594000.0 0.98 0.94 10-pixel line 140000.0 0.99 0.99 100-pixel line 31700.0 1.00 1.00 500-pixel line 811000.0 1.00 0.98 10-pixel dashed line 150000.0 0.95 0.93 100-pixel dashed line 115000.0 1.03 1.04 100-pixel double-dashed line 123000.0 1.06 1.07 10x1 wide line 26000.0 1.00 1.07 100x10 wide line 5880.0 0.85 1.07 500x50 wide line 8490.0 1.06 1.09 100x10 wide dashed line 8150.0 1.03 1.07 100x10 wide double-dashed line 713000.0 1.00 0.95 10x10 rectangle outline 63800.0 1.01 1.00 100x100 rectangle outline 13600.0 1.01 1.00 500x500 rectangle outline 470000.0 1.01 0.98 10x10 wide rectangle outline 47800.0 1.00 1.00 100x100 wide rectangle outline 3860.0 1.00 1.00 500x500 wide rectangle outline 1000000.0 1.00 0.88 1-pixel circle 361000.0 1.02 0.97 10-pixel circle 53000.0 0.98 0.99 100-pixel circle 11200.0 0.99 1.00 500-pixel circle 10200.0 1.09 1.13 100-pixel dashed circle 6760.0 1.08 1.12 100-pixel double-dashed circle 94100.0 1.05 1.07 10-pixel wide circle 10500.0 1.24 1.26 100-pixel wide circle 2330.0 1.13 1.15 500-pixel wide circle 785.0 1.16 1.22 100-pixel wide dashed circle 862.0 1.16 1.22 100-pixel wide double-dashed circle 231000.0 1.03 1.10 10-pixel partial circle 56900.0 0.99 0.99 100-pixel partial circle 12000.0 1.11 1.16 10-pixel wide partial circle 4600.0 1.08 1.08 100-pixel wide partial circle 8380000.0 0.99 0.47 1-pixel solid circle 177000.0 1.03 1.05 10-pixel solid circle 29700.0 1.03 1.05 100-pixel solid circle 2290.0 1.00 1.00 500-pixel solid circle 116000.0 1.12 1.11 10-pixel fill chord partial circle 29100.0 1.00 1.02 100-pixel fill chord partial circle 108000.0 1.10 1.09 10-pixel fill slice partial circle 26500.0 1.01 1.02 100-pixel fill slice partial circle 370000.0 1.02 0.98 10-pixel ellipse 57300.0 1.00 0.99 100-pixel ellipse 12200.0 1.00 1.00 500-pixel ellipse 11100.0 1.22 1.27 100-pixel dashed ellipse 8270.0 1.07 1.11 100-pixel double-dashed ellipse 111000.0 1.04 1.07 10-pixel wide ellipse 16500.0 1.05 1.11 100-pixel wide ellipse 3280.0 0.98 1.09 500-pixel wide ellipse 815.0 1.20 1.21 100-pixel wide dashed ellipse 781.0 1.18 1.20 100-pixel wide double-dashed ellipse 234000.0 1.15 1.09 10-pixel partial ellipse 67900.0 1.00 0.99 100-pixel partial ellipse 11500.0 1.10 1.23 10-pixel wide partial ellipse 2120.0 1.10 1.12 100-pixel wide partial ellipse 218000.0 1.03 1.02 10-pixel filled ellipse 39500.0 1.04 1.05 100-pixel filled ellipse 4230.0 1.00 1.01 500-pixel filled ellipse 124000.0 1.14 1.13 10-pixel fill chord partial ellipse 41200.0 1.01 1.03 100-pixel fill chord partial ellipse 117000.0 1.11 1.10 10-pixel fill slice partial ellipse 37200.0 1.02 1.03 100-pixel fill slice partial ellipse 297000.0 0.91 0.99 Fill 1x1 equivalent triangle 113000.0 1.12 1.11 Fill 10x10 equivalent triangle 19900.0 0.99 1.01 Fill 100x100 equivalent triangle 301000.0 0.90 0.96 Fill 1x1 trapezoid 145000.0 1.02 1.01 Fill 10x10 trapezoid 26600.0 0.90 1.03 Fill 100x100 trapezoid 4010.0 1.01 1.01 Fill 300x300 trapezoid 209000.0 1.04 0.99 Fill 1x1 stippled trapezoid (8x8 stipple) 12300.0 0.95 0.96 Fill 10x10 stippled trapezoid (8x8 stipple) 139.0 0.96 0.96 Fill 100x100 stippled trapezoid (8x8 stipple) 15.6 0.96 0.96 Fill 300x300 stippled trapezoid (8x8 stipple) 173000.0 1.01 0.99 Fill 1x1 opaque stippled trapezoid (8x8 stipple) 60000.0 1.06 1.03 Fill 10x10 opaque stippled trapezoid (8x8 stipple) 6360.0 1.06 1.03 Fill 100x100 opaque stippled trapezoid (8x8 stipple) 1530.0 1.03 1.01 Fill 300x300 opaque stippled trapezoid (8x8 stipple) 234000.0 1.04 0.96 Fill 1x1 tiled trapezoid (4x4 tile) 54100.0 1.04 1.04 Fill 10x10 tiled trapezoid (4x4 tile) 6360.0 1.06 1.03 Fill 100x100 tiled trapezoid (4x4 tile) 1360.0 1.15 1.14 Fill 300x300 tiled trapezoid (4x4 tile) 209000.0 1.04 0.99 Fill 1x1 stippled trapezoid (17x15 stipple) 12300.0 0.96 0.96 Fill 10x10 stippled trapezoid (17x15 stipple) 140.0 0.96 0.96 Fill 100x100 stippled trapezoid (17x15 stipple) 15.7 0.95 0.96 Fill 300x300 stippled trapezoid (17x15 stipple) 174000.0 0.84 0.99 Fill 1x1 opaque stippled trapezoid (17x15 stipple) 60300.0 1.06 1.02 Fill 10x10 opaque stippled trapezoid (17x15 stipple) 6440.0 1.06 1.03 Fill 100x100 opaque stippled trapezoid (17x15 stipple) 1420.0 1.02 1.01 Fill 300x300 opaque stippled trapezoid (17x15 stipple) 230000.0 1.03 0.97 Fill 1x1 tiled trapezoid (17x15 tile) 53900.0 1.04 1.02 Fill 10x10 tiled trapezoid (17x15 tile) 6470.0 1.06 1.03 Fill 100x100 tiled trapezoid (17x15 tile) 1490.0 1.03 1.01 Fill 300x300 tiled trapezoid (17x15 tile) 208000.0 1.05 0.99 Fill 1x1 stippled trapezoid (161x145 stipple) 12900.0 0.95 0.95 Fill 10x10 stippled trapezoid (161x145 stipple) 147.0 0.96 0.95 Fill 100x100 stippled trapezoid (161x145 stipple) 16.4 0.96 0.96 Fill 300x300 stippled trapezoid (161x145 stipple) 174000.0 1.01 0.99 Fill 1x1 opaque stippled trapezoid (161x145 stipple) 15100.0 1.05 1.03 Fill 10x10 opaque stippled trapezoid (161x145 stipple) 224.0 0.99 0.97 Fill 100x100 opaque stippled trapezoid (161x145 stipple) 26.4 0.96 0.94 Fill 300x300 opaque stippled trapezoid (161x145 stipple) 228000.0 1.03 0.96 Fill 1x1 tiled trapezoid (161x145 tile) 55700.0 1.04 1.02 Fill 10x10 tiled trapezoid (161x145 tile) 1860.0 0.94 1.04 Fill 100x100 tiled trapezoid (161x145 tile) 256.0 1.02 1.05 Fill 300x300 tiled trapezoid (161x145 tile) 235000.0 1.03 0.94 Fill 1x1 tiled trapezoid (216x208 tile) 59600.0 1.04 1.00 Fill 10x10 tiled trapezoid (216x208 tile) 2310.0 1.03 1.04 Fill 100x100 tiled trapezoid (216x208 tile) 327.0 1.02 1.02 Fill 300x300 tiled trapezoid (216x208 tile) 89000.0 1.02 1.01 Fill 10x10 equivalent complex polygon 13300.0 1.02 1.03 Fill 100x100 equivalent complex polygons 60100.0 0.94 0.86 Fill 10x10 64-gon (Convex) 20200.0 0.99 0.96 Fill 100x100 64-gon (Convex) 46400.0 0.98 0.87 Fill 10x10 64-gon (Complex) 19500.0 1.00 0.88 Fill 100x100 64-gon (Complex) 554000.0 1.00 1.01 Char in 80-char line (6x13) 535000.0 0.99 0.99 Char in 70-char line (8x13) 376000.0 1.18 1.20 Char in 60-char line (9x15) 206000.0 0.96 0.96 Char16 in 40-char line (k14) 86100.0 0.98 0.98 Char16 in 23-char line (k24) 879000.0 0.95 0.98 Char in 80-char line (TR 10) 272000.0 0.97 0.99 Char in 30-char line (TR 24) 629000.0 0.97 1.01 Char in 20/40/20 line (6x13, TR 10) 116000.0 0.84 0.89 Char16 in 7/14/7 line (k14, k24) 504000.0 0.96 1.00 Char in 80-char image line (6x13) 476000.0 0.99 1.01 Char in 70-char image line (8x13) 384000.0 1.01 1.04 Char in 60-char image line (9x15) 187000.0 0.96 0.97 Char16 in 40-char image line (k14) 76900.0 0.98 0.98 Char16 in 23-char image line (k24) 743000.0 0.95 0.97 Char in 80-char image line (TR 10) 222000.0 0.98 0.99 Char in 30-char image line (TR 24) 154000.0 1.00 0.97 Scroll 10x10 pixels 16700.0 1.00 0.99 Scroll 100x100 pixels 940.0 1.01 1.00 Scroll 500x500 pixels 152000.0 1.00 0.96 Copy 10x10 from window to window 16200.0 1.00 0.97 Copy 100x100 from window to window 933.0 1.01 1.01 Copy 500x500 from window to window 87000.0 1.03 1.01 Copy 10x10 from pixmap to window 2770.0 1.04 1.04 Copy 100x100 from pixmap to window 121.0 1.04 1.03 Copy 500x500 from pixmap to window 32400.0 1.01 1.00 Copy 10x10 from window to pixmap 403.0 0.98 1.00 Copy 100x100 from window to pixmap 15.9 0.84 0.96 Copy 500x500 from window to pixmap 211000.0 1.06 0.99 Copy 10x10 from pixmap to pixmap 8200.0 0.99 0.99 Copy 100x100 from pixmap to pixmap 215.0 1.02 1.00 Copy 500x500 from pixmap to pixmap 84600.0 1.01 1.03 Copy 10x10 1-bit deep plane 2240.0 1.00 0.99 Copy 100x100 1-bit deep plane 93.1 1.00 1.01 Copy 500x500 1-bit deep plane 36500.0 1.05 1.12 Copy 10x10 n-bit deep plane 1300.0 0.98 1.12 Copy 100x100 n-bit deep plane 52.8 0.98 1.08 Copy 500x500 n-bit deep plane 57700.0 1.01 0.93 PutImage 10x10 square 1530.0 1.11 0.83 PutImage 100x100 square 43.4 1.03 0.87 PutImage 500x500 square 936.0 1.00 1.00 PutImage XY 10x10 square 10.5 1.02 0.94 PutImage XY 100x100 square 0.4 1.00 1.00 PutImage XY 500x500 square 77200.0 1.05 1.01 ShmPutImage 10x10 square 2690.0 1.05 1.05 ShmPutImage 100x100 square 118.0 1.05 1.05 ShmPutImage 500x500 square 25.4 1.04 1.11 ShmPutImage XY 10x10 square 10.6 0.94 0.95 ShmPutImage XY 100x100 square 0.7 0.86 0.86 ShmPutImage XY 500x500 square 10400.0 1.01 0.77 GetImage 10x10 square 381.0 1.01 0.94 GetImage 100x100 square 15.1 1.00 0.95 GetImage 500x500 square 664.0 1.14 1.11 GetImage XY 10x10 square 8.4 0.96 0.96 GetImage XY 100x100 square 0.3 1.00 1.00 GetImage XY 500x500 square 1640000.0 1.10 1.04 X protocol NoOperation 17500.0 0.85 0.68 QueryPointer 16200.0 1.03 0.69 GetProperty 213000.0 1.00 1.00 Change graphics context 59000.0 0.96 0.88 Create and map subwindows (4 kids) 66200.0 0.96 0.90 Create and map subwindows (16 kids) 65300.0 0.97 0.94 Create and map subwindows (25 kids) 58400.0 0.96 0.95 Create and map subwindows (50 kids) 52900.0 0.96 0.97 Create and map subwindows (75 kids) 48300.0 0.96 0.97 Create and map subwindows (100 kids) 29900.0 0.97 1.02 Create and map subwindows (200 kids) 150000.0 0.95 0.96 Create unmapped window (4 kids) 158000.0 0.94 0.96 Create unmapped window (16 kids) 159000.0 0.94 0.96 Create unmapped window (25 kids) 159000.0 0.94 0.97 Create unmapped window (50 kids) 160000.0 0.94 0.97 Create unmapped window (75 kids) 160000.0 0.94 0.93 Create unmapped window (100 kids) 160000.0 0.94 0.96 Create unmapped window (200 kids) 70900.0 0.95 0.91 Map window via parent (4 kids) 108000.0 0.99 0.96 Map window via parent (16 kids) 115000.0 1.01 0.97 Map window via parent (25 kids) 118000.0 1.00 0.97 Map window via parent (50 kids) 120000.0 1.01 0.97 Map window via parent (75 kids) 121000.0 1.01 0.97 Map window via parent (100 kids) 120000.0 1.01 0.97 Map window via parent (200 kids) 231000.0 0.87 0.80 Unmap window via parent (4 kids) 486000.0 0.94 0.84 Unmap window via parent (16 kids) 558000.0 0.96 0.85 Unmap window via parent (25 kids) 632000.0 0.98 0.87 Unmap window via parent (50 kids) 662000.0 1.00 0.89 Unmap window via parent (75 kids) 679000.0 1.01 0.89 Unmap window via parent (100 kids) 709000.0 1.01 0.89 Unmap window via parent (200 kids) 116000.0 0.91 0.88 Destroy window via parent (4 kids) 183000.0 0.96 0.92 Destroy window via parent (16 kids) 194000.0 0.97 0.94 Destroy window via parent (25 kids) 205000.0 0.97 0.95 Destroy window via parent (50 kids) 210000.0 0.97 0.95 Destroy window via parent (75 kids) 211000.0 0.98 0.96 Destroy window via parent (100 kids) 214000.0 0.98 0.91 Destroy window via parent (200 kids) 35100.0 0.95 0.94 Hide/expose window via popup (4 kids) 63200.0 1.00 0.96 Hide/expose window via popup (16 kids) 71600.0 0.99 0.97 Hide/expose window via popup (25 kids) 74800.0 0.99 0.96 Hide/expose window via popup (50 kids) 77000.0 0.99 0.95 Hide/expose window via popup (75 kids) 76700.0 1.00 0.97 Hide/expose window via popup (100 kids) 77200.0 1.00 0.97 Hide/expose window via popup (200 kids) 28300.0 0.90 0.89 Move window (4 kids) 20000.0 0.94 0.92 Move window (16 kids) 16900.0 0.95 0.92 Move window (25 kids) 11200.0 1.07 1.02 Move window (50 kids) 9600.0 0.96 0.93 Move window (75 kids) 7790.0 1.00 0.96 Move window (100 kids) 4220.0 1.02 0.99 Move window (200 kids) 332000.0 1.10 1.00 Moved unmapped window (4 kids) 333000.0 0.99 1.00 Moved unmapped window (16 kids) 325000.0 1.12 1.02 Moved unmapped window (25 kids) 327000.0 1.11 1.01 Moved unmapped window (50 kids) 325000.0 1.10 1.00 Moved unmapped window (75 kids) 323000.0 1.11 1.01 Moved unmapped window (100 kids) 318000.0 1.10 1.00 Moved unmapped window (200 kids) 96500.0 0.90 0.88 Move window via parent (4 kids) 202000.0 0.95 0.92 Move window via parent (16 kids) 230000.0 0.97 0.93 Move window via parent (25 kids) 262000.0 0.98 0.94 Move window via parent (50 kids) [ Note that some tests after this weren't run by all configurations. ] --------------84FD45CFA9E26AFECD30AE4F--