delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2005/11/06/21:16:37

X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f
From: "News Reader" <nospam AT aol DOT com>
Newsgroups: comp.os.msdos.djgpp
References: <1131105759 DOT 132511 DOT 231360 AT g47g2000cwa DOT googlegroups DOT com>
Subject: Re: missing optimization?
Date: Mon, 7 Nov 2005 03:04:26 +0100
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.2670
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2670
X-RFC2646: Format=Flowed; Original
Lines: 175
Message-ID: <436eb66f$0$4419$91cee783@newsreader02.highway.telekom.at>
NNTP-Posting-Host: 212.183.34.133
X-Trace: 1131329135 newsreader02.highway.telekom.at 4419 212.183.34.133:11932
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Why do you think the compiler does strange things?
The result is not like hand-coded assembly language
but it does correspond to your C code!


If the assembly listing is re-converted to C code
it will be something like that:


typedef struct { int entry; } data_t;
data_t data;

void function1(int val) {
  int eax=data.entry;
  eax-=val;
  eax=eax;  // nop
  data.entry=eax;
  if (eax<0) goto L5;
  return;

  L5: putchar('<');
}


void function2(int val) {
  int eax=data.entry;
  eax-=val;
  if (eax<0) goto L9;
  data.entry=eax;
  return;

  L9: data.entry=eax;
  putchar('<');
}



I have also commented the assembly listing:

_function1:
    movl _data, %eax    // eax=data.entry
    subl 4(%esp), %eax  // eax-=val
    testl %eax, %eax    // (not needed but doesn't harm)
    movl %eax, _data    // data.entry=eax

    js  L5              // if (eax<0) goto L5
    ret                 // return

L5:
    movl $60, 4(%esp)   // putchar('<');
    jmp _putchar        // after putchar() return


_function2:
    movl _data, %eax    // eax=data.entry
    subl 4(%esp), %eax  // eax-=val
    js L9               // if (eax<0) goto L9

    movl %eax, _data    // data.entry=eax
    ret                 // return

L9:
    movl %eax, _data    // data.entry=eax
    movl $60, 4(%esp)   // putchar('<');
    jmp _putchar        // after putchar() return


Remark:
Both, function1 and function2, have duplicated
and/or unneeded instructions in their assembly
listings. But this is what you'd have to expect
from most compilers anyway. Furthermore there
is a fair chance that modern CPUs will not mind
clumsy looking code in respect to speed penalties.




<cbramix AT libero DOT it> wrote in message 
news:1131105759 DOT 132511 DOT 231360 AT g47g2000cwa DOT googlegroups DOT com...
> Hello,
> I discovered that GCC makes a strange thing when compiling the sources
> for my embedded application.
> I attached a very simple C source for demonstrating the fact.
> ------------------------------------
> #include <stdio.h>
>
> typedef struct {
>    int entry;
> } data_t;
>
> data_t data;
>
> void function1(int val)
> {
>    if ((data.entry -= val) < 0) {
>        printf("<");
>    }
> }
>
> void function2(int val)
> {
>    int v = data.entry;
>
>    if ( (v -= val) < 0) {
>        data.entry = v;
>        printf("<");
>    } else {
>        data.entry = v;
>    }
> }
>
> ------------------------------------
> 'function1' and 'function2' make the same thing.
> But the generated assembly isn't exacly what I wanted...
> Here I just put the code of the functions:
>
> _function1:
>    movl _data, %eax
>    subl 4(%esp), %eax
>    testl %eax, %eax
>    movl %eax, _data
>    js  L5
>    ret
> .p2align 4,,15
> L5:
>    movl $60, 4(%esp)
>    jmp _putchar
>
> _function2:
>    movl _data, %eax
>    subl 4(%esp), %eax
>    js L9
>    movl %eax, _data
>    ret
> .p2align 4,,15
> L9:
>    movl %eax, _data
>    movl $60, 4(%esp)
>    jmp _putchar
>
> Into the code of 'function1' the instruction 'subl 4(%esp), %eax'
> already changes the status of the FLAGS.
> So I can't understand why it makes an additional 'testl %eax, %eax'.
> When I copy my value into a local variable like I did into 'function2',
> it works as I expected.
> Unfortunately, with this solution I must save the modified value with
> the 'data.entry=v;' into both directions, while it could be safely
> placed between the SUB and JS opcodes.
> If compiler is *newer* than 2.95, then it puts the TEST opcode.
> Older versions work fine when accessing the structure directly too.
> I compiled the source with:
>
> gcc -march=i486 -mtune=i486 -mpreferred-stack-boundary=2 -Wall demo.c
> -S -fomit-frame-pointer -O2
>
> I tried to increase the optimization level but it didn't change.
> Changing the 'i486' to another microprocessor didn't change too.
>
> The application seems to run without problem.
> However, those optimizations are missed in several critical points.
> I would feel better if the code into these points could be reduced to
> the minimum.
>
> Do you have some good idea?
> What am I doing wrong in your opinion?
> I would like to avoid changes into this old source, if it's possible...
>
> Sincerely,
>
> Carlo Bramini
> 


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019