delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/04/17/13:48:58

From: "Alexei A. Frounze" <alex DOT fru AT mtu-net DOT ru>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: inefficiency of GCC output code & -O problem
Date: Mon, 17 Apr 2000 20:49:24 +0400
Organization: MTU-Intel ISP
Lines: 61
Message-ID: <38FB4094.DE7B5F4C@mtu-net.ru>
References: <Pine DOT LNX DOT 4 DOT 10 DOT 10004161837540 DOT 1138-100000 AT darkstar DOT grendel DOT net> <38F9D717 DOT 9438A3F6 AT mtu-net DOT ru> <8df84a DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de>
NNTP-Posting-Host: ppp102-60.dialup.mtu-net.ru
Mime-Version: 1.0
X-Trace: gavrilo.mtu.ru 955990260 40736 212.188.102.60 (17 Apr 2000 16:51:00 GMT)
X-Complaints-To: usenet-abuse AT mtu DOT ru
NNTP-Posting-Date: 17 Apr 2000 16:51:00 GMT
Cc: buers AT gmx DOT de, eliz AT is DOT elta DOT co DOT il
X-Mailer: Mozilla 4.72 [en] (Win95; I)
X-Accept-Language: en,ru
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Dieter Buerssner wrote:
> 
> Alexei A. Frounze wrote:
> 
> [Alexei has sent me an (almost) compilable set of sources. Looks
> quite nice - the graphics;)]
> 
> >Not really. The inner loop in my tmapper can not be written in pure C.
> >Belive me.
> 
> This is not true.

Okay, interpolate U and V over a group of pixels, and don't forget &0xFF to be
sure U and V don't exceed the 0...255 range (the span() function does this). I
doubt your C code will be as fast as my ASM. Tell me what you've got when you're
done.

> >No one compiler figure out such a trick as used in my ASM module.
> 
> Alexei, then you will surely accept the following bet.

First of all, don't giggle, laugh and grin about the following. It won't be a
bet, since I can't send what you want. Really. I don't work and my parents for
sure won't give me money for the bet or anything else as well. And btw, AFAIK
I'll have to pay for sending that package. :(( I'd like to make it as you
suggest, but I can't.

> Level I:
> 
> I will replace about half of your inline assembly in T_Map() by
> C code, that does exactly the same thing. No change of algorithm.
> Most of the replacement will be your comments. To make the competition
> fair, I will put in your shift code in inline assembly (with correct
> constrains), because this was in the original source you posted here,
> and because the C replacement was my sugestion. But this is really
> a minor point. I will compile my C version with gcc (2.95.2) -O. It is
> your choice, whether you want your version compiled with -O or -O2.
> To compare the performance, I start your program (plain DOS, AMD K6-2),
> wait a little bit, write down the FPS display, and stop it again
> with ESC. I bet my version will be faster! 

Do that if you're interested. :)
Btw, I made some improvements to the code (<< in C, and improved the ceil()
replacement).

> I get rid of all your inline assembly in T_Map. I will be allowed
> to add one single line (say less than 50 characters from __asm__
> upto the closing ')' ) of inline assembly to your source. I bet,
> the plain C code will perform about the same, as your inline
> code. I win, when my code is no more than 2 FPS slower, or faster, than
> your code (The executable you sent reports 70 FPS here).

How many are there such lines in your oppinion? :)

bye.
Alexei A. Frounze
-----------------------------------------
Homepage: http://alexfru.chat.ru
Mirror:   http://members.xoom.com/alexfru


- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019