delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1992/06/02/10:38:09

From: greve AT rs1 DOT thch DOT uni-bonn DOT de (Thomas Greve)
Subject: Re: How Fast (??) part 3
To: csaba AT vuse DOT vanderbilt DOT edu (Csaba A. Biegl)
Date: Tue, 2 Jun 92 16:24:21 NFT
Cc: djgpp AT sun DOT soe DOT clarkson DOT edu
Status: O

[...] (my text deleted)
> 
> (1) The 386 encodes the operand size using only 1 bit
>     of the opcode for the most commonly used instructions. In real and 16
>     bit protected modes this bit selects between one and two byte operands.
>     In 32 bit protected mode the bit selects between 1 byte (char) and
>     4 byte (int, long) operands. The "other" non-single byte operand size
>     is selected by an instruction prefix byte in both modes. The above code
>     when compiled with a 32 bit compiler will contain a lot of these prefix
>     bytes which increase code size and slow down execution.
					 ^^^^^^^^^^^^^^^^^^^^
This is not true. Intel's '386 data sheet sais, that prefixes cost 0 (zero)
processor cycles as long as instruction fetch is fast enough. (it usually
is, as instruction processing is well pipe lined)

> (2) Most of the integer arithmetic in the code will be performed using
>     the native "int" precision, i.e. 32 bits. Thus, the compiler will
>     have to output conversion code whenever a 16 bit operand is fetched.
This *is* true. `movsx' (Intel notation) costs a lot more cycles than
just `mov' (5 vs. 2 or something). And misalignment (a[3] if a is short) 
costs extra wait states -- even on '386sx ;-(

> 
> Csaba Biegl
> csaba AT vuse DOT vanderbilt DOT edu
> 


				- Thomas

   greve AT rs1 DOT thch DOT uni-bonn DOT de
   unt145 AT dbnrhrz1

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019