delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1995/11/14/10:37:00

Date: Tue, 14 Nov 1995 09:49:12 -0500
From: kagel AT quasar DOT bloomberg DOT com
To: ghernan AT cariari DOT ucr DOT ac DOT cr
Cc: djgpp AT sun DOT soe DOT clarkson DOT edu
Subject: Re: Help with "normalised float" etc.
Reply-To: kagel AT ts1 DOT bloomberg DOT com

   From: "Luis G. Hernandez U." <ghernan AT cariari DOT ucr DOT ac DOT cr>
   Date: Mon, 13 Nov 1995 19:09:31 -0600 (CST)

		     I wonder if someone out there can ask to me that question
     (or maybe can direct me to the correct place to look at that):
     What's a normalised float (double and/or long double)?
     Please excuse me (if?) I'm asking a non djgpp question but I'm not able 
     to find an answer by other means.

     I'll be gratfull if too, you can explain me the way bits are assigned to
     the exponential part of a float (double or long double), How many bits are
     used in the exp. of a float (double or long double)?

My IEEE floating point is a little rusty but here is my best memory:

IEEE 32 bit floating point:

Sign bit:  Bit  0     (High order bit of the highest order byte.  This is the
                       sign of the mantissa.)
Exponent:  Bits 1-11  (I believe this is treated as an 11bit signed binary 
                       integer.)
Mantissa:  Bits 12-63 (Plus implied high-order bit value of '1' unless all 
                       exponent and mantissa bits are zero.  This gives 52
                       bits, 15+ decimal digits, of precision and permits both 
                       +/-0.)

A normalized IEEE floating point contains an implied high order binary 1 read,
I believe, as 1. followed by any recorded bits in the mantissa raised to the
base 2 power indicated by the exponent and signed according to the sign bit.  A
number with a zero recorded mantissa and zero exponent has the value of
positive or negative zero depending on the value of the sign bit (both positive
zero and negative zero have the same effect on calculations as natural zero).

IEEE 32bit floating point is similar except for the sizes of the components:

Sign bit: Bit  0
Exponent: Bits 1-8
Mantissa: Bits 9-31

I do not know the format for IEEE extended (80bit) floating point or for the
proposed IEEE long double (128 bit) floating point.  Except that I seem to
remember that IEEE extended used the same exponent as double simply adding bits
to the mantissa to improve precision for intermediate results to prevent
truncation loss where the final result is to be stored in double (64bit)
format.

Someone posted the addresses to acquire IEEE standards documentation.  Perhaps
he/she could re-post?  I believe floating point is IEEE-854.

-- 
Art S. Kagel, kagel AT ts1 DOT bloomberg DOT com

Variety is the soul of pleasure.  --  Aphra Behn

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019