Sender: rich AT phekda DOT freeserve DOT co DOT uk
Message-ID: <3E89B1F9.1BC8C387@phekda.freeserve.co.uk>
Date: Tue, 01 Apr 2003 16:36:25 +0100
From: Richard Dawe <rich AT phekda DOT freeserve DOT co DOT uk>
X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.2.23 i586)
X-Accept-Language: de,fr
MIME-Version: 1.0
To: DJGPP workers <djgpp-workers AT delorie DOT com>
Subject: Support for nan(0x[0-9a-f]*) in strto*, *printf
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Reply-To: djgpp-workers AT delorie DOT com

Hello.

I've been thinking some more about how we can support nan(0x[0-9a-f]*) in
strto* and *printf.

Here are a few facts:

* NAN is a float with a mantissa of 0x7fffff.

* (double) NAN or (long double) NAN results in the exponent being extended up
to the maximum for double or long double.

* (double) NAN or (long double) NAN does not result in the mantissa being
1-filled. E.g.: for (double) NAN the mantissa becomes 0xfffffe0000000.

* (float_t) (float) (double) NAN == (float_t) NAN. An uncast NAN and a NAN
cast back to float are not equal, when compared directly. See the test program
t-nan.c below.

* (float_t) (float) (long double) NAN == (float_t) NAN. An uncast NAN and a
NAN cast back to float are not equal, when compared directly. See the test
program t-nan.c below.

* (double) a long double NaN == a double NaN. See the test program t-nan.c
below.

* (float) a long double NaN == a float NaN. See the test program t-nan.c
below.

The way I've implemented support for nan(0x[0-9a-f]*) at the moment means
that:

    printf("%f", NAN);

would display:

    nan(0xfffffe0000000)

This is because the float is converted to a double. I can't see a way round
this. There is no way that the _doprnt code can know that the number is a
float and not a double.

NB: The case of "nan" is correct, as specified by C99. This has changed from
the previous mixed-case "NaN". For "%F" you get "NAN". Similarly for "inf".

The current proposed implementation of strtof (see "strtof()'s NaN and Inf
support", dated Sat, 22 Mar 2003 12:32:13 +0100 (CET)) will not return NAN,
when asked to parse "nan(0xfffffe0000000)". The problem is that it assumes the
data it wants is in the lowest 23 bits of the hex data (the mantissa). This is
not the case for floats converted to doubles. I suspect a similar problem
exists, when doubles are converted to long doubles.

I'd like to propose a solution:

Since the hex data in nan(0x[0-9a-f]*) is implementation-specific, I think we
should always write the mantissa from a long double representation of the NaN
there in *printf. This long double representation would then be used as the
lowest- (highest-?) common denominator. It would be down converted to float,
double as appropriate by strto*. strto* could set up a long_double_t and cast
that down to the appropriate size. E.g.: for strtof:

    long_double_t ldt;
    long double ld;
    unsigned long long mantissa_bits;

    mantissa_bits = strtoull(&s[4], &endptr, 16);

    ldt.mantissah = mantissa_bits >> 32;
    ldt.mantissal = mantissa_bits & 0xffffffffU;
    ldt.exponent  = 0x7fffU;
    ldt.sign      = <whatever>;

    ld = *(long double *) &ldt;

    /* endptr stuff here */

    return(ld); /* <- converts to a float return value */

Does this make sense? Any thoughts?

Thanks, bye, Rich =]

---Start t-nan.c---
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <libc/ieee.h>

static void
dump (unsigned mh, unsigned ml, unsigned exp, unsigned sign)
{
  printf("0x%x 0x%x 0x%x %s\n", mh, ml, exp, sign ? "negative" : "positive");
}

int
main (void)
{
  long_double_t unenc_ld_nan = { 0xffffffffU, 0xffffffffU, 0x7fffU, 0U };
  long double   enc_ld_nan;
  double_t      unenc_d_nan;
  double        enc_d_nan;
  float_t       unenc_f_nan;
  float         enc_f_nan;

  enc_ld_nan = *(long double *) &unenc_ld_nan;

  enc_d_nan = (double) enc_ld_nan;
  enc_f_nan = (float) enc_ld_nan;

  unenc_d_nan = *(double_t *) &enc_d_nan;
  unenc_f_nan = *(float_t *) &enc_f_nan;

  dump(unenc_ld_nan.mantissah,
       unenc_ld_nan.mantissal,
       unenc_ld_nan.exponent,
       unenc_ld_nan.sign);

  dump(unenc_d_nan.mantissah,
       unenc_d_nan.mantissal,
       unenc_d_nan.exponent,
       unenc_d_nan.sign);

  dump(0,
       unenc_f_nan.mantissa,
       unenc_f_nan.exponent,
       unenc_f_nan.sign);

  return(EXIT_SUCCESS);
}
---End t-nan.c---

---Start t-nan2.c---
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>
#include <libc/ieee.h>

static void
dump (unsigned mh, unsigned ml, unsigned exp, unsigned sign)
{
  printf("0x%x 0x%x 0x%x %s\n", mh, ml, exp, sign ? "negative" : "positive");
}

int
main (void)
{
  float f;
  double d;
  long double ld;
  float_t unenc_f;

  f = NAN;
  unenc_f = *(float_t *) &f;

  puts("NaN:");
  printf("%f\n", f);
  dump(0, unenc_f.mantissa, unenc_f.exponent, unenc_f.sign);

  d  = NAN;
  ld = (long double) NAN;

  f = d;
  unenc_f = *(float_t *) &f;

  puts("double:");
  printf("%f\n", f);
  dump(0, unenc_f.mantissa, unenc_f.exponent, unenc_f.sign);

  /*assert(f == NAN);*/

  f = ld;
  unenc_f = *(float_t *) &f;

  puts("long double:");
  printf("%f\n", f);
  dump(0, unenc_f.mantissa, unenc_f.exponent, unenc_f.sign);

  /*assert(f == NAN);*/

  puts("PASS");
  return(EXIT_SUCCESS);
}
---End t-nan2.c---

-- 
Richard Dawe [ http://www.phekda.freeserve.co.uk/richdawe/ ]