delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/01/07/08:12:23

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_SOFTFAIL
X-Spam-Check-By: sourceware.org
Message-ID: <4964AA21.4040900@byu.net>
Date: Wed, 07 Jan 2009 06:12:01 -0700
From: Eric Blake <ebb9 AT byu DOT net>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.19) Gecko/20081209 Thunderbird/2.0.0.19 Mnenhy/0.7.5.666
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: Inconsistency with sort -n?
References: <0105D5C1E0353146B1B222348B0411A211B41DEC AT NIHMLBX02 DOT nih DOT gov>
In-Reply-To: <0105D5C1E0353146B1B222348B0411A211B41DEC@NIHMLBX02.nih.gov>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

--------------enig64AF8F59ABD65E5F7FE7C987
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

According to Buchbinder, Barry (NIH/NIAID) [E] on 12/31/2008 2:29 PM:

[sorry for my delay in replying]

> `sort -n' and `sort -g' work inconsistently with 0 and -0 if there are le=
ading spaces.  Sometimes -0 is before 0, as I would expect, and sometimes i=
t is afterwards.  Adding `-b' does not seem to help.
>=20
> Is this where I should report it or should I go upstream?

If it were a bug, it would be an upstream issue (I reproduced your test
cases on Linux).  But it is not a bug; sort is behaving as documented.

> $ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -n
> -1
>  0
> -0
>  1

sort -n sorts the entire line based on numeric value (0 and -0 have the
same value), then breaks ties based on byte-wise values (' ' comes before
'-').

> $ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -g
> -1
>  0
> -0
>  1

sort -g is slower than sort -n, because it converts to floating point; and
although -0.0 and +0.0 are distinct bit patterns, they still sort equal,
so you ware once again back to the fallback of bytewise comparison to
break ties (and ' ' still comes before '-').

Use sort -u to see that 0 and -0 sort numerically equal, and thus why a
fallback sort must be attempted.

$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -nu
-1
 0
 1

Or, go one better - use two sort keys.  Make the primary key sort
numerically, and the second sort key break ties in favor of '-':

$ echo 0 -1 -0 1 | tr ' ' '\n' | sed -e '/^.$/s/^/ /' | sort -k1,1n -k1r
-1
-0
 0
 1

--=20
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9 AT byu DOT net
volunteer cygwin coreutils maintainer


--------------enig64AF8F59ABD65E5F7FE7C987
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAklkqicACgkQ84KuGfSFAYA5+QCgt/UyyXz075ix7hpP5Rn9T7X9
L3gAoJVTNGJQiOfLWlkf1UHzqL3fbUV4
=v2Ku
-----END PGP SIGNATURE-----

--------------enig64AF8F59ABD65E5F7FE7C987--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019