delorie.com/archives/browse.cgi   search  
Mail Archives: geda-user/2016/01/05/14:07:47

X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f
X-Recipient: geda-user AT delorie DOT com
X-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=googlemail.com; s=20120113;
h=mime-version:in-reply-to:references:date:message-id:subject:from:to
:content-type;
bh=b8IhVylBN0V2mRHtlRE+jPjVY6paSKeXAwsGG5vPsQ4=;
b=QK9sNbDDvQnYZ4ch3ym9ksZXWhu4NBImsFRz0M8j8dC2Vk4Yxkj1c1wPxp4GSS9dIy
6AyMPp61GXP14poiS93qq0MV2C2gjDb8UYq/HdDsaw9rW3AIY6pWERN24Fnj1N9uZrpt
PU7zoEP3/wFKoqltUAwR4g15+zwCq4HLsdARUsjW5zDQMsaQid3Ag5BdRCl4uHT+8O+H
HHlW4APPIyu7ADgbLRyvR6T5H4uWyzmXMtA5NZf46oC/fgpWqx/B8U6nXTmanKxdHb7A
D+ZeyfaTMasPRNsQkftJ3zhFGI7HZKI81SysnMuNyTvU+oX1sC/ED/3weFqLcPp6rXdb
i1mQ==
MIME-Version: 1.0
X-Received: by 10.60.232.231 with SMTP id tr7mr54934133oec.27.1452020855079;
Tue, 05 Jan 2016 11:07:35 -0800 (PST)
In-Reply-To: <201601051829.u05IT7TI021027@envy.delorie.com>
References: <1512221837 DOT AA25291 AT ivan DOT Harhan DOT ORG>
<CAJXU7q_qxdvJaejF-VcY=u7VHZ-zrfrc+Z7-qSwfFyPdy-umxw AT mail DOT gmail DOT com>
<B02363CD-469D-493A-AC15-1D5DC7836982 AT noqsi DOT com>
<20151222232230 DOT 12633 DOT qmail AT stuge DOT se>
<0F6F1D0F-4F07-48EA-90FE-836EAD4E2354 AT noqsi DOT com>
<CAM2RGhTficnys3a4xs=UBFvk8aPwpzYWUADFLP_pUQ+R1iKs0g AT mail DOT gmail DOT com>
<0FCF3774-F93C-4BFF-BB61-636F75DCCACB AT noqsi DOT com>
<CAC4O8c_UAiFE-vGfoE2tXppHLhaa0dSYz9o_rkdCBo7_SRRtxw AT mail DOT gmail DOT com>
<FFBE7623-E240-4798-96B0-2BECF56C8E29 AT noqsi DOT com>
<CAC4O8c980g1gj15=5njstC_BT-WYDgKQx9BRycdFKA8OvgtiOg AT mail DOT gmail DOT com>
<B54C0E1F-1986-4C79-9F70-7F1919B8B26D AT noqsi DOT com>
<CAC4O8c9bxJP1eMG4yz3YwKkQJRmsDGmLQ0aMd5pJRyu0WpdCtQ AT mail DOT gmail DOT com>
<C1CFCCEE-C64A-4E49-AA64-446C061656D6 AT noqsi DOT com>
<CAC4O8c-zt8B=joDd+ws77D2jt6aZf3MWfR_dAvpzGcNuBrTURQ AT mail DOT gmail DOT com>
<alpine DOT DEB DOT 2 DOT 11 DOT 1601030040320 DOT 2176 AT newt>
<D9825C8C-B6FD-4C7F-A8D5-B8AF06253B72 AT noqsi DOT com>
<CAC4O8c_R5xWLmzj_cz0g0mPWNs6mR4efjXKGBoup8YO6nwnPTA AT mail DOT gmail DOT com>
<A942261D-7C25-4F2D-9CB1-FFC60FA1C160 AT noqsi DOT com>
<CAC4O8c8zk8=Py1yX6fVqF+35SYe39Li=y4jZ8bCeZ1Ev8WccAg AT mail DOT gmail DOT com>
<20160105182120 DOT 3237F809D79B AT turkos DOT aspodata DOT se>
<201601051829 DOT u05IT7TI021027 AT envy DOT delorie DOT com>
Date: Tue, 5 Jan 2016 19:07:34 +0000
Message-ID: <CAJXU7q81_G+1ndudoWKBTnbJwEkMhpD42e2b5FoF7gJ8WwwozQ@mail.gmail.com>
Subject: Re: [geda-user] A fileformat library
From: "Peter Clifton (petercjclifton AT googlemail DOT com) [via geda-user AT delorie DOT com]" <geda-user AT delorie DOT com>
To: gEDA User Mailing List <geda-user AT delorie DOT com>
Reply-To: geda-user AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: geda-user AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

--001a11369ad6460bde05289af39c
Content-Type: text/plain; charset=UTF-8

On 5 Jan 2016 18:30, "DJ Delorie" <dj AT delorie DOT com> wrote:
>
>
> > . a binary file might be smaller, but that does not matter much
>
> I wrote an app that used a tree-like data file for storage.  It
> supported both ascii and binary formats.  Not only was the binary
> format significantly smaller, but loaded 10x faster.  Parsing text
> files and adapting to the incoming data is more expensive than you
> think.

Indeed... text representations of floating point numbers take a lot of
computation to turn into the correct binary machine value.  This is one of
the main reasons big 3D models in STEP format are slow to load. (There are
lots of irrational numbers represented in text format, base 10).

It is very easy to write a fast ASCII to double conversion, but only if you
make some assumptions and sacrifice accuracy. Doing correct conversion -
which yields the closest binary floating point number to the decimal
floating point number described is hard to preform correctly, and time
consuming.

Hypothetically, I think the best compromise is a format which has a
lossless translation between text and binary representations.

In reality, the speed issue is for the most part irrelevant to us. We
simply don't have the quantity of floating point numerical data in our
files to cause enough slow down to warrant

For processing 3D step files - two approaches... 1. Don't perform the
conversion unless the number is needed (shunt strings in and out of the
system). 2. Test out the idea of hashing and caching conversions.... I've a
suspicion that many coordinates and vectors get repeated a lot.... (The
Autodesk dwg format special cases 0.0 and 1.0 with a very short bit pattern
(3 bits I recall), which gives them enough reduction in file size to make
it worth while for them.

(Btw... Anyone else react with a "wtf" to realise that the DWG binary
format operates on a literal BIT stream? - ie. Not even byte alignment!)

Peter

--001a11369ad6460bde05289af39c
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr"><br>
On 5 Jan 2016 18:30, &quot;DJ Delorie&quot; &lt;<a href=3D"mailto:dj AT delori=
e.com">dj AT delorie DOT com</a>&gt; wrote:<br>
&gt;<br>
&gt;<br>
&gt; &gt; . a binary file might be smaller, but that does not matter much<b=
r>
&gt;<br>
&gt; I wrote an app that used a tree-like data file for storage.=C2=A0 It<b=
r>
&gt; supported both ascii and binary formats.=C2=A0 Not only was the binary=
<br>
&gt; format significantly smaller, but loaded 10x faster.=C2=A0 Parsing tex=
t<br>
&gt; files and adapting to the incoming data is more expensive than you<br>
&gt; think.</p>
<p dir=3D"ltr">Indeed... text representations of floating point numbers tak=
e a lot of computation to turn into the correct binary machine value.=C2=A0=
 This is one of the main reasons big 3D models in STEP format are slow to l=
oad. (There are lots of irrational numbers represented in text format, base=
 10).</p>
<p dir=3D"ltr">It is very easy to write a fast ASCII to double conversion, =
but only if you make some assumptions and sacrifice accuracy. Doing correct=
 conversion - which yields the closest binary floating point number to the =
decimal floating point number described is hard to preform correctly, and t=
ime consuming.</p>
<p dir=3D"ltr">Hypothetically, I think the best compromise is a format whic=
h has a lossless translation between text and binary representations.</p>
<p dir=3D"ltr">In reality, the speed issue is for the most part irrelevant =
to us. We simply don&#39;t have the quantity of floating point numerical da=
ta in our files to cause enough slow down to warrant<br></p>
<p dir=3D"ltr">For processing 3D step files - two approaches... 1. Don&#39;=
t perform the conversion unless the number is needed (shunt strings in and =
out of the system). 2. Test out the idea of hashing and caching conversions=
.... I&#39;ve a suspicion that many coordinates and vectors get repeated a =
lot.... (The Autodesk dwg format special cases 0.0 and 1.0 with a very shor=
t bit pattern (3 bits I recall), which gives them enough reduction in file =
size to make it worth while for them.</p>
<p dir=3D"ltr">(Btw... Anyone else react with a &quot;wtf&quot; to realise =
that the DWG binary format operates on a literal BIT stream? - ie. Not even=
 byte alignment!)</p>
<p dir=3D"ltr">Peter</p>

--001a11369ad6460bde05289af39c--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019