X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f X-Recipient: geda-user AT delorie DOT com X-TCPREMOTEIP: 63.119.35.194 X-Authenticated-UID: jpd AT noqsi DOT com Content-Type: multipart/signed; boundary="Apple-Mail=_9DE033FE-F627-441A-9D43-DA9634BB4DDA"; protocol="application/pgp-signature"; micalg=pgp-sha512 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: [geda-user] A fileformat library X-Pgp-Agent: GPGMail 2.5.2 From: John Doty In-Reply-To: Date: Wed, 6 Jan 2016 07:36:18 -0500 Message-Id: <6C2FA19B-9B5C-4F6E-841C-4C3031BF9D2D@noqsi.com> References: <1512221837 DOT AA25291 AT ivan DOT Harhan DOT ORG> <20151222232230 DOT 12633 DOT qmail AT stuge DOT se> <0F6F1D0F-4F07-48EA-90FE-836EAD4E2354 AT noqsi DOT com> <0FCF3774-F93C-4BFF-BB61-636F75DCCACB AT noqsi DOT com> <20160105182120 DOT 3237F809D79B AT turkos DOT aspodata DOT se> <20160106091006 DOT 5F67B809D7A1 AT turkos DOT aspodata DOT se> To: geda-user AT delorie DOT com X-Mailer: Apple Mail (2.1878.6) Reply-To: geda-user AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: geda-user AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk --Apple-Mail=_9DE033FE-F627-441A-9D43-DA9634BB4DDA Content-Type: multipart/alternative; boundary="Apple-Mail=_54402707-1FBE-4453-A9BE-B3C6C50F8B05" --Apple-Mail=_54402707-1FBE-4453-A9BE-B3C6C50F8B05 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 On Jan 6, 2016, at 4:58 AM, Levente (leventelist AT gmail DOT com) [via = geda-user AT delorie DOT com] wrote: >=20 >=20 > You can have any difftool for git, so you can do visual diffs or = anything what you want, independent of the file format. (as of today, we = HAVE visual diff for PCB) And then what do you do with a broken file, with a little bit of garbled = data in it? Text is more robust. > You can write your own script in whatever language, as long as it has = bindings for the file format, But that=92s more work. > which is the case. Not for every language. Certainly not for every text utility. >=20 > When selecting a file format I would not have such constraints like = "it should be pretty for a human eye=94. Not pretty, but comprehensible without processing. > CAD data is so complex, that one can (and I believe should not) parse = them with naked eye. For a computer, binary is much more efficient. It=92s more efficient until something breaks (and something always = breaks). I note that Mathematica, an enormous, complex program, saves = its files as text. And every once in a while a rare bug garbles such a = file. Sometimes they advise =93search for a line like =85 and delete it=94= as a recovery strategy. It=92s more efficient if you never want to use text-oriented tools, but = for files with lots of text data, like attributes and file names, I = generally want to use text tools to solve some problems. There are = thousands of tiny problems that you can efficiently address in this way. >=20 > So I still prefer the SQLite database, as that library is optimized = for our purpose. Yes, I prefer to have a library, that parses our files, = and gives us the possibility to use, or modify the data. No, it gets in the way of using and modifying the data, as then you = *must* got through the library with all of its limitations. > I prefer this way because we have approx. 50 different implementation = of pcb/gschem file parser. So what? If you have one parser, then you get 50 different = implementations of the interface to its API. You only move the problem = to a more difficult layer. > There is one in pcb/gaf, and all the other power users wrote their = own. It would be much better to write the parser once, and other could = use that. Why is it better? I think that is an illusion. > If file format changes, you have to just change one code, and not 50+. Not true, because the only sane reason to change the format is to change = something in the semantics. That means that your common parser=92s API = will have to change, and everything that uses it will have to = accommodate that change. >=20 > If we use a standard file format (SQL, YAML, JSON), the parsing = library could be very thin. But it will still get in the way. > The independent parsing code is already written. Only the application = (gEDA) specific code has to be written. Remember, all of us has very = limited time to contribute code to gEDA nowadays. That=92s why we should not be adding extra, unnecessary layers. That=92s = why we should not be breaking our existing tools and scripts. >=20 > Lev >=20 > On Wed, Jan 6, 2016 at 10:10 AM, wrote: > Britton Kerin: > > On Tue, Jan 5, 2016 at 9:21 AM, wrote: > > > Britton Kerin: > > > > On Sun, Jan 3, 2016 at 6:25 PM, John Doty wrote: > > > ... > > > > > Although these are good measures, once you adopt them you may = start > > > asking > > > > > yourself why you aren't just using a binary format. The = argument for > > > text > > > > > is that you can glance at a chunk of it and easily tell what's = going > > > on. > > > > > A stronger argument for text is that you can process it with > > > text-oriented > > > > > tools. > > > > But ultimately the reason for wanting to use those text-oriented = tools is > > > > the same: you can see what you're working on with your own eyes. = In > > > every > > > > other respect binary is better. > > > > > > I counter that. > > > . you have to check a binary file for valid values just as you do = for a > > > text file >=20 > You have not commented this point. Any reader, human or program, have > to verify its input. Just because you have a binary file you can't = just > do >=20 > struct some_struct_type data; > read(fd, data, sizeof(struct some_struct_type)) >=20 > and pretend that data will contain valid values. >=20 > > > . if your binary file is in some way invalid, you will have a = greater > > > problem correcting it than a text file > > > . discussing why a file is invalid is easier with a text file > > > . a binary file might be smaller, but that does not matter much > > > . text files are better provided for by version systems (e.g. git) > > > . it is easier to write tools that write text than binary, because > > > debugging the output is easier > > Regarding vcs of text data files for GUI program, it's a stretch to = claim > > that the fact that they're text makes them much more compatible. = The diffs > > are only useful for the most trivial of cases. >=20 > You used the word "better", now are you judging things around > "compatible". What do you want ? > If you want "compatible", then compatible to what ? >=20 > I write a sym/fp-generators and I prefer text output; I can check the > output both "textually" and "visually". >=20 > The diffs out the output of thoose generators are valueable when > debugging. >=20 > You might think of gschem and pcb as gui programs, but there is > ifrastructure around them which is not gui. Don't think about their = files > in the same way as a png or jpg, where the fileformats are well > entrenched and infrastructure is available. >=20 > > For it to be really useful > > you need a (non-text) diff viewer of some sort. >=20 > A graphical-diff is provided by >=20 > http://www.imagemagick.org/Usage/compare/ >=20 > You can generate png's from sch/pcb files and use that program for > graphical-diffs. >=20 > It would be very useful for regression tests, other projects have been > successful at that. >=20 > > All the rest of the above still boil down to examples of things that = are > > easier because you can see the data, and therefore manipulate and = validate > > it more easily. >=20 > No, validate input and size, is not about "easier to see". >=20 > But... > Why do you want to remove the text format for me ? > What are your complaint of having a textual file format ? >=20 > By dropping the text you loose something, what is the gain of a binary > format that outweights that ? >=20 > > > Also, there is no reason to change a file format unless you change = the > > > functionality it provides, I have to "side heavily" with John on = this. > > > If you want to change the file format, you first have to provide = some > > > goodies that will make people to accept it. And no such "goodie" > > > thing has appeared. > > A little while back a PhD Stefan Salewski put together a very good = start on > > a very nice router. IIRC he said 300-400 hours of effort, probably = about > > $100k worth with overhead in the american market. It's not C (for = good > > reasons) and currently can't talk to pcb at all. I would like to = somehow > > arrange things such than efforts like this could maybe get used. >=20 > In what way does his work relate to a decision about text contra = binary > file format. >=20 > > > You might write a library that reads and writes the files and if = people > > > find it useful, they will start using it, else, it will be just = your own > > > project. > > True. It might be useless but should at least be non-destructive. >=20 > I thought you are for a fileformat library. I'm I wrong ? > But now you say that such a library is useless. Did I get that right ? >=20 > I think a fileformat library would be useful, but I still > would like a textual format for sch/sym/pcb/fp files. >=20 > Regards, > /Karl Hammar >=20 > = ----------------------------------------------------------------------- > Asp=F6 Data > Lilla Asp=F6 148 > S-742 94 =D6sthammar > Sweden > +46 173 140 57 >=20 >=20 >=20 John Doty Noqsi Aerospace, Ltd. http://www.noqsi.com/ jpd AT noqsi DOT com --Apple-Mail=_54402707-1FBE-4453-A9BE-B3C6C50F8B05 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252
On Jan 6, 2016, at 4:58 AM, Levente = (leventelist AT gmail DOT com) = [via geda-user AT delorie DOT com] = <geda-user AT delorie DOT com>= wrote:



You can = have any difftool for git, so you can do visual diffs or anything what = you want, independent of the file format. (as of today, we HAVE visual = diff for = PCB)

And = then what do you do with a broken file, with a little bit of garbled = data in it? Text is more robust.

You can write your own = script in whatever language, as long as it has bindings for the file = format,

But = that=92s more work.

which is the = case.

No= t for every language. Certainly not for every text = utility.


When selecting a file format I = would not have such constraints like "it should be pretty for a human = eye=94.

Not pretty, but = comprehensible without processing.

CAD data is so complex, that = one can (and I believe should not) parse them with naked eye. For a = computer, binary is much more = efficient.

It=92s more = efficient until something breaks (and something always breaks). I note = that Mathematica, an enormous, complex program, saves its files as text. = And every once in a while a rare bug garbles such a file. Sometimes they = advise =93search for a line like =85 and delete it=94 as a recovery = strategy.

It=92s more efficient if you never = want to use text-oriented tools, but for files with lots of text data, = like attributes and file names, I generally want to use text tools to = solve some problems. There are thousands of tiny problems that you can = efficiently address in this way.


So I still prefer the = SQLite database, as that library is optimized for our purpose. Yes, I = prefer to have a library, that parses our files, and gives us the = possibility to use, or modify the = data.

No, it gets in the way of = using and modifying the data, as then you *must* got through the library = with all of its limitations.

I prefer this way because we have approx. 50 different = implementation of pcb/gschem file parser. =

So what? If you have one parser, = then you get 50 different implementations of the interface to its API. = You only move the problem to a more difficult = layer.

There= is one in pcb/gaf, and all the other power users wrote their own. It = would be much better to write the parser once, and other could use that. =

Why is it better? I think that is an = illusion.

If= file format changes, you have to just change one code, and not = 50+.

Not true, because the = only sane reason to change the format is to change something in the = semantics. That means that your common parser=92s API will have to = change, and everything that uses it will have to accommodate that = change.


If we use a standard file format (SQL, = YAML, JSON), the parsing library could be very thin. =

But it will still get in the = way.

The = independent parsing code is already written. Only the application (gEDA) = specific code has to be written. Remember, all of us has very limited = time to contribute code to gEDA = nowadays.

That=92s why we = should not be adding extra, unnecessary layers. That=92s why we should = not be breaking our existing tools and = scripts.


Lev

On Wed, Jan 6, 2016 = at 10:10 AM, <karl AT aspodata DOT se> wrote:
Britton Kerin:
> On Tue, Jan 5, 2016 at 9:21 AM, <karl AT aspodata DOT se> wrote:
> > Britton Kerin:
> > > On Sun, Jan 3, 2016 at 6:25 PM, John Doty <jpd AT noqsi DOT com> wrote:
> > ...
> > > > Although these are good measures, once you adopt = them you may start
> > asking
> > > > yourself why you aren't just using a binary = format.  The argument for
> > text
> > > > is that you can glance at a chunk of it and easily = tell what's going
> > on.
> > > > A stronger argument for text is that you can process = it with
> > text-oriented
> > > > tools.
> > > But ultimately the reason for wanting to use those = text-oriented tools is
> > > the same: you can see what you're working on with your = own eyes.  In
> > every
> > > other respect binary is better.
> >
> > I counter that.
> > . you have to check a binary file for valid values just as you = do for a
> >   text file

You have not commented this point. Any reader, human or program, = have
to verify its input. Just because you have a binary file you can't = just
do

 struct some_struct_type data;
 read(fd, data, sizeof(struct some_struct_type))

and pretend that data will contain valid values.

> > . if your binary file is in some way invalid, you will have a = greater
> >   problem correcting it than a text file
> > . discussing why a file is invalid is easier with a text = file
> > . a binary file might be smaller, but that does not matter = much
> > . text files are better provided for by version systems (e.g. = git)
> > . it is easier to write tools that write text than binary, = because
> >   debugging the output is easier
> Regarding vcs of text data files for GUI program, it's a stretch to = claim
> that the fact that they're text makes them much more = compatible.  The diffs
> are only useful for the most trivial of cases.

You used the word "better", now are you judging things around
"compatible". What do you want ?
If you want "compatible", then compatible to what ?

I write a sym/fp-generators and I prefer text output; I can check = the
output both "textually" and "visually".

The diffs out the output of thoose generators are valueable when
debugging.

You might think of gschem and pcb as gui programs, but there is
ifrastructure around them which is not gui. Don't think about their = files
in the same way as a png or jpg, where the fileformats are well
entrenched and infrastructure is available.

>  For it to be really useful
> you need a (non-text) diff viewer of  some sort.

A graphical-diff is provided by

 http://www.imagemagick.org/Usage/compare/

You can generate png's from sch/pcb files and use that program for
graphical-diffs.

It would be very useful for regression tests, other projects have = been
successful at that.

> All the rest of the above still boil down to examples of things = that are
> easier because you can see the data, and therefore manipulate and = validate
> it more easily.

No, validate input and size, is not about "easier to see".

But...
Why do you want to remove the text format for me ?
What are your complaint of having a textual file format ?

By dropping the text you loose something, what is the gain of a = binary
format that outweights that ?

> > Also, there is no reason to change a file format unless you = change the
> > functionality it provides, I have to "side heavily" with John = on this.
> > If you want to change the file format, you first have to = provide some
> > goodies that will make people to accept it. And no such = "goodie"
> > thing has appeared.
> A little while back a PhD Stefan Salewski put together a very good = start on
> a very nice router.  IIRC he said 300-400 hours of effort, = probably about
> $100k worth with overhead in the american market.  It's not C = (for good
> reasons) and currently can't talk to pcb at all.  I would like = to somehow
> arrange things such than efforts like this could maybe get = used.

In what way does his work relate to a decision about text contra = binary
file format.

> > You might write a library that reads and writes the files and = if people
> > find it useful, they will start using it, else, it will be = just your own
> > project.
> True.  It might be useless but should at least be = non-destructive.

I thought you are for a fileformat library. I'm I wrong ?
But now you say that such a library is useless. Did I get that right = ?

I think a fileformat library would be useful, but I still
would like a textual format for sch/sym/pcb/fp files.

Regards,
/Karl Hammar

= ----------------------------------------------------------------------- Asp=F6 Data
Lilla Asp=F6 148
S-742 94 =D6sthammar
Sweden
+46 173 140 = 57




John = Doty        =       Noqsi = Aerospace, Ltd.

http://www.noqsi.com/

jpd AT noqsi DOT com



= --Apple-Mail=_54402707-1FBE-4453-A9BE-B3C6C50F8B05-- --Apple-Mail=_9DE033FE-F627-441A-9D43-DA9634BB4DDA Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWjQpDAAoJEF1Aj/0UKykRJOAP/jBHbsSFSzYD1rhJP5oz7BB9 8RTBhhuknQWJVRkO+ztXFIFTsW12iSUrH6eutPruQsF7WHmtg032AJQnCSJeVSbS jlnS9vMIL251yhjTogcDlOCc9xxSpJvruuRhYJ4GXOp3R970nG8TFZPoZ4ZUj4n+ mGqTKhZlwS3t4XbbmVyKOA0InmVMvJarQNW3ku0BSNNa8dxLtA43EYCLvzOF8L5/ BHpjXTx2DcOjEsBMVzP6Ri8dCNKoALkKejJwx23e+8sl68z7cvCtUswq4zHdrUOk rAw7lrW8DVoImY6qc1uCZpitJIyoPonNnb7BKCWuWruPIjz7lNXdrD8XlPa+DQsp a1FjqYI/94f40zdveProq56v/+GVxonoJ8XWkzRJihw7dj215sszI3EBCtFMwog9 +OTzoMZMYjN+3Hskgu4ahaVpG9DHe5z8ac+Q1eoYoNeo9fNPzt7lj7vcrqlbWb8i Llkrh2Lj7TdU+rqZqjciHCJx+2Vyf8iUQdgV+aYB7kAGvG0nLfn1POaYo3waINME +VCMaG47aLrWAArqqnmCqTTNib9SdtnxcfoiG74tcEpZLkIxaZfMBmxYPfejHuyR 9n/06I9q3FF1KOilaImvdc3QmcowCNHXaxcLVu37tmAv3tsAFmj0h7xpT0DWRJzd hJ7cKX7JWrkLxUfLroH1 =almu -----END PGP SIGNATURE----- --Apple-Mail=_9DE033FE-F627-441A-9D43-DA9634BB4DDA--