X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f X-Recipient: geda-user AT delorie DOT com X-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=I5rwl5BO496waqnXmDjBs9soor3W8vuz78Trr4ng1cg=; b=KnnnF+06aij9zLUg/zeK6NjK262mD2/Rn1OUx8BVdBMJ3PJx5w05V8eVwKB7dCSmAe mu1KkbWtEnesCSKVRxAAeCUkRULfxHE6TfU1IPPEEasLhnRdtQqjC5UYnEHt2tEPcjmU +Y+iX/8CeTb3uQzpwV7tt1m0e1MGAgbTEdUVKgyfeaG8WwB9h7/rn0kfttsy9nye/awI dsWfvOZdXVjPEvpZP0/fOxUUWeHMQqvqpugz03MnTs8XFETIe4257MojfvlFAceO2O++ zDPgYLY0hjVa89wH8fFaaomVwX3XKfsIvPeoxZQdOpgKIr1e4y57nyod9Huur2IvzJqQ sesQ== MIME-Version: 1.0 X-Received: by 10.28.1.23 with SMTP id 23mr32723748wmb.37.1451807618333; Sat, 02 Jan 2016 23:53:38 -0800 (PST) In-Reply-To: References: <1512221837 DOT AA25291 AT ivan DOT Harhan DOT ORG> <20151222232230 DOT 12633 DOT qmail AT stuge DOT se> <0F6F1D0F-4F07-48EA-90FE-836EAD4E2354 AT noqsi DOT com> <0FCF3774-F93C-4BFF-BB61-636F75DCCACB AT noqsi DOT com> Date: Sat, 2 Jan 2016 22:53:38 -0900 Message-ID: Subject: Re: [geda-user] A fileformat library From: "Britton Kerin (britton DOT kerin AT gmail DOT com) [via geda-user AT delorie DOT com]" To: geda-user AT delorie DOT com Content-Type: multipart/alternative; boundary=001a113d7c3e5f81e70528694d58 Reply-To: geda-user AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: geda-user AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk --001a113d7c3e5f81e70528694d58 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Sat, Jan 2, 2016 at 8:19 PM, John Doty wrote: > > On Jan 2, 2016, at 9:27 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) [via > geda-user AT delorie DOT com] wrote: > > > > On Sat, Jan 2, 2016 at 6:07 PM, John Doty wrote: > >> >> On Jan 2, 2016, at 7:47 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) [via >> geda-user AT delorie DOT com] wrote: >> >> >> >> On Sat, Jan 2, 2016 at 4:38 PM, John Doty wrote: >> >>> >>> On Jan 2, 2016, at 6:07 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) >>> [via geda-user AT delorie DOT com] wrote: >>> >>> Personally I find formats like this: >>> >>> device=3DRESISTOR >>> T 44400 49300 5 10 1 1 90 0 1 >>> >>> substantially less readable than ones with field names, but they are >>> indeed easy to parse. >>> >>> >>> Personally, I rarely edit these things manually except for the text >>> fields, which are not difficult to find. The fact that they=E2=80=99re = easy to >>> parse is handy for automation. >>> >>> The pcb format is quite a bit more elaborate and the savings from not >>> rolling your own parser are more significant. >>> >>> I think you're criteria for what should go in libgeda are spot-on btw. >>> Nor do I have any problem with a C interface calling python or gschem o= r >>> for that matter C++. I do think providing a clean C interface to libge= da >>> gets by far the best return on investment, since it's so widely known a= nd >>> with a little care wrappers can then be provided almost automatically f= or a >>> wide variety of languages (via SWIG or some other similar mechanism -- = or >>> maybe Xorn facilitates this, I'm a little unclear). >>> >>> >>> I don=E2=80=99t find deconstructing C data structures particularly easi= er than >>> parsing the format above. Just another layer I have to penetrate to get= to >>> the data. I do significant processing with simple things like sed, whic= h >>> don=E2=80=99t handle binary data. >>> >>> Wrappers CAN be provided, but will they? FFI programming is not the >>> easiest thing. I hear complaints about the need for developers to main= tain >>> code. It seems to me that one way to address these concerns is to avoid= and >>> eliminate unnecessary code. >>> >> >> Good question. It's a great result if you get it but a lot more work >> than using a serialization library, which is why the latter approach see= ms >> to me like a useful step in the right direction. >> >> Serialization library? Why do you want a extra, unnecessary, opaque >> interface? What, exactly, are you trying to accomplish? >> > > Two things: > > 1. A human- and partial-parser-script-readable format > > > We have that, I think. But you left out the most important virtue: > *simple*. > I agree that it's readable enough, though it could be better. I also agree that simplicity is good. > 2. Full parsers for as many languages as possible without writing > them by hand > > > So instead, you need to write an interface between a complicated parser > and every application by hand. Where=E2=80=99s the gain? > Here's what YAML looks like from perl: use YAML::XS; my $yaml =3D Dump [ 1..4 ]; my $array =3D Load $yaml; The gain is that this is a vastly easier way to vivify a saved object that to write my own parser, or even my own partial parser for non-trivial cases= . > Now take a look at the design goals for YAML: > > http://www.yaml.org/spec/1.2/spec.html#id2708649 > > It's a good fit. If it was only a matter of the technical merits I would > say as close to perfect as it gets with software. > > > Compare it to http://wiki.geda-project.org/geda:file_format_spec > YAML is enormously more complex to no advantage for us. > The point is that you don't have to deal with any of that complexity (of which there really isn't all that much -- calling it enormously complex is a big overstatement). It's a library with approximately two entry points per language for modern languages, and not much more for C. Parsing may be a non-issue for you if you only care about strings in .sch files, but for many useful operations on pcbs you need the whole thing, or most of it. > Unfortunately there's the usual good-versus-most-popular trade-off in > deciding between YAML and JSON. I still favor YAML in this case, largely > because I can't look at people like you and honestly claim that JSON is i= n > all respects fun to read/edit/sed over etc., and because my personal > experience with JSON is that although the parsers are truly ubiquitous th= ey > have some annoying characteristics (at least the Perl one does). > > But since it doesn=E2=80=99t relieve the need of the application programm= er to > understand the interface, it is merely adding more code for no gain (or e= ven > I'm not sure what you mean by this. The programmer needs to understand what the fields mean, sure. YAML/JSON helps somewhat with this, because the fields have names. Even if you do understand the existing format, that understanding that will absolutely not get you a live editable version of what's in a pcb file without a lot of (pointless) additional work. negative gain, given the added complexity). And neither YAML nor JSON is as > universally readable and processable as the format we have. > There's no added complexity to speak of for clients, and YAML is far more readable and at least as processable as what we have now. I think your view of things is strongly tied to your particular use case. It sounds like you mostly work on attributes with their own special meaning (IIRC noqsi has attributes with their own syntax), and don't have to parse everything. That's fine. I sure don't want to break anything for you. However, if you consider the actual problem I'm hoping to address you might sympathize at least with the thought that not reinventing the parser everywhere might be worthwhile. I started out to write a quick parser in perl, in exactly the way you seem to be proposing should be the way to do everything. It's a significant hassle and you end up with a slow parser that only works from one language. As you've pointed out yourself, parsing (and serialization) is a relatively trivial, thoroughly solved problem. Why reinvent the solution? I've taken some time over this because at least one other person indicated that they shared your concern about using a generic parser rather than an arbitrary custom format. So I'd like to actually convince you, lest you convince others that doing as I propose is a bad idea for pcb. I'd also like to apologize for bad attitude and rudeness I've shown you in the past, and hope you're able to view this issue in technical terms alone (I confess that I sometimes have difficulty doing this with your emails). Britton --001a113d7c3e5f81e70528694d58 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Sat, Jan 2, 2016 at 8:19 PM, John Doty <jpd AT noqsi DOT com> wr= ote:

On Jan 2, 2016, at 9:27 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) = [via geda-user AT d= elorie.com] <geda-user AT delorie DOT com> wrote:



On Sat, Jan 2, 2016 at 6:07 PM, John Doty <jpd AT noqsi DOT com> w= rote:

On Jan 2, 2016, at 7:47 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) [via geda-user AT delorie DOT com] = <geda-user AT de= lorie.com> wrote:



On Sat, Jan= 2, 2016 at 4:38 PM, John Doty <jpd AT noqsi DOT com> wrote:


Personally = I find formats like this:

=C2=A0 device=3DRESISTOR
=C2=A0 T 44400= 49300 5 10 1 1 90 0 1

substantially less readable than ones with fi= eld names, but they are indeed easy to parse.

Personally, I rarely edit these things manually except for the t= ext fields, which are not difficult to find. The fact that they=E2=80=99re = easy to parse is handy for automation.

=C2=A0 The pcb format is quite a bit more elaborate= and the savings from not rolling your own parser are more significant.
=

I think you're= criteria for what should go in libgeda are spot-on btw.=C2=A0 Nor do I hav= e any problem with a C interface calling python or gschem or for that matte= r C++.=C2=A0 I do think providing a clean C interface to libgeda gets by fa= r the best return on investment, since it's so widely known and with a = little care wrappers can then be provided almost automatically for a wide v= ariety of languages (via SWIG or some other similar mechanism -- or maybe X= orn facilitates this, I'm a little unclear).
<= br>
I don=E2=80=99t find deconstructing C data structures particu= larly easier than parsing the format above. Just another layer I have to pe= netrate to get to the data. I do significant processing with simple things = like sed, which don=E2=80=99t handle binary data.

= Wrappers CAN be provided, but will they? FFI programming is not the easiest= thing. I hear =C2=A0complaints about the need for developers to maintain c= ode. It seems to me that one way to address these concerns is to avoid and = eliminate unnecessary code.

Serialization library? Why do you want a extra, unnecessary, opa= que interface? What, exactly, are you trying to accomplish?


We have th= at, I think. But you left out the most important virtue: *simple*.

I agree that it's readab= le enough, though it could be better.=C2=A0 I also agree that simplicity is= good.
=C2=A0
=C2=A0 =C2=A0 2.=C2= =A0 Full parsers for as many languages as possible without writing them by = hand

So instead, = you need to write an interface between a complicated parser and every appli= cation by hand. Where=E2=80=99s the gain?

=
Here's what YAML looks like from perl:

=C2=A0 =C2=A0 =C2=A0use YAML::XS;
<= /div>

=C2=A0 =C2=A0 =C2=A0my $yaml =3D Dump [ 1..4 ];
=C2=A0 =C2=A0 =C2=A0my $array =3D Load $yaml;

<= div style=3D"">The gain is that this is a vastly easier way to vivify a sav= ed object that to write my own parser, or even my own partial parser for no= n-trivial cases.

The point is that you don't have to deal with any of that compl= exity (of which there really isn't all that much -- calling it enormous= ly complex is a big overstatement).=C2=A0 It's a library with approxima= tely two entry points per language for modern languages, and not much more = for C.=C2=A0 Parsing may be a non-issue for you if you only care about stri= ngs in .sch files, but for many useful operations on pcbs you need the whol= e thing, or most of it.
Unfortunately there= 's the usual good-versus-most-popular trade-off in deciding between YAM= L and JSON.=C2=A0 I still favor YAML in this case, largely because I can= 9;t look at people like you and honestly claim that JSON is in all respects= fun to read/edit/sed over etc., and because my personal experience with JS= ON is that although the parsers are truly ubiquitous they have some annoyin= g characteristics =C2=A0(at least the Perl one does).
But since it doesn=E2=80=99t relieve the need of the = application programmer to understand the interface, it is merely adding mor= e code for no gain (or even

I'm not sure what you mean by this.=C2=A0 The programmer needs = to understand what the fields mean, sure.=C2=A0 YAML/JSON helps somewhat wi= th this, because the fields have names.=C2=A0 Even if you do understand the= existing format, that understanding that will absolutely not get you a liv= e editable version of what's in a pcb file without a lot of (pointless)= additional work.

negative gain, given the added complexity). And neith= er YAML nor JSON is as universally readable and processable as the format w= e have.

There's = no added complexity to speak of for clients, and YAML is far more readable = and at least as processable as what we have now.=C2=A0 I think your view of= things is strongly tied to your particular use case.=C2=A0 It sounds like = you mostly work on attributes with their own special meaning (IIRC noqsi ha= s attributes with their own syntax), and don't have to parse everything= .=C2=A0 That's fine.=C2=A0 I sure don't want to break anything for = you.

However, if you conside= r the actual problem I'm hoping to address you might sympathize at leas= t with the thought that not reinventing the parser everywhere might be wort= hwhile.=C2=A0 I started out to write a quick parser in perl, in exactly the= way you seem to be proposing should be the way to do everything.=C2=A0 It&= #39;s a significant hassle and you end up with a slow parser that only work= s from one language.=C2=A0 As you've pointed out yourself, parsing (and= serialization) is a relatively trivial, thoroughly solved problem.=C2=A0 W= hy reinvent the solution?

I&= #39;ve taken some time over this because at least one other person indicate= d that they shared your concern about using a generic parser rather than an= arbitrary custom format.=C2=A0 So I'd like to actually convince you, l= est you convince others that doing as I propose is a bad idea for pcb.=C2= =A0 I'd also like to apologize for bad attitude and rudeness I've s= hown you in the past, and hope you're able to view this issue in techni= cal terms alone (I confess that I sometimes have difficulty doing this with= your emails).

Britton
=

--001a113d7c3e5f81e70528694d58--