Mail Archives: geda-user/2016/01/03/02:54:37
--001a113d7c3e5f81e70528694d58
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On Sat, Jan 2, 2016 at 8:19 PM, John Doty <jpd AT noqsi DOT com> wrote:
>
> On Jan 2, 2016, at 9:27 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) [via
> geda-user AT delorie DOT com] <geda-user AT delorie DOT com> wrote:
>
>
>
> On Sat, Jan 2, 2016 at 6:07 PM, John Doty <jpd AT noqsi DOT com> wrote:
>
>>
>> On Jan 2, 2016, at 7:47 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) [via
>> geda-user AT delorie DOT com] <geda-user AT delorie DOT com> wrote:
>>
>>
>>
>> On Sat, Jan 2, 2016 at 4:38 PM, John Doty <jpd AT noqsi DOT com> wrote:
>>
>>>
>>> On Jan 2, 2016, at 6:07 PM, Britton Kerin (britton DOT kerin AT gmail DOT com)
>>> [via geda-user AT delorie DOT com] <geda-user AT delorie DOT com> wrote:
>>>
>>> Personally I find formats like this:
>>>
>>> device=3DRESISTOR
>>> T 44400 49300 5 10 1 1 90 0 1
>>>
>>> substantially less readable than ones with field names, but they are
>>> indeed easy to parse.
>>>
>>>
>>> Personally, I rarely edit these things manually except for the text
>>> fields, which are not difficult to find. The fact that they=E2=80=99re =
easy to
>>> parse is handy for automation.
>>>
>>> The pcb format is quite a bit more elaborate and the savings from not
>>> rolling your own parser are more significant.
>>>
>>> I think you're criteria for what should go in libgeda are spot-on btw.
>>> Nor do I have any problem with a C interface calling python or gschem o=
r
>>> for that matter C++. I do think providing a clean C interface to libge=
da
>>> gets by far the best return on investment, since it's so widely known a=
nd
>>> with a little care wrappers can then be provided almost automatically f=
or a
>>> wide variety of languages (via SWIG or some other similar mechanism -- =
or
>>> maybe Xorn facilitates this, I'm a little unclear).
>>>
>>>
>>> I don=E2=80=99t find deconstructing C data structures particularly easi=
er than
>>> parsing the format above. Just another layer I have to penetrate to get=
to
>>> the data. I do significant processing with simple things like sed, whic=
h
>>> don=E2=80=99t handle binary data.
>>>
>>> Wrappers CAN be provided, but will they? FFI programming is not the
>>> easiest thing. I hear complaints about the need for developers to main=
tain
>>> code. It seems to me that one way to address these concerns is to avoid=
and
>>> eliminate unnecessary code.
>>>
>>
>> Good question. It's a great result if you get it but a lot more work
>> than using a serialization library, which is why the latter approach see=
ms
>> to me like a useful step in the right direction.
>>
>> Serialization library? Why do you want a extra, unnecessary, opaque
>> interface? What, exactly, are you trying to accomplish?
>>
>
> Two things:
>
> 1. A human- and partial-parser-script-readable format
>
>
> We have that, I think. But you left out the most important virtue:
> *simple*.
>
I agree that it's readable enough, though it could be better. I also agree
that simplicity is good.
> 2. Full parsers for as many languages as possible without writing
> them by hand
>
>
> So instead, you need to write an interface between a complicated parser
> and every application by hand. Where=E2=80=99s the gain?
>
Here's what YAML looks like from perl:
use YAML::XS;
my $yaml =3D Dump [ 1..4 ];
my $array =3D Load $yaml;
The gain is that this is a vastly easier way to vivify a saved object that
to write my own parser, or even my own partial parser for non-trivial cases=
.
> Now take a look at the design goals for YAML:
>
> http://www.yaml.org/spec/1.2/spec.html#id2708649
>
> It's a good fit. If it was only a matter of the technical merits I would
> say as close to perfect as it gets with software.
>
>
> Compare it to http://wiki.geda-project.org/geda:file_format_spec
> YAML is enormously more complex to no advantage for us.
>
The point is that you don't have to deal with any of that complexity (of
which there really isn't all that much -- calling it enormously complex is
a big overstatement). It's a library with approximately two entry points
per language for modern languages, and not much more for C. Parsing may be
a non-issue for you if you only care about strings in .sch files, but for
many useful operations on pcbs you need the whole thing, or most of it.
> Unfortunately there's the usual good-versus-most-popular trade-off in
> deciding between YAML and JSON. I still favor YAML in this case, largely
> because I can't look at people like you and honestly claim that JSON is i=
n
> all respects fun to read/edit/sed over etc., and because my personal
> experience with JSON is that although the parsers are truly ubiquitous th=
ey
> have some annoying characteristics (at least the Perl one does).
>
> But since it doesn=E2=80=99t relieve the need of the application programm=
er to
> understand the interface, it is merely adding more code for no gain (or e=
ven
>
I'm not sure what you mean by this. The programmer needs to understand
what the fields mean, sure. YAML/JSON helps somewhat with this, because
the fields have names. Even if you do understand the existing format, that
understanding that will absolutely not get you a live editable version of
what's in a pcb file without a lot of (pointless) additional work.
negative gain, given the added complexity). And neither YAML nor JSON is as
> universally readable and processable as the format we have.
>
There's no added complexity to speak of for clients, and YAML is far more
readable and at least as processable as what we have now. I think your
view of things is strongly tied to your particular use case. It sounds
like you mostly work on attributes with their own special meaning (IIRC
noqsi has attributes with their own syntax), and don't have to parse
everything. That's fine. I sure don't want to break anything for you.
However, if you consider the actual problem I'm hoping to address you might
sympathize at least with the thought that not reinventing the parser
everywhere might be worthwhile. I started out to write a quick parser in
perl, in exactly the way you seem to be proposing should be the way to do
everything. It's a significant hassle and you end up with a slow parser
that only works from one language. As you've pointed out yourself, parsing
(and serialization) is a relatively trivial, thoroughly solved problem.
Why reinvent the solution?
I've taken some time over this because at least one other person indicated
that they shared your concern about using a generic parser rather than an
arbitrary custom format. So I'd like to actually convince you, lest you
convince others that doing as I propose is a bad idea for pcb. I'd also
like to apologize for bad attitude and rudeness I've shown you in the past,
and hope you're able to view this issue in technical terms alone (I confess
that I sometimes have difficulty doing this with your emails).
Britton
--001a113d7c3e5f81e70528694d58
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div class=3D"gmail_quo=
te">On Sat, Jan 2, 2016 at 8:19 PM, John Doty <span dir=3D"ltr"><<a href=
=3D"mailto:jpd AT noqsi DOT com" target=3D"_blank">jpd AT noqsi DOT com</a>></span> wr=
ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex=
;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style=
:solid;padding-left:1ex"><div style=3D"word-wrap:break-word"><br><div><span=
class=3D""><div>On Jan 2, 2016, at 9:27 PM, Britton Kerin (<a href=3D"mail=
to:britton DOT kerin AT gmail DOT com" target=3D"_blank">britton DOT kerin AT gmail DOT com</a>) =
[via <a href=3D"mailto:geda-user AT delorie DOT com" target=3D"_blank">geda-user AT d=
elorie.com</a>] <<a href=3D"mailto:geda-user AT delorie DOT com" target=3D"_bla=
nk">geda-user AT delorie DOT com</a>> wrote:</div><br><blockquote type=3D"cite"=
><div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div class=3D"gmail_qu=
ote">On Sat, Jan 2, 2016 at 6:07 PM, John Doty <span dir=3D"ltr"><<a hre=
f=3D"mailto:jpd AT noqsi DOT com" target=3D"_blank">jpd AT noqsi DOT com</a>></span> w=
rote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8e=
x;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-styl=
e:solid;padding-left:1ex"><div style=3D"word-wrap:break-word"><br><div><div=
>On Jan 2, 2016, at 7:47 PM, Britton Kerin (<a href=3D"mailto:britton.kerin=
@gmail.com" target=3D"_blank">britton DOT kerin AT gmail DOT com</a>) [via <a href=3D"=
mailto:geda-user AT delorie DOT com" target=3D"_blank">geda-user AT delorie DOT com</a>] =
<<a href=3D"mailto:geda-user AT delorie DOT com" target=3D"_blank">geda-user AT de=
lorie.com</a>> wrote:</div><br><blockquote type=3D"cite"><div dir=3D"ltr=
"><br><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Sat, Jan=
2, 2016 at 4:38 PM, John Doty <span dir=3D"ltr"><<a href=3D"mailto:jpd@=
noqsi.com" target=3D"_blank">jpd AT noqsi DOT com</a>></span> wrote:<br><blockq=
uote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-wi=
dth:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-=
left:1ex"><div style=3D"word-wrap:break-word"><br><div><span><div>On Jan 2,=
2016, at 6:07 PM, Britton Kerin (<a href=3D"mailto:britton DOT kerin AT gmail DOT com=
" target=3D"_blank">britton DOT kerin AT gmail DOT com</a>) [via <a href=3D"mailto:ged=
a-user AT delorie DOT com" target=3D"_blank">geda-user AT delorie DOT com</a>] <<a hre=
f=3D"mailto:geda-user AT delorie DOT com" target=3D"_blank">geda-user AT delorie DOT com<=
/a>> wrote:</div><br><blockquote type=3D"cite"><div style=3D"font-family=
:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight=
:normal;letter-spacing:normal;line-height:normal;text-align:start;text-inde=
nt:0px;text-transform:none;white-space:normal;word-spacing:0px">Personally =
I find formats like this:<br><br>=C2=A0 device=3DRESISTOR<br>=C2=A0 T 44400=
49300 5 10 1 1 90 0 1<br><br>substantially less readable than ones with fi=
eld names, but they are indeed easy to parse.</div></blockquote><div><br></=
div></span>Personally, I rarely edit these things manually except for the t=
ext fields, which are not difficult to find. The fact that they=E2=80=99re =
easy to parse is handy for automation.</div><div><br><blockquote type=3D"ci=
te"><span><div style=3D"font-family:Helvetica;font-size:12px;font-style:nor=
mal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-heigh=
t:normal;text-align:start;text-indent:0px;text-transform:none;white-space:n=
ormal;word-spacing:0px">=C2=A0 The pcb format is quite a bit more elaborate=
and the savings from not rolling your own parser are more significant.<br>=
</div><div style=3D"font-family:Helvetica;font-size:12px;font-style:normal;=
font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:no=
rmal;text-align:start;text-indent:0px;text-transform:none;white-space:norma=
l;word-spacing:0px"><br></div></span><span><div style=3D"font-family:Helvet=
ica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal=
;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;=
text-transform:none;white-space:normal;word-spacing:0px">I think you're=
criteria for what should go in libgeda are spot-on btw.=C2=A0 Nor do I hav=
e any problem with a C interface calling python or gschem or for that matte=
r C++.=C2=A0 I do think providing a clean C interface to libgeda gets by fa=
r the best return on investment, since it's so widely known and with a =
little care wrappers can then be provided almost automatically for a wide v=
ariety of languages (via SWIG or some other similar mechanism -- or maybe X=
orn facilitates this, I'm a little unclear).</div></span></blockquote><=
br></div><div>I don=E2=80=99t find deconstructing C data structures particu=
larly easier than parsing the format above. Just another layer I have to pe=
netrate to get to the data. I do significant processing with simple things =
like sed, which don=E2=80=99t handle binary data.</div><div><br></div><div>=
Wrappers CAN be provided, but will they? FFI programming is not the easiest=
thing. I hear =C2=A0complaints about the need for developers to maintain c=
ode. It seems to me that one way to address these concerns is to avoid and =
eliminate unnecessary code.</div></div></blockquote><div><br></div><div>Goo=
d question.=C2=A0 It's a great result if you get it but a lot more work=
than using a serialization library, which is why the latter approach seems=
to me like a useful step in the right direction.</div></div></div></div></=
blockquote>Serialization library? Why do you want a extra, unnecessary, opa=
que interface? What, exactly, are you trying to accomplish?</div></div></bl=
ockquote><div><br></div><div>Two things:=C2=A0</div><div><div><br></div><di=
v>=C2=A0 =C2=A0 1.=C2=A0 A human- and partial-parser-script-readable format=
</div></div></div></div></div></blockquote><div><br></div></span>We have th=
at, I think. But you left out the most important virtue: *simple*.</div></d=
iv></blockquote><div><br></div><div style=3D"">I agree that it's readab=
le enough, though it could be better.=C2=A0 I also agree that simplicity is=
good.</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,2=
04);border-left-style:solid;padding-left:1ex"><div style=3D"word-wrap:break=
-word"><div><span class=3D""><blockquote type=3D"cite"><div dir=3D"ltr"><di=
v class=3D"gmail_extra"><div class=3D"gmail_quote"><div>=C2=A0 =C2=A0 2.=C2=
=A0 Full parsers for as many languages as possible without writing them by =
hand</div></div></div></div></blockquote><div><br></div></span>So instead, =
you need to write an interface between a complicated parser and every appli=
cation by hand. Where=E2=80=99s the gain?</div></div></blockquote><div><br>=
</div><div style=3D"">Here's what YAML looks like from perl:</div><div =
style=3D""><br></div><div style=3D"">=C2=A0 =C2=A0 =C2=A0use YAML::XS;<br><=
/div><div><br></div><div>=C2=A0 =C2=A0 =C2=A0my $yaml =3D Dump [ 1..4 ];</d=
iv><div>=C2=A0 =C2=A0 =C2=A0my $array =3D Load $yaml;</div><div><br></div><=
div style=3D"">The gain is that this is a vastly easier way to vivify a sav=
ed object that to write my own parser, or even my own partial parser for no=
n-trivial cases.</div><blockquote class=3D"gmail_quote" style=3D"margin:0px=
0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);bor=
der-left-style:solid;padding-left:1ex"><div style=3D"word-wrap:break-word">=
<div><span class=3D""><blockquote type=3D"cite"><div dir=3D"ltr"><div class=
=3D"gmail_extra"><div class=3D"gmail_quote"><div>Now take a look at the des=
ign goals for YAML:</div><div><br></div><div>=C2=A0 =C2=A0 <a href=3D"http:=
//www.yaml.org/spec/1.2/spec.html#id2708649" target=3D"_blank">http://www.y=
aml.org/spec/1.2/spec.html#id2708649</a></div><div><br></div><div>It's =
a good fit.=C2=A0 If it was only a matter of the technical merits I would s=
ay as close to perfect as it gets with software.</div></div></div></div></b=
lockquote><div><br></div></span>Compare it to=C2=A0<a href=3D"http://wiki.g=
eda-project.org/geda:file_format_spec" target=3D"_blank">http://wiki.geda-p=
roject.org/geda:file_format_spec</a></div><div>YAML is enormously more comp=
lex to no advantage for us.</div></div></blockquote><div><br></div><div sty=
le=3D"">The point is that you don't have to deal with any of that compl=
exity (of which there really isn't all that much -- calling it enormous=
ly complex is a big overstatement).=C2=A0 It's a library with approxima=
tely two entry points per language for modern languages, and not much more =
for C.=C2=A0 Parsing may be a non-issue for you if you only care about stri=
ngs in .sch files, but for many useful operations on pcbs you need the whol=
e thing, or most of it.</div><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,2=
04);border-left-style:solid;padding-left:1ex"><div style=3D"word-wrap:break=
-word"><div><span class=3D""><blockquote type=3D"cite"><div dir=3D"ltr"><di=
v class=3D"gmail_extra"><div class=3D"gmail_quote"><div>Unfortunately there=
's the usual good-versus-most-popular trade-off in deciding between YAM=
L and JSON.=C2=A0 I still favor YAML in this case, largely because I can=
9;t look at people like you and honestly claim that JSON is in all respects=
fun to read/edit/sed over etc., and because my personal experience with JS=
ON is that although the parsers are truly ubiquitous they have some annoyin=
g characteristics =C2=A0(at least the Perl one does).</div></div></div></di=
v></blockquote></span>But since it doesn=E2=80=99t relieve the need of the =
application programmer to understand the interface, it is merely adding mor=
e code for no gain (or even</div></div></blockquote><div><br></div><div sty=
le=3D"">I'm not sure what you mean by this.=C2=A0 The programmer needs =
to understand what the fields mean, sure.=C2=A0 YAML/JSON helps somewhat wi=
th this, because the fields have names.=C2=A0 Even if you do understand the=
existing format, that understanding that will absolutely not get you a liv=
e editable version of what's in a pcb file without a lot of (pointless)=
additional work.</div><div><br></div><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(=
204,204,204);border-left-style:solid;padding-left:1ex"><div style=3D"word-w=
rap:break-word"><div> negative gain, given the added complexity). And neith=
er YAML nor JSON is as universally readable and processable as the format w=
e have.</div></div></blockquote><div><br></div><div style=3D"">There's =
no added complexity to speak of for clients, and YAML is far more readable =
and at least as processable as what we have now.=C2=A0 I think your view of=
things is strongly tied to your particular use case.=C2=A0 It sounds like =
you mostly work on attributes with their own special meaning (IIRC noqsi ha=
s attributes with their own syntax), and don't have to parse everything=
.=C2=A0 That's fine.=C2=A0 I sure don't want to break anything for =
you.</div><div style=3D""><br></div><div style=3D"">However, if you conside=
r the actual problem I'm hoping to address you might sympathize at leas=
t with the thought that not reinventing the parser everywhere might be wort=
hwhile.=C2=A0 I started out to write a quick parser in perl, in exactly the=
way you seem to be proposing should be the way to do everything.=C2=A0 It&=
#39;s a significant hassle and you end up with a slow parser that only work=
s from one language.=C2=A0 As you've pointed out yourself, parsing (and=
serialization) is a relatively trivial, thoroughly solved problem.=C2=A0 W=
hy reinvent the solution?</div><div style=3D""><br></div><div style=3D"">I&=
#39;ve taken some time over this because at least one other person indicate=
d that they shared your concern about using a generic parser rather than an=
arbitrary custom format.=C2=A0 So I'd like to actually convince you, l=
est you convince others that doing as I propose is a bad idea for pcb.=C2=
=A0 I'd also like to apologize for bad attitude and rudeness I've s=
hown you in the past, and hope you're able to view this issue in techni=
cal terms alone (I confess that I sometimes have difficulty doing this with=
your emails).</div><div style=3D""><br></div><div style=3D"">Britton</div>=
<div><br></div></div></div></div>
--001a113d7c3e5f81e70528694d58--
- Raw text -