X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f X-Recipient: geda-user AT delorie DOT com Date: Mon, 14 Sep 2015 13:03:34 +0200 (CEST) From: Roland Lutz To: geda-user AT delorie DOT com Subject: Re: [geda-user] shortest way towards parsing .pcb files outside pcb In-Reply-To: <201509122223.t8CMNhaZ024482@envy.delorie.com> Message-ID: References: <201509120239 DOT t8C2dAiO026962 AT envy DOT delorie DOT com> <201509122223 DOT t8CMNhaZ024482 AT envy DOT delorie DOT com> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323329-454723429-1442228614=:2811" Reply-To: geda-user AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: geda-user AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323329-454723429-1442228614=:2811 Content-Type: TEXT/PLAIN; format=flowed; charset=UTF-8 Content-Transfer-Encoding: 8BIT On Sat, 12 Sep 2015, DJ Delorie wrote: > Scripts wouldn't *need* to use pcb's parser, they could use any old > parser library, since the scripts already know what the parts of the > schema they're interested in look like. Read file, fiddle with the > parts you know, write file leaving everything else intact. That's easier said than done. With an XML file, you have basically no choice but to use an existing parser library; the format is just too complex (and XML parser bugs are a popular exploit vector). But these parsers aren't designed to tell you which parts of the file are which so you would know what to change. You could, of course, write the parsed data back into an XML, but to do that without losing information, you would again need context information about the file format which you want to avoid in the first place. For example, you would need to know in which parts of the file whitespace is significant so you can add or remove tags while still keeping the file human-readable. And there are XML processing instructions, CDATA sections, and so on… --8323329-454723429-1442228614=:2811--