X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f X-Recipient: geda-user AT delorie DOT com X-Mailer: exmh version 2.7.2 01/07/2005 (debian 1:2.7.2-18) with nmh-1.3 X-Exmh-Isig-CompType: comp X-Exmh-Isig-Folder: inbox To: geda-user AT delorie DOT com Subject: [geda-user] pdf to sym conversion Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Message-Id: <20120615085456.B62F7814BC78@turkos.aspodata.se> Date: Fri, 15 Jun 2012 10:54:56 +0200 (CEST) From: karl AT aspodata DOT se (Karl Hammar) X-Virus-Scanned: ClamAV using ClamSMTP Reply-To: geda-user AT delorie DOT com I'm doing some tries to simplify sym file generation. If you have any idéas, please respond. The actual sym file genertion seems best done with tragesym or djsymbox, but I need to find the pin number to pin name mapping. For that I could grab an ibis file, use pybis [1] or ibis.pl [2]. The problem with that is that some pins might be missing and it doesn't show alternate pin function names (think mcu's). I'm trying to find some way to take a table from a pdf and filter it through something that genererate a "pin" file. Then one could use that pin file, order it to ones liking (e.g. a separate power symbol, a big sym with all pins etc.), and then run tragesym/djsymbox. So this is about finding and generating something like pin type label 1 pas VSS ... for each package of the device. /// Copy and paste method. By marking the first page in the table (table 4, page 24..29 of [3]) in xpdf and copy to a text file I get: TRACECLK/ FSMC_A23 / - - 1 1 1 A2 PE2 I/O FT PE2 ETH_MII_TXD3 / EVENTOUT TRACED0/FSMC_A19/ - - 2 2 2 A1 PE3 I/O FT PE3 EVENTOUT TRACED1/FSMC_A20 / - - 3 3 3 B1 PE4 I/O FT PE4 DCMI_D4/ EVENTOUT TRACED2 / FSMC_A21 / - - 4 4 4 B2 PE5 I/O FT PE5 TIM9_CH1 / DCMI_D6/ EVENTOUT TRACED3 / FSMC_A22 / - - 5 5 5 B3 PE6 I/O FT PE6 TIM9_CH2 / DCMI_D7/ EVENTOUT 1 A9 6 6 6 C1 VBAT S VBAT Which looks parsable. Evince was worse, difficult to mark the right area and the columns did not line up: - - 1 1 1 A2 PE2 I/O FT PE2 TRACECLK/ FSMC_A23 / ETH_MII_TXD3 / EVENTOUT - - 2 2 2 A1 PE3 I/O FT PE3 TRACED0/FSMC_A19/ EVENTOUT - - 3 3 3 B1 PE4 I/O FT PE4 TRACED1/FSMC_A20 / DCMI_D4/ EVENTOUT - - 4 4 4 B2 PE5 I/O FT PE5 TRACED2 / FSMC_A21 / TIM9_CH1 / DCMI_D6/ EVENTOUT - - 5 5 5 B3 PE6 I/O FT PE6 TRACED3 / FSMC_A22 / TIM9_CH2 / DCMI_D7/ EVENTOUT So, I used xpdf and created *.tbl in [7]. The data was taken from page device 24..28 stm32f100 (low/mid density) [3] 24..29 stm32f100 (high density) [4] 26..30 stm32f105/7 [5] 40..51 stm32f205 [6] Running pins_st32f100.pl (from [7]) on thoose files generated nice pin files (I added the first three lines, file prefix, package name, empty line; put a line with "//\n" to signal to the program where there is a page break, i.e. where column widths might change): $ head -15 st32f100h.tbl st32f100h LQFP144 LQFP100 LQFP64 1 1 - PE2 I/O FT PE2 TRACECK/ FSMC_A23 2 2 - PE3 I/O FT PE3 TRACED0/FSMC_A19 3 3 - PE4 I/O FT PE4 TRACED1/FSMC_A20 4 4 - PE5 I/O FT PE5 TRACED2/FSMC_A21 5 5 - PE6 I/O FT PE6 TRACED3/FSMC_A22 6 6 1 VBAT S VBAT PC13-TAMPER- 7 7 2 I/O PC13(6) TAMPER-RTC RTC(5) PC14- 8 8 3 I/O PC14(6) OSC32_IN OSC32_IN(5) $ head st32f100h.lqfp144.pins 1 pas* PE2 (FSMC_A23 TRACECK) 2 pas* PE3 (FSMC_A19 TRACED0) 3 pas* PE4 (FSMC_A20 TRACED1) 4 pas* PE5 (FSMC_A21 TRACED2) 5 pas* PE6 (FSMC_A22 TRACED3) 6 pwr VBAT 7 pas PC13 (TAMPER_RTC) 8 pas PC14 (OSC32_IN) 9 pas PC15 (OSC32_OUT) 10 pas* PF0 (FSMC_A0) Since the i/o pins could be both digital or analogue, I set the type to "pas" ("*" == 5V tolerant pin, should we define new pin types?). Except for stm32f205/7 where many more table cells are multiline, and I got bogus results like: $ grep -C2 '^11 G8 18 29 35 M5' st32f205_7.tbl SPI2_MOSI / I2S2_SD / OTG_HS_ULPI_NXT / ADC123_ 11 G8 18 29 35 M5 PC3(6) I/O FT PC3 ETH_MII_TX_CLK/ IN13 EVENTOUT $ grep '^ 29' st32f205_7.lqfp144.pins 29 pas* PC3 (ADC123_ ETH_MII_TX_CLK EVENTOUT I2S2_SD IN13 OTG_HS_ULPI_NXT SPI2_MOSI) $ where the ADC123_ should be joined with IN13. So I made Tabular [8] to find column limits: $ head st32f205_7.tbl st32f205_7 LQFP64 WLCSP64+2 LQFP100 LQFP144 LQFP176 UFBGA176 TRACECLK/ FSMC_A23 / - - 1 1 1 A2 PE2 I/O FT PE2 ETH_MII_TXD3 / EVENTOUT TRACED0/FSMC_A19/ - - 2 2 2 A1 PE3 I/O FT PE3 EVENTOUT TRACED1/FSMC_A20 / $ Tabular st32f205_7.tbl | head st32f205_7 LQFP64 WLCSP64+2 LQFP100 LQFP144 LQFP176 UFBGA176 | | | | | | | | | |TRACECLK/ FSMC_A23 / -| -|1|1|1|A2|PE2 |I/O|FT|PE2 | ETH_MII_TXD3 / | | | | | | | | | | EVENTOUT | | | | | | | | | | TRACED0/FSMC_A19/ -| -|2|2|2|A1|PE3 |I/O|FT|PE3 | | | | | | | | | | EVENTOUT | | | | | | | | | | TRACED1/FSMC_A20 / which worked fine except where the columns didn't line up: 64|D9|100|144| 172 C5| VDD_3 | S | | VDD_3 | | | | | | | | | TIM8_BKIN / DCMI_D5/ -| -| -| - |173 D4 | PI4 |I/O|FT| PI4 and for pages with empty columns. Any idáas how to proceed? /// pdftotext -layout After running $ pdftotext -layout CD00237391.pdf st32f205_7.tbl2 emacs cutting away everything except the table data $ Tabular st32f205_7.tbl2 | cut -b-60 ... -| -| | | | -| | | 14|20| | | | J3| -| -| | | | -| | | 15|21|K3| | | | -| H9|10|16|22|G2| | | | | | | | | -| -| | | |11|17|23|G3 | | | | | | | | | | | | | | | | | | | | ... So that didn't solve things either. /// Then I found [9] and then [10], which sounded promising. It is coded in java, but the idéa is to use pdftohtml: $ mkdir Todo $ cd Todo $ pdftohtml -c .../CD00237391.pdf a.html ... $ ls -1 a001.png # page 1 graphics ... a-1.html # page 1 text ... a.html a_ind.html a-outline.html $ The last three files is for doing frames and links, which does not interest us. a040.html countain the table lines and boxes, the a-40.html the text with its positions: $ grep Table a-40.html