delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1993/04/01/10:29:03

Date: Thu, 1 Apr 93 10:05:54 EST
From: Steven J. Zeil <zeil AT cs DOT odu DOT edu>
To: djgpp AT sun DOT soe DOT clarkson DOT edu
Subject: Re: Detec and the docs
References: <9303302217 DOT AA23513 AT server DOT physics DOT lsa DOT umich DOT edu>

Mike Sanders writes:
  > There *is* a DOS port of detex in the tex directory of simtel.
  > (oak.oakland.edu:/pub/msdos/tex/detex10.zip
  > 
  > There is a port of detex v 2.5 (a better version) at ftp-os2.nmsu.edu

Tim Boutell writes:
  > I think we should use it to create an ASCII mirror of the TeX docs
  > and make that just as readily available and readily found. 


Be careful what you ask for - you just may get it.

The notion of an ASCII mirror may make sense, but detex is not the
right program to use.  My experience with it is with a much older
version, but unless the very nature of the program has changed, it
does NOT produce an ASCII version of a formatted document.  It simply
strips out all TeX and LaTeX commands, leaving behind the ASCII-only
portion of the unformatted text.  In most cases, a printout of this
would be HARDER to read than the raw TeX input.  Detex was intended
for use as a rough filter prior to running a TeX document through
spelling checkers, word count utilities, etc., where a rough
approximation would be good enough.

For example, suppose I had the following in a LaTeX file:
|  \newcommand{\igc}{incremental garbage collection algorithm}
|    % abbreviation for phrase used
|    % many times in this document
|  Figure~\ref{code-figure} 
|  shows 
|  the \igc\ employed in this
|  system. 
|  \begin{figure}\tt
|  \begin{tabbing} 
|  for \=each object X in heap loop \+ \\
|     trash X; \- \\
|  end loop;
|  \end{tabbing}
|  \caption{A Silly Algorithm}\label{code-figure}
|  \end{figure}
|  As you can see, the \igc\ is not only incorrect, it is
|  {\bf dangerous}!

Detex would present you with something like:
|  Figure 
|  shows 
|  the employed in this
|  system. 
|
|
|  for each object X in heap loop
|     trash X;
|  end loop;
|
|
|  As you can see, the is not only incorrect, it is
|  dangerous!


As you can see, simply dropping all TeX commands results in an ugly
listing with many words and other information missing.


There is another way.  A program dvi2tty takes the .dvi file produced
by TeX and strips out the control commands, leaving the ASCII
characters behind.  Because this is done AFTER formatting, the result
contains all the words and only the graphics and mathematical
expressions are likely to be messed up.

You can further improve the ASCII result by forcing LaTeX to use 
fixed-width fonts when it is formatting. To do this, add the following
right after the \documentstyle command:
|  \def\rm{\protect\tt}
|  \def\it{\protect\tt}
|  \def\bf{\protect\tt}
|  \def\sl{\protect\tt}
|  \def\sf{\protect\tt}
|  \def\sc{\protect\tt}
This forces the use of the fixed-width \tt font no matter what font
commands the LaTeX document contains. If you ran this through LaTeX
and then the resulting .dvi file through dvi2tty, you would get
something like:

|  Figure 1 shows the incremental garbage collection algorithm
|  employed  in  this system.  As you can see, the incremental
|  garbage collection algorithm is not only incorrect,  it  is
|  dangerous!
|
|  -----------------------------------------------------------
|
|  for each object X in heap loop
|      trash X;
|  end loop;
|
|      Figure 1: A Silly Algorithm
|



So, in summary, if you really want to have an ASCII mirror of the
documentation, don't use detex.  Have someone who HAS installed
LaTeX prepare the ASCII docus using dvi2tty.  If no one has the
time/energy to do so, you're better off just printing the raw .tex
files rather than trying to use detex.


Steve Z

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019