HTML2TeX

Html2tex is a simple perl script which converts simple html to tex. Its primary purpose is to convert literature in html over to tex source so as to be able to print the writing nicely. I've never seen output nicer than TeX's, and good works of literature demand to look good. My primary use for it so far has been to make nice copies of the Essays of G. K. Chesterton which have fallen into public domain. However, any time one is printing out multiple pages of html for the words in them they will look nicer and probably take up less space in TeX than in html.

You can get the program here: html2tex. Save it to disk somewhere in your executiona path (to find out what that is type echo $PATH. Then make it executable (chmod +x /full/path/to/file). The syntax is html2tex [file] > outputfile. If the file is not specified, html2tex takes it from standard input.

Html2tex will create relatively full output. It inserts the macros that it will use at the top and puts in the \end at the bottom, so you can run TeX on the output file directly, but you will probably have to edit it a bit to correct the results of html that doesn't translate or sloppy typing. Since html does not distinguish between open quotes and close quotes people are often sloppy about making sure that they have all the quotation marks that they need.


lansdoct@cs.alfred.edu