[Coco] OT: stripping garbage from web pages
Brett K Heath
hcmth019 at csun.edu
Wed Nov 3 20:18:10 EST 2004
On Sat, 23 Oct 2004, Bob Devries wrote:
> Can anyone suggest a programme for windoze that will strip the unnecessary
> bloat from web pages created by M$Word? I find that Word puts in heaps of
> stuff that seems to be totally unrelated to the normal HTML code.
I feel your pain. I was once given the task of cleaning up a few pages
that had been generated by M$Word. In the end it was faster (and easier)
to cut and paste the ascii into a better editor and regenerate the rest
from scratch.
I ended up using Lyx (a LaTex front end, also freely available for
Windows) and converting to html with latex2html (a perl script). Lyx has
some fairly hefty prerequisites (LaTex, for example) but they are all
freely available and it is well documented and easy to use.
Not sure whether it would match your requirements (I was doing lot's of
math stuff) but it offers many handy facilities (like an automatically
generated TOC that can be used to navigate around the document while it's
being edited) and might be worth a look.
I don't have the URL's handy but google knows.
Brett K. Heath
More information about the Coco
mailing list