public inbox for blinux-list@redhat.com
 help / color / mirror / Atom feed
* pdf to html
@  Karl Dahlke
   ` Andor Demarteau
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Karl Dahlke @  UTC (permalink / raw)
  To: blinux-list

A couple months ago I asked you kind folks how to turn pdf into html,
since one of my customers *required* pdf documents.
(Personally I hate pdf, with a passion,
but it's getting more and more popular,
so we're going to have to find ways of dealing with it.)
You directed me to an htmldoc program that works
pretty well, if you don't use many fancy tags.
If I were grading the package, I'd give it a strong B.

Now I am in the opposite situation.
A major software developer supplies its documentation in print
or pdf, period!
I called and asked; it's pdf or hit the highway.
So I searched the net again and found the site
access.adobe.com/tools.html
In other words, Adobe knows its "standard"
is totally inaccessible, and their trying to do something about it.
I guess that's something.
I ran my 6 megabyte pdf administrators guide through it,
and out came the 2 megabyte html equivalent.
As a conversion utility, I'd give it a C+, maybe a B-.
But at least it worked, and I can read the manual
and get on with my job.

One of the most irritating features of the tool is its tendency
to print its error messages right in the text,
and there are plenty of them.
At this point I'm glad I wrote my own editor browser.
One line of perl code strips them out.
If you're using standard software such as lynx,
you might be able to save the html page with the -source option,
run the following perl command on it (sed won't do),
and then view the modified local file.
That will get rid of the errors.
Here's the relevant chunk of perl code from my editor.

#  One of the common problems in the translation is
#  the following meaningless string, that appears over and over again.
#  I'm removing it here.
$text =~ s/Had\strouble\sresolving\sdest\snear\sword\s(<[\w_]+>\s)?action\stype\sis\sGoToR//g;




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~ UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
 pdf to html Karl Dahlke
 ` Andor Demarteau
 ` Jason White
 ` B. Alan Mattison
   ` David Poehlman
   ` Andor Demarteau
 ` Andor Demarteau
   ` Jude DaShiell
     ` Andor Demarteau
       ` libcdaudio A. R. Vener
         ` libcdaudio S. Massy
           ` libcdaudio A. R. Vener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).