public inbox for blinux-list@redhat.com
 help / color / mirror / Atom feed
* File conversion question
@  Jason White
   ` wlestes
   ` Lar Kaufman
  0 siblings, 2 replies; 3+ messages in thread
From: Jason White @  UTC (permalink / raw)
  To: blinux-list

Is there any software available which can be compiled under Unix or Linux,
and which can convert PDF files into either aSCII text or HTML format. I
am aware of the conversion service at http://access.adobe.com/ but this
does not provide a solution when the PDF document can only be accessed on
a restricted basis (E.G. via a web site that requires client
authentication).

For those who need to read MS-Word 8.0 documents, a new conversion tool is
available at http://www.csn.ul.ie/~caolan/docs/MSWordView.html

Jason.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: File conversion question
   File conversion question Jason White
@  ` wlestes
   ` Lar Kaufman
  1 sibling, 0 replies; 3+ messages in thread
From: wlestes @  UTC (permalink / raw)
  To: blinux-list

> Is there any software available which can be compiled under Unix or Linux,
> and which can convert PDF files into either aSCII text or HTML format. I

pstotext

which can be had in rpm format from
ftp.redhat.com/pub/contrib/hurricane/{SRPMS,i386}

and in .tar.Z from

http://www.research.digital.com/SRC/virtualpaper/pstotext.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: File conversion question
   File conversion question Jason White
   ` wlestes
@  ` Lar Kaufman
  1 sibling, 0 replies; 3+ messages in thread
From: Lar Kaufman @  UTC (permalink / raw)
  To: blinux-list, blinux-list

Aladdin ghostscript converts pdf and postscript to a multitude of output
forms, including plain or formatted text. It's powerful and complex, however.
It is a lot easier to use pstoedit as a front-end (and pstoedit can also
manage other modules for additional conversions). There is one nasty PDF
trick that is sometimes used online--a decryption key is added to the PDF
file.  Ghostscript can also process those PDF files, but you have to download
the decryption module from a non-U.S. website, because it's export-restricted
technology. (Still!) 

Get ghostscript from <http://www.aladdin.com>. The GNU ghostscript package
can't process PDF (or even PostScript level 2) so you need Aladdin's version
4.01 or later.  It's been a long time since I checked, but pstoedit should
be in usual Linux websites.  I'll have to research where to get the 
decryption module, if you should need it.  

I'm interested whether anyone is using these tools "on the fly" for
netsurfing.  If so, could you share info on how to set it up?  I have
always captured the files and converted them later, but I'm interested
in integrating and automating the conversions, preferably from within
emacs.

 -lar
 Lar Kaufman        | "It's bad enough to see yourself as the world is seeing
 Polymedia Services | you now.  To see yourself as the world saw you in the
 Concord, Mass.     | immediate past is to see yourself tantalizingly beyond
 lark@walden.com    | any hope of redemption." - Roy Blount Jr.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~ UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
 File conversion question Jason White
 ` wlestes
 ` Lar Kaufman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).