Converting or OCRing PDF to text
D. Michael McIntyre
michael.mcintyre at rosegardenmusic.com
Sun Jan 28 22:23:49 UTC 2007
On Sunday 28 January 2007 4:29 pm, Donn wrote:
> > I have a few articles in .pdf format I'd need to convert (or OCR) to
> > plain text, .odf or .doc format.
> > Any advice for this Linux newbie?
>
> Hi, check what these commands give you:
> pdftotext
> or
> pdftohtml
Same thing I was going to suggest, more or less. I use pdftohtml, then load
the HTML into OpenOffice and export it or save as or whatever to convert it
to OO.o-native format. (I think you have to export it or send it, so you
don't wind up with an OO.o document that still behaves like a one-page HTML
file, but I can't remember the details, and I just closed OO.o, and am too
lazy to sit here while it warms back up.)
--
D. Michael McIntyre
More information about the kubuntu-users
mailing list