Converting or OCRing PDF to text

D. Michael McIntyre michael.mcintyre at rosegardenmusic.com
Sun Jan 28 22:23:49 UTC 2007


On Sunday 28 January 2007 4:29 pm, Donn wrote:
> > I have a few articles in .pdf format I'd need to convert (or OCR) to
> > plain text, .odf or .doc format.
> > Any advice for this Linux newbie?
>
> Hi, check what these commands give you:
> pdftotext
> or
> pdftohtml

Same thing I was going to suggest, more or less.  I use pdftohtml, then load 
the HTML into OpenOffice and export it or save as or whatever to convert it 
to OO.o-native format.  (I think you have to export it or send it, so you 
don't wind up with an OO.o document that still behaves like a one-page HTML 
file, but I can't remember the details, and I just closed OO.o, and am too 
lazy to sit here while it warms back up.)

-- 
D. Michael McIntyre 




More information about the kubuntu-users mailing list