Converting or OCRing PDF to text (solved)
john d. herron
paradox.herron at bluewin.ch
Mon Jan 29 00:37:39 UTC 2007
Thank you, Michael McIntyre and Donn.
pdftotext works beautifully!
jdh
D. Michael McIntyre wrote:
> On Sunday 28 January 2007 4:29 pm, Donn wrote:
>
>>> I have a few articles in .pdf format I'd need to convert (or OCR) to
>>> plain text, .odf or .doc format.
>>> Any advice for this Linux newbie?
>>>
>> Hi, check what these commands give you:
>> pdftotext
>> or
>> pdftohtml
>>
>
> Same thing I was going to suggest, more or less. I use pdftohtml, then load
> the HTML into OpenOffice and export it or save as or whatever to convert it
> to OO.o-native format. (I think you have to export it or send it, so you
> don't wind up with an OO.o document that still behaves like a one-page HTML
> file, but I can't remember the details, and I just closed OO.o, and am too
> lazy to sit here while it warms back up.)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/kubuntu-users/attachments/20070129/909e8630/attachment.html>
More information about the kubuntu-users
mailing list