Converting or OCRing PDF to text (solved)

Mon Jan 29 00:37:39 UTC 2007

Thank you, Michael McIntyre and Donn.
pdftotext works beautifully!
jdh

D. Michael McIntyre wrote:
> On Sunday 28 January 2007 4:29 pm, Donn wrote:
>   
>>> I have a few articles in .pdf format I'd need to convert (or OCR) to
>>> plain text, .odf or .doc format.
>>> Any advice for this Linux newbie?
>>>       
>> Hi, check what these commands give you:
>> pdftotext
>> or
>> pdftohtml
>>     
>
> Same thing I was going to suggest, more or less.  I use pdftohtml, then load 
> the HTML into OpenOffice and export it or save as or whatever to convert it 
> to OO.o-native format.  (I think you have to export it or send it, so you 
> don't wind up with an OO.o document that still behaves like a one-page HTML 
> file, but I can't remember the details, and I just closed OO.o, and am too 
> lazy to sit here while it warms back up.)
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/kubuntu-users/attachments/20070129/909e8630/attachment.html>