Need advice: Ubuntu OCR techniques

Patton Echols p.echols at comcast.net
Mon Oct 10 03:19:21 UTC 2011


On 10/09/2011 06:34 PM, NoOp wrote:
> On 10/09/2011 02:39 PM, Kevin O'Gorman wrote:
>> On Sun, Oct 9, 2011 at 1:09 PM, Kevin O'Gorman<kogorman at gmail.com>  wrote:
>>
>>> On Sun, Oct 9, 2011 at 11:10 AM, Icarus Alive<icarus.alive at gmail.com>wrote:
>>>
>>>> On Sun, Oct 9, 2011 at 11:04 PM, Kevin O'Gorman<kogorman at gmail.com>
>>>> wrote:
>>>>> I'm new to OCR (optical character reading), have never done it before.
>>>>> Suddenly I have a need.
>>>>>
>>>>> I've been diving through old papers and have found hard-copy (appears to
>>>> be
>>>>> real Courier font, laser printed on white background) of a program I
>>>> wrote
>>>>> decades ago on a Macintosh 512K in Lightspeed C.  I thought I had lost
>>>> it
>>>>> completely.  I would like to recover it from the hard-copy without
>>>> typing
>>>>> ~100 pages of code.  I have a scanner, and full Acrobat CS5 on a Windows
>>>>> machine, plus all the FOSS of Ubuntu (tesseract, gocr, plus anything
>>>> useful
>>>>> in multiverse).  Does anybody know the fastest way to usable code from
>>>> this
>>>>> situation?
>>>> Use the power-of-the-cloud... Google docs can do OCR. For english
>>>> language printed text, scanned well, it works pretty well.
>>>> http://docs.google.com/support/bin/answer.py?answer=176692
>>>>
>>>> Icarus (may your wings stay on),
>>> Great idea.  I'll check it out.
>>>
>>> I was unable to make it work.  I scanned one of the files as a 3-page TIFF
>> file with Irfanview, and uploaded it to Google Docs.  I marked all the
>> checkboxes for conversion, but did not get a text document.  I've marked it
>> shared to all, and the link (for me) is
>> https://docs.google.com/viewer?a=v&pid=explorer&chrome=true&srcid=0B6pbHEZND52eZWNlZGQ4MmUtMTgwZi00MTQ3LWJkMTUtNzIzOTIwMWRlOWJk&hl=en_US
>> (modulo any folding)
> ...
>
> Does:
> $ tesseract crystal.h1.tif crystal
> Tesseract Open Source OCR Engine
> Page 1
> Page 2
> $ gedit crystal.txt
> not work for you?
>
>

I stand corrected RE my post that multipage does not work well.  I 
suppose I could have tested as you did!



>
>





More information about the ubuntu-users mailing list