trying to OCR a simple tif file with tesseract-ocr
Nicolae Ghimbovschi
xfreebird at gmail.com
Sat Mar 26 13:28:10 UTC 2011
It was a tif file, imageshack has converted it to png.
I can test your image in Fedora if you wish, it might be the same result.
On Sat, Mar 26, 2011 at 15:24, Robert P. J. Day <rpjday at crashcourse.ca> wrote:
> On Sat, 26 Mar 2011, Nicolae Ghimbovschi wrote:
>
>> In your case unpaper will not do much.
>>
>> I'm using Fedora, and tesseract 3.0 works just fine:
>>
>> Sample input:
>> http://img402.imageshack.us/f/eurotext.png/
>>
>> tesseract's output:
>> http://pastebin.com/zREhWGQq
>
> in that case, i'm baffled, but i note that you seem to be using a
> .png file as input to tesseract, whereas the man page *strongly*
> suggests that tesseract works well only with TIFF files. so that
> confuses me as well.
>
> in any event, i'm still interested in someone trying this with a
> simple example under ubuntu and letting me know (privately if they
> wish) if they got it to work properly. this really shouldn't be that
> difficult, i just don't see what i'm doing wrong.
>
> rday
>
> --
>
> ========================================================================
> Robert P. J. Day Waterloo, Ontario, CANADA
> http://crashcourse.ca
>
> Twitter: http://twitter.com/rpjday
> LinkedIn: http://ca.linkedin.com/in/rpjday
> ========================================================================
>
More information about the ubuntu-users
mailing list