trying to OCR a simple tif file with tesseract-ocr
Nicolae Ghimbovschi
xfreebird at gmail.com
Sat Mar 26 13:17:19 UTC 2011
In your case unpaper will not do much.
I'm using Fedora, and tesseract 3.0 works just fine:
Sample input:
http://img402.imageshack.us/f/eurotext.png/
tesseract's output:
http://pastebin.com/zREhWGQq
On Sat, Mar 26, 2011 at 15:03, Robert P. J. Day <rpjday at crashcourse.ca> wrote:
>
> a little more googling suggests that i'm not the only person who's
> run into this issue:
>
> http://ubuntuforums.org/showthread.php?t=1599686
>
> the symptoms described there are *exactly* what i'm seeing -- the
> output file consisting of a single byte. so can anyone else try a
> simple tesseract invocation on a trivial .tif file and verify whether
> or not they get actual output?
>
> rday
>
> --
>
> ========================================================================
> Robert P. J. Day Waterloo, Ontario, CANADA
> http://crashcourse.ca
>
> Twitter: http://twitter.com/rpjday
> LinkedIn: http://ca.linkedin.com/in/rpjday
> ========================================================================
>
More information about the ubuntu-users
mailing list