OCR in Linux
Pia
pmikeal at comcast.net
Mon Jan 11 05:58:18 GMT 2010
Wow awesome! I am wondering why I don't have it in Lenny when I search
for it. I guess only one person meant tesseract then. Thank you very
much for your explanation because now I know there is a deb package. I
just have to figure out why I am not seeing it in my version of the distro
and maybe download the version from a newer version now. Thanks again for
the good details.
On Sun, 10 Jan 2010, Kenny Hitt wrote:
> Hi. I've been using ocropus for at least 2 years. I built the first release from source, but have used Debian packages ever since.
> Not sure why it isn't part of Lenny, but it is definitely part of Sid.
>
> kenny at blackbox:~$ apt-cache search ocropus
> ocropus - document analysis and OCR system
> ocropus-data - document analysis and OCR system --- data files
> kenny at blackbox:~$ apt-cache show ocropus
> Package: ocropus
> Status: install ok installed
> Priority: optional
> Section: graphics
> Installed-Size: 3732
> Maintainer: Jeffrey Ratcliffe <jeffrey.ratcliffe at gmail.com>
> Architecture: i386
> Version: 0.3.1-2
> Depends: libc6 (>= 2.4), libgcc1 (>= 1:4.1.1), libiulib0, libjpeg62, liblua5.1-0, libpng12-0 (>= 1.2.13-4), libstdc++6 (>= 4.1.1), libtiff4, zlib1g (>= 1:1.1.4), ocropus-data (= 0.3.1-2)
> Recommends: tesseract-ocr (>= 2.03-2)
> Breaks: ocrodjvu (<< 0.3)
> Description: document analysis and OCR system
> OCRopus(tm) is a state-of-the-art document analysis and Optical
> Character Recognition (OCR) system, featuring
> pluggable layout analysis, pluggable character recognition, statistical
> natural language modeling, and multi-lingual capabilities.
> .
> The OCRopus engine is based on two research projects: a high-performance
> handwriting recognizer developed in the mid-90's and deployed by the US Census
> bureau, and novel high-performance layout analysis methods.
> .
> OCRopus development is sponsored by Google and is initially intended for
> high-throughput, high-volume document conversion efforts. It
> will also be an excellent OCR system for many other applications.
> Homepage: http://code.google.com/p/ocropus/
>
> I knew you were talking about ocropus and not tesseract.
>
> Kenny
>
> On Sun, Jan 10, 2010 at 10:56:50PM -0500, pmikeal at comcast.net wrote:
>>> Hi. It's part of Debian. I am running Sid, but it's been a Debian package for a few years.
>>
>>
>> Huh? I thought ocropus just got created via google summer of code not
>> long ago. You and others must have thought I meant tesseract which is
>> not the email I replied to because you are the second person who emailed
>> me something about it instead of ocropus to my query about ocropus here.
>> The other guy who emailed me I am even more sure meant tesseract because
>> he actually included a link to it. I appreciate everyone's kindness to
>> answer questions, but please read the message people are replying to
>> before replying. Thanks.
>>
>>
>> --
>> Ubuntu-accessibility mailing list
>> Ubuntu-accessibility at lists.ubuntu.com
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-accessibility
>
> --
> Ubuntu-accessibility mailing list
> Ubuntu-accessibility at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-accessibility
>
More information about the Ubuntu-accessibility
mailing list