scanner optimization and usage

Jonas Norlander jonorland at
Tue Aug 12 19:26:43 UTC 2008

2008/8/12 chuck adams <k7qo at>:
> I am in the process of converting a number of high-end math
> and physics texts to PDF.  I want to be able to sit with a laptop
> and not have to run back and forth to the bookcases lugging
> heavy books and winding up with a pile of them in the floor.
> Using Kubuntu 8.04.1 with xsane and a Canon LiDE 25 scanner.
> I scan in pages at 300DPI to PS files, i.e.,, ...
> I need the 300DPI (I think) to see clean crisp text and images
> at 400% zoom factors.
> After I generate the book in many files of PS, I have a script page_txt
> with the line
> gs -sPAPERSIZE=a4 -sDEVICE=pnmraw -r300 -dNOPAUSE -dBATCH -sOutputFile=- -q \
> $1 | ocrad  > `basename $1 .ps`.txt
> and I then, from the command line in the directory:
> for i in *ps
> do
> page_txt $i
> done
> This gives me pages with the OCR text.  I tried gocr and tesseract and
> did not get as good results as with ocrad.
> Then I run the following
> gs -q -dNOPAUSE -dBATCH  -sDEVICE=pdfwrite -sOutputFile=out.pdf *.ps
> to consolidate all the .ps files into one PDF file.  I can live with this.
> I can NFS mount several TB of disc space, so that is not an issue at
> this time.  :-)
> Is there a way to further compress the file sizes at any point and still
> not lose the desired resolution?  Using only software that comes with
> Kubuntu or available from the Kubuntu repos.  Inquiring minds want
> to know.
> I may have reinvented the wheel or gone about this all wrong, but
> education is expensive no matter how you get it.
> Thanks in advance,
> chuck


A good program to scan books and magazines to pdf or djvu are
gscan2pdf. Give it a try.

/ Jonas

More information about the kubuntu-users mailing list