scanner optimization and usage

Martin Laberge mlsoft at
Tue Aug 12 22:11:03 UTC 2008

On Tuesday 12 August 2008 12:56:15 chuck adams wrote:
> I am in the process of converting a number of high-end math
> and physics texts to PDF.  I want to be able to sit with a laptop
> and not have to run back and forth to the bookcases lugging
> heavy books and winding up with a pile of them in the floor.
> Using Kubuntu 8.04.1 with xsane and a Canon LiDE 25 scanner.
> I scan in pages at 300DPI to PS files, i.e.,, ...
> I need the 300DPI (I think) to see clean crisp text and images
> at 400% zoom factors.  
> After I generate the book in many files of PS, I have a script page_txt
> with the line
> gs -sPAPERSIZE=a4 -sDEVICE=pnmraw -r300 -dNOPAUSE -dBATCH -sOutputFile=- -q \ 
> $1 | ocrad  > `basename $1 .ps`.txt
> and I then, from the command line in the directory:
> for i in *ps
> do
> page_txt $i
> done
> This gives me pages with the OCR text.  I tried gocr and tesseract and
> did not get as good results as with ocrad.  
> Then I run the following
> gs -q -dNOPAUSE -dBATCH  -sDEVICE=pdfwrite -sOutputFile=out.pdf *.ps
> to consolidate all the .ps files into one PDF file.  I can live with this.
> I can NFS mount several TB of disc space, so that is not an issue at
> this time.  :-)
> Is there a way to further compress the file sizes at any point and still
> not lose the desired resolution?  Using only software that comes with
> Kubuntu or available from the Kubuntu repos.  Inquiring minds want
> to know.
> I may have reinvented the wheel or gone about this all wrong, but
> education is expensive no matter how you get it.
> Thanks in advance,
> chuck

gscan2pdf is a jewel to do this task,

and the results are super sharp at 300dpi.

but of course, take a few minutes to understand 
how it works, but after that, you're at full speed.

Martin Laberge
mlsoft at
30 Years of Unix Admin, and still learning...

More information about the kubuntu-users mailing list