[ubuntu-uk] Document Storage

Ian Pascoe softy.lofty.ilp at btinternet.com
Sun Sep 16 21:21:48 BST 2007


Nice one mate - let me know how you get on with it.

How do you deal with things like statements that roll over onto multiple
pages?  Or put another way, do you scan each individual page, save it, and
move onto the next one without attempting to link the two images together
either by filename or other method, or are you relying on the text document
to provide the linkage?


-----Original Message-----
From: ubuntu-uk-bounces at lists.ubuntu.com
[mailto:ubuntu-uk-bounces at lists.ubuntu.com]On Behalf Of Dave Walker
Sent: 16 September 2007 15:56
To: British Ubuntu Talk
Subject: Re: [ubuntu-uk] Document Storage

On Sun, 2007-09-16 at 12:10 +0100, Ian Pascoe wrote:
> Morning Folks
> In the list's opinion which is the best way to store documents?
> In particular, as my own filing system is, well non existant, I was
> about scanning all necessary documents and then storing them eithre to HD
> CD / DVD.
> I've been trying to work out in my own mind what would be the better way
> store these scanned documents that will maintain the clarity and be of
> minimal size.

Hi Ian,

I have been looking into the same possibility over the last few months.
Personally i would recommend PDF as it is easily accessible and a good
file size with ~300dpi.

One method of retrieval I did consider was http based searching with the
scanned image OCR'd,  this would allow $filename.pdf and $filename.txt.
The later containing the 'best effort' of OCR.  The web interface could
search the txt's for keywords and link to the corresponding PDF's.

Currently I have automated the scan process producing a PDF and an OCR'd
txt file.  If I find time this week, I will try and work on it further -
and maybe create a project on launchpad hosting the source.

Kind Regards
Dave Walker

More information about the ubuntu-uk mailing list