Having trouble finding a word in multiple files
Karl Auer
kauer at biplane.com.au
Mon Jun 15 12:48:28 UTC 2020
On Mon, 2020-06-15 at 13:51 +0200, Liam Proven wrote:
> Again, no. In my considered opinion, trying to attack this problem
> with conversions is completely the wrong approach and will not bring
> any good satisfying resolution, ever, under any circumstances.
> [...]
> > Two tools mentioned, AbiWord and LibreOffice - allow doc files to
> > be converted to other formats which can then be searched.
> Once again for the gallery: *conversion is not the answer here.*
Hm. Horses for courses, I think.
If someone has 12GB of DOCX documents that they need to search once or
twice a year, it doesn't matter if it takes an hour or two to get a
result. If they search them daily, storing converted versions might
well be an excellent choice, or an indexing tool or or or...
There are hundreds ways to achieve this, and the method that is "best"
depends on the value of the result, the value of the search speed, and
the cost of the resources needed - including the cost of setup and
maintenance of the solution in some cases.
For this particular problem, conversion is definitely a possible
solution, as is decompression and text search. If the search is
something that needs repeating, that speaks for conversion as well.
Converting every time would indeed be a bit silly, but not if it really
was a once-off search. The converted files need only be text and text
compresses extremely well, if the storage cost is a concern. The
compressed converted texts could also be stored locally for added
search speed.
If the search needs to be more sophisticated, faster, more generalised,
or whatever then maybe more thought will be needed. BTW what lets your
Mac find things quickly is an index. It sure as shooting isn't
searching the original source texts if it gets through 12GB of remote
files in seconds.
It's a rare solution that is "never, ever" correct, provided that it
does actually solve the stated problem.
Regards, K.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (kauer at biplane.com.au)
http://www.biplane.com.au/kauer
http://twitter.com/kauer389
GPG fingerprint: 2561 E9EC D868 E73C 8AF1 49CF EE50 4B1D CCA1 5170
Old fingerprint: 8D08 9CAA 649A AFEF E862 062A 2E97 42D4 A2A0 616D
More information about the ubuntu-users
mailing list