Having trouble finding a word in multiple files
Robert Heller
heller at deepsoft.com
Sun Jun 14 11:22:10 UTC 2020
At Sun, 14 Jun 2020 11:34:32 +0100 "Ubuntu user technical support, not for general discussions" <ubuntu-users at lists.ubuntu.com> wrote:
>
> On Sun, Jun 14, 2020 at 05:17:57AM -0400, Pat Brown wrote:
> >
> > Unfortunately, none of those suggestions worked. Perhaps it's because
> > the files I'm searching are either .doc, .docx or .odt files. The
> > Dropbox folder is at the root of my home directory and it is an actual
> > folder that contains multiple folders under it.
> >
> Yes, well, why didn't you say to start with! :-)
>
> You can't (directly) search the above file types with grep, grep
> searches for strings in *text* files, or at least in files where the
> text you are looking for is stored 'as is'.
>
> .docx is definitely a compressed format so a tool for searching it
> will need to decompress the files (at the very least) before searching.
A .docx or .odt file is actually a Zip file, containing (amounst other things)
XML file(s), which are just text files. So a script that uses unzip might
work:
#!/bin/bash
# $1 -- some grep expression
# $2 -- a .docx or .odt file
#
regexp = $1
document = $2
tempname=`mktemp -d`
unzip -qq $document -d $tempname
grep -r -q "$regexp" $tempname
if $?; then
echo "$regexp is in $document"
else
echo "$regexp is not in $document"
fi
rm -rf $tempname
>
> Isn't it possible to search multiple files with Libreoffice? That
> should manage the above file types.
>
>
> All of this is why I steer well clear of non-text ways of storing what
> is basically text. I use reStructuredText and/or Dokuwiki's own
> (text) markup language, both easy to search with grep.
As is LaTeX or Doxygen embeded in program sources (.h, .c, .tcl, etc.).
>
--
Robert Heller -- 978-544-6933 Cell: 413-658-7953 GV: 978-633-5364
Deepwoods Software -- Custom Software Services
http://www.deepsoft.com/ -- Linux Administration Services
heller at deepsoft.com -- Webhosting Services
More information about the ubuntu-users
mailing list