Simplest word search application for ubuntu 12.04
Robert Heller
heller at deepsoft.com
Mon Jan 5 21:23:57 UTC 2015
At Mon, 05 Jan 2015 15:40:46 -0500 dfp10 at verizon.net, "Ubuntu user technical support, not for general discussions" <ubuntu-users at lists.ubuntu.com> wrote:
>
> I want to search some very large PDF files for words and word-combinations.
> evince does not seem to do this.
> pdfedit works but is too complex and only handles one page at a time.
> Are there any recommendations?
pdftohtml (should be in poppler-utils or something like that) + grep
pdftohtml will convert the PDFs to HTML file(s). HTML are just basic text
files (with HTML tags). So long as the words are not going to be common HTML
tag names (probably the only problems would be 'body' or 'table', most of the
other HTML tags are not typical natural language words), this should work.
> Thanks
> Don Parsons
>
>
--
Robert Heller -- 978-544-6933
Deepwoods Software -- Custom Software Services
http://www.deepsoft.com/ -- Linux Administration Services
heller at deepsoft.com -- Webhosting Services
More information about the ubuntu-users
mailing list