How can I extract sentenses from text documents
Karl Hegbloom
hegbloom at pdx.edu
Fri Dec 30 11:52:38 UTC 2005
On Thu, 2005-12-22 at 13:09 -0600, Wade Smart wrote:
> Ok, this may be totally impossible but, I have about 1800 documents that
> have sentences inside [QUOTE] and sometimes [QUOTE] [QUOTE] or [QUOTE]
> [/QUOTE]. I don't know how many lines each document has - maybe 8 to
> 20k. Is there a way to copy all the sentences between the [QUOTE]
> [QUOTE] or [QUOTE] [/QUOTE] to a new file?
>
> This is way beyond my knowledge but if someone knows how this is done,
> if they would point me in the right direction - I would greatly
> appreciate it..
Perl with libtext-delimmatch-perl (man Text::DelimMatch) would probably
do the trick.
TIMTOWTDI.
--
Karl Hegbloom <hegbloom at pdx.edu>
More information about the ubuntu-users
mailing list