How can I extract sentenses from text documents

Karl Hegbloom hegbloom at pdx.edu
Fri Dec 30 11:52:38 UTC 2005


On Thu, 2005-12-22 at 13:09 -0600, Wade Smart wrote:
> Ok, this may be totally impossible but, I have about 1800 documents that 
> have sentences inside [QUOTE] and sometimes [QUOTE] [QUOTE] or [QUOTE] 
> [/QUOTE]. I don't know how many lines each document has - maybe 8 to 
> 20k. Is there a way to copy all the sentences between the [QUOTE] 
> [QUOTE] or [QUOTE] [/QUOTE] to a new file?  
> 
> This is way beyond my knowledge but if someone knows how this is done, 
> if they would point me in the right direction - I would greatly 
> appreciate it..

Perl with libtext-delimmatch-perl (man Text::DelimMatch) would probably
do the trick.

TIMTOWTDI.

-- 
Karl Hegbloom <hegbloom at pdx.edu>





More information about the ubuntu-users mailing list