How can I extract sentenses from text documents
Matt Morgan
minxmertzmomo at gmail.com
Thu Dec 22 21:20:14 UTC 2005
On 12/22/05, Mike Bird <mgb-ubuntu at yosemite.net> wrote:
> On Thu, 2005-12-22 at 11:09, Wade Smart wrote:
> > Ok, this may be totally impossible but, I have about 1800 documents that
> > have sentences inside [QUOTE] and sometimes [QUOTE] [QUOTE] or [QUOTE]
> > [/QUOTE]. I don't know how many lines each document has - maybe 8 to
> > 20k. Is there a way to copy all the sentences between the [QUOTE]
> > [QUOTE] or [QUOTE] [/QUOTE] to a new file?
>
> sed can probably do that, if the documents are text format rather
> than some word-processing format and depending upon line breaks
> and depending upon how you want the quotes to appear in the new
> file.
Can sed cross newlines? I thought it worked strictly within a single
line (here comes my confession: I've never read the whole book--only
the parts I needed at specific times :-)).
More information about the ubuntu-users
mailing list