OT maybe: Looking for a tool or . . .?

Michael Gustafson michaelrpg at gmail.com
Tue Feb 20 00:58:19 UTC 2007


Try running this command: sed -e 's#<[^>]*>##g'  oldfile > newfile

This will strip all the html tags out.

On 2/19/07, Patton Echols <p.echols at comcast.net> wrote:
>
> I'm a newbie to linux (Ubuntu my first functioning install, tried a
> redhat install years ago and lost patience before it would function).
>
> I have a project where I want to do some text editing, but not
> manually.  The document is in HTML and has text in normal font, bold and
> italics, mixed in each paragraph and often in each sentence.  (Basically
> a "redline" version of a lengthy document) What I need to be able to do
> is strip out the text in bold, remove the italic formatting, but leave
> that text.  I could do it manually, but where's the fun in that?
>
> My question:  I'm sure that there is either a linux tool, or text editor
> with scripting ability that can do what I want.  And I'd love to RTFM,
> but I have no idea where to look.  Anyone have any pointers / ideas?  If
> not, a better forum to ask?
>
>
> Thanks
>
> --
> ubuntu-users mailing list
> ubuntu-users at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20070219/34a64f1e/attachment.html>


More information about the ubuntu-users mailing list