Scripting / one liner help

Hal Burgiss hal at burgiss.net
Wed Aug 10 21:52:40 UTC 2011


On Wed, Aug 10, 2011 at 3:00 PM, Johnny Rosenberg <gurus.knugum at gmail.com>wrote:

> 2011/8/10 Hal Burgiss <hal at burgiss.net>:
> >
> > See if this gets close to extracting the image names ...
> > grep SRC *html | sed -r 's/SRC="([^"]+)"/\1/ig' | whatever_script.sh
>
> I didn't create this thread, but can you please explain that sed
> statement? I don't get it… (I'm not a beginner with regular
> expressions but I'm definitely not an expert either…)
>
>
Its attempting to capture the string in between:

 SRC="  and the next doublequote: ".  The [^"] stops the capture at the next
double quote. The capture should then include any character that is NOT a
double quote. If not careful, the expression could get "greedy" and start
matching other double quotes on the same line.  This should stop that
effect. The \1 is a reference back to the capture that is in the
parenthesis, in sed syntax, which essentially just preserves the captured
characters, and ignores the rest. Does that make sense?

-- 
Hal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20110810/194cd021/attachment.html>


More information about the ubuntu-users mailing list