Scripting / one liner help
hal at burgiss.net
Wed Aug 10 21:52:40 UTC 2011
On Wed, Aug 10, 2011 at 3:00 PM, Johnny Rosenberg <gurus.knugum at gmail.com>wrote:
> 2011/8/10 Hal Burgiss <hal at burgiss.net>:
> > See if this gets close to extracting the image names ...
> > grep SRC *html | sed -r 's/SRC="([^"]+)"/\1/ig' | whatever_script.sh
> I didn't create this thread, but can you please explain that sed
> statement? I don't get it… (I'm not a beginner with regular
> expressions but I'm definitely not an expert either…)
Its attempting to capture the string in between:
SRC=" and the next doublequote: ". The [^"] stops the capture at the next
double quote. The capture should then include any character that is NOT a
double quote. If not careful, the expression could get "greedy" and start
matching other double quotes on the same line. This should stop that
effect. The \1 is a reference back to the capture that is in the
parenthesis, in sed syntax, which essentially just preserves the captured
characters, and ignores the rest. Does that make sense?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ubuntu-users