Scripting / one liner help [solved]

Patton Echols p.echols at comcast.net
Thu Aug 11 00:06:05 UTC 2011


On 08/10/2011 03:43 PM, Jordon Bedwell wrote:
> On Wed, August 10, 2011 2:52 pm, Hal Burgiss wrote:
>> Its attempting to capture the string in between:
>>
>> SRC="  and the next doublequote: ".  The [^"] stops the capture at the
>> double quote. The capture should then include any character that is NOT a
>> double quote. If not careful, the expression could get "greedy" and start
>> matching other double quotes on the same line.  This should stop that
>> effect. The \1 is a reference back to the capture that is in the
>> parenthesis, in sed syntax, which essentially just preserves the captured
>> characters, and ignores the rest. Does that make sense?
> Because it should be:
>
> grep -iPo "<img[^>]+>" file.html | \
> sed -n 's/<img src=['\''"]\([^"'\'']*\).*/\1/pgI'
>
> [COPY AND PASTE BOTH LINES AT ONCE AND PRESS THE ENTER KEY]

Thanks, that works great and solves the immediate problem.  For purposes 
of my CLE (continuing linux education) I hope you will indulge me in the 
same question you posed to Hal.  How's it work?  I get the -io grep 
tags.  The -P enables perl regex?  What part of the grep string is the 
perl part?

Then I also wonder how the sed statement works.  I am still trying to 
figure sed (and plain old regex) out.

Even if you don't have time for the follow up, I appreciate it.

-- PE




More information about the ubuntu-users mailing list