Scripting / one liner help

Patton Echols p.echols at
Thu Aug 11 21:05:25 UTC 2011

On 08/10/2011 06:30 PM, Hal Burgiss wrote:
> On Wed, Aug 10, 2011 at 7:58 PM, Patton Echols <p.echols at 
> <mailto:p.echols at>> wrote:
>     On 08/10/2011 02:52 PM, Hal Burgiss wrote:
>         On Wed, Aug 10, 2011 at 3:00 PM, Johnny Rosenberg
>         <gurus.knugum at <mailto:gurus.knugum at>
>         <mailto:gurus.knugum at
>         <mailto:gurus.knugum at>>> wrote:
>            2011/8/10 Hal Burgiss <hal at
>         <mailto:hal at> <mailto:hal at
>         <mailto:hal at>>>:
>         >
>         > See if this gets close to extracting the image names ...
>         > grep SRC *html | sed -r 's/SRC="([^"]+)"/\1/ig' |
>            I didn't create this thread, but can you please explain
>         that sed
>            statement? I don't get it… (I'm not a beginner with regular
>            expressions but I'm definitely not an expert either…)
>     Thanks for the explanation Hal, unfortunately it is not doing the
>     "ignores the rest" part It appears that it finds each occurrance
>     of a file name, then replaces it with the same occurrance, without
>     the " marks.
> Sorry something got left out, try ...
>   grep  SRC *html | sed -r 's/.*SRC="([^"]+)".*/\1/ig'
> -- 
> Hal

As mentioned in another post to this thread, I have a working solution.  
So this for info only.

The original source document has all the image tags on one line w/o 
carriage return or newline.  So the grep statement captures the whole 
line.  Then the modified sed statement outputs only the last image file 

Using the grep statement suggested by Johnny:

grep -io "<img[^>]\+>"

solves it because grep is spitting out each match, not the entire line.

As I mentioned to Johnny, even though I don't understand all of this, the discussion is helping me learn, so I greatly appreciate it.


More information about the ubuntu-users mailing list