Scripting / one liner help

Johnny Rosenberg gurus.knugum at
Wed Aug 10 19:00:54 UTC 2011

2011/8/10 Hal Burgiss <hal at>:
> On Wed, Aug 10, 2011 at 12:29 PM, Patton Echols <p.echols at>
> wrote:
>> I am looking for thoughts on how I might extract image names from an html
>> document.
>> The document started as a Word document with nothing but images, one per
>> page, randomly named.  It was saved as html using libre office, so I now
>> have the images separate.  I have a script that will process them through
>> imagemagik to clean them up, reduce to from full color to b/w and make them
>> into a pdf.  But the pages are out of order because the images are randomly
>> named.
>> What I'd like to do is have something read the html file in order and
>> either feed the names of the JPGs to the script in order or just spit them
>> out to a file that I can feed to the script.  The html source has all the
>> images listed sequentially without line breaks.  Each tag is the same except
>> for the image name and looks like this:
>> <IMG SRC="source_html_m1463afff.jpg" NAME="graphics3" ALIGN=BOTTOM
> See if this gets close to extracting the image names ...
> grep SRC *html | sed -r 's/SRC="([^"]+)"/\1/ig' |

I didn't create this thread, but can you please explain that sed
statement? I don't get it… (I'm not a beginner with regular
expressions but I'm definitely not an expert either…)

Kind regards

Johnny Rosenberg

> --
> Hal
> --
> ubuntu-users mailing list
> ubuntu-users at
> Modify settings or unsubscribe at:

More information about the ubuntu-users mailing list