Scripting / one liner help
p.echols at comcast.net
Thu Aug 11 21:05:25 UTC 2011
On 08/10/2011 06:30 PM, Hal Burgiss wrote:
> On Wed, Aug 10, 2011 at 7:58 PM, Patton Echols <p.echols at comcast.net
> <mailto:p.echols at comcast.net>> wrote:
> On 08/10/2011 02:52 PM, Hal Burgiss wrote:
> On Wed, Aug 10, 2011 at 3:00 PM, Johnny Rosenberg
> <gurus.knugum at gmail.com <mailto:gurus.knugum at gmail.com>
> <mailto:gurus.knugum at gmail.com
> <mailto:gurus.knugum at gmail.com>>> wrote:
> 2011/8/10 Hal Burgiss <hal at burgiss.net
> <mailto:hal at burgiss.net> <mailto:hal at burgiss.net
> <mailto:hal at burgiss.net>>>:
> > See if this gets close to extracting the image names ...
> > grep SRC *html | sed -r 's/SRC="([^"]+)"/\1/ig' |
> I didn't create this thread, but can you please explain
> that sed
> statement? I don't get it… (I'm not a beginner with regular
> expressions but I'm definitely not an expert either…)
> Thanks for the explanation Hal, unfortunately it is not doing the
> "ignores the rest" part It appears that it finds each occurrance
> of a file name, then replaces it with the same occurrance, without
> the " marks.
> Sorry something got left out, try ...
> grep SRC *html | sed -r 's/.*SRC="([^"]+)".*/\1/ig'
As mentioned in another post to this thread, I have a working solution.
So this for info only.
The original source document has all the image tags on one line w/o
carriage return or newline. So the grep statement captures the whole
line. Then the modified sed statement outputs only the last image file
Using the grep statement suggested by Johnny:
grep -io "<img[^>]\+>"
solves it because grep is spitting out each match, not the entire line.
As I mentioned to Johnny, even though I don't understand all of this, the discussion is helping me learn, so I greatly appreciate it.
More information about the ubuntu-users