Question about wget
prestonh at gmail.com
Mon Oct 25 16:01:31 UTC 2010
On Sat, Oct 23, 2010 at 8:53 PM, Anthony Papillion <papillion at gmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> Hello Everyone,
> I know this doesn't specifically have to do with Ubuntu but I hope
> someone here can offer their Kung-foo to help me solve this.
> I need to grab all images from a specific category on Craigslist (not
> used for spam but rather research). The category is located at
> When I browse the category with my browser, there are images there. Yet
> when I issue the command
> wget -r -l2 --no-parent -A.jpg http://tulsa.craigslist.com/m4w
> The result is what looks like the entire folder structure of the site
> but all in empty folders.
> Am I doing something wrong?
> Can anyone offer help on how I might structure this?
images with CSS, so there aren't "normal" image tags. Wget just sees
the text-only, non-css version of the site and doesn't even see that
there are image tags in there. Take of the -A.jpg and look at the
file downloaded and you will see what I am talking about.
That said, even if this is to be used "for research", you are
violating Section 7.u of Craig's List terms of service:
and could owe them $3000 USD for each day you use your script per
Section 19.f of their Terms of Service.
I am not a lawyer though nor am I in anyway affiliated with Craigslist.
More information about the ubuntu-users