Site Spider

Sun Jan 20 08:15:25 UTC 2008

[I sent this to Donn's private e-mail by mistake..  sorry Donn.]

Donn wrote:
> There is a gui that does this. It has a name so abysmal that I can't recall 
> it... 
>
> I used this scripts once a few years ago to fetch a website.
> It gets two parameters: url level
> The level is how far down a chain of links it should go.
> You could just replace the vars and run the command directly.
> ===
>
> #!/bin/bash
> #Try to make using wget easier than it bloody is.
> url=$1
> if [ -z $url ]; then (echo "Bad url"; exit 1); fi
> LEV=$2
> if [ -z $LEV ]; then
>         LEV="2"
> fi
>
> echo "running: wget --convert-links -r -l$LEV $url -o log"
> wget --convert-links -r -l$LEV "$url" -o log
>
> ===
>
> man wget is the best plan really.
>
>
> \d
>
>   
<sigh>  I don't know what I'm doing wrong, but I can't get wget to get
more than the top layer of the site.  The archive.org site just brings
in index.html (and robots.txt).  I tried it on another site and it
brought in the two versions of the main page (dialup and high speed) but
the menu links weren't followed.  I tried -l5 and -15 and got the same
download.

Any idea why the -r isn't recursing?

-- 
Blessings

Wulfmann

Wulf Credo:
Respect the elders. Teach the young. Co-operate with the pack.
Play when you can. Hunt when you must. Rest in between.
Share your affections. Voice your opinion. Leave your Mark.
Copyright July 17, 1988 by Del Goetz