wget problem

Wulfy wulfmann at tiscali.co.uk
Mon Jan 21 02:02:19 UTC 2008


Lou Katz wrote:
> On Sun, Jan 20, 2008 at 08:15:25AM +0000, Wulfy wrote:
>   
>> <sigh>  I don't know what I'm doing wrong, but I can't get wget to get
>> more than the top layer of the site.  The archive.org site just brings
>> in index.html (and robots.txt).  I tried it on another site and it
>> brought in the two versions of the main page (dialup and high speed) but
>> the menu links weren't followed.  I tried -l5 and -15 and got the same
>> download.
>>
>> Any idea why the -r isn't recursing?
>>     
>
> Have you used the underdocumented option to ignore robots.txt?
>
> put
>     robots = off
>
> in your .wgetrc, or use
>
>     -erobots=off
>
> on the command line.

It turns out the problem was with robots.txt.  I'll try your solution.  
Thanks, Lou!

-- 
Blessings

Wulfmann

Wulf Credo:
Respect the elders. Teach the young. Co-operate with the pack.
Play when you can. Hunt when you must. Rest in between.
Share your affections. Voice your opinion. Leave your Mark.
Copyright July 17, 1988 by Del Goetz





More information about the kubuntu-users mailing list