how to find out dead links
Loïc Grenié
loic.grenie at gmail.com
Sat Nov 14 15:31:49 UTC 2009
2009/11/14 Derek Broughton <derek at pointerstop.ca>:
> Loïc Grenié wrote:
>
>> 2009/11/14 Eugeneapolinary Ju <eugeneapolinary81 at yahoo.com>:
>>> wget -r -p -U Firefox "http://www.somesite.com/" 2>&1 | grep 404 >
>>> 404.txt
>>>
>>>
>>> why come 404.txt is 0 Byte? how to put the STDOUT to a file with wget?
>>
>> Have you tried
>>
>> wget -r -p -U Firefox "http://www.somesite.com/"
>>
>> There is no 404 message (at least here). To be more precise, there is
>> no 404 message because there is no web server that can output the
>> 404 message. A web page can fail for (at least) three different reasons:
>
> I imagine that "somesite.com" was an example, likely because his actual site
> isn't accessible to the Internet.
>
> The real problem is:
>
> $ wget http://localhost/test.htm
> --2009-11-14 10:43:23-- http://localhost/test.htm
> Resolving localhost... 127.0.0.1, ::1
> Connecting to localhost|127.0.0.1|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2009-11-14 10:43:23 ERROR 404: Not Found.
>
>
> In this case, 404 is ONLY a status, and not a page.
Of course, but the status is delivered by a web server.
We'll need a better understanding of what the first user
wants: detect non-existing sites or non-existing pages
on an existing site (or both).
Loïc
More information about the ubuntu-users
mailing list