how to find out dead links

Loïc Grenié loic.grenie at gmail.com
Sat Nov 14 15:31:49 UTC 2009


2009/11/14 Derek Broughton <derek at pointerstop.ca>:
> Loïc Grenié wrote:
>
>> 2009/11/14 Eugeneapolinary Ju <eugeneapolinary81 at yahoo.com>:
>>> wget -r -p -U Firefox "http://www.somesite.com/" 2>&1 | grep 404 >
>>> 404.txt
>>>
>>>
>>> why come 404.txt is 0 Byte? how to put the STDOUT to a file with wget?
>>
>>    Have you tried
>>
>> wget -r -p -U Firefox "http://www.somesite.com/"
>>
>>    There is no 404 message (at least here). To be more precise, there is
>>   no 404 message because there is no web server that can output the
>>   404 message. A web page can fail for (at least) three different reasons:
>
> I imagine that "somesite.com" was an example, likely because his actual site
> isn't accessible to the Internet.
>
> The real problem is:
>
> $ wget http://localhost/test.htm
> --2009-11-14 10:43:23--  http://localhost/test.htm
> Resolving localhost... 127.0.0.1, ::1
> Connecting to localhost|127.0.0.1|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2009-11-14 10:43:23 ERROR 404: Not Found.
>
>
> In this case, 404 is ONLY a status, and not a page.

   Of course, but the status is delivered by a web server.
  We'll need a better understanding of what the first user
  wants: detect non-existing sites or non-existing pages
  on an existing site (or both).

           Loïc




More information about the ubuntu-users mailing list