forum archive not accessible anonymously

Jordon Bedwell jordon at envygeeks.com
Sun Aug 7 12:45:01 UTC 2011


On 07/08/11 03:59, Alan Pope wrote:
> On 7 August 2011 08:36, Colin Law <clanlaw at googlemail.com> wrote:
>> On 2 August 2011 07:40, Mihamina de Rakotomandimby
>> <rakotomandimby at gmail.com> wrote:
>>> Hi all,
>>>
>>> When performing Google search, I often get a "UbuntuForum" post as
>>> result.
>>>
>>> Unfortunately, I get blocked because I must sign in to access the
>>> content.
>>>
>>> 1°) Would you know why?
> 
> Yes. Older posts are being blocked so that people don't stumble upon
> them. Many of the posts in the forums archive are old and in many ways
> just wrong. So rather than delete them, they're still there but only
> authenticated people can see them. The idea is that the forums admins
> block search engines so eventually they will forget this stuff exists
> and it wont show up in search results.
> 
>>> 2°) I think Google indexes it anonymously, so why cant I read it as
>>> anonymous?
>>
> 
> Google indexed it a long time ago. You have to go out of your way to
> make google forget something.

Throw up a 410 header when you detect Google or another bot on those
pages and the bot will be tricker-ied into removing it stat.  They don't
joke around with them 410's. Mostly because most software for the web is
too lazy to differ (or just doesn't know the difference) between 404 and
410 so when those bots see a 410 they assume you know what you're on
about if you're using it and do as you want. You should also consider
adjusting your meta tags and adding another:

<meta name="googlebot" content="noarchive, nosnippet">

It'll pick up the rules in your other generic bot meta too. Adding that
meta tag will be worthless if you pass Googlebot at 410 but it's still
worth it to add if you plan on keeping it the way you do.




More information about the ubuntu-users mailing list