Google search

Todd Slater dontodd at gmail.com
Thu Oct 6 16:22:24 UTC 2005


On 10/6/05, Rajiv Vyas <rajiv1 at gmail.com> wrote:
> On 10/6/05, Todd Slater <dontodd at gmail.com> wrote:
>
> > On 10/6/05, Rajiv Vyas <rajiv1 at gmail.com> wrote:
> > >
> > > > Would the RSS or Atom feed of the search results help you any?
> > > >
> > >
> > >
> > >  Well, what I am looking for is "how many" news articles are searchable
> > > today. For example, on news.google.com, if you type "linux Ubuntu" there
> are
> > > 106 articles and for "Linux SuSE" there are 546.
> > >
> > >  The goal is to see traction in media for open source, Ubuntu, etc. over
> a
> > > period of one year and derive some conclusions.
> >
> > Right, so maybe  you could pull the RSS/Atom feed with wget and do
> > some processing on it (like count the occurrences of <ITEM></ITEM> or
> > whatever for each term you're searching for).
> >
> > Or do the search via wget and grep for the # of search results.
>
>
>  That might help, since I don't need to read those stories. I am not sure
> though if new.google.com allows to you wget informations. Also, how much
> programming would this involve as I am not a programmer?

You have to specify a user agent (e.g. -U Mozilla); however, when I do
that it doesn't print the "Results 1-10 of about 1,000,000" line.
Looks like you might need to resort to a screen scraper.

Good luck,

Todd




More information about the ubuntu-users mailing list