Searching help.ubuntu.com does not work anymore..
Dustin Kirkland
dustin.kirkland at gmail.com
Sat Nov 22 17:23:45 UTC 2008
On Sat, Nov 22, 2008 at 7:44 AM, Marko Oreskovic <markoresko at gmail.com> wrote:
> I wouldn`t call it a flame but if you say so, then..
http://en.wikipedia.org/wiki/Flaming_(internet)
> Google page inside Ubuntu site is Soo ugly to see..
> Also that search result picture on search results is second most uglier
> search results page. Google is prettier that that.
...
> Ok that is matter of opinion what is ugly for you and what is not.
> But consider that if I go to the ubuntu site
> and everything is Brown and every page is carefully made to be mostly
> nicely incorporated in site style.
> And suddenly I get amateur-like google output..
> If I wanted to get search results from google, I would go to google.com
> and type in "some search site:.help.ubuntu.com"
This was actually "constructive criticism", and I have addressed it accordingly.
I have modified the search results page style sheet to match that of
the rest of the site. The colors of links and such are now more
consistent with other parts of help.ubuntu.com.
For an example, see:
* https://help.ubuntu.com/search.html?cx=003883529982892832976%3Ae2vwumte3fq&cof=FORID%3A9&ie=UTF-8&q=code+of+conduct&sa=Search
Other than that, I'm sorry, but I don't find the results page "ugly".
It's under the Ubuntu Documentation logo and header. It's dark text
on a light background. The color scheme is consistent. The results
are pertinent to the term searched.
> Google is privately owned company and their supporting or non-supporting
> of GNU operating systems, Including GNU/Linux is not an issue here.
To be precise, Google is a publicly owned company, traded on the
NASDAQ stock exchange:
* http://finance.google.com/finance?q=goog
> You could say that also, Canonical itself is "strong supporter of Linux,
> cross-platform utilities, and Ubuntu in general."
Whereas Canonical is a privately held company, owned by Mark Shuttleworth.
I should probably note that I'm employed by Canonical to develop the
Ubuntu Server. I am not, however, employed to work on the Ubuntu
Documentation Project. My time here is purely voluntary, and
completely in addition to my job responsibilities. I enjoy
pro-actively solving problems and technology gaps in our community.
I have volunteered my spare time to the Ubuntu Documentation project,
and have provided working code and solutions to Matt and his team. As
you've seen above, these solutions are evolving, in the interest of
addressing the issues.
> 3. Because many of those sites including google scripts
> have 2 displaying solutions:
> - display with no java script
...
> 4. Turning on java script un-selectively on all sites will kill
> my machine by using too much cpu time.
Again, I have taken your concern to heart. Some users will not have
javascript enabled or available.
I have sent Matt a patch that will display the help.ubuntu.com header,
with search box, consistently with and without javascript enabled.
Hopefully he can apply it soon...
If you have javascript enabled, your search results will appear within:
* https://help.ubuntu.com/search.html page.
If you do not have javascript enabled, your search will post to a
Google page, and your search results will be available here, with the
documentation logo and stylesheet colors.
* http://www.google.com/cse?cx=003883529982892832976:e2vwumte3fq
> How big is the base of pages that should be search for keywords and
> article title search?
We're interested in searching 3 forms of documentation, unofficial
counts in parentheses:
* The Official documentation on help.ubuntu.com (~12,000 pages)
* The Community documentation on help.ubuntu.com/community (~3,000 pages)
* The Technical documentation on manpages.ubuntu.com (~160,000 pages)
These are currently hosted on separate servers. Thus, a web crawler
is required, rather than a local process that indexes filesystem data.
As I said, I've written 2 search engines before--one to index 350,000
manpages, and another to index 30 million lines of source code.
Presentation of the results is the easy part, and yes, it's trivial to
emulate the Google-looking results that you call "ugly".
The hard part is writing a process that regularly and incrementally
scans gigabytes or terabytes of data, and then slices that into
pertinent search terms, getting rid of less valuable words. Then,
designing and loading a performance optimized database with billions
of search terms, and hundreds of thousands (millions?) of pages. Then
applying value, prioritization, and ordering algorithms so that you
get "good" results, rather than just "any" results. Adding advanced
comparators (and, or, not, etc) is another layer of complexity. The
processing of incoming search requests and the batch indexing jobs may
require large clusters of computing power.
And that's a grossly simplified discussion of the problem.
For zero monetary cost, we are able to outsource all of that
processing of search requests and indexing of sites to a Google hosted
Custom Search Engine.
> The most constructive solution is to use whatever search engine you
> want on server side and to represent results
> in community-friendly and ubuntu-friendly designed way that does not
> forces users to go for google. it is the same as you
> put "go to google for help" and link to google on the main page of
> help.ubuntu.com . It sounds much like: "Read the f* manual" type of
> answer.. ("Go to f* google for help...")
> We want to provide people with help, not getting them back to google..
> where they came from mostly..
We've gone far, far beyond just sending someone back to Google.
Google indexes some 20 billion webpages.
We have customized the scope of the search to only include Ubuntu
Official, Community, and Technical documentation, for the current
Ubuntu release. Furthermore, we prioritize the results, such that
Official >> Community >> Technical. And finally, we provide three
simple links, such that you can refine your search to *only* include
Official, or Community, or Technical documentation.
Search for "dvd" on Google.com and you'll get 163 million results, and
I couldn't find anything about "Ubuntu" in the first 30 pages of
results.
Search for "dvd" on help.ubuntu.com, and all of the results are
Ubuntu-related documentation.
--
If you really think you have a better way of implementing a search
engine for help.ubuntu.com, I recommend that you create a Blueprint,
Project, and Team in Launchpad.net. Provide a clear design document
for the problem you're trying to solve, and how you're going to solve
it. Write the code yourself, or recruit a team of individuals to help
you. We have well defined processes with the Ubuntu and Open Source
Communities for improving the tools we all use. But angry diatribes
are not generally productive.
:-Dustin
More information about the ubuntu-doc
mailing list