brainstorming for UDS-N - Performance - disk footprint

Colin Watson cjwatson at ubuntu.com
Thu Oct 14 16:45:50 BST 2010


On Mon, Oct 04, 2010 at 10:05:33AM -0700, Dustin Kirkland wrote:
> I like the idea of moving almost all documentation to the web.
> manpages.ubuntu.com has both an HTML rendering of every manpage, as
> well as the .gz original manpage.  The 'dman' utility can remotely
> retrieve manpages from m.u.c and display them on a console.  It could
> easily be enhanced to cache them locally in /var/cache, too.

As the man-db maintainer, this deeply concerns me.  Adding support for
transparent web retrieval is one thing; that would be kind of cool.
However, I think the typical sysadmin expectation is that 'man foo'
should give you answers quickly and without any messing about (e.g. on
firewalled machines), and I feel that retrieving pages from the web as
the default mode of operation is contrary to that.  For that matter, the
times when your network connection isn't working are often the times
when a local quick reference guide in the form of a manual page is
exactly what you need, and who's to say that the one you need would be
cached?

I looked in the nearest chroot I had to hand, which was a fresh
debootstrap of natty with build-essential and a few other
build-dependencies installed; it should be fairly close to the existing
minimal install.  In this chroot, /usr/share/man consumes slightly over
11MB, while the entire chroot consumes 405MB (I don't know how the
absolute numbers corresponds to a server installation, but I expect that
the ratio is similar).  Surely this can't be the lowest-hanging fruit,
or even low-enough-hanging to justify the problems it would cause?

Manual pages are a simple, effective, and quick form of reference, and
probably the best-translated form of documentation on our system with
the exception of the high-profile guides.  They excel at orienting a
sysadmin, particularly those unfamiliar with the exact details of the
operating system at hand.  In my experience, server operating systems
that don't pay attention to making sure that manual pages always Just
Work pay the price for it in frustrated sysadmins.  I haven't heard
complaints about man not working properly out of the box in Debian or
Ubuntu since shortly after I took it over (just before then, it was a
frequent cause of vocal irritation), while I still hear friends
complaining about it on proprietary Unix systems from time to time, and
that's a point of pride for me.


We should celebrate manual pages, not push them off our installed system
as a space optimisation.  So absolutely, let's make it easier to get at
manual pages stored on the web from the command line, so that you can
easily read documentation for packages you don't have installed; if
somebody wants to allocate me some work time to work on my favourite
hobby project, I have no problem with that. :-)  If somebody has
sufficiently delicate space constraints that they have to 'rm -rf
/usr/share/man' and use the new dpkg filters support (thanks to Tollef
and Martin!) to make sure it stays gone, then we should support that
choice.  But I don't think they should be anywhere near the top of our
list to remove from single-purpose server environments, which still go
wrong from time to time and need to be repaired.

-- 
Colin Watson                                       [cjwatson at ubuntu.com]



More information about the ubuntu-devel mailing list