Ubuntu boot speed fall in Hardy

Mon May 12 21:12:28 UTC 2008

2008/5/12 Scott James Remnant <scott at canonical.com>:
> On Sun, 2008-05-11 at 18:00 +0200, Wouter Stomp wrote:
>
>  > Quoting one of the last comments from the brainstorm idea:
>  >
>  > "
>  > Hello everyone.
>  > I am the author of Google Summer of Code 2007 prefetching for Ubuntu.
>  >
>  > I did not get any feedback on prefetch project mailing list (or any
>  > other way), so I thought it is not used, and did not have motivation
>  > to further work on it. And then I have come across this site :)
>  >
>  How weird,
>
>  The author has certainly received many detailed questions from me, and
>  has simply not answered them.

Sorry about that. It just got lost in my mailbox.
And about lack of feedback, I meant feedback from users. The kernel
has been downloaded 280+ times and I got something like 2 reports from
users. I don't know if it works or if it crashes somebody's machines.

>  Problem with prefetch is that it's quite a lot of code, in different
>  places, and zero documentation on how it works and which bit does what.

I agree the documentation needs to be improved. I will add description
of implementation in project wiki. The comparison with other
prefetching solutions would also clear up things a bit.

I have added to project downloads the presentation I gave at my
university, it contains some slides about implementation details at
the end. See: http://prefetch.googlecode.com/files/gsoc-prefetching-presentation.pdf

>  > > 1) Documentation.  A 1000ft overview explaining how prefetch works, what
>  > >    it does and doesn't do, what the pieces are and what they do and how
>  > >    it compares (technically) to readahead.
>  >
>  > Many information is on wiki pages (http://code.google.com/p/prefetch/),
>  > but it currently lack such high-level overview.
>  >
>  I didn't find this very extensive, or explanatory.  When we reviewed it,
>  it didn't answer any of our questions about how prefetch worked.

In short, in comparison to readahead:
Readahead works by tracing which files are used, but it works on whole
files. Prefetch has greater resolution as it works on pages.
Readahead cannot do prefetching and profiling at the same time,
separate boot with profiling must be done.
Readahead profiling is expensive (uses inotify).
Readahead needs manual intervention from user to change readahead
list, prefetch adapts itself automatically.

>  For example, how does it determine which blocks need prefetching?

It monitors page cache to see which pages are used by processes.

>  Where/how are these lists of blocks stored?

They are stored in /prefetch directory as prefetch lists for each
traced app and for boot stages.
Each file contains list of tuples (device, inode, start-in-pages,
length-in-pages) which describe what to prefetch.

>  What decides when to load blocks?

Blocks are loaded when application starts (for application
prefetching) or when appropriate boot script is started (for boot
prefetching).

>  What if the filesystem isn't mounted yet (/usr), how can the loading be
>  staged?

Boot prefetching is split into 3 phases: initial boot (with only root
mounted), boot with all partitions mounted and GUI boot. Each stage
has separate prefetching list.

>  Are the lists transferable between systems?

No, they contain inode numbers and these differ on systems.
If it is a matter of supplying predefined list, it is easy to write
the tool which will convert paths to inodes upon first boot.

>  Could we use the lists to sort the LiveCD filesystem generation?

It depends what you want to do with it. If you want to feed the list
to mksquashfs, it can be done. If you want to add prefetching list to
live CD, this would be harder, as inode numbers are generated during
generation of SquashFS image.

>  Could we use the lists to sort the order in which we copy files during
>  the install?

You mean to copy in such order that after boot from disk the system
boots faster?
This is interesting issue. The list contains page ranges and I am not
aware of any tool which allows to specify which ranges of files to
copy and when. The ext3 allocator would reorganize it anyway. IMO
running my reordering tool after copying would be simpler.

>  Is prefetching done in block order to minimise disk head movement?

Prefetch file is sorted using (device,inode,start) lexicographical
order which should in general correspond to disk order. It could be
extended to take into account block number, but I am not sure it is
necessary. Disk scheduler will sort disk requests anyway. And it
reordering tool is run, they will be in proper order on disk and in
large chunks, so requests will be merged.

>  How necessary is ext3 defrag to this working?

It is completely optional, but it speeds up boot more, because
necessary files can be read in large chunks without head movements.

>  Do we still need readahead or preload with prefetch?

Readahead should not be used together with prefetch as it uses its own
prefetch lists. It could read unnecessary data and spoil performance.

Preload has some heuristics to predict which programs will be run, so
this could be useful. But I don't know how it will behave (in terms of
performance) together with prefetch - prefetch for apps might think
preload is loading the files for itself and this could make prefetch
perform poorly.

HTH

-- 

	Krzysztof Lichota