status performance and ... chdir

Robert Collins robertc at robertcollins.net
Thu Sep 11 01:39:18 BST 2008


On Wed, 2008-09-10 at 08:37 +0100, Paul LeoNerd Evans wrote:
> On Wed, 10 Sep 2008 15:52:11 +1000
> Robert Collins <robertc at robertcollins.net> wrote:
> 
> > The nice thing about experiments is you can be staggered by the result.
> > 
> > Executive summary: we should chdir. Lots.
> 
> Or maybe even open() and fchdir(). It doesn't help the first time, but it
> helps when you come "back" again; it avoids another directory name =>
> inode lookup in the kernel. Keep a stack of opened directory fds, rather
> than names.
> 
> Only real downside to this is it starts to eat up your fd limit - 1024
> usually. Shouldn't be a major issue except for -really- deep trees. At
> which point perhaps you can switch to names if you have a stack taller
> than, say, 200 dirs?

Well, a couple of things:
 - we don't want to process a given dir more than once - thats wasteful.
 - we want to process everything in a dir at once, because we've been 
   told that ext3 [arguably our most common deployed filesystem] tends 
   to put all the contents of a dir in the same block group
 - mozilla has 5770 directories; a simple cache would get thrashed :P

So fchdir isn't that helpful with our current traversal structure, as we
only see a dir once during the stat pass. When reading files it might be
more useful, but at that point there is every likelyhood we'll hit fd
limits at low cache hit rates.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080911/f3f0599e/attachment.pgp 


More information about the bazaar mailing list