[RFC] Removing hash prefix in storage vastly improves performance
John Arbash Meinel
john at arbash-meinel.com
Fri Aug 18 15:31:34 BST 2006
Matthew D. Fuller wrote:
> On Fri, Aug 18, 2006 at 09:51:49AM +1000 I heard the voice of
> Robert Collins, and lo! it spake thus:
>> What are the performance changes on kernel sized trees for
>> - BSD (perhaps)
>
> I can't speak for trying it (my systems would revolt against me), but
> here's some stats size-wise for FreeBSD.
>
> Fresh "cvs export -rHEAD":
>
> src (kernel + userland):
> % find src -type d -print | wc -l
> 3286
> % find src -type f -print | wc -l
> 22600
>
> ports tree:
> % find ports -type d -print | wc -l
> 22969
> % find ports -type f -print | wc -l
> 84257
> (obviously, this is the far side of the curve from "put everything in
> one directory ;)
Well, there is also the fact that we create 2 files (index + knit) for
every versioned file. So all these numbers double. And we create these
files for directories as well, so you would end up with:
(22969 + 84257) * 2 = 214,452 or >200k files in one directory. Which is
a little bit much for any filesystem without an index.
With a perfect distribution over the 256 hash prefixes, you only have 1K
files per dir, which is probably a lot more reasonable.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060818/9cc7a8cc/attachment.pgp
More information about the bazaar
mailing list