FreeBSD Ports statistics

John Arbash Meinel john at arbash-meinel.com
Thu Aug 31 06:47:21 BST 2006


I thought I would share a few of the FreeBSD ports statistics that I
have been able to extract so far.


1) Time to 'cvs co' a complete source tree
13min 28s

2) For Ports, a cvs checkout actually creates more control files and
directories than actual versioned files and directories. It is close,
but with CVS you have a CVS control dir, and 3 control files for every
versioned directory. And the FreeBSD ports tree has *lots* of
directories, and very few files per directory.

In ports, the number of files per directory is around 2.6:1 (84614:32058)


Here are the specific numbers:
180810	total files		find . -type f | wc -l
 84614	total non CVS files	find . -type f ! -path '*CVS*' | wc -l
------
 96196	total CVS control *files*

 64120	total directories	find . -type d | wc -l
 32058	non CVS directories	find . -type d ! -path '*CVS*' | wc -l
------
 32062  CVS directories (the extra is CVSROOT/ directories)

244930	total nodes		find . | wc -l
116672	non CVS nodes		find . ! -path '*CVS*' | wc -l
------
128258	CVS nodes

3) 'cvs' does something very bad to the filesystem with this many files.

time 'find .' in a CVS checkout
1:29.08

time 'find .' in a copy with no CVS meta files
4s

I thought this was really weird that 'find .' would take 22 times longer
in a cvs checkout than it would in a 'cp -al' directory with CVS
directories removed. So I did 'cp -al' and didn't remove the CVS
directories, and I got:

time 'find .' in a copy of a CVS checkout
7.5s

So something about how CVS is creating the working directory is making
it *very* slow to work in. I don't know exactly how we can make this
work to our advantage, but I have the feeling it has to do with exactly
what the mercurial guys were talking about. If you create files in the
wrong places, then your disk system gets all screwy.

In CVS, don't they create a temporary file and then rename into place?
Do they create these in the CVS directory, or in the working directory?

Actually, if I copy my copy, the time goes down to 1.5s (down from 4s).
So maybe this is inode order stuff coming to play. Regardless, weird
stuff is happening.

4) Bazaar timings.

In a directory with no CVS files,
time bzr -q add
45.3s

Size of the working inventory:
du -ksh .bzr/checkout
17M

time bzr -q commit -m 'first'
41m11.35s

du -ksh .bzr/
995M    .bzr/

size of raw ports tree: 426M

time bzr-no-hash-prefix -q commit -m 'first'
16m20s

So on reiserfs at least, even though there are 200k files in one
directory it is still quite fast.
Now, 'rm -rf .bzr' took a really long time (10 min) on the all-in-one
and much less time (30s) with the separate hash directories.

Anyway, I now have a new tree to use for bzr punishment. And maybe if I
get some time to let it churn, I'll even try to turn it into a bazaar
branch.

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060831/9313a38c/attachment.pgp 


More information about the bazaar mailing list