Naive questions re hard-linking repositories

Ian Clatworthy ian.clatworthy at canonical.com
Tue Apr 14 16:04:05 BST 2009


Given it takes ~ 4 minutes to branch Emacs outside a shared repo
and 6 seconds to branch within one, I'd like to better understand
why we don't just hard link the .bzr/repository directory when
conditions permit it, e.g. both source and target branch are local
and on the same filesystem say.

Why is it the right thing to *always* penalise users (w.r.t. performance
and space) who chose not to use a shared repo or who simply don't
know any better? If we're not comfortable deciding when hard linking
is safe, could we add an option (e.g. --common-repo) to allow
power users to hard link repositories when they wanted to?

If hard links are completely out, can't we at least do a vanilla
copy of the ./bzr/repository directory under appropriate conditions?
Maybe we had a theory that it would be faster to only transfer what
you needed but that theory just doesn't hold on most local-to-local
transfers.

Also, why should branch be overloaded to mean "clean out any garbage"
instead of that being an explicit command or option that you pay for if
and when you need it? If you branch inside a shared repo, we don't see
the need to clean out garbage then so why is it *required* when branching
outside a shared repo?

I realise we've probably covered this ground umpteen times but
4 mins - on a desktop, not laptop - is just *crazy* slow and our
competitors don't seem as paranoid about hard linking as we are.
Is our approach winning us more users than we're losing?

More broadly, I guess I'm asking us to revisit our assumptions about
what branch must do vs what it does now. Shared repositories are
cool but we ought to have a system that benefits from them, not
*requires* them for acceptable performance. I don't have the answers,
or even all the questions, so I thought I'd start back at the basics ...

Ian C.



More information about the bazaar mailing list