[MERGE/RFC] Userdoc Driven Design on the Bazaar 2.0 UI

Robert Collins robert.collins at canonical.com
Thu Apr 16 00:40:46 BST 2009


On Thu, 2009-04-16 at 00:40 +1000, Ian Clatworthy wrote:

I do like the docs. However I have some concerns too.

> 1. checkouts should be lightweight by default (ala svn)

There are serious performance implications in this. For starters, right
now this scales with tree size * latency. E.g. its about as slow as its
possible to be. Is it something we could aim for? Yes, but then we'll
have to download data over the network for every diff operation.
lightweight checkouts are really only suitable for high bandwidth low
latency environments. Fixing them to have a local cache is ~= to using a
stacked bound branch with sufficient fixes to allow disconnected use. I
don't think we have the manpower to achieve those fixes in the sort of
time frame we'd like to have a 2.0 happening in.

> 2. a branch could reuse the repository of a dependent branch
>    without needing to explicitly create a shared repo in a parent
>    directory.

I think this would make it harder for people to reason about 'where is
my data' - which is important when deleting branches. If users get it
wrong they will delete data they care about. Its not /so/ different from
a lightweight checkout, so perhaps the change isn't that large.

However, I think the root cause of the issue is that we have an
expensive resource (the repository) and we want to make more reuse of
it, but we don't by default make it explicit - so people need to learn
about it through trial and error, and our toolchain isn't as polished
for working with one repo, one work area, and multiple branches as it
should be.

Misquoting Martin from a couple of days ago, we can either tackle the
root cause, or we can tackle specific symptoms.

Tackling the root cause requires effort and though but should lead to a
better result - consider the brisbane-core project, which tackled
repository performance issues, and has a really good result - just
tackling serialisation of inventories, or data compression alone
wouldn't have gotten the same result [and there are benchmarks from the
past showing that :)]. It took us focusing down and fixing the actual
root cause (comparing trees is O(tree) not O(delta)) to get a good
solution.

I think we have to fix the root cause here - our [well intentioned]
hiding of repositories by default.

As an expensive resource, repositories are a performance optimisation.
It is fine (and correct in my opinion) to keep them as just a
performance optimisation, but we should make them available for reuse by
default.

Now, as to how we do that, we can look at the suggestion in this story
of having some sort of pointer to the repository in arbitrary branches
[but this invalidates some assumptions repositories hold dear, so we
would not be able to do garbage collection at all anymore]. We already
violate this for stacked-upon repositories - they cannot be garbage
collected, *but* we also only stack on branches which means that its
fairly hard to get references to garbage-collectable-revisions.

We can have a list of branches in the repository - we currently have
that by 'ls -lR' under the repository - its the 'branches are contained'
rule. This is important because it means a repository can reason about
what data is transient and removable, and what is used. Jelmer's
colocated branches plugin is a way of extending this without adding
directories where a user has their source code. There are other ways to
do this - a list of named heads, or references, can also manage this,
and would be more efficient at a number of operations (list-branches
would be _much_ faster).

And we can stop creating repository+branch at the same place, which
makes the repository immediately reusable with all of our normal tools.

In short I think its important we solve the 'reuse the repository' use
case, but we should aim to do so by reducing the permutations available,
not increasing them. So some design is needed, including evaluation
current proposals on the table, and things like the repository design
Aaron favoured back when we first added shared repositories; then we
pick one, and run with it.

-Rob

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090416/6fe92d65/attachment.pgp 


More information about the bazaar mailing list