[MERGE] Port across errors for shallow branch support.

Aaron Bentley aaron at aaronbentley.com
Wed Feb 27 04:41:48 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Wed, 2008-02-20 at 00:05 -0500, Aaron Bentley wrote:
>> If remote stacking locations are stored on a per-branch basis, then you
>> can easily wind up in situations where a branch can't access the
>> information it needs to construct the revision it wants.
> 
> Can you enlarge on this; I don't see how it can happen more or less
> easily than with external references being stored in the repositories
> control dir.

This was based on the assumption that we don't do ghost-filling, which
is our current practice.  Under that assumption, if a repository has
part of revision FOO, and a non-stacked branch BAR derived from FOO is
added to it, BAR will not fetch the remaining data needed for FOO and
its ancestors.

> First of all a lemma: Both approaches provide the same set of all
> external references. The reason: all branches of a repository are within
> the repository. Repositories have a find_branches method to find the
> branches using the repository.

That's a bit uncomfortable for me, as find_branches was not intended to
break our abstraction that repositories are merely a cloud of revisions.
 Or at least, it was meant to do so only in certain rare cases like
garbage collection and recursive upgrade.

> Now, consider a repository with a number of branches, 5 or 6 of which
> have external references.
> 
> In a repository centric implementation, all those external references
> will be active always. This means that even entirely local operations
> will access /all/ those external references whenever any missed key
> lookup occurs.

This seems oversimplified.  It will only access all external references
when none of the remote repositories is a match.

We already know that bloom filters can tell us quickly whether there are
any matches to be had.


> It forces repository-wide scaling on branch-wide
> operations, which is fundamentally bad.

But for many purposes, there won't be many external references.  And
even when there are many external references, many of them will match a
given key.

And this ignores the prospects of memoizing indices locally, which would
make index-scanning portion essentially a local operation always, only
hitting external references that were known to have the desired data,
and avoiding hitting the remote index.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHxOoM0F+nu1YWqI0RAsxbAJsF53XGm09rdlEbA55Ztdv6g0ucFQCfWUDt
6tvWPutfXFD9qfvKrnEIgGc=
=v7j3
-----END PGP SIGNATURE-----



More information about the bazaar mailing list