[MERGE] Support fetching a restricted set of revisions

John Arbash Meinel john at arbash-meinel.com
Tue Jul 17 17:09:46 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> Aaron Bentley has voted +0.
> Status is now: Waiting
> Comment:
> Jelmer Vernooij wrote:
>> What the -rX..Y code (which I haven't submitted yet) does is create a
>> bread-first search over all the ancestors of Y until it hits X
>> (bzrlib.graph was very useful in that regard). In other words, it will
>> fetch all ancestors of Y that have a distance equal to or less than that
>> of X to Y.
> 
> Why not just select the ancestors of Y that are not ancestors of X? That
> seems like a simpler definition, and should behave similarly most of the
> time.  Also, much more likely to be reused.
> 

I also think it gives better bounds on the ghosts that you end up
with/the revisions that you would need in the future to determine merge
bases, etc. It *is* slightly more expensive to determine, since you need
to at least spider out from each. But I think Aaron has already written
code as part of the 'find_lca*' functions that would give you this sort
of thing in a reasonably efficient manner.

You could do a BFS and just keep a list of nodes that you need to
evaluate, and do the same from both nodes. Once you run out of nodes
that you are searching from the source, then you are done.


>> === modified file bzrlib/fetch.py //
> last-changed:jelmer at samba.org-200707131714
>> ... 44-00gl8k8t8j43rl3m
>> --- bzrlib/fetch.py
>> +++ bzrlib/fetch.py
>> @@ -267,6 +267,12 @@
>>      copy revision texts.
>>      """
>>
>> +    def __init__(self, to_repository, from_repository,
> last_revision=None, pb=None,
>> +                 limit_to_revisions=None):
>> +        self._limit_to_revisions = limit_to_revisions
>> +        super(KnitRepoFetcher, self).__init__(to_repository,
> from_repository,
>> +            last_revision, pb)
>> +
>>      def _fetch_revision_texts(self, revs):
>>          # may need to be a InterRevisionStore call here.
>>          from_transaction = self.from_repository.get_transaction()
>> @@ -280,7 +286,11 @@
>>              to_transaction)
>>          from_rf =
> self.from_repository._revision_store.get_revision_file(
>>              from_transaction)
>> -        to_rf.join(from_rf, version_ids=revs)
>> +        if self._limit_to_revisions is not None:
>> +            to_rf.join(from_rf, version_ids=self._limit_to_revisions,
>> +                       restrict_revisions=True)
> 
> I don't like this very much, style-wise.  Changing the meaning if a
> parameter based on another parameter doesn't seem like good form.
> 
> What would be wrong with adding a limit_to_revisions parameter to
> versionedfile.join?
> 
>>
>> -            self.source_ancestry =
> set(self.source.get_ancestry(version_ids))
>> +            if restrict_revisions:
>> +                self.source_ancestry = set(
>> + self.source._get_components_positions(version_ids).keys())
> 
> ^^^ Are you ensuring that every revision that is present has all its
> inventories and file texts present also?  It doesn't look like that to me.
> 
> By including the build-dependencies of the desired revisions, you're
> including the revisions themselves, and that means their inventories and
> file texts must also be included.
> 
> I think it would be simpler to create a snapshots of the oldest revisions.
> 
> The only reason your later test cases are passing is because revisions
> *happen to be* all snapshots, and therefore have no build dependencies.
> This is not true for inventories, however.

I thought that was something he mentioned. That he had it working for
'revisions' but was working to expand it to inventories and file texts.
Such that this was more of a 'work in progress'.

I think the idea is that when copying the data, you have to pull back to
a snapshot in order to figure out how to build the target anyway. So you
may as well save that information.

Again, it is a bit of the distinction between what you've defined (don't
store any data past this point) and Jelmer (don't store data that I
don't have to, but if I've downloaded it, keep it).

I personally prefer Jelmer's form, but I can see a benefit for both.

John
=:->


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGnOnKJdeBCYSNAAMRAqZvAJ0QrFQCRsfahsbfhBv67fbbLKV6sACgr3Hr
M3B0exP/mKZ+rQN1OhffvMU=
=tj4k
-----END PGP SIGNATURE-----



More information about the bazaar mailing list