[MERGE] Tree.revision_tree

Mon Sep 11 17:12:49 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> Aaron Bentley wrote:
>>My suggestion is to register working trees with their repositories, so
>>that repository.revision_tree() will use a cached inventory if one is
>>available in any live working tree that uses the repository.

> There are a few issues, which I think we need to work out:
> 
> 1) WorkingTree will most likely be storing it in a different format. So
> at the least, it can only register what revision ids it knows about, and
> expect to be called at the appropriate time.

The current handling is hybrid, with Repository storing the file texts,
but the tree holding the inventory.  There's an advantage to using the
same inventory format in both repository and the cached basis, because
that allows us to copy the text directly.  However, there may be more
powerful arguments for using disjoint formats.  I don't know.

> 2) dirstate can be queried reasonably cheaply to see what revision ids
> are cached. (You have to read a certain number of lines, but that is
> fairly fixed).
> However, we have to figure out when it should be registering itself with
> the repository, and whether it registers just that it is a cache, or
> that it has an explicit set of revision ids.

I would think "just a cache", because the set of revision ids may change
during the lifetime of the repository.

> Anyway, because of lightweight checkouts, it isn't like we can write a
> control file into the repository to tell it where the cache is held. So
> we have to have some sort of runtime registration going on.

That was what I meant, yes.  Maybe as a weakref, so that the cache
doesn't get too greedy.

> I suppose it
> might be possible for this to be during the tree constructor.

Yes, this is where I was thinking of putting it.

> 4) Dirstate (at least) is written in such a way as to make it cheaper to
> do revision_trees(), because each tree is nested side-by-side with the
> other ones. I do wonder if it wouldn't be better to have another file
> for each one. I know we wanted to do it this way for faster
> 'changes_from' performance, because you get the parent entry at the same
> time as the child entry....

Well, if revision_trees is the most efficient API, let's support that,
by all means.  But instead of raising NoSuchRevision (as
Repository.revision_trees does), we'd just return the revision_trees
that could be found.  (Maybe list the missing ones, but that can also be
calculated easily.)

The point, though, is that it wouldn't be Tree.revision_tree, it would
be WorkingTree._revision_trees, which would be passed to the repository
at registration time.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFBYsB0F+nu1YWqI0RAo6wAKCH6OvwswfAP75QmJDkmPuhus8nPACfY3+k
/T51Zau5IRNSTDJRPuA2JGY=
=6nJD
-----END PGP SIGNATURE-----