sketch of a new interface between loggerhead and bzrlib
Aaron Bentley
aaron.bentley at utoronto.ca
Fri Jun 22 18:05:50 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Michael Hudson wrote:
> Aaron Bentley wrote:
>> Surprising. I would have expected that a database cache would
>> complement Bazaar well.
>
> Oh, these are eminently practical concerns, not theoretical ones. I
> don't know where in the stack the problem is, but the changes caches
> routinely get corrupted and have to be thrown away (one or two a day on
> average on codebrowse.launchpad.net).
Well, as a temporary measure, sure. But you might want to just switch
to a different database. I'm kinda surprised it's using pdb rather than
SQLite.
>> I explicitly do not like organizing things by change type. I like to
>> see them organized by file.
>
> Well, the reason I put it like that is that is how it's likely to be
> displayed:
>
> added: ...
> removed: ...
> renamed: ...
> modified: ...
Yeah, I also don't like *displays* that are organized by change type...
>> Also, are you sure it's a good idea to restrict yourself to the lefthand
>> parent? I'd want it to be *possible* to compare with the righthand
>> parent at least.
>
> I'm not sure. Given that there will be a way of comparing an arbitrary
> pair of revisions, this will certainly be possible, but that's different
> from it being easy, or cached.
I thought you were proposing that this would be the complete interface.
>> BranchRevisionInfo looks interesting, but I would look at putting
>> where_merged into the RevisionInfo object instead. (And this is an
>> example of where a database cache wins)
>
> But where_merged isn't stable, is it?
It's append-only.
>> History.getRevisionInfo: Much of this info is not "cheap". Especially
>> dotted-revno and tree comparison data requires a lot of computation.
>
> But dotted-revno _has_ to be cheap, as it's going to be the most
> frequently requested thing.
Well, I'm sorry, but it's *not*, even if it *has* to be. It requires a
topological sort of the entire revision history, which scales O(n) with
the number of nodes, at best. Relative to, say,
Repository.get_revisions, it is very pricy.
> I was thinking of storing
> branch.get_revision_id_to_revno_map() on the History object somewhere --
> even for launchpad (my archetypal large repository, if you hadn't
> noticed already) it only takes 700ms, which seems a reasonably cost to
> pay once per 'bzr push'.
How are you paying this once per push? I thought you were paying it
once per revision you display.
>> In contexts where you want only a single revision, a getRevisionInfo call
>> makes sense. For the log view, I would strongly recommend providing an
>> iterator or list.
>
> Why, particularly?
So that you can pay the cost of branch.get_revision_id_to_revno_map up
front, and retrieve multiple revisions at once.
> I don't know what BzrInspect is :)
It is a web-based repository viewer for, say, debugging repositories. I
mentioned it in my 18/06/07 message to you.
http://panoramicfeedback.com/opensource/bzr/repo/BzrInspect/
> Sure, but Python is quite good at being lazy. Just because you return
> something with a .changes attribute doesn't mean you have to compute it :)
Certainly, but I think that an API should express which operations are
expensive and which are not, to aid people using it.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGfAFu0F+nu1YWqI0RAghCAJ93pSOe6anPWWF0m1SQwpGXH3/YbQCghqAZ
+QY2RqjvlbGzjbHnhkTaV5M=
=4RqP
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list