sketch of a new interface between loggerhead and bzrlib

Michael Hudson michael.hudson at canonical.com
Fri Jun 22 18:32:22 BST 2007


Aaron Bentley wrote:
> Michael Hudson wrote:
>> Aaron Bentley wrote:
>>> Surprising.  I would have expected that a database cache would
>>> complement Bazaar well.
>> Oh, these are eminently practical concerns, not theoretical ones.  I
>> don't know where in the stack the problem is, but the changes caches
>> routinely get corrupted and have to be thrown away (one or two a day on
>> average on codebrowse.launchpad.net).
> 
> Well, as a temporary measure, sure.  But you might want to just switch
> to a different database.  I'm kinda surprised it's using pdb rather than
> SQLite.

It uses the stdlib module 'shelve', which on the machine codebrowse runs
on at least, ends up using bdb.  Not my code :)

>>> I explicitly do not like organizing things by change type.  I like to
>>> see them organized by file.
>> Well, the reason I put it like that is that is how it's likely to be
>> displayed:
> 
>>     added: ...
>>   removed: ...
>>   renamed: ...
>>  modified: ...
> 
> Yeah, I also don't like *displays* that are organized by change type...

I guessed as much :)

>>> Also, are you sure it's a good idea to restrict yourself to the lefthand
>>> parent?  I'd want it to be *possible* to compare with the righthand
>>> parent at least.
>> I'm not sure.  Given that there will be a way of comparing an arbitrary
>> pair of revisions, this will certainly be possible, but that's different
>> from it being easy, or cached.
> 
> I thought you were proposing that this would be the complete interface.

Yeah, I thought a bit more and changed to a getChanges(self, revidA,
revidB=None) style interface.

>>> BranchRevisionInfo looks interesting, but I would look at putting
>>> where_merged into the RevisionInfo object instead.  (And this is an
>>> example of where a database cache wins)
>> But where_merged isn't stable, is it?
> 
> It's append-only.

Indeed.  Maybe this distinction of stable/unstable data shouldn't really
poke through into the interface.

[...]
>>  I was thinking of storing
>> branch.get_revision_id_to_revno_map() on the History object somewhere --
>> even for launchpad (my archetypal large repository, if you hadn't
>> noticed already) it only takes 700ms, which seems a reasonably cost to
>> pay once per 'bzr push'.
> 
> How are you paying this once per push?  I thought you were paying it
> once per revision you display.

History objects are expected to live until outOfDate() returns false.
Loggerhead is a long running process, not a cgi script (for this reason,
if none other).

>>> In contexts where you want only a single revision, a getRevisionInfo call
>>> makes sense.  For the log view, I would strongly recommend providing an
>>> iterator or list.
>> Why, particularly?
> 
> So that you can pay the cost of branch.get_revision_id_to_revno_map up
> front, and retrieve multiple revisions at once.

See above?

>> I don't know what BzrInspect is :)
> 
> It is a web-based repository viewer for, say, debugging repositories.  

I see.

> I mentioned it in my 18/06/07 message to you.

Yes, but that message didn't explain what it was either...

> http://panoramicfeedback.com/opensource/bzr/repo/BzrInspect/

Wee, more TurboGears.

>> Sure, but Python is quite good at being lazy.  Just because you return
>> something with a .changes attribute doesn't mean you have to compute it :)
> 
> Certainly, but I think that an API should express which operations are
> expensive and which are not, to aid people using it.

In any case, the desire to compare against revisions other than the left
hand parent pushed me into adding a .getChanges method.

I've updated

http://starship.python.net/crew/mwh/hacks/history2.py.txt

and

http://starship.python.net/crew/mwh/hacks/history2/history2.html

Cheers,
mwh



More information about the bazaar mailing list