iterating over revisions in a branch

Wed Jan 4 17:18:45 GMT 2006

Jamie Wilkinson wrote:
> This one time, at band camp, Jelmer Vernooij wrote:
> 
>>On Wed, Jan 04, 2006 at 11:56:24PM +1100, Jamie Wilkinson wrote about 'iterating over revisions in a branch':
>>
>>>Rob gave me a hint on how to iterate over revisoins in a branch:
>>
>>>  for revision_id in branch.revision_history():
>>>    revision = branch.get_revision(revision_id)
>>
>>>It occurred to me just now that it doesn't feel right.  If I'm iterating
>>>over revisions, I shouldn't have to extract the revision during the body of
>>>the iterator loop, surely instead of the revision id I should get back the
>>>whole revision?
>>
>>Efficiency. The list of revision ids can be read directly from the
>>.bzr/revision-history file. Reading the revision itself requires reading 
>>the appropriate file in .bzr/revisions/ (amongst other files). 
>>
>>If, for example, you would only want to do something with the last two 
>>revisions, that would not require reading all revisions that exist in
>>the revision history. Also, for things like 'revno' (which is
>>basically length(branch.revision_history()) the revision contents don't
>>matter.
> 
> 
> Ok.
> 
> Still, I think that something that returns an iterator over the *id*s of
> revisions should be named as such, from an OO point of view.  I'm still
> surprised that revision_history() didn't give me a list of Revision objects.
> 
> Would it be worth making Branch.__iter__() perform this function?
> 
>   for revision in branch:
>     print revision.id
> 
> Or if that's not readable, a .revisions() method?
> 
>   for revision in branch.revisions():
>     print revision.id
> 
> Also, the efficient revision_history() as it stands doesn't feel useful
> outside of bzr internals.  Well, sure it's efficient, but as a programmer
> using the bzrlib API I don't care about the internals of bzr, and well, I'm
> back to the OO argument: if I'm iterating over revisions, I want a Revision
> object :)
> 

Sometimes you are only iterating over revision ids. :)

I'm fine with a branch.revisions(), but you have to be careful with how
you want to iterate over the history. Do you want to start with the Null
revision, or start with the current revision.
Do you want to go through all of the ancestry, or just the mainline for
this branch 'revision-history'.
Going through all ancestry is defined better if you are starting at the
current revision, and at that point, do you do depth first, or breadth
first?

The problem is that Revisions are actually a DAG. Now you can stick to
what we are currently using as the primary path (the path of first
ancestors), which is the revision-history.

Another way to iterate is:

rev_id = branch.last_revision()
while rev_id:
  rev = branch.get_revision(rev_id)
  # do stuff
  if rev.parent_ids:
    rev_id = rev.parent_ids[0]
  else:
    rev_id = None

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060104/820a949f/attachment.pgp