[MERGE][BUG 51980] bzr log <file>returns inappropriate revisions
John Arbash Meinel
john at arbash-meinel.com
Mon Apr 23 19:07:00 BST 2007
Kent Gibson wrote:
>
>
> John Arbash Meinel wrote:
>> Interestingly enough, because of the extra work, for a short number of
>> revisions of a heavily modified file, your version is actually slower.
>
> Yup, the ancestry has to be calculated for all revisions, no matter
> how many are going to be displayed. For a small number of revisions
> to display the delta approach, slow as it is, will be faster than
> ancestry generation over thousands of revisions.
>
> Unfortunately I can't think of any way to simplify the ancestry
> generation for a small number of recent revisions. If you prune the
> revision graph at some arbitrary point you can loose whole branches of
> ancestry.
>
> And I'd rather not have to use some heuristic to switch between the
> two approaches.
> Since the delta approach can miss ancestry changes, there is the
> potential of different output depending on which algorithm is used.
>> At least for me:
>
>> bzr log --short -r-10..-1 NEWS
>
>> takes 5.6s with bzr.dev and 14.8s for your version.
>
> For me it is 1.32 for bzr.dev and 2.66 with my patch.
> On my machine the cross-over point seems to be about -r-30..-1
>> Now, it is modified by almost every version in the mainline, so it is
>> sort of an extreme case.
>
>> bzr log --short -r-10..-1 INSTALL
>
>> Is 2.875s versus 6.143s for bzr.dev.
> My patch is faster for this case on my machine as well - 0.76sec
> versus 1.2 for bzr.dev.
> So it should be good for files/projects with a small number of
> revisions - bzr.dev NEWS being a prime example of the other end of the
> spectrum.
>
>> I'll see what we can do to make correctness not cost so much. (And get
>> everything well tested).
>
> Any luck with that?
Actually, yeah. I factored out the important function you had into a
separate function that could be tested.
The branch is available from:
https://code.launchpad.net/~jameinel/bzr/log-ancestry
or
http://bzr.arbash-meinel.com/branches/bzr/0.16-dev/log_ancestry/
in case LP hasn't updated yet.
One thing I was hoping for is to be able to not generate a set() for
every revision in the ancestry. Since often the set will not change
(since it is only updated when new entries are added, or there is a merge).
I was also hoping to improve not having to extract the revision graph 2
times, and have it 'tsort' in one place and 'merge_sort' in the other.
tsort is faster than 'merge_sort' because it has less sorting to do.
However, if we already need a merge_sort, we might as well use it.
I did find one big performance improvement, which is to not generate a
new set for every revision. If nothing has changed, you can just use a
reference to the existing parent set(). It dropped "bzr log NEWS" from
22s => 8s for me.
That is enough for me to want to merge this now, since it seems to be
universally faster than the existing "bzr log foo" and it is
significantly more correct.
>
> Cheers,
> Kent.
We'll see if we can get a review in for 0.16.
One very interesting thing about this change, though. Is that if you
have a revision which makes a change, and then another revision which
reverts that change, you still end up showing that file as "modified" by
the top-most revision.
Specifically, I'm seeing this:
$ ./bzr log bzr
2362 Kent Gibson 2007-04-11
merge bzr.dev
However a '--verbose' doesn't show bzr as modified.
It turns out the the merge has a rather involved ancestry:
------------------------------------------------------------
revno: 2362
merge bzr.dev
------------------------------------------------------------
revno: 2359.1.9
Update NEWS to match bzr 0.15.
------------------------------------------------------------
** revno: 2323.2.3
Merge Martins 0.15rc2 release branch.
------------------------------------------------------------
revno: 2358.2.1
prepare to release 0.15rc2
------------------------------------------------------------
revno: 2359.1.8.1.1
Update NEWS to match bzr 0.15.
------------------------------------------------------------
revno: 2334.1.5
[merge] bzr.dev 2371
------------------------------------------------------------
revno: 1551.2.49.1.40.1.22.1.42.1.31.1.39.1.4
Merge bzr.dev
------------------------------------------------------------
revno: 1551.2.49.1.40.1.22.1.42.1.31.1.21.1.54.1.12
Merge bzr.dev
------------------------------------------------------------
revno: 2359.1.31
merge 0.15 back to dev
------------------------------------------------------------
revno: 2323.2.3.1.2
(mbp) various integrated fixes for 0.15
------------------------------------------------------------
revno: 2323.2.3.1.1.1.6
merge jam's integrated changes from trunk
------------------------------------------------------------
revno: 2323.2.3.1.3
(mbp) more integrated 0.15 fixes
------------------------------------------------------------
revno: 2323.2.3.1.1.1.12
These changes still intended for 0.15 branch
------------------------------------------------------------
revno: 2164.2.10
merge bzr.dev
------------------------------------------------------------
revno: 2164.2.20
merge bzr.dev
------------------------------------------------------------
revno: 2164.2.27
Merge bzr.dev
...
And it seems that the merge of 0.15rc2 reverts the change to the bzr
version string.
Anyway, this still seems "correct" to me. What would have helped is if
"bzr log --verbose bzr" would have showed the deltas for each revision.
So that I could have seen which merged revision actually modified 'bzr'.
As is, I had to manually do "bzr status -r before:X..X" until I found
one that claimed a change.
But that can wait for 0.17 :)
John
=:->
More information about the bazaar
mailing list