[PATCH] do binary search for -r date:....

Olaf Conradi olaf at conradi.org
Sat May 20 02:16:15 BST 2006


On Sat, May 20, 2006 at 12:58:54AM +0200, Robert Widhopf-Fenk wrote:
Content-Description: message body text
> The current implementation does a linear search to find
> the specified date, which is dead slow for a tree with
> may revisions.
> 
> The attached patch does a binary search to find it.

+1 on binary searching

-1 on this implementation

These routines are very easy to make a mistake in.

Your binary search does not work in all cases:
I added a line inside the loop to print the boundaries.

$ ./bzr log -r date:2005-06-06
l= 1  m= 837  r= 1674
l= 1  m= 418  r= 836
l= 419  m= 627  r= 836
l= 419  m= 522  r= 626
l= 523  m= 574  r= 626
l= 575  m= 600  r= 626
l= 601  m= 613  r= 626
l= 614  m= 620  r= 626
l= 621  m= 623  r= 626
l= 621  m= 621  r= 622
l= 621  m= 620  r= 620
l= 621  m= 620  r= 620
l= 621  m= 620  r= 620
l= 621  m= 620  r= 620
...loops for ever...

$ ./bzr log -r date:2005-01-01
l= 1  m= 837  r= 1674
l= 1  m= 418  r= 836
l= 1  m= 209  r= 417
l= 1  m= 104  r= 208
l= 1  m= 52  r= 103
l= 1  m= 26  r= 51
l= 1  m= 13  r= 25
l= 1  m= 6  r= 12
l= 1  m= 3  r= 5
l= 1  m= 1  r= 2
l= 1  m= 0  r= 0
l= 1  m= 0  r= -1
l= 1  m= 0  r= -1
l= 1  m= 0  r= -1
...loops for ever...

This also demonstrates why I don't like while True loops ;)

I prefer the implementation of Guillaume Pinot, which uses python's bisect:
https://lists.ubuntu.com/archives/bazaar-ng/2006q2/011432.html

When Guillaume brought up the slow searching for dates, I also made a
version of binary search which does not use bisect. I'll attach it for
reference.

Cheers
 -Olaf
 
-------------- next part --------------
=== modified file 'bzrlib/revisionspec.py'
--- bzrlib/revisionspec.py	
+++ bzrlib/revisionspec.py	
@@ -295,13 +295,22 @@
 
             dt = datetime.datetime(year=year, month=month, day=day,
                     hour=hour, minute=minute, second=second)
-        first = dt
-        for i in range(len(revs)):
-            r = branch.repository.get_revision(revs[i])
+        wanted = dt
+        left = 0
+        right = last = len(revs);
+        while left < right:
+            mid = (left + right)/2
+            r = branch.repository.get_revision(revs[mid])
             # TODO: Handle timezone.
             dt = datetime.datetime.fromtimestamp(r.timestamp)
-            if first <= dt:
-                return RevisionInfo(branch, i+1)
+            if dt == wanted:
+                return RevisionInfo(branch, mid + 1)
+            if wanted < dt:
+                right = mid
+            else:
+                left = mid + 1
+        if left == right and left != last:
+            return RevisionInfo(branch, left + 1);
         return RevisionInfo(branch, None)
 
 SPEC_TYPES.append(RevisionSpec_date)



More information about the bazaar mailing list