Rev 2528: Optimize the simple case that the strings are the same object. in http://bzr.arbash-meinel.com/branches/bzr/0.17-dev/dirstate_pyrex

John Arbash Meinel john at arbash-meinel.com
Tue May 8 00:01:00 BST 2007


At http://bzr.arbash-meinel.com/branches/bzr/0.17-dev/dirstate_pyrex

------------------------------------------------------------
revno: 2528
revision-id: john at arbash-meinel.com-20070507230047-53ozoz7og6n2j24i
parent: john at arbash-meinel.com-20070507221117-l6pjpggfs9p2dtwy
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Mon 2007-05-07 18:00:47 -0500
message:
  Optimize the simple case that the strings are the same object.
  Add some TODO statements that we might consider.
modified:
  bzrlib/compiled/dirstate_helpers.pyx dirstate_helpers.pyx-20070503201057-u425eni465q4idwn-3
-------------- next part --------------
=== modified file 'bzrlib/compiled/dirstate_helpers.pyx'
--- a/bzrlib/compiled/dirstate_helpers.pyx	2007-05-07 22:11:17 +0000
+++ b/bzrlib/compiled/dirstate_helpers.pyx	2007-05-07 23:00:47 +0000
@@ -91,6 +91,9 @@
     cdef int *end_int1
     cdef int *end_int2
 
+    if path1 == path2:
+        return 0
+
     cur_int1 = <int*>path1
     cur_int2 = <int*>path2
     end_int1 = <int*>(path1 + size1 - (size1%4))
@@ -101,6 +104,9 @@
     # Use 32-bit comparisons for the matching portion of the string.
     # Almost all CPU's are faster at loading and comparing 32-bit integers,
     # than they are at 8-bit integers.
+    # TODO: jam 2007-05-07 Do we need to change this so we always start at an
+    #       integer offset in memory? I seem to remember that being done in
+    #       some C libraries for strcmp()
     while cur_int1 < end_int1 and cur_int2 < end_int2:
         if cur_int1[0] != cur_int2[0]:
             break
@@ -348,6 +354,13 @@
         new_block = 0
         entry_count = 0
 
+        # TODO: jam 2007-05-07 Consider pre-allocating some space for the
+        #       members, and then growing and shrinking from there. If most
+        #       directories have close to 10 entries in them, it would save a
+        #       few mallocs if we default our list size to something
+        #       reasonable. Or we could malloc it to something large (100 or
+        #       so), and then truncate. That would give us a malloc + realloc,
+        #       rather than lots of reallocs.
         while self.cur < self.end_str:
             entry = self._get_entry(num_trees, &current_dirname, &new_block)
             if new_block:



More information about the bazaar-commits mailing list