Rev 5850: With profiling, now I run _sort_entries multiple times to get timing info. in http://bazaar.launchpad.net/~jameinel/bzr/2.4-uncommit-faster

John Arbash Meinel john at arbash-meinel.com
Tue May 10 14:57:53 UTC 2011


At http://bazaar.launchpad.net/~jameinel/bzr/2.4-uncommit-faster

------------------------------------------------------------
revno: 5850
revision-id: john at arbash-meinel.com-20110510145746-tg3vtkmgiofm0xhb
parent: john at arbash-meinel.com-20110510142027-jvdwzqeq5yr2zqmm
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.4-uncommit-faster
timestamp: Tue 2011-05-10 16:57:46 +0200
message:
  With profiling, now I run _sort_entries multiple times to get timing info.
  Using StaticTuple as the primary object has a surprising benefit.
  Drops us from ~700ms down to 400ms. My guess is the 65k tuples were causing
  us to hit a minor gc run, and putting StaticTuple instead is avoiding it.
-------------- next part --------------
=== modified file 'bzrlib/dirstate.py'
--- a/bzrlib/dirstate.py	2011-05-10 14:20:27 +0000
+++ b/bzrlib/dirstate.py	2011-05-10 14:57:46 +0000
@@ -2611,6 +2611,11 @@
         # --- end generation of full tree mappings
 
         # sort and output all the entries
+        items = by_path.items()
+        self._sort_entries(items)
+        self._sort_entries(items)
+        self._sort_entries(items)
+
         new_entries = self._sort_entries(by_path.items())
         self._entries_to_current_state(new_entries)
         self._parents = [rev_id for rev_id, tree in trees]
@@ -2626,16 +2631,22 @@
         it's easier to sort after the fact.
         """
         split_dirs = {}
-        def _key(entry, _split_dirs=split_dirs):
+        stats = [0, 0]
+        def _key(entry, _split_dirs=split_dirs, st=static_tuple.StaticTuple,
+                 as_st=static_tuple.StaticTuple.from_sequence):
             # sort by: directory parts, file name, file id
             dirpath, fname, file_id = entry[0]
             try:
                 split = _split_dirs[dirpath]
             except KeyError:
-                split = dirpath.split('/')
+                split = as_st(dirpath.split('/'))
                 _split_dirs[dirpath] = split
-            return (split, fname, file_id)
-        return sorted(entry_list, key=_key)
+            return st(split, fname, file_id)
+        t = time.clock()
+        sort_vals = sorted(entry_list, key=_key)
+        t = time.clock() - t
+        trace.note('%.3fs Hit %d, miss %d' % (t, stats[0], stats[1]))
+        return sort_vals
 
     def set_state_from_inventory(self, new_inv):
         """Set new_inv as the current state.



More information about the bazaar-commits mailing list