Rev 5850: With profiling, now I run _sort_entries multiple times to get timing info. in http://bazaar.launchpad.net/~jameinel/bzr/2.4-uncommit-faster
John Arbash Meinel
john at arbash-meinel.com
Tue May 10 14:57:53 UTC 2011
At http://bazaar.launchpad.net/~jameinel/bzr/2.4-uncommit-faster
------------------------------------------------------------
revno: 5850
revision-id: john at arbash-meinel.com-20110510145746-tg3vtkmgiofm0xhb
parent: john at arbash-meinel.com-20110510142027-jvdwzqeq5yr2zqmm
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.4-uncommit-faster
timestamp: Tue 2011-05-10 16:57:46 +0200
message:
With profiling, now I run _sort_entries multiple times to get timing info.
Using StaticTuple as the primary object has a surprising benefit.
Drops us from ~700ms down to 400ms. My guess is the 65k tuples were causing
us to hit a minor gc run, and putting StaticTuple instead is avoiding it.
-------------- next part --------------
=== modified file 'bzrlib/dirstate.py'
--- a/bzrlib/dirstate.py 2011-05-10 14:20:27 +0000
+++ b/bzrlib/dirstate.py 2011-05-10 14:57:46 +0000
@@ -2611,6 +2611,11 @@
# --- end generation of full tree mappings
# sort and output all the entries
+ items = by_path.items()
+ self._sort_entries(items)
+ self._sort_entries(items)
+ self._sort_entries(items)
+
new_entries = self._sort_entries(by_path.items())
self._entries_to_current_state(new_entries)
self._parents = [rev_id for rev_id, tree in trees]
@@ -2626,16 +2631,22 @@
it's easier to sort after the fact.
"""
split_dirs = {}
- def _key(entry, _split_dirs=split_dirs):
+ stats = [0, 0]
+ def _key(entry, _split_dirs=split_dirs, st=static_tuple.StaticTuple,
+ as_st=static_tuple.StaticTuple.from_sequence):
# sort by: directory parts, file name, file id
dirpath, fname, file_id = entry[0]
try:
split = _split_dirs[dirpath]
except KeyError:
- split = dirpath.split('/')
+ split = as_st(dirpath.split('/'))
_split_dirs[dirpath] = split
- return (split, fname, file_id)
- return sorted(entry_list, key=_key)
+ return st(split, fname, file_id)
+ t = time.clock()
+ sort_vals = sorted(entry_list, key=_key)
+ t = time.clock() - t
+ trace.note('%.3fs Hit %d, miss %d' % (t, stats[0], stats[1]))
+ return sort_vals
def set_state_from_inventory(self, new_inv):
"""Set new_inv as the current state.
More information about the bazaar-commits
mailing list