Rev 3887: Bring in brisbane-core 3196 in http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack3
John Arbash Meinel
john at arbash-meinel.com
Fri Mar 27 04:33:50 GMT 2009
At http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack3
------------------------------------------------------------
revno: 3887
revision-id: john at arbash-meinel.com-20090327042407-o04qojo548gekl0c
parent: john at arbash-meinel.com-20090326195751-slpkjxd39uqlewki
parent: john at arbash-meinel.com-20090327040528-88uc1za4ep2fj6gh
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: hack3
timestamp: Thu 2009-03-26 23:24:07 -0500
message:
Bring in brisbane-core 3196
modified:
bzrlib/chk_map.py chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
bzrlib/groupcompress.py groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
------------------------------------------------------------
revno: 3869.7.28
revision-id: john at arbash-meinel.com-20090327040528-88uc1za4ep2fj6gh
parent: john at arbash-meinel.com-20090327014543-9b216wm9q4olu3ib
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: brisbane-core
timestamp: Thu 2009-03-26 23:05:28 -0500
message:
Set 'combine_backing_indices=False' as the default for text and chk indices.
We may want them for something like commit according to Robert, though we have to
be committing more than 100k new texts for it to matter, and really more than
200k for it to trigger a combine. And it makes a very big difference
to 'fetch' performance.
Also, set random_id=True for 'insert_record_stream'. This makes another
big win for fetch performance, though we may need to decide if it is
genuinely safe.
modified:
bzrlib/groupcompress.py groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
------------------------------------------------------------
revno: 3869.7.27
revision-id: john at arbash-meinel.com-20090327014543-9b216wm9q4olu3ib
parent: john at arbash-meinel.com-20090326201840-ddb2uqof335ysvnu
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: brisbane-core
timestamp: Thu 2009-03-26 20:45:43 -0500
message:
fix a bug in iter_interesting_nodes.
If you have a leaf node as one of your CHK roots, it can get
transmitted 2 times, if after a split you end up with the
same content.
Needs tests, though.
modified:
bzrlib/chk_map.py chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
------------------------------------------------------------
revno: 3869.7.26
revision-id: john at arbash-meinel.com-20090326201840-ddb2uqof335ysvnu
parent: john at arbash-meinel.com-20090326195952-w0qea66iw597ipza
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: brisbane-core
timestamp: Thu 2009-03-26 15:18:40 -0500
message:
max() shows up under lsprof as more expensive than creating an object.
timeit also says if x < y is faster than y = max(x, y).
Small win, but I'll take it.
modified:
bzrlib/groupcompress.py groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
-------------- next part --------------
=== modified file 'bzrlib/chk_map.py'
--- a/bzrlib/chk_map.py 2009-03-26 19:13:04 +0000
+++ b/bzrlib/chk_map.py 2009-03-27 01:45:43 +0000
@@ -1418,6 +1418,14 @@
if records or interesting_items:
yield records, interesting_items
interesting_keys.difference_update(all_uninteresting_chks)
+ # TODO: We need a test for this
+ # This handles the case where after a split, one of the child trees
+ # is identical to one of the interesting root keys. Like if you had a
+ # leaf node, with "aa" "ab", that then overflowed at "bb". You would
+ # get a new internal node, but it would have one leaf node with
+ # ("aa", "ab") and another leaf node with "bb". And you don't want to
+ # re-transmit that ("aa", "ab") node again
+ all_uninteresting_chks.update(interesting_root_keys)
chks_to_read = interesting_keys
counter = 0
=== modified file 'bzrlib/groupcompress.py'
--- a/bzrlib/groupcompress.py 2009-03-26 19:57:51 +0000
+++ b/bzrlib/groupcompress.py 2009-03-27 04:24:07 +0000
@@ -538,7 +538,11 @@
# Note that this creates a reference cycle....
factory = _LazyGroupCompressFactory(key, parents, self,
start, end, first=first)
- self._last_byte = max(end, self._last_byte)
+ # max() works here, but as a function call, doing a compare seems to be
+ # significantly faster, timeit says 250ms for max() and 100ms for the
+ # comparison
+ if end > self._last_byte:
+ self._last_byte = end
self._factories.append(factory)
def get_record_stream(self):
@@ -1399,7 +1403,7 @@
:return: None
:seealso VersionedFiles.get_record_stream:
"""
- for _ in self._insert_record_stream(stream):
+ for _ in self._insert_record_stream(stream, random_id=True):
pass
def _insert_record_stream(self, stream, random_id=False, nostore_sha=None,
@@ -1456,10 +1460,18 @@
insert_manager = None
block_start = None
block_length = None
+ # XXX: TODO: remove this, it is just for safety checking for now
+ inserted_keys = set()
for record in stream:
# Raise an error when a record is missing.
if record.storage_kind == 'absent':
raise errors.RevisionNotPresent(record.key, self)
+ if random_id:
+ if record.key in inserted_keys:
+ trace.note('Insert claimed random_id=True, but then inserted'
+ ' %r two times', record.key)
+ continue
+ inserted_keys.add(record.key)
if reuse_blocks:
# If the reuse_blocks flag is set, check to see if we can just
# copy a groupcompress block as-is.
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py 2009-03-26 19:59:52 +0000
+++ b/bzrlib/repofmt/pack_repo.py 2009-03-27 04:05:28 +0000
@@ -2000,12 +2000,14 @@
self._new_pack)
self.text_index.add_writable_index(self._new_pack.text_index,
self._new_pack)
+ self._new_pack.text_index.set_optimize(combine_backing_indices=False)
self.signature_index.add_writable_index(self._new_pack.signature_index,
self._new_pack)
if self.chk_index is not None:
self.chk_index.add_writable_index(self._new_pack.chk_index,
self._new_pack)
self.repo.chk_bytes._index._add_callback = self.chk_index.add_callback
+ self._new_pack.chk_index.set_optimize(combine_backing_indices=False)
self.repo.inventories._index._add_callback = self.inventory_index.add_callback
self.repo.revisions._index._add_callback = self.revision_index.add_callback
More information about the bazaar-commits
mailing list