Rev 3878: Bring in brisbane-core 3895 in http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack3
John Arbash Meinel
john at arbash-meinel.com
Fri Mar 20 15:56:15 GMT 2009
At http://bzr.arbash-meinel.com/branches/bzr/brisbane/hack3
------------------------------------------------------------
revno: 3878
revision-id: john at arbash-meinel.com-20090320154811-znms4757w29gmc4b
parent: john at arbash-meinel.com-20090319194720-4esxj7gnrmfaykww
parent: john at arbash-meinel.com-20090320155300-2qdojs8r4loamvmw
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: hack3
timestamp: Fri 2009-03-20 10:48:11 -0500
message:
Bring in brisbane-core 3895
modified:
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
bzrlib/chk_map.py chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
bzrlib/diff-delta.c diffdelta.c-20090226042143-l9wzxynyuxnb5hus-1
bzrlib/groupcompress.py groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
bzrlib/inventory.py inventory.py-20050309040759-6648b84ca2005b37
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/tests/inventory_implementations/basics.py basics.py-20070903044446-kdjwbiu1p1zi9phs-1
bzrlib/tests/test_groupcompress.py test_groupcompress.p-20080705181503-ccbxd6xuy1bdnrpu-13
------------------------------------------------------------
revno: 3869.7.7
revision-id: john at arbash-meinel.com-20090320155300-2qdojs8r4loamvmw
parent: john at arbash-meinel.com-20090320154310-q5ye037radsy052j
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: brisbane-core
timestamp: Fri 2009-03-20 10:53:00 -0500
message:
Remove an isinstance(..., tuple) assertion.
According to lsprof it was actually a bit expensive, and didn't help much anyway.
modified:
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
------------------------------------------------------------
revno: 3869.7.6
revision-id: john at arbash-meinel.com-20090320154310-q5ye037radsy052j
parent: john at arbash-meinel.com-20090320032107-bm9wg421rtcacy5i
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: brisbane-core
timestamp: Fri 2009-03-20 10:43:10 -0500
message:
Remove support for passing None for end in GroupCompressBlock.extract.
I decided the removal of the extra int in wire-bytes and indices was not a worthy
trade-off versus the ability to _prepare_for_extract and cheaply filter bytes
during fetch. And it makes the code simpler/easier to maintain.
Also, add support for having a 'empty content' record, which has start=end=0.
Support costs very little, and simplifies things.
And now GroupCompressBlock.extract() just returns the bytes. It doesn't try to
sha the content, nor does it return a GCBEntry. We weren't using it anyway.
And it can save ~50 seconds of sha-ing all the content during 'bzr pack' of
a launchpad branch.
modified:
bzrlib/groupcompress.py groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
bzrlib/tests/test_groupcompress.py test_groupcompress.p-20080705181503-ccbxd6xuy1bdnrpu-13
------------------------------------------------------------
revno: 3869.7.5
revision-id: john at arbash-meinel.com-20090320032107-bm9wg421rtcacy5i
parent: john at arbash-meinel.com-20090320031652-jjy97n2zsjq1ouxp
parent: john at arbash-meinel.com-20090319233050-tf8ah6zasmeaetr0
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: brisbane-core
timestamp: Thu 2009-03-19 22:21:07 -0500
message:
Merge the updates to the groupcompress DeltaIndex.
modified:
bzrlib/delta.h delta.h-20090227173129-qsu3u43vowf1q3ay-1
bzrlib/diff-delta.c diffdelta.c-20090226042143-l9wzxynyuxnb5hus-1
bzrlib/tests/test__groupcompress_pyx.py test__groupcompress_-20080724145854-koifwb7749cfzrvj-1
------------------------------------------------------------
revno: 3869.8.1
revision-id: john at arbash-meinel.com-20090319233050-tf8ah6zasmeaetr0
parent: john at arbash-meinel.com-20090319145132-e7eu3p75btuidhu2
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: gc_delta_index_room
timestamp: Thu 2009-03-19 18:30:50 -0500
message:
*grow* the local hmask if it is smaller than expected, don't *shrink* it.
modified:
bzrlib/diff-delta.c diffdelta.c-20090226042143-l9wzxynyuxnb5hus-1
------------------------------------------------------------
revno: 3869.7.4
revision-id: john at arbash-meinel.com-20090320031652-jjy97n2zsjq1ouxp
parent: ian.clatworthy at canonical.com-20090320015656-xrypfxtcwk0poi4z
parent: john at arbash-meinel.com-20090319203157-h1b6rtdqm3wjjgli
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: brisbane-core
timestamp: Thu 2009-03-19 22:16:52 -0500
message:
Merge the _LazyGroupContentManager, et al.
This allows us to stream GroupCompressBlocks in their compressed form, and unpack them
during insert, rather than during get().
modified:
bzrlib/groupcompress.py groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/tests/__init__.py selftest.py-20050531073622-8d0e3c8845c97a64
bzrlib/tests/test_groupcompress.py test_groupcompress.p-20080705181503-ccbxd6xuy1bdnrpu-13
bzrlib/tests/test_versionedfile.py test_versionedfile.py-20060222045249-db45c9ed14a1c2e5
bzrlib/versionedfile.py versionedfile.py-20060222045106-5039c71ee3b65490
------------------------------------------------------------
revno: 3869.6.28
revision-id: john at arbash-meinel.com-20090319203157-h1b6rtdqm3wjjgli
parent: john at arbash-meinel.com-20090319030602-stjxub1g3yhq0u32
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: lazy_gc_stream
timestamp: Thu 2009-03-19 15:31:57 -0500
message:
We can use 'random_id=True' when copying the streams.
This is because the 'get_stream' code is responsible for ensuring
the keys are truly non-overlapping, and we know we are creating a
new pack file.
It might mean that we have some overlap with yet-another existing
pack file, but only if some other operation inserted it accidentally,
and that doesn't hurt anything. When we autopack or fetch, we will
skip one of those records anyway.
This saves quite a bit of time, since we don't have to look up
texts in the index we are writing. Mostly only in the case of
large projects where we have spilled some of the nodes to disk
already.
modified:
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
------------------------------------------------------------
revno: 3869.7.3
revision-id: ian.clatworthy at canonical.com-20090320015656-xrypfxtcwk0poi4z
parent: ian.clatworthy at canonical.com-20090319193106-4bwt29ovr1b710ky
committer: Ian Clatworthy <ian.clatworthy at canonical.com>
branch nick: brisbane-core
timestamp: Fri 2009-03-20 11:56:56 +1000
message:
Inventory.iter_just_entries() API & test
modified:
bzrlib/inventory.py inventory.py-20050309040759-6648b84ca2005b37
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/tests/inventory_implementations/basics.py basics.py-20070903044446-kdjwbiu1p1zi9phs-1
------------------------------------------------------------
revno: 3869.7.2
revision-id: ian.clatworthy at canonical.com-20090319193106-4bwt29ovr1b710ky
parent: ian.clatworthy at canonical.com-20090318095149-y903o2ecqqcslikf
committer: Ian Clatworthy <ian.clatworthy at canonical.com>
branch nick: brisbane-core
timestamp: Fri 2009-03-20 05:31:06 +1000
message:
fix chk_map Node %r formatting
modified:
bzrlib/chk_map.py chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
------------------------------------------------------------
revno: 3869.7.1
revision-id: ian.clatworthy at canonical.com-20090318095149-y903o2ecqqcslikf
parent: john at arbash-meinel.com-20090317201340-amjnj1wl78iwcxae
committer: Ian Clatworthy <ian.clatworthy at canonical.com>
branch nick: brisbane-core
timestamp: Wed 2009-03-18 19:51:49 +1000
message:
fix add's interaction with filtered views
modified:
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
-------------- next part --------------
=== modified file 'bzrlib/builtins.py'
--- a/bzrlib/builtins.py 2009-03-17 20:13:40 +0000
+++ b/bzrlib/builtins.py 2009-03-18 09:51:49 +0000
@@ -83,9 +83,10 @@
tree = WorkingTree.open_containing(file_list[0])[0]
if tree.supports_views():
view_files = tree.views.lookup_view()
- for filename in file_list:
- if not osutils.is_inside_any(view_files, filename):
- raise errors.FileOutsideView(filename, view_files)
+ if view_files:
+ for filename in file_list:
+ if not osutils.is_inside_any(view_files, filename):
+ raise errors.FileOutsideView(filename, view_files)
else:
tree = WorkingTree.open_containing(u'.')[0]
if tree.supports_views():
=== modified file 'bzrlib/chk_map.py'
--- a/bzrlib/chk_map.py 2009-03-12 07:03:10 +0000
+++ b/bzrlib/chk_map.py 2009-03-19 19:31:06 +0000
@@ -523,7 +523,7 @@
def __repr__(self):
items_str = str(sorted(self._items))
if len(items_str) > 20:
- items_str = items_str[16] + '...]'
+ items_str = items_str[:16] + '...]'
return '%s(key:%s len:%s size:%s max:%s prefix:%s items:%s)' % (
self.__class__.__name__, self._key, self._len, self._raw_size,
self._maximum_size, self._search_prefix, items_str)
@@ -607,9 +607,9 @@
self._search_key_func = search_key_func
def __repr__(self):
- items_str = sorted(self._items)
+ items_str = str(sorted(self._items))
if len(items_str) > 20:
- items_str = items_str[16] + '...]'
+ items_str = items_str[:16] + '...]'
return \
'%s(key:%s len:%s size:%s max:%s prefix:%s keywidth:%s items:%s)' \
% (self.__class__.__name__, self._key, self._len, self._raw_size,
=== modified file 'bzrlib/diff-delta.c'
--- a/bzrlib/diff-delta.c 2009-03-19 14:51:32 +0000
+++ b/bzrlib/diff-delta.c 2009-03-19 23:30:50 +0000
@@ -393,7 +393,7 @@
for (i = 4; (1u << i) < hsize && i < 31; i++);
hsize = 1 << i;
hmask = hsize - 1;
- if (old && old->hash_mask < hmask) {
+ if (old && old->hash_mask > hmask) {
hmask = old->hash_mask;
hsize = hmask + 1;
}
=== modified file 'bzrlib/groupcompress.py'
--- a/bzrlib/groupcompress.py 2009-03-19 18:38:49 +0000
+++ b/bzrlib/groupcompress.py 2009-03-20 15:48:11 +0000
@@ -339,20 +339,11 @@
:param sha1: TODO (should we validate only when sha1 is supplied?)
:return: The bytes for the content
"""
- # Make sure we have enough bytes for this record
- # TODO: if we didn't want to track the end of this entry, we could
- # _ensure_content(start+enough_bytes_for_type_and_length), and
- # then decode the entry length, and
- # _ensure_content(start+1+length)
- # It is 2 calls to _ensure_content(), but we always buffer a bit
- # extra anyway, and it means 1 less offset stored in the index,
- # and transmitted over the wire
- if end is None:
- # it takes 5 bytes to encode 2^32, so we need 1 byte to hold the
- # 'f' or 'd' declaration, and then 5 more for the record length.
- self._ensure_content(start + 6)
- else:
- self._ensure_content(end)
+ # Handle the 'Empty Content' record, even if we don't always write it
+ # yet.
+ if start == end == 0:
+ return ''
+ self._ensure_content(end)
# The bytes are 'f' or 'd' for the type, then a variable-length
# base128 integer for the content size, then the actual content
# We know that the variable-length integer won't be longer than 5
@@ -368,23 +359,15 @@
content_len, len_len = decode_base128_int(
self._content[start + 1:start + 6])
content_start = start + 1 + len_len
- if end is None:
- end = content_start + content_len
- self._ensure_content(end)
- else:
- if end != content_start + content_len:
- raise ValueError('end != len according to field header'
- ' %s != %s' % (end, content_start + content_len))
- entry = GroupCompressBlockEntry(key, type, sha1=None,
- start=start, length=end-start)
+ if end != content_start + content_len:
+ raise ValueError('end != len according to field header'
+ ' %s != %s' % (end, content_start + content_len))
content = self._content[content_start:end]
if c == 'f':
bytes = content
elif c == 'd':
bytes = _groupcompress_pyx.apply_delta(self._content, content)
- # if entry.sha1 is None:
- # entry.sha1 = sha_string(bytes)
- return entry, bytes
+ return bytes
def add_entry(self, key, type, sha1, start, length):
"""Add new meta info about an entry.
@@ -515,7 +498,7 @@
if storage_kind in ('fulltext', 'chunked'):
self._manager._prepare_for_extract()
block = self._manager._block
- _, bytes = block.extract(self.key, self._start, self._end)
+ bytes = block.extract(self.key, self._start, self._end)
if storage_kind == 'fulltext':
return bytes
else:
=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py 2009-03-17 20:13:40 +0000
+++ b/bzrlib/inventory.py 2009-03-20 01:56:56 +0000
@@ -1143,6 +1143,19 @@
"""Iterate over all file-ids."""
return iter(self._byid)
+ def iter_just_entries(self):
+ """Iterate over all entries.
+
+ Unlike iter_entries(), just the entries are returned (not (path, ie))
+ and the order of entries is undefined.
+
+ XXX: We may not want to merge this into bzr.dev.
+ """
+ if self.root is None:
+ return
+ for _, ie in self._byid.iteritems():
+ yield ie
+
def __len__(self):
"""Returns number of entries."""
return len(self._byid)
@@ -1722,6 +1735,22 @@
for key, _ in self.id_to_entry.iteritems():
yield key[-1]
+ def iter_just_entries(self):
+ """Iterate over all entries.
+
+ Unlike iter_entries(), just the entries are returned (not (path, ie))
+ and the order of entries is undefined.
+
+ XXX: We may not want to merge this into bzr.dev.
+ """
+ for key, entry in self.id_to_entry.iteritems():
+ file_id = key[0]
+ ie = self._entry_cache.get(file_id, None)
+ if ie is None:
+ ie = self._bytes_to_entry(entry)
+ self._entry_cache[file_id] = ie
+ yield ie
+
def iter_changes(self, basis):
"""Generate a Tree.iter_changes change list between this and basis.
=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- a/bzrlib/repofmt/groupcompress_repo.py 2009-03-19 19:47:20 +0000
+++ b/bzrlib/repofmt/groupcompress_repo.py 2009-03-20 15:48:11 +0000
@@ -254,9 +254,6 @@
next_keys = set()
def handle_internal_node(node):
for prefix, value in node._items.iteritems():
- if not isinstance(value, tuple):
- raise AssertionError("value is %s when a tuple"
- " is expected" % (value.__class__))
# We don't want to request the same key twice, and we
# want to order it by the first time it is seen.
# Even further, we don't want to request a key which is
@@ -290,13 +287,6 @@
handle_internal_node(node)
elif parse_leaf_nodes:
handle_leaf_node(node)
- # XXX: We don't walk the chk map to determine
- # referenced (file_id, revision_id) keys.
- # We don't do it yet because you really need to
- # filter out the ones that are present in the
- # parents of the rev just before the ones you are
- # copying, otherwise the filter is grabbing too
- # many keys...
counter[0] += 1
if pb is not None:
pb.update('chk node', counter[0], total_keys)
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py 2009-03-17 20:33:54 +0000
+++ b/bzrlib/repofmt/pack_repo.py 2009-03-20 03:16:52 +0000
@@ -2498,7 +2498,7 @@
total = len(revision_ids)
for pos, inv in enumerate(self.iter_inventories(revision_ids)):
pb.update("Finding text references", pos, total)
- for _, entry in inv.iter_entries():
+ for entry in inv.iter_just_entries():
if entry.revision != inv.revision_id:
continue
if not rich_roots and entry.file_id == inv.root_id:
=== modified file 'bzrlib/tests/inventory_implementations/basics.py'
--- a/bzrlib/tests/inventory_implementations/basics.py 2009-03-12 08:12:18 +0000
+++ b/bzrlib/tests/inventory_implementations/basics.py 2009-03-20 01:56:56 +0000
@@ -208,6 +208,23 @@
('src/hello.c', 'hello-id'),
], [(path, ie.file_id) for path, ie in inv.iter_entries()])
+ def test_iter_just_entries(self):
+ inv = self.make_inventory('tree-root')
+ for args in [('src', 'directory', 'src-id'),
+ ('doc', 'directory', 'doc-id'),
+ ('src/hello.c', 'file', 'hello-id'),
+ ('src/bye.c', 'file', 'bye-id'),
+ ('Makefile', 'file', 'makefile-id')]:
+ inv.add_path(*args)
+ self.assertEqual([
+ 'bye-id',
+ 'doc-id',
+ 'hello-id',
+ 'makefile-id',
+ 'src-id',
+ 'tree-root',
+ ], sorted([ie.file_id for ie in inv.iter_just_entries()]))
+
def test_iter_entries_by_dir(self):
inv = self.make_inventory('tree-root')
for args in [('src', 'directory', 'src-id'),
=== modified file 'bzrlib/tests/test_groupcompress.py'
--- a/bzrlib/tests/test_groupcompress.py 2009-03-19 03:06:02 +0000
+++ b/bzrlib/tests/test_groupcompress.py 2009-03-20 15:43:10 +0000
@@ -329,21 +329,6 @@
'length:100\n'
'\n', raw_bytes)
- def test_extract_no_end(self):
- # We should be able to extract a record, even if we only know the start
- # of the bytes.
- texts = {
- ('key1',): 'text for key1\nhas bytes that are common\n',
- ('key2',): 'text for key2\nhas bytes that are common\n',
- }
- entries, block = self.make_block(texts)
- self.assertEqualDiff('text for key1\nhas bytes that are common\n',
- block.extract(('key1',), entries[('key1',)].start,
- end=None)[1])
- self.assertEqualDiff('text for key2\nhas bytes that are common\n',
- block.extract(('key2',), entries[('key2',)].start,
- end=None)[1])
-
def test_partial_decomp(self):
content_chunks = []
# We need a sufficient amount of data so that zlib.decompress has
More information about the bazaar-commits
mailing list