Rev 4470: (robertc) Pack 2a repositories after fetching from a different format in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Tue Jun 23 01:35:23 BST 2009
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 4470
revision-id: pqm at pqm.ubuntu.com-20090623003517-lrjel82rf7q6qjlc
parent: pqm at pqm.ubuntu.com-20090622171120-fuxez9ylfqpxynqn
parent: robertc at robertcollins.net-20090622232400-3v66jsa4bdorxcn6
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Tue 2009-06-23 01:35:17 +0100
message:
(robertc) Pack 2a repositories after fetching from a different format
(bug 376748) and fix problems with autopacking 2a repositories
(bug 365615). (Robert Collins)
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/remote.py remote.py-20060720103555-yeeg2x51vn0rbtdp-1
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/tests/per_repository/test_pack.py test_pack.py-20070712120702-0c7585lh56p894mo-2
bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
bzrlib/tests/per_repository/test_write_group.py test_write_group.py-20070716105516-89n34xtogq5frn0m-1
bzrlib/tests/test_pack_repository.py test_pack_repository-20080801043947-eaw0e6h2gu75kwmy-1
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
------------------------------------------------------------
revno: 4462.2.10
revision-id: robertc at robertcollins.net-20090622232400-3v66jsa4bdorxcn6
parent: robertc at robertcollins.net-20090622215537-f7kxi0tui92ysiec
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Tue 2009-06-23 09:24:00 +1000
message:
Add explicit test for autopack of CHK repositories when CHK pages are not in the source packs.
modified:
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
------------------------------------------------------------
revno: 4462.2.9
revision-id: robertc at robertcollins.net-20090622215537-f7kxi0tui92ysiec
parent: robertc at robertcollins.net-20090622061541-mri46zc9w30imk3l
parent: pqm at pqm.ubuntu.com-20090622171120-fuxez9ylfqpxynqn
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Tue 2009-06-23 07:55:37 +1000
message:
Resolve NEWS.
renamed:
generate_docs.py => tools/generate_docs.py bzrinfogen.py-20051211224525-78e7c14f2c955e55
tools/doc_generate => bzrlib/doc_generate bzrinfogen-20051211214907-45ff5f0af3a80b32
modified:
Makefile Makefile-20050805140406-d96e3498bb61c5bb
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/_known_graph_py.py _known_graph_py.py-20090610185421-vw8vfda2cgnckgb1-1
bzrlib/_known_graph_pyx.pyx _known_graph_pyx.pyx-20090610194911-yjk73td9hpjilas0-1
bzrlib/bugtracker.py bugtracker.py-20070410073305-vu1vu1qosjurg8kb-1
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
bzrlib/bzrdir.py bzrdir.py-20060131065624-156dfea39c4387cb
bzrlib/commands.py bzr.py-20050309040720-d10f4714595cf8c3
bzrlib/doc_generate/__init__.py __init__.py-20051211214907-df9e0e6b493553f1
bzrlib/doc_generate/autodoc_bash_completion.py big_bash_completion.py-20051211223059-00ecfbfcc8056b78
bzrlib/doc_generate/autodoc_man.py bzrman.py-20050601153041-0ff7f74de456d15e
bzrlib/doc_generate/autodoc_rstx.py autodoc_rstx.py-20060420024836-3e0d4a526452193c
bzrlib/groupcompress.py groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
bzrlib/help.py help.py-20050505025907-4dd7a6d63912f894
bzrlib/help_topics/__init__.py help_topics.py-20060920210027-rnim90q9e0bwxvy4-1
bzrlib/hooks.py hooks.py-20070325015548-ix4np2q0kd8452au-1
bzrlib/knit.py knit.py-20051212171256-f056ac8f0fbe1bd9
bzrlib/pack.py container.py-20070607160755-tr8zc26q18rn0jnb-1
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/revision.py revision.py-20050309040759-e77802c08f3999d5
bzrlib/tests/blackbox/test_push.py test_push.py-20060329002750-929af230d5d22663
bzrlib/tests/test__known_graph.py test__known_graph.py-20090610185421-vw8vfda2cgnckgb1-2
bzrlib/tests/test_generate_docs.py test_generate_docs.p-20070102123151-cqctnsrlqwmiljd7-1
bzrlib/tests/test_pack.py test_container.py-20070607160755-tr8zc26q18rn0jnb-2
bzrlib/tests/test_tuned_gzip.py test_tuned_gzip.py-20060418042056-c576dfc708984968
bzrlib/tests/test_versionedfile.py test_versionedfile.py-20060222045249-db45c9ed14a1c2e5
bzrlib/tuned_gzip.py tuned_gzip.py-20060407014720-5aadc518e928e8d2
bzrlib/versionedfile.py versionedfile.py-20060222045106-5039c71ee3b65490
setup.py setup.py-20050314065409-02f8a0a6e3f9bc70
tools/time_graph.py time_graph.py-20090608210127-6g0epojxnqjo0f0s-1
tools/generate_docs.py bzrinfogen.py-20051211224525-78e7c14f2c955e55
------------------------------------------------------------
revno: 4462.2.8
revision-id: robertc at robertcollins.net-20090622061541-mri46zc9w30imk3l
parent: robertc at robertcollins.net-20090622061438-3v9hl1pe2ph72ik4
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Mon 2009-06-22 16:15:41 +1000
message:
Review corrections.
modified:
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
------------------------------------------------------------
revno: 4462.2.7
revision-id: robertc at robertcollins.net-20090622061438-3v9hl1pe2ph72ik4
parent: robertc at robertcollins.net-20090622052704-32rm1mbm9mgfk1v3
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Mon 2009-06-22 16:14:38 +1000
message:
Both StreamSink and InterDifferingSerialiser now pack after fetching when it is beneficial
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
------------------------------------------------------------
revno: 4462.2.6
revision-id: robertc at robertcollins.net-20090622052704-32rm1mbm9mgfk1v3
parent: robertc at robertcollins.net-20090622045621-plce53iif067uod1
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Mon 2009-06-22 15:27:04 +1000
message:
Cause StreamSink to partially pack repositories after cross format fetches when beneficial.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
------------------------------------------------------------
revno: 4462.2.5
revision-id: robertc at robertcollins.net-20090622045621-plce53iif067uod1
parent: robertc at robertcollins.net-20090622022509-qn2rjozy7g1hsmpv
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Mon 2009-06-22 14:56:21 +1000
message:
Teach groupcompress repositories to honour pack hints, and also not error when a CHK page is not in the packs being repacked by partial pack operations.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
------------------------------------------------------------
revno: 4462.2.4
revision-id: robertc at robertcollins.net-20090622022509-qn2rjozy7g1hsmpv
parent: robertc at robertcollins.net-20090621235117-zvjywxin20usblpn
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Mon 2009-06-22 12:25:09 +1000
message:
Teach commit_write_group to return hint information for pack().
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/tests/per_repository/test_write_group.py test_write_group.py-20070716105516-89n34xtogq5frn0m-1
bzrlib/tests/test_pack_repository.py test_pack_repository-20080801043947-eaw0e6h2gu75kwmy-1
------------------------------------------------------------
revno: 4462.2.3
revision-id: robertc at robertcollins.net-20090621235117-zvjywxin20usblpn
parent: robertc at robertcollins.net-20090619042602-dicz171b8vhj1s71
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Mon 2009-06-22 09:51:17 +1000
message:
Add a hint parameter to Repository.pack.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/remote.py remote.py-20060720103555-yeeg2x51vn0rbtdp-1
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/tests/per_repository/test_pack.py test_pack.py-20070712120702-0c7585lh56p894mo-2
------------------------------------------------------------
revno: 4462.2.2
revision-id: robertc at robertcollins.net-20090619042602-dicz171b8vhj1s71
parent: robertc at robertcollins.net-20090619041922-acr6p23jah4z2gc8
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Fri 2009-06-19 14:26:02 +1000
message:
Change CHK already-packed check to be generic using the pack_compresses flag.
modified:
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
------------------------------------------------------------
revno: 4462.2.1
revision-id: robertc at robertcollins.net-20090619041922-acr6p23jah4z2gc8
parent: pqm at pqm.ubuntu.com-20090618213920-8d1p9f28uomzfkvl
committer: Robert Collins <robertc at robertcollins.net>
branch nick: autopack-cross-format-fetch
timestamp: Fri 2009-06-19 14:19:22 +1000
message:
Add new attribute to RepositoryFormat pack_compresses, hinting when pack can be useful.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/remote.py remote.py-20060720103555-yeeg2x51vn0rbtdp-1
bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
=== modified file 'NEWS'
--- a/NEWS 2009-06-22 17:11:20 +0000
+++ b/NEWS 2009-06-22 21:55:37 +0000
@@ -47,6 +47,10 @@
``--2a`` formats should be down to exactly 2x the size. Related to bug
#109114. (John Arbash Meinel)
+* Repositories using CHK pages (which includes the new 2a format) will no
+ longer error during commit or push operations when an autopack operation
+ is triggered. (Robert Collins, #365615)
+
* Unshelve works correctly when multiple zero-length files are present on
the shelf. (Aaron Bentley, #363444)
@@ -66,16 +70,37 @@
for files with long ancestry and 'cherrypicked' changes.)
(John Arbash Meinel, Vincent Ladeuil)
+* ``GroupCompress`` repositories now take advantage of the pack hints
+ parameter to permit cross-format fetching to incrementally pack the
+ converted data. (Robert Collins)
+
* pack <=> pack fetching is now done via a ``PackStreamSource`` rather
than the ``Packer`` code. The user visible change is that we now
properly fetch the minimum number of texts for non-smart fetching.
(John Arbash Meinel)
+* ``Repository.commit_write_group`` now returns opaque data about what
+ was committed, for passing to the ``Repository.pack``. Repositories
+ without atomic commits will still return None. (Robert Collins)
+
+* ``Repository.pack`` now takes an optional ``hint`` parameter
+ which will support doing partial packs for repositories that can do
+ that. (Robert Collins)
+
+* RepositoryFormat has a new attribute 'pack_compresses' which is True
+ when doing a pack operation changes the compression of content in the
+ repository. (Robert Collins)
+
+* ``StreamSink`` and ``InterDifferingSerialiser`` will call
+ ``Repository.pack`` with the hint returned by
+ ``Repository.commit_write_group`` if the formats were different and the
+ repository can increase compression by doing a pack operation.
+ (Robert Collins, #376748)
+
* ``VersionedFiles._add_text`` is a new api that lets us insert text into
the repository as a single string, rather than a list of lines. This can
improve memory overhead and performance of committing large files.
(Currently a private api, used only by commit). (John Arbash Meinel)
-
Improvements
=== modified file 'bzrlib/remote.py'
--- a/bzrlib/remote.py 2009-06-17 03:53:51 +0000
+++ b/bzrlib/remote.py 2009-06-21 23:51:17 +0000
@@ -566,6 +566,11 @@
return self._creating_repo._real_repository._format.network_name()
@property
+ def pack_compresses(self):
+ self._ensure_real()
+ return self._custom_format.pack_compresses
+
+ @property
def _serializer(self):
self._ensure_real()
return self._custom_format._serializer
@@ -1491,13 +1496,13 @@
return self._real_repository.inventories
@needs_write_lock
- def pack(self):
+ def pack(self, hint=None):
"""Compress the data within the repository.
This is not currently implemented within the smart server.
"""
self._ensure_real()
- return self._real_repository.pack()
+ return self._real_repository.pack(hint=hint)
@property
def revisions(self):
=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- a/bzrlib/repofmt/groupcompress_repo.py 2009-06-22 15:13:45 +0000
+++ b/bzrlib/repofmt/groupcompress_repo.py 2009-06-22 21:55:37 +0000
@@ -218,6 +218,7 @@
p_id_roots_set = set()
stream = source_vf.get_record_stream(keys, 'groupcompress', True)
for idx, record in enumerate(stream):
+ # Inventories should always be with revisions; assume success.
bytes = record.get_bytes_as('fulltext')
chk_inv = inventory.CHKInventory.deserialise(None, bytes,
record.key)
@@ -294,6 +295,11 @@
stream = source_vf.get_record_stream(cur_keys,
'as-requested', True)
for record in stream:
+ if record.storage_kind == 'absent':
+ # An absent CHK record: we assume that the missing
+ # record is in a different pack - e.g. a page not
+ # altered by the commit we're packing.
+ continue
bytes = record.get_bytes_as('fulltext')
# We don't care about search_key_func for this code,
# because we only care about external references.
@@ -558,11 +564,6 @@
pack_factory = GCPack
resumed_pack_factory = ResumedGCPack
- def _already_packed(self):
- """Is the collection already packed?"""
- # Always repack GC repositories for now
- return False
-
def _execute_pack_operations(self, pack_operations,
_packer_class=GCCHKPacker,
reload_func=None):
@@ -1048,6 +1049,7 @@
_fetch_order = 'unordered'
_fetch_uses_deltas = False # essentially ignored by the groupcompress code.
fast_deltas = True
+ pack_compresses = True
def _get_matching_bzrdir(self):
return bzrdir.format_registry.make_bzrdir('development6-rich-root')
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py 2009-06-17 17:57:15 +0000
+++ b/bzrlib/repofmt/pack_repo.py 2009-06-22 04:56:21 +0000
@@ -1459,12 +1459,12 @@
in synchronisation with certain steps. Otherwise the names collection
is not flushed.
- :return: True if packing took place.
+ :return: Something evaluating true if packing took place.
"""
while True:
try:
return self._do_autopack()
- except errors.RetryAutopack, e:
+ except errors.RetryAutopack:
# If we get a RetryAutopack exception, we should abort the
# current action, and retry.
pass
@@ -1474,7 +1474,7 @@
total_revisions = self.revision_index.combined_index.key_count()
total_packs = len(self._names)
if self._max_pack_count(total_revisions) >= total_packs:
- return False
+ return None
# determine which packs need changing
pack_distribution = self.pack_distribution(total_revisions)
existing_packs = []
@@ -1502,10 +1502,10 @@
'containing %d revisions. Packing %d files into %d affecting %d'
' revisions', self, total_packs, total_revisions, num_old_packs,
num_new_packs, num_revs_affected)
- self._execute_pack_operations(pack_operations,
+ result = self._execute_pack_operations(pack_operations,
reload_func=self._restart_autopack)
mutter('Auto-packing repository %s completed', self)
- return True
+ return result
def _execute_pack_operations(self, pack_operations, _packer_class=Packer,
reload_func=None):
@@ -1513,7 +1513,7 @@
:param pack_operations: A list of [revision_count, packs_to_combine].
:param _packer_class: The class of packer to use (default: Packer).
- :return: None.
+ :return: The new pack names.
"""
for revision_count, packs in pack_operations:
# we may have no-ops from the setup logic
@@ -1535,10 +1535,11 @@
self._remove_pack_from_memory(pack)
# record the newly available packs and stop advertising the old
# packs
- self._save_pack_names(clear_obsolete_packs=True)
+ result = self._save_pack_names(clear_obsolete_packs=True)
# Move the old packs out of the way now they are no longer referenced.
for revision_count, packs in pack_operations:
self._obsolete_packs(packs)
+ return result
def _flush_new_pack(self):
if self._new_pack is not None:
@@ -1554,29 +1555,26 @@
def _already_packed(self):
"""Is the collection already packed?"""
- return len(self._names) < 2
+ return not (self.repo._format.pack_compresses or (len(self._names) > 1))
- def pack(self):
+ def pack(self, hint=None):
"""Pack the pack collection totally."""
self.ensure_loaded()
total_packs = len(self._names)
if self._already_packed():
- # This is arguably wrong because we might not be optimal, but for
- # now lets leave it in. (e.g. reconcile -> one pack. But not
- # optimal.
return
total_revisions = self.revision_index.combined_index.key_count()
# XXX: the following may want to be a class, to pack with a given
# policy.
mutter('Packing repository %s, which has %d pack files, '
- 'containing %d revisions into 1 packs.', self, total_packs,
- total_revisions)
+ 'containing %d revisions with hint %r.', self, total_packs,
+ total_revisions, hint)
# determine which packs need changing
- pack_distribution = [1]
pack_operations = [[0, []]]
for pack in self.all_packs():
- pack_operations[-1][0] += pack.get_revision_count()
- pack_operations[-1][1].append(pack)
+ if not hint or pack.name in hint:
+ pack_operations[-1][0] += pack.get_revision_count()
+ pack_operations[-1][1].append(pack)
self._execute_pack_operations(pack_operations, OptimisingPacker)
def plan_autopack_combinations(self, existing_packs, pack_distribution):
@@ -1938,6 +1936,7 @@
:param clear_obsolete_packs: If True, clear out the contents of the
obsolete_packs directory.
+ :return: A list of the names saved that were not previously on disk.
"""
self.lock_names()
try:
@@ -1958,6 +1957,7 @@
self._unlock_names()
# synchronise the memory packs list with what we just wrote:
self._syncronize_pack_names_from_disk_nodes(disk_nodes)
+ return [new_node[0][0] for new_node in new_nodes]
def reload_pack_names(self):
"""Sync our pack listing with what is present in the repository.
@@ -2097,7 +2097,7 @@
if not self.autopack():
# when autopack takes no steps, the names list is still
# unsaved.
- self._save_pack_names()
+ return self._save_pack_names()
def _suspend_write_group(self):
tokens = [pack.name for pack in self._resumed_packs]
@@ -2348,13 +2348,13 @@
raise NotImplementedError(self.dont_leave_lock_in_place)
@needs_write_lock
- def pack(self):
+ def pack(self, hint=None):
"""Compress the data within the repository.
This will pack all the data to a single pack. In future it may
recompress deltas or do other such expensive operations.
"""
- self._pack_collection.pack()
+ self._pack_collection.pack(hint=hint)
@needs_write_lock
def reconcile(self, other=None, thorough=False):
=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py 2009-06-22 15:47:25 +0000
+++ b/bzrlib/repository.py 2009-06-22 21:55:37 +0000
@@ -1404,8 +1404,9 @@
raise errors.BzrError('mismatched lock context %r and '
'write group %r.' %
(self.get_transaction(), self._write_group))
- self._commit_write_group()
+ result = self._commit_write_group()
self._write_group = None
+ return result
def _commit_write_group(self):
"""Template method for per-repository write group cleanup.
@@ -2418,7 +2419,7 @@
keys = tsort.topo_sort(parent_map)
return [None] + list(keys)
- def pack(self):
+ def pack(self, hint=None):
"""Compress the data within the repository.
This operation only makes sense for some repository types. For other
@@ -2427,6 +2428,13 @@
This stub method does not require a lock, but subclasses should use
@needs_write_lock as this is a long running call its reasonable to
implicitly lock for the user.
+
+ :param hint: If not supplied, the whole repository is packed.
+ If supplied, the repository may use the hint parameter as a
+ hint for the parts of the repository to pack. A hint can be
+ obtained from the result of commit_write_group(). Out of
+ date hints are simply ignored, because concurrent operations
+ can obsolete them rapidly.
"""
def get_transaction(self):
@@ -2835,6 +2843,11 @@
# Does this format have < O(tree_size) delta generation. Used to hint what
# code path for commit, amongst other things.
fast_deltas = None
+ # Does doing a pack operation compress data? Useful for the pack UI command
+ # (so if there is one pack, the operation can still proceed because it may
+ # help), and for fetching when data won't have come from the same
+ # compressor.
+ pack_compresses = False
def __str__(self):
return "<%s>" % self.__class__.__name__
@@ -3666,6 +3679,7 @@
cache = lru_cache.LRUCache(100)
cache[basis_id] = basis_tree
del basis_tree # We don't want to hang on to it here
+ hints = []
for offset in range(0, len(revision_ids), batch_size):
self.target.start_write_group()
try:
@@ -3677,7 +3691,11 @@
self.target.abort_write_group()
raise
else:
- self.target.commit_write_group()
+ hint = self.target.commit_write_group()
+ if hint:
+ hints.extend(hint)
+ if hints and self.target._format.pack_compresses:
+ self.target.pack(hint=hints)
pb.update('Transferring revisions', len(revision_ids),
len(revision_ids))
@@ -4025,7 +4043,10 @@
# missing keys can handle suspending a write group).
write_group_tokens = self.target_repo.suspend_write_group()
return write_group_tokens, missing_keys
- self.target_repo.commit_write_group()
+ hint = self.target_repo.commit_write_group()
+ if (to_serializer != src_serializer and
+ self.target_repo._format.pack_compresses):
+ self.target_repo.pack(hint=hint)
return [], set()
def _extract_and_insert_inventories(self, substream, serializer):
=== modified file 'bzrlib/tests/per_repository/test_pack.py'
--- a/bzrlib/tests/per_repository/test_pack.py 2009-03-23 14:59:43 +0000
+++ b/bzrlib/tests/per_repository/test_pack.py 2009-06-21 23:51:17 +0000
@@ -24,3 +24,14 @@
def test_pack_empty_does_not_error(self):
repo = self.make_repository('.')
repo.pack()
+
+ def test_pack_accepts_opaque_hint(self):
+ # For requesting packs of a repository where some data is known to be
+ # unoptimal we permit packing just some data via a hint. If the hint is
+ # illegible it is ignored.
+ tree = self.make_branch_and_tree('tree')
+ rev1 = tree.commit('1')
+ rev2 = tree.commit('2')
+ rev3 = tree.commit('3')
+ rev4 = tree.commit('4')
+ tree.branch.repository.pack(hint=[rev3, rev4])
=== modified file 'bzrlib/tests/per_repository/test_repository.py'
--- a/bzrlib/tests/per_repository/test_repository.py 2009-06-17 21:33:03 +0000
+++ b/bzrlib/tests/per_repository/test_repository.py 2009-06-19 04:19:22 +0000
@@ -66,29 +66,29 @@
class TestRepository(TestCaseWithRepository):
+ def assertFormatAttribute(self, attribute, allowed_values):
+ """Assert that the format has an attribute 'attribute'."""
+ repo = self.make_repository('repo')
+ self.assertSubset([getattr(repo._format, attribute)], allowed_values)
+
def test_attribute__fetch_order(self):
"""Test the the _fetch_order attribute."""
- tree = self.make_branch_and_tree('tree')
- repo = tree.branch.repository
- self.assertTrue(repo._format._fetch_order in ('topological', 'unordered'))
+ self.assertFormatAttribute('_fetch_order', ('topological', 'unordered'))
def test_attribute__fetch_uses_deltas(self):
"""Test the the _fetch_uses_deltas attribute."""
- tree = self.make_branch_and_tree('tree')
- repo = tree.branch.repository
- self.assertTrue(repo._format._fetch_uses_deltas in (True, False))
+ self.assertFormatAttribute('_fetch_uses_deltas', (True, False))
def test_attribute_fast_deltas(self):
"""Test the format.fast_deltas attribute."""
- tree = self.make_branch_and_tree('tree')
- repo = tree.branch.repository
- self.assertTrue(repo._format.fast_deltas in (True, False))
+ self.assertFormatAttribute('fast_deltas', (True, False))
def test_attribute__fetch_reconcile(self):
"""Test the the _fetch_reconcile attribute."""
- tree = self.make_branch_and_tree('tree')
- repo = tree.branch.repository
- self.assertTrue(repo._format._fetch_reconcile in (True, False))
+ self.assertFormatAttribute('_fetch_reconcile', (True, False))
+
+ def test_attribute_format_pack_compresses(self):
+ self.assertFormatAttribute('pack_compresses', (True, False))
def test_attribute_inventories_store(self):
"""Test the existence of the inventories attribute."""
=== modified file 'bzrlib/tests/per_repository/test_write_group.py'
--- a/bzrlib/tests/per_repository/test_write_group.py 2009-06-10 03:56:49 +0000
+++ b/bzrlib/tests/per_repository/test_write_group.py 2009-06-22 02:25:09 +0000
@@ -68,11 +68,14 @@
repo.commit_write_group()
repo.unlock()
- def test_commit_write_group_gets_None(self):
+ def test_commit_write_group_does_not_error(self):
repo = self.make_repository('.')
repo.lock_write()
repo.start_write_group()
- self.assertEqual(None, repo.commit_write_group())
+ # commit_write_group can either return None (for repositories without
+ # isolated transactions) or a hint for pack(). So we only check it
+ # works in this interface test, because all repositories are exercised.
+ repo.commit_write_group()
repo.unlock()
def test_unlock_in_write_group(self):
=== modified file 'bzrlib/tests/test_pack_repository.py'
--- a/bzrlib/tests/test_pack_repository.py 2009-06-17 17:57:15 +0000
+++ b/bzrlib/tests/test_pack_repository.py 2009-06-22 02:25:09 +0000
@@ -238,6 +238,35 @@
pack_names = [node[1][0] for node in index.iter_all_entries()]
self.assertTrue(large_pack_name in pack_names)
+ def test_commit_write_group_returns_new_pack_names(self):
+ format = self.get_format()
+ tree = self.make_branch_and_tree('foo', format=format)
+ tree.commit('first post')
+ repo = tree.branch.repository
+ repo.lock_write()
+ try:
+ repo.start_write_group()
+ try:
+ inv = inventory.Inventory(revision_id="A")
+ inv.root.revision = "A"
+ repo.texts.add_lines((inv.root.file_id, "A"), [], [])
+ rev = _mod_revision.Revision(timestamp=0, timezone=None,
+ committer="Foo Bar <foo at example.com>", message="Message",
+ revision_id="A")
+ rev.parent_ids = ()
+ repo.add_revision("A", rev, inv=inv)
+ except:
+ repo.abort_write_group()
+ raise
+ else:
+ old_names = repo._pack_collection._names.keys()
+ result = repo.commit_write_group()
+ cur_names = repo._pack_collection._names.keys()
+ new_names = list(set(cur_names) - set(old_names))
+ self.assertEqual(new_names, result)
+ finally:
+ repo.unlock()
+
def test_fail_obsolete_deletion(self):
# failing to delete obsolete packs is not fatal
format = self.get_format()
=== modified file 'bzrlib/tests/test_repository.py'
--- a/bzrlib/tests/test_repository.py 2009-06-18 18:00:01 +0000
+++ b/bzrlib/tests/test_repository.py 2009-06-22 23:24:00 +0000
@@ -673,10 +673,14 @@
self.assertFalse(repo._format.supports_external_lookups)
-class TestDevelopment6(TestCaseWithTransport):
+class Test2a(TestCaseWithTransport):
+
+ def test_format_pack_compresses_True(self):
+ repo = self.make_repository('repo', format='2a')
+ self.assertTrue(repo._format.pack_compresses)
def test_inventories_use_chk_map_with_parent_base_dict(self):
- tree = self.make_branch_and_tree('repo', format="development6-rich-root")
+ tree = self.make_branch_and_tree('repo', format="2a")
revid = tree.commit("foo")
tree.lock_read()
self.addCleanup(tree.unlock)
@@ -688,14 +692,41 @@
self.assertEqual(65536,
inv.parent_id_basename_to_file_id._root_node.maximum_size)
+ def test_autopack_unchanged_chk_nodes(self):
+ # at 20 unchanged commits, chk pages are packed that are split into
+ # two groups such that the new pack being made doesn't have all its
+ # pages in the source packs (though they are in the repository).
+ tree = self.make_branch_and_tree('tree', format='2a')
+ for pos in range(20):
+ tree.commit(str(pos))
+
+ def test_pack_with_hint(self):
+ tree = self.make_branch_and_tree('tree', format='2a')
+ # 1 commit to leave untouched
+ tree.commit('1')
+ to_keep = tree.branch.repository._pack_collection.names()
+ # 2 to combine
+ tree.commit('2')
+ tree.commit('3')
+ all = tree.branch.repository._pack_collection.names()
+ combine = list(set(all) - set(to_keep))
+ self.assertLength(3, all)
+ self.assertLength(2, combine)
+ tree.branch.repository.pack(hint=combine)
+ final = tree.branch.repository._pack_collection.names()
+ self.assertLength(2, final)
+ self.assertFalse(combine[0] in final)
+ self.assertFalse(combine[1] in final)
+ self.assertSubset(to_keep, final)
+
def test_stream_source_to_gc(self):
- source = self.make_repository('source', format='development6-rich-root')
- target = self.make_repository('target', format='development6-rich-root')
+ source = self.make_repository('source', format='2a')
+ target = self.make_repository('target', format='2a')
stream = source._get_source(target._format)
self.assertIsInstance(stream, groupcompress_repo.GroupCHKStreamSource)
def test_stream_source_to_non_gc(self):
- source = self.make_repository('source', format='development6-rich-root')
+ source = self.make_repository('source', format='2a')
target = self.make_repository('target', format='rich-root-pack')
stream = source._get_source(target._format)
# We don't want the child GroupCHKStreamSource
@@ -703,7 +734,7 @@
def test_get_stream_for_missing_keys_includes_all_chk_refs(self):
source_builder = self.make_branch_builder('source',
- format='development6-rich-root')
+ format='2a')
# We have to build a fairly large tree, so that we are sure the chk
# pages will have split into multiple pages.
entries = [('add', ('', 'a-root-id', 'directory', None))]
@@ -726,7 +757,7 @@
source_branch = source_builder.get_branch()
source_branch.lock_read()
self.addCleanup(source_branch.unlock)
- target = self.make_repository('target', format='development6-rich-root')
+ target = self.make_repository('target', format='2a')
source = source_branch.repository._get_source(target._format)
self.assertIsInstance(source, groupcompress_repo.GroupCHKStreamSource)
@@ -1354,3 +1385,83 @@
self.assertTrue(new_pack.inventory_index._optimize_for_size)
self.assertTrue(new_pack.text_index._optimize_for_size)
self.assertTrue(new_pack.signature_index._optimize_for_size)
+
+
+class TestCrossFormatPacks(TestCaseWithTransport):
+
+ def log_pack(self, hint=None):
+ self.calls.append(('pack', hint))
+ self.orig_pack(hint=hint)
+ if self.expect_hint:
+ self.assertTrue(hint)
+
+ def run_stream(self, src_fmt, target_fmt, expect_pack_called):
+ self.expect_hint = expect_pack_called
+ self.calls = []
+ source_tree = self.make_branch_and_tree('src', format=src_fmt)
+ source_tree.lock_write()
+ self.addCleanup(source_tree.unlock)
+ tip = source_tree.commit('foo')
+ target = self.make_repository('target', format=target_fmt)
+ target.lock_write()
+ self.addCleanup(target.unlock)
+ source = source_tree.branch.repository._get_source(target._format)
+ self.orig_pack = target.pack
+ target.pack = self.log_pack
+ search = target.search_missing_revision_ids(
+ source_tree.branch.repository, tip)
+ stream = source.get_stream(search)
+ from_format = source_tree.branch.repository._format
+ sink = target._get_sink()
+ sink.insert_stream(stream, from_format, [])
+ if expect_pack_called:
+ self.assertLength(1, self.calls)
+ else:
+ self.assertLength(0, self.calls)
+
+ def run_fetch(self, src_fmt, target_fmt, expect_pack_called):
+ self.expect_hint = expect_pack_called
+ self.calls = []
+ source_tree = self.make_branch_and_tree('src', format=src_fmt)
+ source_tree.lock_write()
+ self.addCleanup(source_tree.unlock)
+ tip = source_tree.commit('foo')
+ target = self.make_repository('target', format=target_fmt)
+ target.lock_write()
+ self.addCleanup(target.unlock)
+ source = source_tree.branch.repository
+ self.orig_pack = target.pack
+ target.pack = self.log_pack
+ target.fetch(source)
+ if expect_pack_called:
+ self.assertLength(1, self.calls)
+ else:
+ self.assertLength(0, self.calls)
+
+ def test_sink_format_hint_no(self):
+ # When the target format says packing makes no difference, pack is not
+ # called.
+ self.run_stream('1.9', 'rich-root-pack', False)
+
+ def test_sink_format_hint_yes(self):
+ # When the target format says packing makes a difference, pack is
+ # called.
+ self.run_stream('1.9', '2a', True)
+
+ def test_sink_format_same_no(self):
+ # When the formats are the same, pack is not called.
+ self.run_stream('2a', '2a', False)
+
+ def test_IDS_format_hint_no(self):
+ # When the target format says packing makes no difference, pack is not
+ # called.
+ self.run_fetch('1.9', 'rich-root-pack', False)
+
+ def test_IDS_format_hint_yes(self):
+ # When the target format says packing makes a difference, pack is
+ # called.
+ self.run_fetch('1.9', '2a', True)
+
+ def test_IDS_format_same_no(self):
+ # When the formats are the same, pack is not called.
+ self.run_fetch('2a', '2a', False)
More information about the bazaar-commits
mailing list