Rev 4200: (andrew) Buffer writes when pushing to a pack repository on a in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Wed Mar 25 02:03:47 GMT 2009
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 4200
revision-id: pqm at pqm.ubuntu.com-20090325020341-dmq0yek061gtungf
parent: pqm at pqm.ubuntu.com-20090324231912-rb0kgktzkvge8aea
parent: andrew.bennetts at canonical.com-20090324222046-mhx6gqyu7qm4ngkt
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Wed 2009-03-25 02:03:41 +0000
message:
(andrew) Buffer writes when pushing to a pack repository on a
pre-1.12 smart server.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/knit.py knit.py-20051212171256-f056ac8f0fbe1bd9
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
------------------------------------------------------------
revno: 4187.3.6
revision-id: andrew.bennetts at canonical.com-20090324222046-mhx6gqyu7qm4ngkt
parent: andrew.bennetts at canonical.com-20090324024839-oiwvj7rkob17on3x
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: remote-pack-hack
timestamp: Wed 2009-03-25 09:20:46 +1100
message:
Move the flush in KnitVersionedFiles.insert_record_stream so that it covers the add_lines call of the fallback case, not just the adapter.get_bytes.
modified:
bzrlib/knit.py knit.py-20051212171256-f056ac8f0fbe1bd9
------------------------------------------------------------
revno: 4187.3.5
revision-id: andrew.bennetts at canonical.com-20090324024839-oiwvj7rkob17on3x
parent: andrew.bennetts at canonical.com-20090324024609-fo3q9opym5srqudy
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: remote-pack-hack
timestamp: Tue 2009-03-24 13:48:39 +1100
message:
Add NEWS entry.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
------------------------------------------------------------
revno: 4187.3.4
revision-id: andrew.bennetts at canonical.com-20090324024609-fo3q9opym5srqudy
parent: andrew.bennetts at canonical.com-20090324023447-5m39wlirz4kzj16a
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: remote-pack-hack
timestamp: Tue 2009-03-24 13:46:09 +1100
message:
Better docstrings and comments.
modified:
bzrlib/knit.py knit.py-20051212171256-f056ac8f0fbe1bd9
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
------------------------------------------------------------
revno: 4187.3.3
revision-id: andrew.bennetts at canonical.com-20090324023447-5m39wlirz4kzj16a
parent: andrew.bennetts at canonical.com-20090324020946-h3vkfs75tq0ghqul
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: remote-pack-hack
timestamp: Tue 2009-03-24 13:34:47 +1100
message:
In KnitVersionedFiles.insert_record_stream, flush the access object before expanding a delta into a fulltext.
modified:
bzrlib/knit.py knit.py-20051212171256-f056ac8f0fbe1bd9
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
------------------------------------------------------------
revno: 4187.3.2
revision-id: andrew.bennetts at canonical.com-20090324020946-h3vkfs75tq0ghqul
parent: andrew.bennetts at canonical.com-20090324020232-ndz0vognhdqsbj2o
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: remote-pack-hack
timestamp: Tue 2009-03-24 13:09:46 +1100
message:
Only enable the hack when the serializers match, otherwise we cause ShortReadvErrors.
modified:
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
------------------------------------------------------------
revno: 4187.3.1
revision-id: andrew.bennetts at canonical.com-20090324020232-ndz0vognhdqsbj2o
parent: pqm at pqm.ubuntu.com-20090323202515-uwlqu9w037ndukz4
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: remote-pack-hack
timestamp: Tue 2009-03-24 13:02:32 +1100
message:
Add set_write_cache_size hack in StreamSink to avoid too many round trips with old HPSS servers.
modified:
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
=== modified file 'NEWS'
--- a/NEWS 2009-03-24 23:19:12 +0000
+++ b/NEWS 2009-03-25 02:03:41 +0000
@@ -67,6 +67,10 @@
* Progress bars now show the rate of network activity for
``bzr+ssh://`` and ``bzr://`` connections. (Andrew Bennetts)
+* Pushing to a stacked pack-format branch on a 1.12 or older smart server
+ now takes many less round trips. (Andrew Bennetts, Robert Collins,
+ #294479)
+
* Streaming push can be done to older repository formats. This is
implemented using a new ``Repository.insert_stream_locked`` RPC.
(Andrew Bennetts, Robert Collins)
=== modified file 'bzrlib/knit.py'
--- a/bzrlib/knit.py 2009-03-23 14:59:43 +0000
+++ b/bzrlib/knit.py 2009-03-25 02:03:41 +0000
@@ -1609,6 +1609,7 @@
# KnitVersionedFiles doesn't permit deltas (_max_delta_chain ==
# 0) or because it depends on a base only present in the
# fallback kvfs.
+ self._access.flush()
try:
# Try getting a fulltext directly from the record.
bytes = record.get_bytes_as('fulltext')
@@ -3049,6 +3050,13 @@
result.append((key, base, size))
return result
+ def flush(self):
+ """Flush pending writes on this access object.
+
+ For .knit files this is a no-op.
+ """
+ pass
+
def get_raw_records(self, memos_for_retrieval):
"""Get the raw bytes for a records.
@@ -3079,7 +3087,7 @@
class _DirectPackAccess(object):
"""Access to data in one or more packs with less translation."""
- def __init__(self, index_to_packs, reload_func=None):
+ def __init__(self, index_to_packs, reload_func=None, flush_func=None):
"""Create a _DirectPackAccess object.
:param index_to_packs: A dict mapping index objects to the transport
@@ -3092,6 +3100,7 @@
self._write_index = None
self._indices = index_to_packs
self._reload_func = reload_func
+ self._flush_func = flush_func
def add_raw_records(self, key_sizes, raw_data):
"""Add raw knit bytes to a storage area.
@@ -3119,6 +3128,14 @@
result.append((self._write_index, p_offset, p_length))
return result
+ def flush(self):
+ """Flush pending writes on this access object.
+
+ This will flush any buffered writes to a NewPack.
+ """
+ if self._flush_func is not None:
+ self._flush_func()
+
def get_raw_records(self, memos_for_retrieval):
"""Get the raw bytes for a records.
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py 2009-03-24 01:53:42 +0000
+++ b/bzrlib/repofmt/pack_repo.py 2009-03-25 02:03:41 +0000
@@ -532,7 +532,7 @@
# XXX: Probably 'can be written to' could/should be separated from 'acts
# like a knit index' -- mbp 20071024
- def __init__(self, reload_func=None):
+ def __init__(self, reload_func=None, flush_func=None):
"""Create an AggregateIndex.
:param reload_func: A function to call if we find we are missing an
@@ -543,7 +543,8 @@
self.index_to_pack = {}
self.combined_index = CombinedGraphIndex([], reload_func=reload_func)
self.data_access = _DirectPackAccess(self.index_to_pack,
- reload_func=reload_func)
+ reload_func=reload_func,
+ flush_func=flush_func)
self.add_callback = None
def replace_indices(self, index_to_pack, indices):
@@ -1322,10 +1323,11 @@
# when a pack is being created by this object, the state of that pack.
self._new_pack = None
# aggregated revision index data
- self.revision_index = AggregateIndex(self.reload_pack_names)
- self.inventory_index = AggregateIndex(self.reload_pack_names)
- self.text_index = AggregateIndex(self.reload_pack_names)
- self.signature_index = AggregateIndex(self.reload_pack_names)
+ flush = self._flush_new_pack
+ self.revision_index = AggregateIndex(self.reload_pack_names, flush)
+ self.inventory_index = AggregateIndex(self.reload_pack_names, flush)
+ self.text_index = AggregateIndex(self.reload_pack_names, flush)
+ self.signature_index = AggregateIndex(self.reload_pack_names, flush)
# resumed packs
self._resumed_packs = []
@@ -1452,6 +1454,10 @@
for revision_count, packs in pack_operations:
self._obsolete_packs(packs)
+ def _flush_new_pack(self):
+ if self._new_pack is not None:
+ self._new_pack.flush()
+
def lock_names(self):
"""Acquire the mutex around the pack-names index.
=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py 2009-03-24 01:53:42 +0000
+++ b/bzrlib/repository.py 2009-03-25 02:03:41 +0000
@@ -3750,6 +3750,24 @@
def _locked_insert_stream(self, stream, src_format):
to_serializer = self.target_repo._format._serializer
src_serializer = src_format._serializer
+ if to_serializer == src_serializer:
+ # If serializers match and the target is a pack repository, set the
+ # write cache size on the new pack. This avoids poor performance
+ # on transports where append is unbuffered (such as
+ # RemoteTransport). This is safe to do because nothing should read
+ # back from the target repository while a stream with matching
+ # serialization is being inserted.
+ # The exception is that a delta record from the source that should
+ # be a fulltext may need to be expanded by the target (see
+ # test_fetch_revisions_with_deltas_into_pack); but we take care to
+ # explicitly flush any buffered writes first in that rare case.
+ try:
+ new_pack = self.target_repo._pack_collection._new_pack
+ except AttributeError:
+ # Not a pack repository
+ pass
+ else:
+ new_pack.set_write_cache_size(1024*1024)
for substream_type, substream in stream:
if substream_type == 'texts':
self.target_repo.texts.insert_record_stream(substream)
More information about the bazaar-commits
mailing list