Rev 2885: (robertc) Transport improvements from packs: new trace+ transport, and readv adjust_for_latency feature, as well as test harness cleanup. (Robert Collins) in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Thu Oct 4 08:12:16 BST 2007


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 2885
revision-id: pqm at pqm.ubuntu.com-20071004071214-i0icltanhq59qtwt
parent: pqm at pqm.ubuntu.com-20071004063818-ymke1egbfap2m31x
parent: robertc at robertcollins.net-20071004053008-aquje8fyntc8q69h
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Thu 2007-10-04 08:12:14 +0100
message:
  (robertc) Transport improvements from packs: new trace+ transport, and readv adjust_for_latency feature, as well as test harness cleanup. (Robert Collins)
added:
  bzrlib/transport/trace.py      trace.py-20070828055009-7kt0bbc4t4b92apz-1
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/registry.py             lazy_factory.py-20060809213415-2gfvqadtvdn0phtg-1
  bzrlib/tests/test_knit.py      test_knit.py-20051212171302-95d4c00dd5f11f2b
  bzrlib/tests/test_transport.py testtransport.py-20050718175618-e5cdb99f4555ddce
  bzrlib/tests/test_transport_implementations.py test_transport_implementations.py-20051227111451-f97c5c7d5c49fce7
  bzrlib/transport/__init__.py   transport.py-20050711165921-4978aa7ce1285ad5
  bzrlib/transport/decorator.py  decorator.py-20060402223305-e913a0f25319ab42
  bzrlib/transport/fakevfat.py   fakevfat.py-20060407072414-d59939fa1d6c79d9
  bzrlib/transport/http/__init__.py http_transport.py-20050711212304-506c5fd1059ace96
  bzrlib/transport/remote.py     ssh.py-20060608202016-c25gvf1ob7ypbus6-1
  bzrlib/transport/sftp.py       sftp.py-20051019050329-ab48ce71b7e32dfe
    ------------------------------------------------------------
    revno: 2745.5.6
    merged: robertc at robertcollins.net-20071004053008-aquje8fyntc8q69h
    parent: robertc at robertcollins.net-20071004050958-3uxw4hyf0f6xq7fy
    parent: robertc at robertcollins.net-20071004030526-25d0wxbbvqtjezau
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: transport-get-file
    timestamp: Thu 2007-10-04 15:30:08 +1000
    message:
      Fix knit test fallout from final readv api change.
    ------------------------------------------------------------
    revno: 2745.5.5
    merged: robertc at robertcollins.net-20071004050958-3uxw4hyf0f6xq7fy
    parent: robertc at robertcollins.net-20071004045943-1wxlrsr37yppwp64
    parent: pqm at pqm.ubuntu.com-20071003232450-c831pepea3skddct
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: transport-get-file
    timestamp: Thu 2007-10-04 15:09:58 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2745.5.4
    merged: robertc at robertcollins.net-20071004045943-1wxlrsr37yppwp64
    parent: robertc at robertcollins.net-20071002053339-vnyjf4jrxv0jeekf
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: transport-get-file
    timestamp: Thu 2007-10-04 14:59:43 +1000
    message:
      Review feedback.
    ------------------------------------------------------------
    revno: 2745.5.3
    merged: robertc at robertcollins.net-20071002053339-vnyjf4jrxv0jeekf
    parent: robertc at robertcollins.net-20070827041949-br263tkuxayldxoc
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: transport-get-file
    timestamp: Tue 2007-10-02 15:33:39 +1000
    message:
      * Move transport logging into a new transport class
        TransportTraceDecorator (trace+ to get it from a url).
      * Give Registry a useful __repr__.
      * Fix a bug introduced by the change to use a Registry for transport where
        the transport loading tests left global state behind due to the
        _get_protocol_handlers method returning a reference, not a value object.
      * Add an upper_limit parameter to readv, because when asking for byte
        ranges within the latency-adjustment window near the end of the file
        causes errors that are tricky to manage.
      * A few minor drive-by formatting fixes.
      * The TransportDecorator constructor now accepts a _from_transport
        parameter for decorators that want to share state (used by the trace
        decorator)
    ------------------------------------------------------------
    revno: 2745.5.2
    merged: robertc at robertcollins.net-20070827041949-br263tkuxayldxoc
    parent: robertc at robertcollins.net-20070826221051-46uq33p3oqkscdd0
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: transport-get-file
    timestamp: Mon 2007-08-27 14:19:49 +1000
    message:
      * ``bzrlib.transport.Transport.put_file`` now returns the number of bytes
        put by the method call, to allow avoiding stat-after write or
        housekeeping in callers. (Robert Collins)
    ------------------------------------------------------------
    revno: 2745.5.1
    merged: robertc at robertcollins.net-20070826221051-46uq33p3oqkscdd0
    parent: pqm at pqm.ubuntu.com-20070823005013-ada9x55rc31yiwou
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: transport-get-file
    timestamp: Mon 2007-08-27 08:10:51 +1000
    message:
      * New parameter on ``bzrlib.transport.Transport.readv``
        ``adjust_for_latency`` which changes readv from returning strictly the
        requested data to inserted return larger ranges and in forward read order
        to reduce the effect of network latency. (Robert Collins)
=== added file 'bzrlib/transport/trace.py'
--- a/bzrlib/transport/trace.py	1970-01-01 00:00:00 +0000
+++ b/bzrlib/transport/trace.py	2007-10-04 04:59:43 +0000
@@ -0,0 +1,162 @@
+# Copyright (C) 2007 Canonical Ltd
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+
+"""Implementation of Transport that traces transport operations.
+
+This does not change the transport behaviour at all, merely records every call
+and then delegates it.
+"""
+
+from bzrlib.transport.decorator import TransportDecorator, DecoratorServer
+
+
+class TransportTraceDecorator(TransportDecorator):
+    """A tracing decorator for Transports.
+
+    Calls that potentially perform IO are logged to self._activity. The
+    _activity attribute is shared as the transport is cloned, but not if a new
+    transport is created without cloning.
+
+    Not all operations are logged at this point, if you need an unlogged
+    operation please add a test to the tests of this transport, for the logging
+    of the operation you want logged.
+
+    Another future enhancement would be to log to bzrlib.trace.mutter when
+    trace+ is used from the command line (or perhaps as well/instead use
+    -Dtransport), to make tracing operations of the entire program easily.
+    """
+
+    def __init__(self, url, _decorated=None, _from_transport=None):
+        """Set the 'base' path where files will be stored.
+        
+        _decorated is a private parameter for cloning.
+        """
+        TransportDecorator.__init__(self, url, _decorated)
+        if _from_transport is None:
+            # newly created
+            self._activity = []
+        else:
+            # cloned
+            self._activity = _from_transport._activity
+
+    def append_file(self, relpath, f, mode=None):
+        """See Transport.append_file()."""
+        return self._decorated.append_file(relpath, f, mode=mode)
+
+    def append_bytes(self, relpath, bytes, mode=None):
+        """See Transport.append_bytes()."""
+        return self._decorated.append_bytes(relpath, bytes, mode=mode)
+
+    def delete(self, relpath):
+        """See Transport.delete()."""
+        return self._decorated.delete(relpath)
+
+    def delete_tree(self, relpath):
+        """See Transport.delete_tree()."""
+        return self._decorated.delete_tree(relpath)
+
+    @classmethod
+    def _get_url_prefix(self):
+        """Tracing transports are identified by 'trace+'"""
+        return 'trace+'
+
+    def get(self, relpath):
+        """See Transport.get()."""
+        self._activity.append(('get', relpath))
+        return self._decorated.get(relpath)
+
+    def get_smart_client(self):
+        return self._decorated.get_smart_client()
+
+    def has(self, relpath):
+        """See Transport.has()."""
+        return self._decorated.has(relpath)
+
+    def is_readonly(self):
+        """See Transport.is_readonly."""
+        return self._decorated.is_readonly()
+
+    def mkdir(self, relpath, mode=None):
+        """See Transport.mkdir()."""
+        return self._decorated.mkdir(relpath, mode)
+
+    def open_write_stream(self, relpath, mode=None):
+        """See Transport.open_write_stream."""
+        return self._decorated.open_write_stream(relpath, mode=mode)
+
+    def put_file(self, relpath, f, mode=None):
+        """See Transport.put_file()."""
+        return self._decorated.put_file(relpath, f, mode)
+    
+    def put_bytes(self, relpath, bytes, mode=None):
+        """See Transport.put_bytes()."""
+        self._activity.append(('put_bytes', relpath, len(bytes), mode))
+        return self._decorated.put_bytes(relpath, bytes, mode)
+
+    def listable(self):
+        """See Transport.listable."""
+        return self._decorated.listable()
+
+    def iter_files_recursive(self):
+        """See Transport.iter_files_recursive()."""
+        return self._decorated.iter_files_recursive()
+    
+    def list_dir(self, relpath):
+        """See Transport.list_dir()."""
+        return self._decorated.list_dir(relpath)
+
+    def readv(self, relpath, offsets, adjust_for_latency=False,
+        upper_limit=None):
+        """See Transport.readv."""
+        self._activity.append(('readv', relpath, offsets, adjust_for_latency,
+            upper_limit))
+        return self._decorated.readv(relpath, offsets, adjust_for_latency,
+            upper_limit)
+
+    def recommended_page_size(self):
+        """See Transport.recommended_page_size()."""
+        return self._decorated.recommended_page_size()
+
+    def rename(self, rel_from, rel_to):
+        return self._decorated.rename(rel_from, rel_to)
+    
+    def rmdir(self, relpath):
+        """See Transport.rmdir."""
+        return self._decorated.rmdir(relpath)
+
+    def stat(self, relpath):
+        """See Transport.stat()."""
+        return self._decorated.stat(relpath)
+
+    def lock_read(self, relpath):
+        """See Transport.lock_read."""
+        return self._decorated.lock_read(relpath)
+
+    def lock_write(self, relpath):
+        """See Transport.lock_write."""
+        return self._decorated.lock_write(relpath)
+
+
+class TraceServer(DecoratorServer):
+    """Server for the TransportTraceDecorator for testing with."""
+
+    def get_decorator_class(self):
+        return TransportTraceDecorator
+
+
+def get_test_permutations():
+    """Return the permutations to be used in testing."""
+    return [(TransportTraceDecorator, TraceServer)]

=== modified file 'NEWS'
--- a/NEWS	2007-10-04 03:05:26 +0000
+++ b/NEWS	2007-10-04 05:30:08 +0000
@@ -127,6 +127,9 @@
      enable_cache() has been called - the caching feature is now exclusively for
      reading existing data. (Robert Collins)
 
+   * Removed ``bzrlib.transport.TransportLogger`` - please see the new
+     ``trace+`` transport instead. (Robert Collins)
+
    * Removed previously deprecated varargs interface to ``TestCase.run_bzr`` and
      deprecated methods ``TestCase.capture`` and ``TestCase.run_bzr_captured``.
      (Martin Pool)
@@ -155,9 +158,9 @@
 
   INTERNALS:
 
-    * ``bzrlib.transport.Transport.put_file`` now returns the number of bytes
-      put by the method call, to allow avoiding stat-after-write or
-      housekeeping in callers. (Robert Collins)
+   * ``bzrlib.transport.Transport.put_file`` now returns the number of bytes
+     put by the method call, to allow avoiding stat-after-write or
+     housekeeping in callers. (Robert Collins)
 
    * New class ``bzrlib.errors.InternalBzrError`` which is just a convenient
      shorthand for deriving from BzrError and setting internal_error = True.
@@ -182,6 +185,11 @@
      and returns a gzipped version of the same. This is used to avoid a bunch
      of api friction during adding of knit hunks. (Robert Collins)
 
+   * New parameter on ``bzrlib.transport.Transport.readv``
+     ``adjust_for_latency`` which changes readv from returning strictly the
+     requested data to inserted return larger ranges and in forward read order
+     to reduce the effect of network latency. (Robert Collins)
+
    * New parameter yield_parents on ``Inventory.iter_entries_by_dir`` which
      causes the parents of a selected id to be returned recursively, so all the
      paths from the root down to each element of selected_file_ids are
@@ -198,6 +206,9 @@
 
   TESTING:
 
+   * New transport implementation ``trace+`` which is useful for testing,
+     logging activity taken to its _activity attribute. (Robert Collins)
+
    * When running bzr commands within the test suite, internal exceptions are
      not caught and reported in the usual way, but rather allowed to propagate
      up and be visible to the test suite.  A new API ``run_bzr_catch_user_errors``

=== modified file 'bzrlib/registry.py'
--- a/bzrlib/registry.py	2007-01-25 00:35:22 +0000
+++ b/bzrlib/registry.py	2007-10-02 05:33:39 +0000
@@ -69,6 +69,11 @@
         self._obj = obj
         self._imported = True
 
+    def __repr__(self):
+        return "<%s.%s object at %x, module=%r attribute=%r>" % (
+            self.__class__.__module__, self.__class__.__name__, id(self),
+            self._module_name, self._member_name)
+
 
 class Registry(object):
     """A class that registers objects to a name.

=== modified file 'bzrlib/tests/test_knit.py'
--- a/bzrlib/tests/test_knit.py	2007-09-25 06:41:04 +0000
+++ b/bzrlib/tests/test_knit.py	2007-10-04 05:30:08 +0000
@@ -57,7 +57,7 @@
     TestCaseWithMemoryTransport,
     TestCaseWithTransport,
     )
-from bzrlib.transport import TransportLogger, get_transport
+from bzrlib.transport import get_transport
 from bzrlib.transport.memory import MemoryTransport
 from bzrlib.util import bencode
 from bzrlib.weave import Weave
@@ -1390,18 +1390,22 @@
         k1.get_texts(('%d' % t) for t in range(3))
         
     def test_iter_lines_reads_in_order(self):
-        t = MemoryTransport()
-        instrumented_t = TransportLogger(t)
+        instrumented_t = get_transport('trace+memory:///')
         k1 = KnitVersionedFile('id', instrumented_t, create=True, delta=True)
-        self.assertEqual([('id.kndx',)], instrumented_t._calls)
+        self.assertEqual([('get', 'id.kndx',)], instrumented_t._activity)
         # add texts with no required ordering
         k1.add_lines('base', [], ['text\n'])
         k1.add_lines('base2', [], ['text2\n'])
         k1.clear_cache()
-        instrumented_t._calls = []
+        # clear the logged activity, but preserve the list instance in case of
+        # clones pointing at it.
+        del instrumented_t._activity[:]
         # request a last-first iteration
-        results = list(k1.iter_lines_added_or_present_in_versions(['base2', 'base']))
-        self.assertEqual([('id.knit', [(0, 87), (87, 89)])], instrumented_t._calls)
+        results = list(k1.iter_lines_added_or_present_in_versions(
+            ['base2', 'base']))
+        self.assertEqual(
+            [('readv', 'id.knit', [(0, 87), (87, 89)], False, None)],
+            instrumented_t._activity)
         self.assertEqual(['text\n', 'text2\n'], results)
 
     def test_create_empty_annotated(self):

=== modified file 'bzrlib/tests/test_transport.py'
--- a/bzrlib/tests/test_transport.py	2007-10-04 05:50:44 +0000
+++ b/bzrlib/tests/test_transport.py	2007-10-04 07:12:14 +0000
@@ -39,7 +39,8 @@
                            UnsupportedProtocol,
                            )
 from bzrlib.tests import TestCase, TestCaseInTempDir
-from bzrlib.transport import (_CoalescedOffset,
+from bzrlib.transport import (_clear_protocol_handlers,
+                              _CoalescedOffset,
                               ConnectedTransport,
                               _get_protocol_handlers,
                               _set_protocol_handlers,
@@ -78,6 +79,8 @@
 
     def test_get_transport_modules(self):
         handlers = _get_protocol_handlers()
+        # don't pollute the current handlers
+        _clear_protocol_handlers()
         class SampleHandler(object):
             """I exist, isnt that enough?"""
         try:
@@ -94,6 +97,8 @@
     def test_transport_dependency(self):
         """Transport with missing dependency causes no error"""
         saved_handlers = _get_protocol_handlers()
+        # don't pollute the current handlers
+        _clear_protocol_handlers()
         try:
             register_transport_proto('foo')
             register_lazy_transport('foo', 'bzrlib.tests.test_transport',
@@ -760,12 +765,44 @@
         self.assertEquals('bzr://host.com:666/', t.base)
 
 
-
-def get_test_permutations():
-    """Return transport permutations to be used in testing.
-
-    This module registers some transports, but they're only for testing
-    registration.  We don't really want to run all the transport tests against
-    them.
-    """
-    return []
+class TestTransportTrace(TestCase):
+
+    def test_get(self):
+        transport = get_transport('trace+memory://')
+        self.assertIsInstance(
+            transport, bzrlib.transport.trace.TransportTraceDecorator)
+
+    def test_clone_preserves_activity(self):
+        transport = get_transport('trace+memory://')
+        transport2 = transport.clone('.')
+        self.assertTrue(transport is not transport2)
+        self.assertTrue(transport._activity is transport2._activity)
+
+    # the following specific tests are for the operations that have made use of
+    # logging in tests; we could test every single operation but doing that
+    # still won't cause a test failure when the top level Transport API
+    # changes; so there is little return doing that.
+    def test_get(self):
+        transport = get_transport('trace+memory:///')
+        transport.put_bytes('foo', 'barish')
+        transport.get('foo')
+        expected_result = []
+        # put_bytes records the bytes, not the content to avoid memory
+        # pressure.
+        expected_result.append(('put_bytes', 'foo', 6, None))
+        # get records the file name only.
+        expected_result.append(('get', 'foo'))
+        self.assertEqual(expected_result, transport._activity)
+
+    def test_readv(self):
+        transport = get_transport('trace+memory:///')
+        transport.put_bytes('foo', 'barish')
+        list(transport.readv('foo', [(0, 1), (3, 2)], adjust_for_latency=True,
+            upper_limit=6))
+        expected_result = []
+        # put_bytes records the bytes, not the content to avoid memory
+        # pressure.
+        expected_result.append(('put_bytes', 'foo', 6, None))
+        # readv records the supplied offset request
+        expected_result.append(('readv', 'foo', [(0, 1), (3, 2)], True, 6))
+        self.assertEqual(expected_result, transport._activity)

=== modified file 'bzrlib/tests/test_transport_implementations.py'
--- a/bzrlib/tests/test_transport_implementations.py	2007-09-24 20:13:29 +0000
+++ b/bzrlib/tests/test_transport_implementations.py	2007-10-04 05:09:58 +0000
@@ -1488,6 +1488,71 @@
         self.assertEqual(d[2], (0, '0'))
         self.assertEqual(d[3], (3, '34'))
 
+    def test_readv_with_adjust_for_latency(self):
+        transport = self.get_transport()
+        # the adjust for latency flag expands the data region returned
+        # according to a per-transport heuristic, so testing is a little
+        # tricky as we need more data than the largest combining that our
+        # transports do. To accomodate this we generate random data and cross
+        # reference the returned data with the random data. To avoid doing
+        # multiple large random byte look ups we do several tests on the same
+        # backing data.
+        content = osutils.rand_bytes(200*1024)
+        content_size = len(content)
+        if transport.is_readonly():
+            file('a', 'w').write(content)
+        else:
+            transport.put_bytes('a', content)
+        def check_result_data(result_vector):
+            for item in result_vector:
+                data_len = len(item[1])
+                self.assertEqual(content[item[0]:item[0] + data_len], item[1])
+
+        # start corner case
+        result = list(transport.readv('a', ((0, 30),),
+            adjust_for_latency=True, upper_limit=content_size))
+        # we expect 1 result, from 0, to something > 30
+        self.assertEqual(1, len(result))
+        self.assertEqual(0, result[0][0])
+        self.assertTrue(len(result[0][1]) >= 30)
+        check_result_data(result)
+        # end of file corner case
+        result = list(transport.readv('a', ((204700, 100),),
+            adjust_for_latency=True, upper_limit=content_size))
+        # we expect 1 result, from 204800- its length, to the end
+        self.assertEqual(1, len(result))
+        data_len = len(result[0][1])
+        self.assertEqual(204800-data_len, result[0][0])
+        self.assertTrue(data_len >= 100)
+        check_result_data(result)
+        # out of order ranges are made in order
+        result = list(transport.readv('a', ((204700, 100), (0, 50)),
+            adjust_for_latency=True, upper_limit=content_size))
+        # we expect 2 results, in order, start and end.
+        self.assertEqual(2, len(result))
+        # start
+        data_len = len(result[0][1])
+        self.assertEqual(0, result[0][0])
+        self.assertTrue(data_len >= 30)
+        # end
+        data_len = len(result[1][1])
+        self.assertEqual(204800-data_len, result[1][0])
+        self.assertTrue(data_len >= 100)
+        check_result_data(result)
+        # close ranges get combined (even if out of order)
+        for request_vector in [((400,50), (800, 234)), ((800, 234), (400,50))]:
+            result = list(transport.readv('a', request_vector,
+                adjust_for_latency=True, upper_limit=content_size))
+            self.assertEqual(1, len(result))
+            data_len = len(result[0][1])
+            # minimmum length is from 400 to 1034 - 634
+            self.assertTrue(data_len >= 634)
+            # must contain the region 400 to 1034
+            self.assertTrue(result[0][0] <= 400)
+            self.assertTrue(result[0][0] + data_len >= 1034)
+            check_result_data(result)
+        
+
     def test_get_with_open_write_stream_sees_all_content(self):
         t = self.get_transport()
         if t.is_readonly():

=== modified file 'bzrlib/transport/__init__.py'
--- a/bzrlib/transport/__init__.py	2007-10-03 05:25:50 +0000
+++ b/bzrlib/transport/__init__.py	2007-10-04 05:09:58 +0000
@@ -99,7 +99,7 @@
                 modules.add(factory._module_name)
             else:
                 modules.add(factory._obj.__module__)
-    # Add chroot directly, because there is not handler registered for it.
+    # Add chroot directly, because there is no handler registered for it.
     modules.add('bzrlib.transport.chroot')
     result = list(modules)
     result.sort()
@@ -125,7 +125,7 @@
         self.get(key).insert(0, registry._ObjectGetter(obj))
 
     def register_lazy_transport_provider(self, key, module_name, member_name):
-        self.get(key).insert(0, 
+        self.get(key).insert(0,
                 registry._LazyObjectGetter(module_name, member_name))
 
     def register_transport(self, key, help=None, default_port=None):
@@ -143,7 +143,7 @@
             return None
 
 
-transport_list_registry = TransportListRegistry( )
+transport_list_registry = TransportListRegistry()
 
 
 def register_transport_proto(prefix, help=None, info=None, default_port=None):
@@ -642,10 +642,72 @@
         """
         raise errors.NoSmartMedium(self)
 
-    def readv(self, relpath, offsets):
-        """Get parts of the file at the given relative path.
-
-        :offsets: A list of (offset, size) tuples.
+    def readv(self, relpath, offsets, adjust_for_latency=False,
+        upper_limit=None):
+        """Get parts of the file at the given relative path.
+
+        :param relpath: The path to read data from.
+        :param offsets: A list of (offset, size) tuples.
+        :param adjust_for_latency: Adjust the requested offsets to accomdate
+            transport latency. This may re-order the offsets, expand them to
+            grab adjacent data when there is likely a high cost to requesting
+            data relative to delivering it.
+        :param upper_limit: When adjust_for_latency is True setting upper_limit
+            allows the caller to tell the transport about the length of the
+            file, so that requests are not issued for ranges beyond the end of
+            the file. This matters because some servers and/or transports error
+            in such a case rather than just satisfying the available ranges.
+            upper_limit should always be provided when adjust_for_latency is
+            True, and should be the size of the file in bytes.
+        :return: A list or generator of (offset, data) tuples
+        """
+        if adjust_for_latency:
+            offsets = sorted(offsets)
+            # short circuit empty requests
+            if len(offsets) == 0:
+                def empty_yielder():
+                    # Quick thunk to stop this function becoming a generator
+                    # itself, rather we return a generator that has nothing to
+                    # yield.
+                    if False:
+                        yield None
+                return empty_yielder()
+            # expand by page size at either end
+            expansion = self.recommended_page_size()
+            reduction = expansion / 2
+            new_offsets = []
+            for offset, length in offsets:
+                new_offset = offset - reduction
+                new_length = length + expansion
+                if new_offset < 0:
+                    # don't ask for anything < 0
+                    new_length -= new_offset
+                    new_offset = 0
+                if (upper_limit is not None and
+                    new_offset + new_length > upper_limit):
+                    new_length = upper_limit - new_offset
+                new_offsets.append((new_offset, new_length))
+            # combine the expanded offsets
+            offsets = []
+            current_offset, current_length = new_offsets[0]
+            current_finish = current_length + current_offset
+            for offset, length in new_offsets[1:]:
+                if offset > current_finish:
+                    offsets.append((current_offset, current_length))
+                    current_offset = offset
+                    current_length = length
+                    continue
+                finish = offset + length
+                if finish > current_finish:
+                    current_finish = finish
+            offsets.append((current_offset, current_length))
+        return self._readv(relpath, offsets)
+
+    def _readv(self, relpath, offsets):
+        """Get parts of the file at the given relative path.
+
+        :param relpath: The path to read.
+        :param offsets: A list of (offset, size) tuples.
         :return: A list or generator of (offset, data) tuples
         """
         if not offsets:
@@ -1609,30 +1671,6 @@
         raise NotImplementedError
 
 
-class TransportLogger(object):
-    """Adapt a transport to get clear logging data on api calls.
-    
-    Feel free to extend to log whatever calls are of interest.
-    """
-
-    def __init__(self, adapted):
-        self._adapted = adapted
-        self._calls = []
-
-    def get(self, name):
-        self._calls.append((name,))
-        return self._adapted.get(name)
-
-    def __getattr__(self, name):
-        """Thunk all undefined access through to self._adapted."""
-        # raise AttributeError, name 
-        return getattr(self._adapted, name)
-
-    def readv(self, name, offsets):
-        self._calls.append((name, offsets))
-        return self._adapted.readv(name, offsets)
-
-
 # None is the default transport, for things with no url scheme
 register_transport_proto('file://',
             help="Access using the standard filesystem (default)")
@@ -1689,6 +1727,8 @@
 
 register_transport_proto('memory://')
 register_lazy_transport('memory://', 'bzrlib.transport.memory', 'MemoryTransport')
+
+# chroots cannot be implicitly accessed, they must be explicitly created:
 register_transport_proto('chroot+')
 
 register_transport_proto('readonly+',
@@ -1699,6 +1739,9 @@
 register_transport_proto('fakenfs+')
 register_lazy_transport('fakenfs+', 'bzrlib.transport.fakenfs', 'FakeNFSTransportDecorator')
 
+register_transport_proto('trace+')
+register_lazy_transport('trace+', 'bzrlib.transport.trace', 'TransportTraceDecorator')
+
 register_transport_proto('unlistable+')
 register_lazy_transport('unlistable+', 'bzrlib.transport.unlistable', 'UnlistableTransportDecorator')
 

=== modified file 'bzrlib/transport/decorator.py'
--- a/bzrlib/transport/decorator.py	2007-08-22 01:41:24 +0000
+++ b/bzrlib/transport/decorator.py	2007-10-02 05:33:39 +0000
@@ -37,10 +37,13 @@
     method to return the url prefix for the subclass.
     """
 
-    def __init__(self, url, _decorated=None):
-        """Set the 'base' path where files will be stored.
+    def __init__(self, url, _decorated=None, _from_transport=None):
+        """Set the 'base' path of the transport.
         
-        _decorated is a private parameter for cloning."""
+        :param _decorated: A private parameter for cloning.
+        :param _from_transport: Is available for subclasses that
+            need to share state across clones.
+        """
         prefix = self._get_url_prefix()
         assert url.startswith(prefix), \
                 "url %r doesn't start with decorator prefix %r" % \
@@ -73,7 +76,8 @@
         """See Transport.clone()."""
         decorated_clone = self._decorated.clone(offset)
         return self.__class__(
-            self._get_url_prefix() + decorated_clone.base, decorated_clone)
+            self._get_url_prefix() + decorated_clone.base, decorated_clone,
+            self)
 
     def delete(self, relpath):
         """See Transport.delete()."""
@@ -138,6 +142,10 @@
         """See Transport.list_dir()."""
         return self._decorated.list_dir(relpath)
 
+    def _readv(self, relpath, offsets):
+        """See Transport._readv."""
+        return self._decorated._readv(relpath, offsets)
+
     def recommended_page_size(self):
         """See Transport.recommended_page_size()."""
         return self._decorated.recommended_page_size()

=== modified file 'bzrlib/transport/fakevfat.py'
--- a/bzrlib/transport/fakevfat.py	2007-08-15 06:53:07 +0000
+++ b/bzrlib/transport/fakevfat.py	2007-08-26 22:10:51 +0000
@@ -92,7 +92,7 @@
     def has(self, relpath):
         return self._decorated.has(self._squash_name(relpath))
 
-    def readv(self, relpath, offsets):
+    def _readv(self, relpath, offsets):
         return self._decorated.readv(self._squash_name(relpath), offsets)
 
     def put_file(self, relpath, f, mode=None):

=== modified file 'bzrlib/transport/http/__init__.py'
--- a/bzrlib/transport/http/__init__.py	2007-08-05 01:47:30 +0000
+++ b/bzrlib/transport/http/__init__.py	2007-08-26 22:10:51 +0000
@@ -252,7 +252,7 @@
     # to avoid downloading the whole file.
     _max_readv_combined = 0
 
-    def readv(self, relpath, offsets):
+    def _readv(self, relpath, offsets):
         """Get parts of the file at the given relative path.
 
         :param offsets: A list of (offset, size) tuples.

=== modified file 'bzrlib/transport/remote.py'
--- a/bzrlib/transport/remote.py	2007-10-03 05:25:50 +0000
+++ b/bzrlib/transport/remote.py	2007-10-04 05:09:58 +0000
@@ -291,7 +291,7 @@
         # the external path for RemoteTransports is the base
         return self.base
 
-    def readv(self, relpath, offsets):
+    def _readv(self, relpath, offsets):
         if not offsets:
             return
 

=== modified file 'bzrlib/transport/sftp.py'
--- a/bzrlib/transport/sftp.py	2007-09-24 20:13:29 +0000
+++ b/bzrlib/transport/sftp.py	2007-10-04 05:09:58 +0000
@@ -247,7 +247,7 @@
             self._translate_io_exception(e, path, ': error retrieving',
                 failure_exc=errors.ReadError)
 
-    def readv(self, relpath, offsets):
+    def _readv(self, relpath, offsets):
         """See Transport.readv()"""
         # We overload the default readv() because we want to use a file
         # that does not have prefetch enabled.




More information about the bazaar-commits mailing list