Rev 2741: (robertc) Add two new transport methods to help pack repositories, get_recommended_page_size and open_write_stream. (Robert Collins). in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Wed Aug 22 03:49:22 BST 2007
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 2741
revision-id: pqm at pqm.ubuntu.com-20070822024917-nw7dh478y4d8cjeg
parent: pqm at pqm.ubuntu.com-20070822013256-6w9yisc450hwqf2b
parent: robertc at robertcollins.net-20070822014124-wiinlne4nin2f2tm
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Wed 2007-08-22 03:49:17 +0100
message:
(robertc) Add two new transport methods to help pack repositories, get_recommended_page_size and open_write_stream. (Robert Collins).
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/tests/test_transport_implementations.py test_transport_implementations.py-20051227111451-f97c5c7d5c49fce7
bzrlib/transport/__init__.py transport.py-20050711165921-4978aa7ce1285ad5
bzrlib/transport/chroot.py chroot.py-20061011104729-0us9mgm97z378vnt-1
bzrlib/transport/decorator.py decorator.py-20060402223305-e913a0f25319ab42
bzrlib/transport/fakevfat.py fakevfat.py-20060407072414-d59939fa1d6c79d9
bzrlib/transport/ftp.py ftp.py-20051116161804-58dc9506548c2a53
bzrlib/transport/http/__init__.py http_transport.py-20050711212304-506c5fd1059ace96
bzrlib/transport/local.py local_transport.py-20050711165921-9b1f142bfe480c24
bzrlib/transport/memory.py memory.py-20051016101338-cd008dbdf69f04fc
bzrlib/transport/remote.py ssh.py-20060608202016-c25gvf1ob7ypbus6-1
bzrlib/transport/sftp.py sftp.py-20051019050329-ab48ce71b7e32dfe
------------------------------------------------------------
revno: 2671.3.10
merged: robertc at robertcollins.net-20070822014124-wiinlne4nin2f2tm
parent: robertc at robertcollins.net-20070815065307-8xwdhnm2qmpi5nk2
parent: pqm at pqm.ubuntu.com-20070822013256-6w9yisc450hwqf2b
committer: Robert Collins <robertc at robertcollins.net>
branch nick: integration
timestamp: Wed 2007-08-22 11:41:24 +1000
message:
Merge bzr.dev to resolve conflicts.
------------------------------------------------------------
revno: 2671.3.9
merged: robertc at robertcollins.net-20070815065307-8xwdhnm2qmpi5nk2
parent: robertc at robertcollins.net-20070815012630-xqjtm5z2c4718n8s
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Wed 2007-08-15 16:53:07 +1000
message:
Review feedback and fix VFat emulated transports to not claim to have unix permissions.
------------------------------------------------------------
revno: 2671.3.8
merged: robertc at robertcollins.net-20070815012630-xqjtm5z2c4718n8s
parent: robertc at robertcollins.net-20070808071757-qfrx4dwms024ccy5
parent: pqm at pqm.ubuntu.com-20070814221506-6rw0b0oolfdeqrdw
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Wed 2007-08-15 11:26:30 +1000
message:
Merge bzr.dev.
------------------------------------------------------------
revno: 2671.3.7
merged: robertc at robertcollins.net-20070808071757-qfrx4dwms024ccy5
parent: robertc at robertcollins.net-20070808071618-4e1jopgxjj6g16ug
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Wed 2007-08-08 17:17:57 +1000
message:
Remove references to close_file_stream.
------------------------------------------------------------
revno: 2671.3.6
merged: robertc at robertcollins.net-20070808071618-4e1jopgxjj6g16ug
parent: robertc at robertcollins.net-20070805081501-ipg5fapwuigozr50
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Wed 2007-08-08 17:16:18 +1000
message:
Review feedback.
------------------------------------------------------------
revno: 2671.3.5
merged: robertc at robertcollins.net-20070805081501-ipg5fapwuigozr50
parent: robertc at robertcollins.net-20070805055353-k382i5ur5no56nnx
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Sun 2007-08-05 18:15:01 +1000
message:
* New methods on ``bzrlib.transport.Transport`` ``open_file_stream`` and
``close_file_stream`` allow incremental addition of data to a file
without requiring that all the data be buffered in memory.
(Robert Collins)
------------------------------------------------------------
revno: 2671.3.4
merged: robertc at robertcollins.net-20070805055353-k382i5ur5no56nnx
parent: robertc at robertcollins.net-20070805053815-jeb19qdogkh5zrq5
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Sun 2007-08-05 15:53:53 +1000
message:
Sync up with open file streams on get/get_bytes.
------------------------------------------------------------
revno: 2671.3.3
merged: robertc at robertcollins.net-20070805053815-jeb19qdogkh5zrq5
parent: robertc at robertcollins.net-20070805025745-eg2qmr8jzsky39y2
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Sun 2007-08-05 15:38:15 +1000
message:
Add mode parameter to Transport.open_file_stream.
------------------------------------------------------------
revno: 2671.3.2
merged: robertc at robertcollins.net-20070805025745-eg2qmr8jzsky39y2
parent: robertc at robertcollins.net-20070805014730-qjx8zkquv3pagglo
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Sun 2007-08-05 12:57:45 +1000
message:
Start open_file_stream logic.
------------------------------------------------------------
revno: 2671.3.1
merged: robertc at robertcollins.net-20070805014730-qjx8zkquv3pagglo
parent: pqm at pqm.ubuntu.com-20070803043116-l7u1uypblmx1uxnr
committer: Robert Collins <robertc at robertcollins.net>
branch nick: transport-get-file
timestamp: Sun 2007-08-05 11:47:30 +1000
message:
* New method ``bzrlib.transport.Transport.get_recommended_page_size``.
This provides a hint to users of transports as to the reasonable
minimum data to read. In principle this can take latency and
bandwidth into account on a per-connection basis, but for now it
just has hard coded values based on the url. (e.g. http:// has a large
page size, file:// has a small one.) (Robert Collins)
=== modified file 'NEWS'
--- a/NEWS 2007-08-21 23:18:35 +0000
+++ b/NEWS 2007-08-22 01:41:24 +0000
@@ -64,6 +64,17 @@
* ``bzrlib.pack.ContainerWriter`` now tracks how many records have been
added via a public attribute records_written. (Robert Collins)
+ * New method ``bzrlib.transport.Transport.get_recommended_page_size``.
+ This provides a hint to users of transports as to the reasonable
+ minimum data to read. In principle this can take latency and
+ bandwidth into account on a per-connection basis, but for now it
+ just has hard coded values based on the url. (e.g. http:// has a large
+ page size, file:// has a small one.) (Robert Collins)
+
+ * New method on ``bzrlib.transport.Transport`` ``open_write_stream`` allows
+ incremental addition of data to a file without requiring that all the
+ data be buffered in memory. (Robert Collins)
+
bzr 0.90 2007-08-??
===================
=== modified file 'bzrlib/tests/test_transport_implementations.py'
--- a/bzrlib/tests/test_transport_implementations.py 2007-08-15 04:33:34 +0000
+++ b/bzrlib/tests/test_transport_implementations.py 2007-08-22 01:41:24 +0000
@@ -229,6 +229,29 @@
self.assertRaises(NoSuchFile, t.get_bytes, 'c')
+ def test_get_with_open_write_stream_sees_all_content(self):
+ t = self.get_transport()
+ if t.is_readonly():
+ return
+ handle = t.open_write_stream('foo')
+ try:
+ handle.write('b')
+ self.assertEqual('b', t.get('foo').read())
+ finally:
+ handle.close()
+
+ def test_get_bytes_with_open_write_stream_sees_all_content(self):
+ t = self.get_transport()
+ if t.is_readonly():
+ return
+ handle = t.open_write_stream('foo')
+ try:
+ handle.write('b')
+ self.assertEqual('b', t.get_bytes('foo'))
+ self.assertEqual('b', t.get('foo').read())
+ finally:
+ handle.close()
+
def test_put_bytes(self):
t = self.get_transport()
@@ -556,6 +579,33 @@
t.mkdir('dnomode', mode=None)
self.assertTransportMode(t, 'dnomode', 0777 & ~umask)
+ def test_opening_a_file_stream_creates_file(self):
+ t = self.get_transport()
+ if t.is_readonly():
+ return
+ handle = t.open_write_stream('foo')
+ try:
+ self.assertEqual('', t.get_bytes('foo'))
+ finally:
+ handle.close()
+
+ def test_opening_a_file_stream_can_set_mode(self):
+ t = self.get_transport()
+ if t.is_readonly():
+ return
+ if not t._can_roundtrip_unix_modebits():
+ # Can't roundtrip, so no need to run this test
+ return
+ def check_mode(name, mode, expected):
+ handle = t.open_write_stream(name, mode=mode)
+ handle.close()
+ self.assertTransportMode(t, name, expected)
+ check_mode('mode644', 0644, 0644)
+ check_mode('mode666', 0666, 0666)
+ check_mode('mode600', 0600, 0600)
+ # The default permissions should be based on the current umask
+ check_mode('nomode', None, 0666 & ~osutils.get_umask())
+
def test_copy_to(self):
# FIXME: test: same server to same server (partly done)
# same protocol two servers
@@ -769,6 +819,11 @@
# plain "listdir".
# self.assertEqual([], os.listdir('.'))
+ def test_recommended_page_size(self):
+ """Transports recommend a page size for partial access to files."""
+ t = self.get_transport()
+ self.assertIsInstance(t.recommended_page_size(), int)
+
def test_rmdir(self):
t = self.get_transport()
# Not much to do with a readonly transport
@@ -1430,6 +1485,17 @@
self.assertEqual(d[2], (0, '0'))
self.assertEqual(d[3], (3, '34'))
+ def test_get_with_open_write_stream_sees_all_content(self):
+ t = self.get_transport()
+ if t.is_readonly():
+ return
+ handle = t.open_write_stream('foo')
+ try:
+ handle.write('bcd')
+ self.assertEqual([(0, 'b'), (2, 'd')], list(t.readv('foo', ((0,1), (2,1)))))
+ finally:
+ handle.close()
+
def test_get_smart_medium(self):
"""All transports must either give a smart medium, or know they can't.
"""
=== modified file 'bzrlib/transport/__init__.py'
--- a/bzrlib/transport/__init__.py 2007-08-19 20:38:10 +0000
+++ b/bzrlib/transport/__init__.py 2007-08-22 01:41:24 +0000
@@ -66,6 +66,11 @@
from bzrlib import registry
+# a dictionary of open file streams. Keys are absolute paths, values are
+# transport defined.
+_file_streams = {}
+
+
def _get_protocol_handlers():
"""Return a dictionary of {urlprefix: [factory]}"""
return transport_list_registry
@@ -252,6 +257,49 @@
self._fail()
+class FileStream(object):
+ """Base class for FileStreams."""
+
+ def __init__(self, transport, relpath):
+ """Create a FileStream for relpath on transport."""
+ self.transport = transport
+ self.relpath = relpath
+
+ def _close(self):
+ """A hook point for subclasses that need to take action on close."""
+
+ def close(self):
+ self._close()
+ del _file_streams[self.transport.abspath(self.relpath)]
+
+
+class FileFileStream(FileStream):
+ """A file stream object returned by open_write_stream.
+
+ This version uses a file like object to perform writes.
+ """
+
+ def __init__(self, transport, relpath, file_handle):
+ FileStream.__init__(self, transport, relpath)
+ self.file_handle = file_handle
+
+ def _close(self):
+ self.file_handle.close()
+
+ def write(self, bytes):
+ self.file_handle.write(bytes)
+
+
+class AppendBasedFileStream(FileStream):
+ """A file stream object returned by open_write_stream.
+
+ This version uses append on a transport to perform writes.
+ """
+
+ def write(self, bytes):
+ self.transport.append_bytes(self.relpath, bytes)
+
+
class Transport(object):
"""This class encapsulates methods for retrieving or putting a file
from/to a storage location.
@@ -455,6 +503,18 @@
path = '/' + path
return path
+ def recommended_page_size(self):
+ """Return the recommended page size for this transport.
+
+ This is potentially different for every path in a given namespace.
+ For example, local transports might use an operating system call to
+ get the block size for a given path, which can vary due to mount
+ points.
+
+ :return: The page size in bytes.
+ """
+ return 4 * 1024
+
def relpath(self, abspath):
"""Return the local path portion from a given absolute path.
@@ -785,6 +845,24 @@
self.mkdir(path, mode=mode)
return len(self._iterate_over(relpaths, mkdir, pb, 'mkdir', expand=False))
+ def open_write_stream(self, relpath, mode=None):
+ """Open a writable file stream at relpath.
+
+ A file stream is a file like object with a write() method that accepts
+ bytes to write.. Buffering may occur internally until the stream is
+ closed with stream.close(). Calls to readv or the get_* methods will
+ be synchronised with any internal buffering that may be present.
+
+ :param relpath: The relative path to the file.
+ :param mode: The mode for the newly created file,
+ None means just use the default
+ :return: A FileStream. FileStream objects have two methods, write() and
+ close(). There is no guarantee that data is committed to the file
+ if close() has not been called (even if get() is called on the same
+ path).
+ """
+ raise NotImplementedError(self.open_write_stream)
+
def append_file(self, relpath, f, mode=None):
"""Append bytes from a file-like object to a file at relpath.
=== modified file 'bzrlib/transport/chroot.py'
--- a/bzrlib/transport/chroot.py 2007-07-20 03:20:20 +0000
+++ b/bzrlib/transport/chroot.py 2007-08-15 06:53:07 +0000
@@ -89,6 +89,9 @@
def append_file(self, relpath, f, mode=None):
return self._call('append_file', relpath, f, mode)
+ def _can_roundtrip_unix_modebits(self):
+ return self.server.backing_transport._can_roundtrip_unix_modebits()
+
def clone(self, relpath):
return ChrootTransport(self.server, self.abspath(relpath))
@@ -133,6 +136,9 @@
def mkdir(self, relpath, mode=None):
return self._call('mkdir', relpath, mode)
+ def open_write_stream(self, relpath, mode=None):
+ return self._call('open_write_stream', relpath, mode)
+
def put_file(self, relpath, f, mode=None):
return self._call('put_file', relpath, f, mode)
=== modified file 'bzrlib/transport/decorator.py'
--- a/bzrlib/transport/decorator.py 2007-08-15 04:56:08 +0000
+++ b/bzrlib/transport/decorator.py 2007-08-22 01:41:24 +0000
@@ -65,6 +65,10 @@
"""See Transport.append_bytes()."""
return self._decorated.append_bytes(relpath, bytes, mode=mode)
+ def _can_roundtrip_unix_modebits(self):
+ """See Transport._can_roundtrip_unix_modebits()."""
+ return self._decorated._can_roundtrip_unix_modebits()
+
def clone(self, offset=None):
"""See Transport.clone()."""
decorated_clone = self._decorated.clone(offset)
@@ -110,6 +114,10 @@
"""See Transport.mkdir()."""
return self._decorated.mkdir(relpath, mode)
+ def open_write_stream(self, relpath, mode=None):
+ """See Transport.open_write_stream."""
+ return self._decorated.open_write_stream(relpath, mode=mode)
+
def put_file(self, relpath, f, mode=None):
"""See Transport.put_file()."""
return self._decorated.put_file(relpath, f, mode)
@@ -130,6 +138,10 @@
"""See Transport.list_dir()."""
return self._decorated.list_dir(relpath)
+ def recommended_page_size(self):
+ """See Transport.recommended_page_size()."""
+ return self._decorated.recommended_page_size()
+
def rename(self, rel_from, rel_to):
return self._decorated.rename(rel_from, rel_to)
=== modified file 'bzrlib/transport/fakevfat.py'
--- a/bzrlib/transport/fakevfat.py 2007-02-11 16:06:13 +0000
+++ b/bzrlib/transport/fakevfat.py 2007-08-15 06:53:07 +0000
@@ -64,6 +64,10 @@
which actually stored the files.
"""
+ def _can_roundtrip_unix_modebits(self):
+ """See Transport._can_roundtrip_unix_modebits()."""
+ return False
+
@classmethod
def _get_url_prefix(self):
"""Readonly transport decorators are invoked via 'vfat+'"""
=== modified file 'bzrlib/transport/ftp.py'
--- a/bzrlib/transport/ftp.py 2007-08-15 04:56:08 +0000
+++ b/bzrlib/transport/ftp.py 2007-08-22 01:41:24 +0000
@@ -46,6 +46,8 @@
)
from bzrlib.trace import mutter, warning
from bzrlib.transport import (
+ AppendBasedFileStream,
+ _file_streams,
Server,
ConnectedTransport,
)
@@ -318,6 +320,21 @@
self._translate_perm_error(e, abspath,
unknown_exc=errors.FileExists)
+ def open_write_stream(self, relpath, mode=None):
+ """See Transport.open_write_stream."""
+ self.put_bytes(relpath, "", mode)
+ result = AppendBasedFileStream(self, relpath)
+ _file_streams[self.abspath(relpath)] = result
+ return result
+
+ def recommended_page_size(self):
+ """See Transport.recommended_page_size().
+
+ For FTP we suggest a large page size to reduce the overhead
+ introduced by latency.
+ """
+ return 64 * 1024
+
def rmdir(self, rel_path):
"""Delete the directory at rel_path"""
abspath = self._remote_path(rel_path)
=== modified file 'bzrlib/transport/http/__init__.py'
--- a/bzrlib/transport/http/__init__.py 2007-07-20 18:59:29 +0000
+++ b/bzrlib/transport/http/__init__.py 2007-08-05 01:47:30 +0000
@@ -294,6 +294,14 @@
# After one or more tries, we get the data.
yield start, data
+ def recommended_page_size(self):
+ """See Transport.recommended_page_size().
+
+ For HTTP we suggest a large page size to reduce the overhead
+ introduced by latency.
+ """
+ return 64 * 1024
+
@staticmethod
@deprecated_method(zero_seventeen)
def offsets_to_ranges(offsets):
=== modified file 'bzrlib/transport/local.py'
--- a/bzrlib/transport/local.py 2007-08-15 04:56:08 +0000
+++ b/bzrlib/transport/local.py 2007-08-22 01:41:24 +0000
@@ -33,6 +33,7 @@
osutils,
urlutils,
symbol_versioning,
+ transport,
)
from bzrlib.trace import mutter
from bzrlib.transport import LateReadError
@@ -135,6 +136,9 @@
:param relpath: The relative path to the file
"""
+ canonical_url = self.abspath(relpath)
+ if canonical_url in transport._file_streams:
+ transport._file_streams[canonical_url].flush()
try:
path = self._abspath(relpath)
return open(path, 'rb')
@@ -298,6 +302,14 @@
"""Create a directory at the given path."""
self._mkdir(self._abspath(relpath), mode=mode)
+ def open_write_stream(self, relpath, mode=None):
+ """See Transport.open_write_stream."""
+ # initialise the file
+ self.put_bytes_non_atomic(relpath, "", mode=mode)
+ handle = open(self._abspath(relpath), 'wb')
+ transport._file_streams[self.abspath(relpath)] = handle
+ return transport.FileFileStream(self, relpath, handle)
+
def _get_append_file(self, relpath, mode=None):
"""Call os.open() for the given relpath"""
file_abspath = self._abspath(relpath)
=== modified file 'bzrlib/transport/memory.py'
--- a/bzrlib/transport/memory.py 2007-07-20 03:20:20 +0000
+++ b/bzrlib/transport/memory.py 2007-08-15 06:53:07 +0000
@@ -36,6 +36,8 @@
)
from bzrlib.trace import mutter
from bzrlib.transport import (
+ AppendBasedFileStream,
+ _file_streams,
LateReadError,
register_transport,
Server,
@@ -165,6 +167,13 @@
raise FileExists(relpath)
self._dirs[_abspath]=mode
+ def open_write_stream(self, relpath, mode=None):
+ """See Transport.open_write_stream."""
+ self.put_bytes(relpath, "", mode)
+ result = AppendBasedFileStream(self, relpath)
+ _file_streams[self.abspath(relpath)] = result
+ return result
+
def listable(self):
"""See Transport.listable."""
return True
=== modified file 'bzrlib/transport/remote.py'
--- a/bzrlib/transport/remote.py 2007-07-30 14:36:04 +0000
+++ b/bzrlib/transport/remote.py 2007-08-15 06:53:07 +0000
@@ -213,6 +213,13 @@
self._serialise_optional_mode(mode))
self._translate_error(resp)
+ def open_write_stream(self, relpath, mode=None):
+ """See Transport.open_write_stream."""
+ self.put_bytes(relpath, "", mode)
+ result = transport.AppendBasedFileStream(self, relpath)
+ transport._file_streams[self.abspath(relpath)] = result
+ return result
+
def put_bytes(self, relpath, upload_contents, mode=None):
# FIXME: upload_file is probably not safe for non-ascii characters -
# should probably just pass all parameters as length-delimited
=== modified file 'bzrlib/transport/sftp.py'
--- a/bzrlib/transport/sftp.py 2007-08-16 06:43:12 +0000
+++ b/bzrlib/transport/sftp.py 2007-08-22 01:41:24 +0000
@@ -54,6 +54,8 @@
)
from bzrlib.trace import mutter, warning
from bzrlib.transport import (
+ FileFileStream,
+ _file_streams,
local,
register_urlparse_netloc_protocol,
Server,
@@ -264,6 +266,14 @@
except (IOError, paramiko.SSHException), e:
self._translate_io_exception(e, path, ': error retrieving')
+ def recommended_page_size(self):
+ """See Transport.recommended_page_size().
+
+ For SFTP we suggest a large page size to reduce the overhead
+ introduced by latency.
+ """
+ return 64 * 1024
+
def _sftp_readv(self, fp, offsets, relpath='<unknown>'):
"""Use the readv() member of fp to do async readv.
@@ -532,6 +542,28 @@
"""Create a directory at the given path."""
self._mkdir(self._remote_path(relpath), mode=mode)
+ def open_write_stream(self, relpath, mode=None):
+ """See Transport.open_write_stream."""
+ # initialise the file to zero-length
+ # this is three round trips, but we don't use this
+ # api more than once per write_group at the moment so
+ # it is a tolerable overhead. Better would be to truncate
+ # the file after opening. RBC 20070805
+ self.put_bytes_non_atomic(relpath, "", mode)
+ abspath = self._remote_path(relpath)
+ # TODO: jam 20060816 paramiko doesn't publicly expose a way to
+ # set the file mode at create time. If it does, use it.
+ # But for now, we just chmod later anyway.
+ handle = None
+ try:
+ handle = self._get_sftp().file(abspath, mode='wb')
+ handle.set_pipelined(True)
+ except (paramiko.SSHException, IOError), e:
+ self._translate_io_exception(e, abspath,
+ ': unable to open')
+ _file_streams[self.abspath(relpath)] = handle
+ return FileFileStream(self, relpath, handle)
+
def _translate_io_exception(self, e, path, more_info='',
failure_exc=PathError):
"""Translate a paramiko or IOError into a friendlier exception.
More information about the bazaar-commits
mailing list