Rev 2572: (Andrew Bennetts, Aaron Bentley) Add container format as described in doc/developers/container-format.txt in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Tue Jul 3 06:25:00 BST 2007
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 2572
revision-id: pqm at pqm.ubuntu.com-20070703052458-wh36exfav0xnj9nf
parent: pqm at pqm.ubuntu.com-20070702183615-qkiquhju4t2grtf9
parent: andrew.bennetts at canonical.com-20070703041219-4zsjgrup4k6sdlzk
committer: Canonical.com Patch Queue Manager<pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Tue 2007-07-03 06:24:58 +0100
message:
(Andrew Bennetts, Aaron Bentley) Add container format as described in doc/developers/container-format.txt
added:
bzrlib/pack.py container.py-20070607160755-tr8zc26q18rn0jnb-1
bzrlib/tests/test_pack.py test_container.py-20070607160755-tr8zc26q18rn0jnb-2
modified:
bzrlib/errors.py errors.py-20050309040759-20512168c4e14fbd
bzrlib/tests/__init__.py selftest.py-20050531073622-8d0e3c8845c97a64
bzrlib/tests/test_errors.py test_errors.py-20060210110251-41aba2deddf936a8
doc/developers/container-format.txt containerformat.txt-20070601074309-7n7w1jiyayud6xdn-1
------------------------------------------------------------
revno: 2506.2.12
merged: andrew.bennetts at canonical.com-20070703041219-4zsjgrup4k6sdlzk
parent: andrew.bennetts at canonical.com-20070703040601-62bbp6gt9ivf3vja
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: abentley-container-format
timestamp: Tue 2007-07-03 14:12:19 +1000
message:
Update docstring for Aaron's changes.
------------------------------------------------------------
revno: 2506.2.11
merged: andrew.bennetts at canonical.com-20070703040601-62bbp6gt9ivf3vja
parent: andrew.bennetts at canonical.com-20070703040508-11q9cdef4og1qry8
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: abentley-container-format
timestamp: Tue 2007-07-03 14:06:01 +1000
message:
Keep container-format.txt up to date with changes to the code.
------------------------------------------------------------
revno: 2506.2.10
merged: andrew.bennetts at canonical.com-20070703040508-11q9cdef4og1qry8
parent: abentley at panoramicfeedback.com-20070628171306-scpsxn9g89cchzz8
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: abentley-container-format
timestamp: Tue 2007-07-03 14:05:08 +1000
message:
Add '(introduced in 0.18)' to pack format string.
------------------------------------------------------------
revno: 2506.2.9
merged: abentley at panoramicfeedback.com-20070628171306-scpsxn9g89cchzz8
parent: abentley at panoramicfeedback.com-20070628165006-m7bd56ngqs26rd91
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: container-format
timestamp: Thu 2007-06-28 13:13:06 -0400
message:
Use file-like objects as container input, not callables
------------------------------------------------------------
revno: 2506.2.8
merged: abentley at panoramicfeedback.com-20070628165006-m7bd56ngqs26rd91
parent: andrew.bennetts at canonical.com-20070614132802-bas89f67tqq4p3s6
parent: pqm at pqm.ubuntu.com-20070628082903-b21gad45bimzvmgu
committer: Aaron Bentley <abentley at panoramicfeedback.com>
branch nick: container-format
timestamp: Thu 2007-06-28 12:50:06 -0400
message:
Merge bzr.dev
------------------------------------------------------------
revno: 2506.2.7
merged: andrew.bennetts at canonical.com-20070614132802-bas89f67tqq4p3s6
parent: andrew.bennetts at canonical.com-20070614055245-rtwk0vgz74fyyimo
parent: andrew.bennetts at canonical.com-20070614125513-nua0p6bw9cw3jeaq
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Thu 2007-06-14 23:28:02 +1000
message:
Change read/iter_records to return a callable, add more validation, and
improve docstrings.
------------------------------------------------------------
revno: 2506.2.6.1.2
merged: andrew.bennetts at canonical.com-20070614125513-nua0p6bw9cw3jeaq
parent: andrew.bennetts at canonical.com-20070614112338-6u3900u6nkag66u8
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Thu 2007-06-14 22:55:13 +1000
message:
Docstring improvements.
------------------------------------------------------------
revno: 2506.2.6.1.1
merged: andrew.bennetts at canonical.com-20070614112338-6u3900u6nkag66u8
parent: andrew.bennetts at canonical.com-20070614055245-rtwk0vgz74fyyimo
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Thu 2007-06-14 21:23:38 +1000
message:
Return a callable instead of a str from read, and add more validation.
------------------------------------------------------------
revno: 2506.2.6
merged: andrew.bennetts at canonical.com-20070614055245-rtwk0vgz74fyyimo
parent: andrew.bennetts at canonical.com-20070614022816-ne4h4qk0j50x6n26
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Thu 2007-06-14 15:52:45 +1000
message:
Add validate method to ContainerReader and BytesRecordReader.
------------------------------------------------------------
revno: 2506.2.5
merged: andrew.bennetts at canonical.com-20070614022816-ne4h4qk0j50x6n26
parent: andrew.bennetts at canonical.com-20070614015707-hncvkzg0mn4w0w31
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Thu 2007-06-14 12:28:16 +1000
message:
Update format marker in container-format.txt to be in sync with the code.
------------------------------------------------------------
revno: 2506.2.4
merged: andrew.bennetts at canonical.com-20070614015707-hncvkzg0mn4w0w31
parent: andrew.bennetts at canonical.com-20070612015639-z378i21fmcnd5j4x
parent: andrew.bennetts at canonical.com-20070613105602-1bagfibob1rh21mg
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Thu 2007-06-14 11:57:07 +1000
message:
Some small improvements to the pack format, and also merge in bzr.dev.
------------------------------------------------------------
revno: 2506.2.3.1.4
merged: andrew.bennetts at canonical.com-20070613105602-1bagfibob1rh21mg
parent: andrew.bennetts at canonical.com-20070613105446-ukb9knp9dmy57v74
parent: pqm at pqm.ubuntu.com-20070613061627-xx5xk6q0oxcy1etm
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Wed 2007-06-13 20:56:02 +1000
message:
Merge from bzr.dev.
------------------------------------------------------------
revno: 2506.2.3.1.3
merged: andrew.bennetts at canonical.com-20070613105446-ukb9knp9dmy57v74
parent: andrew.bennetts at canonical.com-20070613105312-z94x8g4y5mlg4ukg
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Wed 2007-06-13 20:54:46 +1000
message:
Change format marker to use the word 'Bazaar' rather than 'bzr'.
------------------------------------------------------------
revno: 2506.2.3.1.2
merged: andrew.bennetts at canonical.com-20070613105312-z94x8g4y5mlg4ukg
parent: andrew.bennetts at canonical.com-20070613075835-sb1o923hdtwnmlrv
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Wed 2007-06-13 20:53:12 +1000
message:
Raise InvalidRecordError on invalid names.
------------------------------------------------------------
revno: 2506.2.3.1.1
merged: andrew.bennetts at canonical.com-20070613075835-sb1o923hdtwnmlrv
parent: andrew.bennetts at canonical.com-20070612015639-z378i21fmcnd5j4x
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Wed 2007-06-13 17:58:35 +1000
message:
Remove duplicate definition of ContainerWriter.
------------------------------------------------------------
revno: 2506.2.3
merged: andrew.bennetts at canonical.com-20070612015639-z378i21fmcnd5j4x
parent: andrew.bennetts at canonical.com-20070609034820-t7u540w5pyhvtgn3
parent: andrew.bennetts at canonical.com-20070611072208-9p73cf5vcu7zh0ys
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Tue 2007-06-12 11:56:39 +1000
message:
Fix docstring markup, remove obsolete comment.
------------------------------------------------------------
revno: 2506.2.2.1.1
merged: andrew.bennetts at canonical.com-20070611072208-9p73cf5vcu7zh0ys
parent: andrew.bennetts at canonical.com-20070609034820-t7u540w5pyhvtgn3
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Mon 2007-06-11 17:22:08 +1000
message:
Fix docstring markup, remove obsolete comment.
------------------------------------------------------------
revno: 2506.2.2
merged: andrew.bennetts at canonical.com-20070609034820-t7u540w5pyhvtgn3
parent: andrew.bennetts at canonical.com-20070607160934-jfs1wrxxtulso9nw
parent: andrew.bennetts at canonical.com-20070609034525-j9d7i5dlk6ou97eb
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Sat 2007-06-09 13:48:20 +1000
message:
More improvements, especially in error handling.
------------------------------------------------------------
revno: 2506.2.1.1.3
merged: andrew.bennetts at canonical.com-20070609034525-j9d7i5dlk6ou97eb
parent: andrew.bennetts at canonical.com-20070608064547-vzhyegqx2vl6pni3
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Sat 2007-06-09 13:45:25 +1000
message:
Deal with EOF in the middle of a bytes record.
------------------------------------------------------------
revno: 2506.2.1.1.2
merged: andrew.bennetts at canonical.com-20070608064547-vzhyegqx2vl6pni3
parent: andrew.bennetts at canonical.com-20070608063359-s5ps81a8i85w7by0
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Fri 2007-06-08 16:45:47 +1000
message:
Test docstring tweaks, inspired by looking over the output of jml's testdoc tool.
------------------------------------------------------------
revno: 2506.2.1.1.1
merged: andrew.bennetts at canonical.com-20070608063359-s5ps81a8i85w7by0
parent: andrew.bennetts at canonical.com-20070607160934-jfs1wrxxtulso9nw
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Fri 2007-06-08 16:33:59 +1000
message:
More progress:
* Rename container.py to pack.py
* Refactor bytes record reading into a separate class for ease of unit testing.
* Start handling error conditions such as invalid content lengths in byte
records.
------------------------------------------------------------
revno: 2506.2.1
merged: andrew.bennetts at canonical.com-20070607160934-jfs1wrxxtulso9nw
parent: pqm at pqm.ubuntu.com-20070604194535-ihhpf84qp0icoj2t
committer: Andrew Bennetts <andrew.bennetts at canonical.com>
branch nick: container-format
timestamp: Fri 2007-06-08 02:09:34 +1000
message:
Start implementing container format reading and writing.
=== added file 'bzrlib/pack.py'
--- a/bzrlib/pack.py 1970-01-01 00:00:00 +0000
+++ b/bzrlib/pack.py 2007-07-03 04:12:19 +0000
@@ -0,0 +1,269 @@
+# Copyright (C) 2007 Canonical Ltd
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+"""Container format for Bazaar data.
+
+"Containers" and "records" are described in doc/developers/container-format.txt.
+"""
+
+import re
+
+from bzrlib import errors
+
+
+FORMAT_ONE = "Bazaar pack format 1 (introduced in 0.18)"
+
+
+_whitespace_re = re.compile('[\t\n\x0b\x0c\r ]')
+
+
+def _check_name(name):
+ """Do some basic checking of 'name'.
+
+ At the moment, this just checks that there are no whitespace characters in a
+ name.
+
+ :raises InvalidRecordError: if name is not valid.
+ :seealso: _check_name_encoding
+ """
+ if _whitespace_re.search(name) is not None:
+ raise errors.InvalidRecordError("%r is not a valid name." % (name,))
+
+
+def _check_name_encoding(name):
+ """Check that 'name' is valid UTF-8.
+
+ This is separate from _check_name because UTF-8 decoding is relatively
+ expensive, and we usually want to avoid it.
+
+ :raises InvalidRecordError: if name is not valid UTF-8.
+ """
+ try:
+ name.decode('utf-8')
+ except UnicodeDecodeError, e:
+ raise errors.InvalidRecordError(str(e))
+
+
+class ContainerWriter(object):
+ """A class for writing containers."""
+
+ def __init__(self, write_func):
+ """Constructor.
+
+ :param write_func: a callable that will be called when this
+ ContainerWriter needs to write some bytes.
+ """
+ self.write_func = write_func
+
+ def begin(self):
+ """Begin writing a container."""
+ self.write_func(FORMAT_ONE + "\n")
+
+ def end(self):
+ """Finish writing a container."""
+ self.write_func("E")
+
+ def add_bytes_record(self, bytes, names):
+ """Add a Bytes record with the given names."""
+ # Kind marker
+ self.write_func("B")
+ # Length
+ self.write_func(str(len(bytes)) + "\n")
+ # Names
+ for name in names:
+ # Make sure we're writing valid names. Note that we will leave a
+ # half-written record if a name is bad!
+ _check_name(name)
+ self.write_func(name + "\n")
+ # End of headers
+ self.write_func("\n")
+ # Finally, the contents.
+ self.write_func(bytes)
+
+
+class BaseReader(object):
+
+ def __init__(self, source_file):
+ """Constructor.
+
+ :param source_file: a file-like object with `read` and `readline`
+ methods.
+ """
+ self._source = source_file
+
+ def reader_func(self, length=None):
+ return self._source.read(length)
+
+ def _read_line(self):
+ line = self._source.readline()
+ if not line.endswith('\n'):
+ raise errors.UnexpectedEndOfContainerError()
+ return line.rstrip('\n')
+
+
+class ContainerReader(BaseReader):
+ """A class for reading Bazaar's container format."""
+
+ def iter_records(self):
+ """Iterate over the container, yielding each record as it is read.
+
+ Each yielded record will be a 2-tuple of (names, callable), where names
+ is a ``list`` and bytes is a function that takes one argument,
+ ``max_length``.
+
+ You **must not** call the callable after advancing the interator to the
+ next record. That is, this code is invalid::
+
+ record_iter = container.iter_records()
+ names1, callable1 = record_iter.next()
+ names2, callable2 = record_iter.next()
+ bytes1 = callable1(None)
+
+ As it will give incorrect results and invalidate the state of the
+ ContainerReader.
+
+ :raises ContainerError: if any sort of containter corruption is
+ detected, e.g. UnknownContainerFormatError is the format of the
+ container is unrecognised.
+ :seealso: ContainerReader.read
+ """
+ self._read_format()
+ return self._iter_records()
+
+ def iter_record_objects(self):
+ """Iterate over the container, yielding each record as it is read.
+
+ Each yielded record will be an object with ``read`` and ``validate``
+ methods. Like with iter_records, it is not safe to use a record object
+ after advancing the iterator to yield next record.
+
+ :raises ContainerError: if any sort of containter corruption is
+ detected, e.g. UnknownContainerFormatError is the format of the
+ container is unrecognised.
+ :seealso: iter_records
+ """
+ self._read_format()
+ return self._iter_record_objects()
+
+ def _iter_records(self):
+ for record in self._iter_record_objects():
+ yield record.read()
+
+ def _iter_record_objects(self):
+ while True:
+ record_kind = self.reader_func(1)
+ if record_kind == 'B':
+ # Bytes record.
+ reader = BytesRecordReader(self._source)
+ yield reader
+ elif record_kind == 'E':
+ # End marker. There are no more records.
+ return
+ elif record_kind == '':
+ # End of stream encountered, but no End Marker record seen, so
+ # this container is incomplete.
+ raise errors.UnexpectedEndOfContainerError()
+ else:
+ # Unknown record type.
+ raise errors.UnknownRecordTypeError(record_kind)
+
+ def _read_format(self):
+ format = self._read_line()
+ if format != FORMAT_ONE:
+ raise errors.UnknownContainerFormatError(format)
+
+ def validate(self):
+ """Validate this container and its records.
+
+ Validating consumes the data stream just like iter_records and
+ iter_record_objects, so you cannot call it after
+ iter_records/iter_record_objects.
+
+ :raises ContainerError: if something is invalid.
+ """
+ all_names = set()
+ for record_names, read_bytes in self.iter_records():
+ read_bytes(None)
+ for name in record_names:
+ _check_name_encoding(name)
+ # Check that the name is unique. Note that Python will refuse
+ # to decode non-shortest forms of UTF-8 encoding, so there is no
+ # risk that the same unicode string has been encoded two
+ # different ways.
+ if name in all_names:
+ raise errors.DuplicateRecordNameError(name)
+ all_names.add(name)
+ excess_bytes = self.reader_func(1)
+ if excess_bytes != '':
+ raise errors.ContainerHasExcessDataError(excess_bytes)
+
+
+class BytesRecordReader(BaseReader):
+
+ def read(self):
+ """Read this record.
+
+ You can either validate or read a record, you can't do both.
+
+ :returns: A tuple of (names, callable). The callable can be called
+ repeatedly to obtain the bytes for the record, with a max_length
+ argument. If max_length is None, returns all the bytes. Because
+ records can be arbitrarily large, using None is not recommended
+ unless you have reason to believe the content will fit in memory.
+ """
+ # Read the content length.
+ length_line = self._read_line()
+ try:
+ length = int(length_line)
+ except ValueError:
+ raise errors.InvalidRecordError(
+ "%r is not a valid length." % (length_line,))
+
+ # Read the list of names.
+ names = []
+ while True:
+ name = self._read_line()
+ if name == '':
+ break
+ _check_name(name)
+ names.append(name)
+
+ self._remaining_length = length
+ return names, self._content_reader
+
+ def _content_reader(self, max_length):
+ if max_length is None:
+ length_to_read = self._remaining_length
+ else:
+ length_to_read = min(max_length, self._remaining_length)
+ self._remaining_length -= length_to_read
+ bytes = self.reader_func(length_to_read)
+ if len(bytes) != length_to_read:
+ raise errors.UnexpectedEndOfContainerError()
+ return bytes
+
+ def validate(self):
+ """Validate this record.
+
+ You can either validate or read, you can't do both.
+
+ :raises ContainerError: if this record is invalid.
+ """
+ names, read_bytes = self.read()
+ for name in names:
+ _check_name_encoding(name)
+ read_bytes(None)
+
=== added file 'bzrlib/tests/test_pack.py'
--- a/bzrlib/tests/test_pack.py 1970-01-01 00:00:00 +0000
+++ b/bzrlib/tests/test_pack.py 2007-07-03 04:05:08 +0000
@@ -0,0 +1,377 @@
+# Copyright (C) 2007 Canonical Ltd
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+"""Tests for bzrlib.pack."""
+
+
+from cStringIO import StringIO
+
+from bzrlib import pack, errors, tests
+
+
+class TestContainerWriter(tests.TestCase):
+
+ def test_construct(self):
+ """Test constructing a ContainerWriter.
+
+ This uses None as the output stream to show that the constructor doesn't
+ try to use the output stream.
+ """
+ writer = pack.ContainerWriter(None)
+
+ def test_begin(self):
+ """The begin() method writes the container format marker line."""
+ output = StringIO()
+ writer = pack.ContainerWriter(output.write)
+ writer.begin()
+ self.assertEqual('Bazaar pack format 1 (introduced in 0.18)\n',
+ output.getvalue())
+
+ def test_end(self):
+ """The end() method writes an End Marker record."""
+ output = StringIO()
+ writer = pack.ContainerWriter(output.write)
+ writer.begin()
+ writer.end()
+ self.assertEqual('Bazaar pack format 1 (introduced in 0.18)\nE',
+ output.getvalue())
+
+ def test_add_bytes_record_no_name(self):
+ """Add a bytes record with no name."""
+ output = StringIO()
+ writer = pack.ContainerWriter(output.write)
+ writer.begin()
+ writer.add_bytes_record('abc', names=[])
+ self.assertEqual('Bazaar pack format 1 (introduced in 0.18)\nB3\n\nabc',
+ output.getvalue())
+
+ def test_add_bytes_record_one_name(self):
+ """Add a bytes record with one name."""
+ output = StringIO()
+ writer = pack.ContainerWriter(output.write)
+ writer.begin()
+ writer.add_bytes_record('abc', names=['name1'])
+ self.assertEqual(
+ 'Bazaar pack format 1 (introduced in 0.18)\n'
+ 'B3\nname1\n\nabc',
+ output.getvalue())
+
+ def test_add_bytes_record_two_names(self):
+ """Add a bytes record with two names."""
+ output = StringIO()
+ writer = pack.ContainerWriter(output.write)
+ writer.begin()
+ writer.add_bytes_record('abc', names=['name1', 'name2'])
+ self.assertEqual(
+ 'Bazaar pack format 1 (introduced in 0.18)\n'
+ 'B3\nname1\nname2\n\nabc',
+ output.getvalue())
+
+ def test_add_bytes_record_invalid_name(self):
+ """Adding a Bytes record with a name with whitespace in it raises
+ InvalidRecordError.
+ """
+ output = StringIO()
+ writer = pack.ContainerWriter(output.write)
+ writer.begin()
+ self.assertRaises(
+ errors.InvalidRecordError,
+ writer.add_bytes_record, 'abc', names=['bad name'])
+
+
+class TestContainerReader(tests.TestCase):
+
+ def get_reader_for(self, bytes):
+ stream = StringIO(bytes)
+ reader = pack.ContainerReader(stream)
+ return reader
+
+ def test_construct(self):
+ """Test constructing a ContainerReader.
+
+ This uses None as the output stream to show that the constructor doesn't
+ try to use the input stream.
+ """
+ reader = pack.ContainerReader(None)
+
+ def test_empty_container(self):
+ """Read an empty container."""
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\nE")
+ self.assertEqual([], list(reader.iter_records()))
+
+ def test_unknown_format(self):
+ """Unrecognised container formats raise UnknownContainerFormatError."""
+ reader = self.get_reader_for("unknown format\n")
+ self.assertRaises(
+ errors.UnknownContainerFormatError, reader.iter_records)
+
+ def test_unexpected_end_of_container(self):
+ """Containers that don't end with an End Marker record should cause
+ UnexpectedEndOfContainerError to be raised.
+ """
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\n")
+ iterator = reader.iter_records()
+ self.assertRaises(
+ errors.UnexpectedEndOfContainerError, iterator.next)
+
+ def test_unknown_record_type(self):
+ """Unknown record types cause UnknownRecordTypeError to be raised."""
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\nX")
+ iterator = reader.iter_records()
+ self.assertRaises(
+ errors.UnknownRecordTypeError, iterator.next)
+
+ def test_container_with_one_unnamed_record(self):
+ """Read a container with one Bytes record.
+
+ Parsing Bytes records is more thoroughly exercised by
+ TestBytesRecordReader. This test is here to ensure that
+ ContainerReader's integration with BytesRecordReader is working.
+ """
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\nB5\n\naaaaaE")
+ expected_records = [([], 'aaaaa')]
+ self.assertEqual(
+ expected_records,
+ [(names, read_bytes(None))
+ for (names, read_bytes) in reader.iter_records()])
+
+ def test_validate_empty_container(self):
+ """validate does not raise an error for a container with no records."""
+ reader = self.get_reader_for("Bazaar pack format 1 (introduced in 0.18)\nE")
+ # No exception raised
+ reader.validate()
+
+ def test_validate_non_empty_valid_container(self):
+ """validate does not raise an error for a container with a valid record.
+ """
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\nB3\nname\n\nabcE")
+ # No exception raised
+ reader.validate()
+
+ def test_validate_bad_format(self):
+ """validate raises an error for unrecognised format strings.
+
+ It may raise either UnexpectedEndOfContainerError or
+ UnknownContainerFormatError, depending on exactly what the string is.
+ """
+ inputs = ["", "x", "Bazaar pack format 1 (introduced in 0.18)", "bad\n"]
+ for input in inputs:
+ reader = self.get_reader_for(input)
+ self.assertRaises(
+ (errors.UnexpectedEndOfContainerError,
+ errors.UnknownContainerFormatError),
+ reader.validate)
+
+ def test_validate_bad_record_marker(self):
+ """validate raises UnknownRecordTypeError for unrecognised record
+ types.
+ """
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\nX")
+ self.assertRaises(errors.UnknownRecordTypeError, reader.validate)
+
+ def test_validate_data_after_end_marker(self):
+ """validate raises ContainerHasExcessDataError if there are any bytes
+ after the end of the container.
+ """
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\nEcrud")
+ self.assertRaises(
+ errors.ContainerHasExcessDataError, reader.validate)
+
+ def test_validate_no_end_marker(self):
+ """validate raises UnexpectedEndOfContainerError if there's no end of
+ container marker, even if the container up to this point has been valid.
+ """
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\n")
+ self.assertRaises(
+ errors.UnexpectedEndOfContainerError, reader.validate)
+
+ def test_validate_duplicate_name(self):
+ """validate raises DuplicateRecordNameError if the same name occurs
+ multiple times in the container.
+ """
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\n"
+ "B0\nname\n\n"
+ "B0\nname\n\n"
+ "E")
+ self.assertRaises(errors.DuplicateRecordNameError, reader.validate)
+
+ def test_validate_undecodeable_name(self):
+ """Names that aren't valid UTF-8 cause validate to fail."""
+ reader = self.get_reader_for(
+ "Bazaar pack format 1 (introduced in 0.18)\nB0\n\xcc\n\nE")
+ self.assertRaises(errors.InvalidRecordError, reader.validate)
+
+
+class TestBytesRecordReader(tests.TestCase):
+ """Tests for reading and validating Bytes records with BytesRecordReader."""
+
+ def get_reader_for(self, bytes):
+ stream = StringIO(bytes)
+ reader = pack.BytesRecordReader(stream)
+ return reader
+
+ def test_record_with_no_name(self):
+ """Reading a Bytes record with no name returns an empty list of
+ names.
+ """
+ reader = self.get_reader_for("5\n\naaaaa")
+ names, get_bytes = reader.read()
+ self.assertEqual([], names)
+ self.assertEqual('aaaaa', get_bytes(None))
+
+ def test_record_with_one_name(self):
+ """Reading a Bytes record with one name returns a list of just that
+ name.
+ """
+ reader = self.get_reader_for("5\nname1\n\naaaaa")
+ names, get_bytes = reader.read()
+ self.assertEqual(['name1'], names)
+ self.assertEqual('aaaaa', get_bytes(None))
+
+ def test_record_with_two_names(self):
+ """Reading a Bytes record with two names returns a list of both names.
+ """
+ reader = self.get_reader_for("5\nname1\nname2\n\naaaaa")
+ names, get_bytes = reader.read()
+ self.assertEqual(['name1', 'name2'], names)
+ self.assertEqual('aaaaa', get_bytes(None))
+
+ def test_invalid_length(self):
+ """If the length-prefix is not a number, parsing raises
+ InvalidRecordError.
+ """
+ reader = self.get_reader_for("not a number\n")
+ self.assertRaises(errors.InvalidRecordError, reader.read)
+
+ def test_early_eof(self):
+ """Tests for premature EOF occuring during parsing Bytes records with
+ BytesRecordReader.
+
+ A incomplete container might be interrupted at any point. The
+ BytesRecordReader needs to cope with the input stream running out no
+ matter where it is in the parsing process.
+
+ In all cases, UnexpectedEndOfContainerError should be raised.
+ """
+ complete_record = "6\nname\n\nabcdef"
+ for count in range(0, len(complete_record)):
+ incomplete_record = complete_record[:count]
+ reader = self.get_reader_for(incomplete_record)
+ # We don't use assertRaises to make diagnosing failures easier
+ # (assertRaises doesn't allow a custom failure message).
+ try:
+ names, read_bytes = reader.read()
+ read_bytes(None)
+ except errors.UnexpectedEndOfContainerError:
+ pass
+ else:
+ self.fail(
+ "UnexpectedEndOfContainerError not raised when parsing %r"
+ % (incomplete_record,))
+
+ def test_initial_eof(self):
+ """EOF before any bytes read at all."""
+ reader = self.get_reader_for("")
+ self.assertRaises(errors.UnexpectedEndOfContainerError, reader.read)
+
+ def test_eof_after_length(self):
+ """EOF after reading the length and before reading name(s)."""
+ reader = self.get_reader_for("123\n")
+ self.assertRaises(errors.UnexpectedEndOfContainerError, reader.read)
+
+ def test_eof_during_name(self):
+ """EOF during reading a name."""
+ reader = self.get_reader_for("123\nname")
+ self.assertRaises(errors.UnexpectedEndOfContainerError, reader.read)
+
+ def test_read_invalid_name_whitespace(self):
+ """Names must have no whitespace."""
+ # A name with a space.
+ reader = self.get_reader_for("0\nbad name\n\n")
+ self.assertRaises(errors.InvalidRecordError, reader.read)
+
+ # A name with a tab.
+ reader = self.get_reader_for("0\nbad\tname\n\n")
+ self.assertRaises(errors.InvalidRecordError, reader.read)
+
+ # A name with a vertical tab.
+ reader = self.get_reader_for("0\nbad\vname\n\n")
+ self.assertRaises(errors.InvalidRecordError, reader.read)
+
+ def test_validate_whitespace_in_name(self):
+ """Names must have no whitespace."""
+ reader = self.get_reader_for("0\nbad name\n\n")
+ self.assertRaises(errors.InvalidRecordError, reader.validate)
+
+ def test_validate_interrupted_prelude(self):
+ """EOF during reading a record's prelude causes validate to fail."""
+ reader = self.get_reader_for("")
+ self.assertRaises(
+ errors.UnexpectedEndOfContainerError, reader.validate)
+
+ def test_validate_interrupted_body(self):
+ """EOF during reading a record's body causes validate to fail."""
+ reader = self.get_reader_for("1\n\n")
+ self.assertRaises(
+ errors.UnexpectedEndOfContainerError, reader.validate)
+
+ def test_validate_unparseable_length(self):
+ """An unparseable record length causes validate to fail."""
+ reader = self.get_reader_for("\n\n")
+ self.assertRaises(
+ errors.InvalidRecordError, reader.validate)
+
+ def test_validate_undecodeable_name(self):
+ """Names that aren't valid UTF-8 cause validate to fail."""
+ reader = self.get_reader_for("0\n\xcc\n\n")
+ self.assertRaises(errors.InvalidRecordError, reader.validate)
+
+ def test_read_max_length(self):
+ """If the max_length passed to the callable returned by read is not
+ None, then no more than that many bytes will be read.
+ """
+ reader = self.get_reader_for("6\n\nabcdef")
+ names, get_bytes = reader.read()
+ self.assertEqual('abc', get_bytes(3))
+
+ def test_read_no_max_length(self):
+ """If the max_length passed to the callable returned by read is None,
+ then all the bytes in the record will be read.
+ """
+ reader = self.get_reader_for("6\n\nabcdef")
+ names, get_bytes = reader.read()
+ self.assertEqual('abcdef', get_bytes(None))
+
+ def test_repeated_read_calls(self):
+ """Repeated calls to the callable returned from BytesRecordReader.read
+ will not read beyond the end of the record.
+ """
+ reader = self.get_reader_for("6\n\nabcdefB3\nnext-record\nXXX")
+ names, get_bytes = reader.read()
+ self.assertEqual('abcdef', get_bytes(None))
+ self.assertEqual('', get_bytes(None))
+ self.assertEqual('', get_bytes(99))
+
+
=== modified file 'bzrlib/errors.py'
--- a/bzrlib/errors.py 2007-06-26 08:52:20 +0000
+++ b/bzrlib/errors.py 2007-06-28 16:50:06 +0000
@@ -2150,6 +2150,57 @@
self.response_tuple = response_tuple
+class ContainerError(BzrError):
+ """Base class of container errors."""
+
+
+class UnknownContainerFormatError(ContainerError):
+
+ _fmt = "Unrecognised container format: %(container_format)r"
+
+ def __init__(self, container_format):
+ self.container_format = container_format
+
+
+class UnexpectedEndOfContainerError(ContainerError):
+
+ _fmt = "Unexpected end of container stream"
+
+ internal_error = False
+
+
+class UnknownRecordTypeError(ContainerError):
+
+ _fmt = "Unknown record type: %(record_type)r"
+
+ def __init__(self, record_type):
+ self.record_type = record_type
+
+
+class InvalidRecordError(ContainerError):
+
+ _fmt = "Invalid record: %(reason)s"
+
+ def __init__(self, reason):
+ self.reason = reason
+
+
+class ContainerHasExcessDataError(ContainerError):
+
+ _fmt = "Container has data after end marker: %(excess)r"
+
+ def __init__(self, excess):
+ self.excess = excess
+
+
+class DuplicateRecordNameError(ContainerError):
+
+ _fmt = "Container has multiple records with the same name: \"%(name)s\""
+
+ def __init__(self, name):
+ self.name = name
+
+
class NoDestinationAddress(BzrError):
_fmt = "Message does not have a destination address."
=== modified file 'bzrlib/tests/__init__.py'
--- a/bzrlib/tests/__init__.py 2007-07-02 05:53:55 +0000
+++ b/bzrlib/tests/__init__.py 2007-07-03 05:24:58 +0000
@@ -2278,6 +2278,7 @@
'bzrlib.tests.test_commit_merge',
'bzrlib.tests.test_config',
'bzrlib.tests.test_conflicts',
+ 'bzrlib.tests.test_pack',
'bzrlib.tests.test_counted_lock',
'bzrlib.tests.test_decorators',
'bzrlib.tests.test_delta',
=== modified file 'bzrlib/tests/test_errors.py'
--- a/bzrlib/tests/test_errors.py 2007-06-26 08:52:20 +0000
+++ b/bzrlib/tests/test_errors.py 2007-06-28 16:50:06 +0000
@@ -273,6 +273,47 @@
"Could not understand response from smart server: ('not yes',)",
str(e))
+ def test_unknown_container_format(self):
+ """Test the formatting of UnknownContainerFormatError."""
+ e = errors.UnknownContainerFormatError('bad format string')
+ self.assertEqual(
+ "Unrecognised container format: 'bad format string'",
+ str(e))
+
+ def test_unexpected_end_of_container(self):
+ """Test the formatting of UnexpectedEndOfContainerError."""
+ e = errors.UnexpectedEndOfContainerError()
+ self.assertEqual(
+ "Unexpected end of container stream", str(e))
+
+ def test_unknown_record_type(self):
+ """Test the formatting of UnknownRecordTypeError."""
+ e = errors.UnknownRecordTypeError("X")
+ self.assertEqual(
+ "Unknown record type: 'X'",
+ str(e))
+
+ def test_invalid_record(self):
+ """Test the formatting of InvalidRecordError."""
+ e = errors.InvalidRecordError("xxx")
+ self.assertEqual(
+ "Invalid record: xxx",
+ str(e))
+
+ def test_container_has_excess_data(self):
+ """Test the formatting of ContainerHasExcessDataError."""
+ e = errors.ContainerHasExcessDataError("excess bytes")
+ self.assertEqual(
+ "Container has data after end marker: 'excess bytes'",
+ str(e))
+
+ def test_duplicate_record_name_error(self):
+ """Test the formatting of DuplicateRecordNameError."""
+ e = errors.DuplicateRecordNameError(u"n\xe5me".encode('utf-8'))
+ self.assertEqual(
+ "Container has multiple records with the same name: \"n\xc3\xa5me\"",
+ str(e))
+
class PassThroughError(errors.BzrError):
=== modified file 'doc/developers/container-format.txt'
--- a/doc/developers/container-format.txt 2007-06-08 02:47:19 +0000
+++ b/doc/developers/container-format.txt 2007-07-03 04:06:01 +0000
@@ -176,7 +176,7 @@
The format is:
- * a **container lead-in**, "``bzr pack format 1\n``",
+ * a **container lead-in**, "``Bazaar pack format 1 (introduced in 0.18)\n``",
* followed by one or more **records**.
A record is:
More information about the bazaar-commits
mailing list