Rev 3916: Merge in the latest brisbane-core in http://bzr.arbash-meinel.com/branches/bzr/brisbane/vilajam

Fri Mar 27 19:10:44 GMT 2009

At http://bzr.arbash-meinel.com/branches/bzr/brisbane/vilajam

------------------------------------------------------------
revno: 3916
revision-id: john at arbash-meinel.com-20090327191021-7n8x88f002wu4k6b
parent: v.ladeuil+lp at free.fr-20090327150957-1ndu55rwn7j1wqug
parent: john at arbash-meinel.com-20090326201840-ddb2uqof335ysvnu
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: vilajam
timestamp: Fri 2009-03-27 14:10:21 -0500
message:
  Merge in the latest brisbane-core
modified:
  bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
  bzrlib/delta.h                 delta.h-20090227173129-qsu3u43vowf1q3ay-1
  bzrlib/groupcompress.py        groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
  bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
  bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
  bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
  bzrlib/tests/test_inv.py       testinv.py-20050722220913-1dc326138d1a5892
    ------------------------------------------------------------
    revno: 3907.1.7
    revision-id: john at arbash-meinel.com-20090326201840-ddb2uqof335ysvnu
    parent: john at arbash-meinel.com-20090326195952-w0qea66iw597ipza
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-03-26 15:18:40 -0500
    message:
      max() shows up under lsprof as more expensive than creating an object.
      timeit also says if x < y is faster than y = max(x, y).
      Small win, but I'll take it.
    modified:
      bzrlib/groupcompress.py        groupcompress.py-20080705181503-ccbxd6xuy1bdnrpu-8
    ------------------------------------------------------------
    revno: 3907.1.6
    revision-id: john at arbash-meinel.com-20090326195952-w0qea66iw597ipza
    parent: john at arbash-meinel.com-20090326191304-w52buxewrxumpgvo
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-03-26 14:59:52 -0500
    message:
      Add some direct tests for CHKInventory._entry_to_bytes
      and _bytes_to_entry.
      Also, add a new function _bytes_to_utf8name_key. I wanted to just add
      _bytes_to_key, but it seems we have code that uses the name field to
      check if this is a root key that should not be transmitted.
      Anyway, by having this function, item_keys_introduced_by avoids a
      bunch of .decode() calls, as well as not building up InventoryEntry
      objects.
      Also use this when gathering text_refs in GCPacker. Hopefully, we
      could turn it on all the time, if it got cheap enough.
      And it points us in the right directory for a StreamSource that
      sends CHK pages.
    modified:
      bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
      bzrlib/repofmt/groupcompress_repo.py repofmt.py-20080715094215-wp1qfvoo7093c8qr-1
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/tests/test_inv.py       testinv.py-20050722220913-1dc326138d1a5892
    ------------------------------------------------------------
    revno: 3907.1.5
    revision-id: john at arbash-meinel.com-20090326191304-w52buxewrxumpgvo
    parent: john at arbash-meinel.com-20090326180307-yktd7ny3mees1v6t
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-03-26 14:13:04 -0500
    message:
      Shave a little bit of time by using itervalues() rather than casting through refs()
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 3907.1.4
    revision-id: john at arbash-meinel.com-20090326180307-yktd7ny3mees1v6t
    parent: john at arbash-meinel.com-20090326175542-qmb46mw1d8zt5k1l
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-03-26 13:03:07 -0500
    message:
      type(node) is InternalNode is supposedly better than isinstance(node, InternalNode)
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 3907.1.3
    revision-id: john at arbash-meinel.com-20090326175542-qmb46mw1d8zt5k1l
    parent: john at arbash-meinel.com-20090326163500-os7lvdpsdxnxstd0
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-03-26 12:55:42 -0500
    message:
      Simple fix to avoid using small.difference_update(large)
      It seems the obvious thing to do, but Python's implementation scales poorly.
      small = small.difference(large) scales much better [O(small) rather than O(large)].
    modified:
      bzrlib/chk_map.py              chk_map.py-20081001014447-ue6kkuhofvdecvxa-1
    ------------------------------------------------------------
    revno: 3907.1.2
    revision-id: john at arbash-meinel.com-20090326163500-os7lvdpsdxnxstd0
    parent: john at arbash-meinel.com-20090326162258-21e57rtpx47t6493
    parent: pqm at pqm.ubuntu.com-20090326131816-4nzmlssnd4huc5cu
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-03-26 11:35:00 -0500
    message:
      Merge bzr.dev 4208.
      
      This brings in some more smart-server improvements, 
      as well as the iter_files_bytes as chunked, and 
      multi-file and directory logging.
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/builtins.py             builtins.py-20050830033751-fc01482b9ca23183
      bzrlib/counted_lock.py         counted_lock.py-20070502135927-7dk86io3ok7ctx6k-1
      bzrlib/graph.py                graph_walker.py-20070525030359-y852guab65d4wtn0-1
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
      bzrlib/lockable_files.py       control_files.py-20051111201905-bb88546e799d669f
      bzrlib/log.py                  log.py-20050505065812-c40ce11702fe5fb1
      bzrlib/memorytree.py           memorytree.py-20060906023413-4wlkalbdpsxi2r4y-1
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/repofmt/weaverepo.py    presplitout.py-20070125045333-wfav3tsh73oxu3zk-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/revisiontree.py         revisiontree.py-20060724012533-bg8xyryhxd0o0i0h-1
      bzrlib/smart/repository.py     repository.py-20061128022038-vr5wy5bubyb8xttk-1
      bzrlib/smart/request.py        request.py-20061108095550-gunadhxmzkdjfeek-1
      bzrlib/smtp_connection.py      smtp_connection.py-20070618204456-nu6wag1ste4biuk2-1
      bzrlib/tests/__init__.py       selftest.py-20050531073622-8d0e3c8845c97a64
      bzrlib/tests/blackbox/test_log.py test_log.py-20060112090212-78f6ea560c868e24
      bzrlib/tests/test_bundle.py    test.py-20050630184834-092aa401ab9f039c
      bzrlib/tests/test_counted_lock.py test_counted_lock.py-20070502135927-7dk86io3ok7ctx6k-2
      bzrlib/tests/test_graph.py     test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
      bzrlib/tests/test_remote.py    test_remote.py-20060720103555-yeeg2x51vn0rbtdp-2
      bzrlib/tests/test_smart.py     test_smart.py-20061122024551-ol0l0o0oofsu9b3t-2
      bzrlib/tests/test_smart_request.py test_smart_request.p-20090211070731-o38wayv3asm25d6a-1
      bzrlib/tests/test_smtp_connection.py test_smtp_connection-20070618204509-wuyxc0r0ztrecv7e-1
      bzrlib/tests/test_source.py    test_source.py-20051207061333-a58dea6abecc030d
      bzrlib/workingtree.py          workingtree.py-20050511021032-29b6ec0a681e02e3
      bzrlib/workingtree_4.py        workingtree_4.py-20070208044105-5fgpc5j3ljlh5q6c-1
      doc/developers/index.txt       index.txt-20070508041241-qznziunkg0nffhiw-1
      doc/developers/performance-contributing.txt performancecontribut-20070621063612-ac4zhhagjzkr21qp-1
      doc/developers/planned-change-integration.txt plannedchangeintegra-20070619004702-i1b3ccamjtfaoq6w-1
      doc/developers/releasing.txt   releasing.txt-20080502015919-fnrcav8fwy8ccibu-1
      doc/developers/revision-properties.txt revisionproperties.t-20070807133526-w57m8zv5o7t5kugm-1
    ------------------------------------------------------------
    revno: 3907.1.1
    revision-id: john at arbash-meinel.com-20090326162258-21e57rtpx47t6493
    parent: ian.clatworthy at canonical.com-20090325121809-el4l5ie9ifqt5ur9
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: brisbane-core
    timestamp: Thu 2009-03-26 11:22:58 -0500
    message:
      Get rid of inline and const, to reduce warnings and errors.
      It seems compilers don't like it if you pass &(unsigned char *) to
      (const unsigned char **), and pyrex doesn't let you define 'const unsigned char*',
      (it doesn't like const at all), so for now, just remove it, because
      it doesn't hurt anything.
      
      Removing inline because MSVC doesn't understand it, and causes compile
      failures. It wasn't really important anyway.
    modified:
      bzrlib/delta.h                 delta.h-20090227173129-qsu3u43vowf1q3ay-1
-------------- next part --------------
=== modified file 'bzrlib/chk_map.py'

--- a/bzrlib/chk_map.py	2009-03-24 19:36:34 +0000
+++ b/bzrlib/chk_map.py	2009-03-26 19:13:04 +0000
@@ -172,7 +172,7 @@
                 key_str = ' None'
         result.append('%s%r %s%s' % (indent, prefix, node.__class__.__name__,
                                      key_str))
-        if isinstance(node, InternalNode):
+        if type(node) is InternalNode:
             # Trigger all child nodes to get loaded
             list(node._iter_nodes(self._store))
             for prefix, sub in sorted(node._items.iteritems()):
@@ -437,7 +437,7 @@
 
     def key(self):
         """Return the key for this map."""
-        if isinstance(self._root_node, tuple):
+        if type(self._root_node) is tuple:
             return self._root_node
         else:
             return self._root_node._key
@@ -471,7 +471,7 @@
     def unmap(self, key, check_remap=True):
         """remove key from the map."""
         self._ensure_root()
-        if isinstance(self._root_node, InternalNode):
+        if type(self._root_node) is InternalNode:
             unmapped = self._root_node.unmap(self._store, key,
                 check_remap=check_remap)
         else:
@@ -481,7 +481,7 @@
     def _check_remap(self):
         """Check if nodes can be collapsed."""
         self._ensure_root()
-        if isinstance(self._root_node, InternalNode):
+        if type(self._root_node) is InternalNode:
             self._root_node._check_remap(self._store)
 
     def _save(self):
@@ -1047,7 +1047,7 @@
             # new child needed:
             child = self._new_child(search_key, LeafNode)
         old_len = len(child)
-        if isinstance(child, LeafNode):
+        if type(child) is LeafNode:
             old_size = child._current_size()
         else:
             old_size = None
@@ -1059,7 +1059,7 @@
             self._items[search_key] = child
             self._key = None
             new_node = self
-            if isinstance(child, LeafNode):
+            if type(child) is LeafNode:
                 if old_size is None:
                     # The old node was an InternalNode which means it has now
                     # collapsed, so we need to check if it will chain to a
@@ -1213,7 +1213,7 @@
         if len(self._items) == 1:
             # this node is no longer needed:
             return self._items.values()[0]
-        if isinstance(unmapped, InternalNode):
+        if type(unmapped) is InternalNode:
             return self
         if check_remap:
             return self._check_remap(store)
@@ -1259,7 +1259,7 @@
         #   c) With 255-way fan out, we don't want to read all 255 and destroy
         #      the page cache, just to determine that we really don't need it.
         for node, _ in self._iter_nodes(store, batch_size=16):
-            if isinstance(node, InternalNode):
+            if type(node) is InternalNode:
                 # Without looking at any leaf nodes, we are sure
                 return self
             for key, value in node._items.iteritems():
@@ -1300,7 +1300,7 @@
         # care about external references.
         node = _deserialise(bytes, record.key, search_key_func=None)
         if record.key in uninteresting_keys:
-            if isinstance(node, InternalNode):
+            if type(node) is InternalNode:
                 next_uninteresting.update(node.refs())
             else:
                 # We know we are at a LeafNode, so we can pass None for the
@@ -1308,7 +1308,7 @@
                 uninteresting_items.update(node.iteritems(None))
         else:
             interesting_records.append(record)
-            if isinstance(node, InternalNode):
+            if type(node) is InternalNode:
                 next_interesting.update(node.refs())
             else:
                 interesting_items.update(node.iteritems(None))
@@ -1364,7 +1364,7 @@
             # We don't care about search_key_func for this code, because we
             # only care about external references.
             node = _deserialise(bytes, record.key, search_key_func=None)
-            if isinstance(node, InternalNode):
+            if type(node) is InternalNode:
                 # uninteresting_prefix_chks.update(node._items.iteritems())
                 chks = node._items.values()
                 # TODO: We remove the entries that are already in
@@ -1435,9 +1435,16 @@
             # We don't care about search_key_func for this code, because we
             # only care about external references.
             node = _deserialise(bytes, record.key, search_key_func=None)
-            if isinstance(node, InternalNode):
-                chks = set(node.refs())
-                chks.difference_update(all_uninteresting_chks)
+            if type(node) is InternalNode:
+                # all_uninteresting_chks grows large, as it lists all nodes we
+                # don't want to process (including already seen interesting
+                # nodes).
+                # small.difference_update(large) scales O(large), but
+                # small.difference(large) scales O(small).
+                # Also, we know we just _deserialised this node, so we can
+                # access the dict directly.
+                chks = set(node._items.itervalues()).difference(
+                            all_uninteresting_chks)
                 # Is set() and .difference_update better than:
                 # chks = [chk for chk in node.refs()
                 #              if chk not in all_uninteresting_chks]

=== modified file 'bzrlib/delta.h'
--- a/bzrlib/delta.h	2009-03-19 06:01:53 +0000
+++ b/bzrlib/delta.h	2009-03-26 16:22:58 +0000
@@ -84,10 +84,10 @@
  * This must be called twice on the delta data buffer, first to get the
  * expected source buffer size, and again to get the target buffer size.
  */
-static inline unsigned long get_delta_hdr_size(const unsigned char **datap,
-                                               const unsigned char *top)
+static unsigned long
+get_delta_hdr_size(unsigned char **datap, const unsigned char *top)
 {
-    const unsigned char *data = *datap;
+    unsigned char *data = *datap;
     unsigned char cmd;
     unsigned long size = 0;
     int i = 0;

=== modified file 'bzrlib/groupcompress.py'
--- a/bzrlib/groupcompress.py	2009-03-27 12:12:10 +0000
+++ b/bzrlib/groupcompress.py	2009-03-27 19:10:21 +0000
@@ -532,7 +532,11 @@
         # Note that this creates a reference cycle....
         factory = _LazyGroupCompressFactory(key, parents, self,
             start, end, first=first)
-        self._last_byte = max(end, self._last_byte)
+        # max() works here, but as a function call, doing a compare seems to be
+        # significantly faster, timeit says 250ms for max() and 100ms for the
+        # comparison
+        if end > self._last_byte:
+            self._last_byte = end
         self._factories.append(factory)
 
     def get_record_stream(self):

=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py	2009-03-25 17:29:07 +0000
+++ b/bzrlib/inventory.py	2009-03-27 19:10:21 +0000
@@ -1503,6 +1503,15 @@
         else:
             raise ValueError("unknown kind %r" % entry.kind)
 
+    @staticmethod
+    def _bytes_to_utf8name_key(bytes):
+        """Get the file_id, revision_id key out of bytes."""
+        # We don't normally care about name, except for times when we want
+        # to filter out empty names because of non rich-root...
+        sections = bytes.split('\n')
+        kind, file_id = sections[0].split(': ')
+        return (sections[2], file_id, sections[3])
+
     def _bytes_to_entry(self, bytes):
         """Deserialise a serialised entry."""
         sections = bytes.split('\n')
@@ -1521,7 +1530,7 @@
             result = InventoryLink(sections[0][9:],
                 sections[2].decode('utf8'),
                 sections[1])
-            result.symlink_target = sections[4]
+            result.symlink_target = sections[4].decode('utf8')
         elif sections[0].startswith("tree: "):
             result = TreeReference(sections[0][6:],
                 sections[2].decode('utf8'),

=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- a/bzrlib/repofmt/groupcompress_repo.py	2009-03-24 19:36:34 +0000
+++ b/bzrlib/repofmt/groupcompress_repo.py	2009-03-26 19:59:52 +0000
@@ -242,9 +242,7 @@
         remaining_keys = set(keys)
         counter = [0]
         if self._gather_text_refs:
-            # Just to get _bytes_to_entry, so we don't care about the
-            # search_key_name
-            inv = inventory.CHKInventory(None)
+            bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key
             self._text_refs = set()
         def _get_referenced_stream(root_keys, parse_leaf_nodes=False):
             cur_keys = root_keys
@@ -271,8 +269,8 @@
                     # Store is None, because we know we have a LeafNode, and we
                     # just want its entries
                     for file_id, bytes in node.iteritems(None):
-                        entry = inv._bytes_to_entry(bytes)
-                        self._text_refs.add((entry.file_id, entry.revision))
+                        name_utf8, file_id, revision_id = bytes_to_info(bytes)
+                        self._text_refs.add((file_id, revision))
                 def next_stream():
                     stream = source_vf.get_record_stream(cur_keys,
                                                          'as-requested', True)

=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py	2009-03-25 17:29:07 +0000
+++ b/bzrlib/repofmt/pack_repo.py	2009-03-27 19:10:21 +0000
@@ -2468,23 +2468,23 @@
             interesting_root_keys.add(inv.id_to_entry.key())
         revision_ids = frozenset(revision_ids)
         file_id_revisions = {}
+        bytes_to_info = CHKInventory._bytes_to_utf8name_key
         for records, items in chk_map.iter_interesting_nodes(self.chk_bytes,
                     interesting_root_keys, uninteresting_root_keys,
                     pb=pb):
             # This is cheating a bit to use the last grabbed 'inv', but it
             # works
             for name, bytes in items:
-                # TODO: We should use something cheaper than _bytes_to_entry,
-                #       which has to .decode() the entry name, etc.
-                #       We only care about a couple of the fields in the bytes.
-                entry = inv._bytes_to_entry(bytes)
-                if entry.name == '' and not rich_root:
+                (name_utf8, file_id, revision_id) = bytes_to_info(bytes)
+                if not rich_root and name_utf8 == '':
                     continue
-                if entry.revision in revision_ids:
+                if revision_id in revision_ids:
                     # Would we rather build this up into file_id => revision
                     # maps?
-                    s = file_id_revisions.setdefault(entry.file_id, set())
-                    s.add(entry.revision)
+                    try:
+                        file_id_revisions[file_id].add(revision_id)
+                    except KeyError:
+                        file_id_revisions[file_id] = set([revision_id])
         for file_id, revisions in file_id_revisions.iteritems():
             yield ('file', file_id, revisions)
 

=== modified file 'bzrlib/tests/test_inv.py'
--- a/bzrlib/tests/test_inv.py	2009-03-25 17:29:07 +0000
+++ b/bzrlib/tests/test_inv.py	2009-03-27 19:10:21 +0000
@@ -531,3 +531,108 @@
         self.assertEqual(
             {('', ''): 'TREE_ROOT', ('TREE_ROOT', 'file'): 'fileid'},
             dict(chk_inv.parent_id_basename_to_file_id.iteritems()))
+
+    def test_file_entry_to_bytes(self):
+        inv = CHKInventory(None)
+        ie = inventory.InventoryFile('file-id', 'filename', 'parent-id')
+        ie.executable = True
+        ie.revision = 'file-rev-id'
+        ie.text_sha1 = 'abcdefgh'
+        ie.text_size = 100
+        bytes = inv._entry_to_bytes(ie)
+        self.assertEqual('file: file-id\nparent-id\nfilename\n'
+                         'file-rev-id\nabcdefgh\n100\nY', bytes)
+        ie2 = inv._bytes_to_entry(bytes)
+        self.assertEqual(ie, ie2)
+        self.assertIsInstance(ie2.name, unicode)
+        self.assertEqual(('filename', 'file-id', 'file-rev-id'),
+                         inv._bytes_to_utf8name_key(bytes))
+
+    def test_file2_entry_to_bytes(self):
+        inv = CHKInventory(None)
+        # \u30a9 == 'omega'
+        ie = inventory.InventoryFile('file-id', u'\u03a9name', 'parent-id')
+        ie.executable = False
+        ie.revision = 'file-rev-id'
+        ie.text_sha1 = '123456'
+        ie.text_size = 25
+        bytes = inv._entry_to_bytes(ie)
+        self.assertEqual('file: file-id\nparent-id\n\xce\xa9name\n'
+                         'file-rev-id\n123456\n25\nN', bytes)
+        ie2 = inv._bytes_to_entry(bytes)
+        self.assertEqual(ie, ie2)
+        self.assertIsInstance(ie2.name, unicode)
+        self.assertEqual(('\xce\xa9name', 'file-id', 'file-rev-id'),
+                         inv._bytes_to_utf8name_key(bytes))
+
+    def test_dir_entry_to_bytes(self):
+        inv = CHKInventory(None)
+        ie = inventory.InventoryDirectory('dir-id', 'dirname', 'parent-id')
+        ie.revision = 'dir-rev-id'
+        bytes = inv._entry_to_bytes(ie)
+        self.assertEqual('dir: dir-id\nparent-id\ndirname\ndir-rev-id', bytes)
+        ie2 = inv._bytes_to_entry(bytes)
+        self.assertEqual(ie, ie2)
+        self.assertIsInstance(ie2.name, unicode)
+        self.assertEqual(('dirname', 'dir-id', 'dir-rev-id'),
+                         inv._bytes_to_utf8name_key(bytes))
+
+    def test_dir2_entry_to_bytes(self):
+        inv = CHKInventory(None)
+        ie = inventory.InventoryDirectory('dir-id', u'dir\u03a9name',
+                                          None)
+        ie.revision = 'dir-rev-id'
+        bytes = inv._entry_to_bytes(ie)
+        self.assertEqual('dir: dir-id\n\ndir\xce\xa9name\n'
+                         'dir-rev-id', bytes)
+        ie2 = inv._bytes_to_entry(bytes)
+        self.assertEqual(ie, ie2)
+        self.assertIsInstance(ie2.name, unicode)
+        self.assertIs(ie2.parent_id, None)
+        self.assertEqual(('dir\xce\xa9name', 'dir-id', 'dir-rev-id'),
+                         inv._bytes_to_utf8name_key(bytes))
+
+    def test_symlink_entry_to_bytes(self):
+        inv = CHKInventory(None)
+        ie = inventory.InventoryLink('link-id', 'linkname', 'parent-id')
+        ie.revision = 'link-rev-id'
+        ie.symlink_target = u'target/path'
+        bytes = inv._entry_to_bytes(ie)
+        self.assertEqual('symlink: link-id\nparent-id\nlinkname\n'
+                         'link-rev-id\ntarget/path', bytes)
+        ie2 = inv._bytes_to_entry(bytes)
+        self.assertEqual(ie, ie2)
+        self.assertIsInstance(ie2.name, unicode)
+        self.assertIsInstance(ie2.symlink_target, unicode)
+        self.assertEqual(('linkname', 'link-id', 'link-rev-id'),
+                         inv._bytes_to_utf8name_key(bytes))
+
+    def test_symlink2_entry_to_bytes(self):
+        inv = CHKInventory(None)
+        ie = inventory.InventoryLink('link-id', u'link\u03a9name', 'parent-id')
+        ie.revision = 'link-rev-id'
+        ie.symlink_target = u'target/\u03a9path'
+        bytes = inv._entry_to_bytes(ie)
+        self.assertEqual('symlink: link-id\nparent-id\nlink\xce\xa9name\n'
+                         'link-rev-id\ntarget/\xce\xa9path', bytes)
+        ie2 = inv._bytes_to_entry(bytes)
+        self.assertEqual(ie, ie2)
+        self.assertIsInstance(ie2.name, unicode)
+        self.assertIsInstance(ie2.symlink_target, unicode)
+        self.assertEqual(('link\xce\xa9name', 'link-id', 'link-rev-id'),
+                         inv._bytes_to_utf8name_key(bytes))
+
+    def test_tree_reference_entry_to_bytes(self):
+        inv = CHKInventory(None)
+        ie = inventory.TreeReference('tree-root-id', u'tree\u03a9name',
+                                     'parent-id')
+        ie.revision = 'tree-rev-id'
+        ie.reference_revision = 'ref-rev-id'
+        bytes = inv._entry_to_bytes(ie)
+        self.assertEqual('tree: tree-root-id\nparent-id\ntree\xce\xa9name\n'
+                         'tree-rev-id\nref-rev-id', bytes)
+        ie2 = inv._bytes_to_entry(bytes)
+        self.assertEqual(ie, ie2)
+        self.assertIsInstance(ie2.name, unicode)
+        self.assertEqual(('tree\xce\xa9name', 'tree-root-id', 'tree-rev-id'),
+                         inv._bytes_to_utf8name_key(bytes))