Rev 3480: Change how we handle unicode targets, and add a NEWS entry. in http://bzr.arbash-meinel.com/branches/bzr/1.6-dev/symlink_unicode_135320

John Arbash Meinel john at arbash-meinel.com
Thu Jun 5 22:47:58 BST 2008


At http://bzr.arbash-meinel.com/branches/bzr/1.6-dev/symlink_unicode_135320

------------------------------------------------------------
revno: 3480
revision-id: john at arbash-meinel.com-20080605214739-uf050pk6fdgm5xbp
parent: john at arbash-meinel.com-20080605211911-2xtal2ehwcqkcymy
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: symlink_unicode_135320
timestamp: Thu 2008-06-05 16:47:39 -0500
message:
  Change how we handle unicode targets, and add a NEWS entry.
-------------- next part --------------
=== modified file 'NEWS'
--- a/NEWS	2008-06-05 19:12:37 +0000
+++ b/NEWS	2008-06-05 21:47:39 +0000
@@ -22,6 +22,11 @@
     * Sanitize branch nick before using it as an attachment filename in
       ``bzr send``. (Luk???? Lalinsk??, #210218)
 
+    * Squash ``inv_entry.symlink_target`` to a plain string when
+      generating DirState details. This prevents from getting a
+      ``UnicodeError`` when you have symlinks and non-ascii filenames.
+      (John Arbash Meinel, #135320)
+
   IMPROVEMENTS:
 
     * Added the 'alias' command to set/unset and display aliases. (Tim Penhey)

=== modified file 'bzrlib/dirstate.py'
--- a/bzrlib/dirstate.py	2008-06-05 21:19:11 +0000
+++ b/bzrlib/dirstate.py	2008-06-05 21:47:39 +0000
@@ -1842,13 +1842,8 @@
             size = 0
             executable = False
         elif kind == 'symlink':
-            fingerprint = inv_entry.symlink_target
-            if fingerprint is None:
-                fingerprint = ''
-            else:
-                assert isinstance(fingerprint, unicode)
-                # Do we need a 'isinstance(fingerprint, unicode)' check here?
-                fingerprint = fingerprint.encode('UTF-8')
+            # We don't support non-ascii targets for symlinks yet.
+            fingerprint = str(inv_entry.symlink_target or '')
             size = 0
             executable = False
         elif kind == 'file':

=== modified file 'bzrlib/tests/test_dirstate.py'
--- a/bzrlib/tests/test_dirstate.py	2008-06-05 21:19:11 +0000
+++ b/bzrlib/tests/test_dirstate.py	2008-06-05 21:47:39 +0000
@@ -2522,14 +2522,19 @@
     def assertDetails(self, expected, inv_entry):
         details = dirstate.DirState._inv_entry_to_details(inv_entry)
         self.assertEqual(expected, details)
+        # details should always allow join() and always be a plain str when
+        # finished
+        (minikind, fingerprint, size, executable, tree_data) = details
+        self.assertIsInstance(minikind, str)
+        self.assertIsInstance(fingerprint, str)
+        self.assertIsInstance(tree_data, str)
 
     def test_unicode_symlink(self):
-        uni_link_target = u'Non-\xe5scii'
-        utf8_link_target = 'Non-\xc3\xa5scii'
-        self.assertEqual(utf8_link_target, uni_link_target.encode('UTF-8'))
+        # In general, the code base doesn't support a target that contains
+        # non-ascii characters. So we just assert tha 
         inv_entry = inventory.InventoryLink('link-file-id', 'name',
                                             'link-parent-id')
         inv_entry.revision = 'link-revision-id'
-        inv_entry.symlink_target = uni_link_target
-        self.assertDetails(('l', utf8_link_target, 0, False,
-                           'link-revision-id'), inv_entry)
+        inv_entry.symlink_target = u'link-target'
+        details = self.assertDetails(('l', 'link-target', 0, False,
+                                      'link-revision-id'), inv_entry)



More information about the bazaar-commits mailing list