Rev 2548: Revert the accidental removal of the Unicode normalization check code. in http://bzr.arbash-meinel.com/branches/bzr/0.17-dev/dirstate_pyrex

John Arbash Meinel john at arbash-meinel.com
Fri Jul 20 19:26:44 BST 2007


At http://bzr.arbash-meinel.com/branches/bzr/0.17-dev/dirstate_pyrex

------------------------------------------------------------
revno: 2548
revision-id: john at arbash-meinel.com-20070720182620-948wu6weli9aupkq
parent: john at arbash-meinel.com-20070720173448-cn7og836bl8dovwv
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: dirstate_pyrex
timestamp: Fri 2007-07-20 13:26:20 -0500
message:
  Revert the accidental removal of the Unicode normalization check code.
  It was done to profile how much it was costing us, but it wasn't meant to be removed.
modified:
  bzrlib/dirstate.py             dirstate.py-20060728012006-d6mvoihjb3je9peu-1
-------------- next part --------------
=== modified file 'bzrlib/dirstate.py'
--- a/bzrlib/dirstate.py	2007-07-12 16:34:02 +0000
+++ b/bzrlib/dirstate.py	2007-07-20 18:26:20 +0000
@@ -366,11 +366,15 @@
         # add it.
         #------- copied from inventory.make_entry
         # --- normalized_filename wants a unicode basename only, so get one.
-        if path.__class__ == unicode:
-            utf8path = path.encode('utf8')
-        else:
-            utf8path = path
-        dirname, basename = osutils.split(utf8path)
+        dirname, basename = osutils.split(path)
+        # we dont import normalized_filename directly because we want to be
+        # able to change the implementation at runtime for tests.
+        norm_name, can_access = osutils.normalized_filename(basename)
+        if norm_name != basename:
+            if can_access:
+                basename = norm_name
+            else:
+                raise errors.InvalidNormalization(path)
         # you should never have files called . or ..; just add the directory
         # in the parent, or according to the special treatment for the root
         if basename == '.' or basename == '..':
@@ -378,6 +382,8 @@
         # now that we've normalised, we need the correct utf8 path and 
         # dirname and basename elements. This single encode and split should be
         # faster than three separate encodes.
+        utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
+        dirname, basename = osutils.split(utf8path)
         assert file_id.__class__ == str, \
             "must be a utf8 file_id not %s" % (type(file_id))
         # Make sure the file_id does not exist in this tree



More information about the bazaar-commits mailing list