As I know some code in bzr uses conversion from utf-8 to unicode (e.g. dirstate?). Recently there is interesting discussion in comp.lang.python that shows unicode(s, 'utf-8') is faster than decode. http://groups.google.com/group/comp.lang.python/browse_thread/thread/314a3043ea63319f/ Maybe this will be useful to know for bzrlib.