str vs unicode

Toshio Kuratomi a.badger at gmail.com
Fri May 27 21:00:46 UTC 2011


On Fri, May 27, 2011 at 01:12:41PM +0200, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 05/27/2011 12:04 PM, Andrew Bennetts wrote:
> > John Arbash Meinel wrote:
> >> On 05/27/2011 05:19 AM, Robert Collins wrote:
> > […]
> >>> Doesn't python3 delete __unicode__ ? In the spirit of having no
> >>> compatiblity concerns, after all :)
> >>>
> >>> -Rob
> >>>
> >>
> >> It deleted __unicode__ because __str__ *is* __unicode__.
> > 
> > Right.  I think Rob's issue is how to write code that is simultaneously
> > compatible with 2 and 3 with no translation step.  In that case I
> > suppose __unicode__ == __str__ (or vice versa?) is probably close to
> > close enough.  IIRC you can actually return unicode from __str__ in
> > Python 2 (probably due to the pervasive str/unicode confusion there).
> > 
> > -Andrew.
> 
> If you return unicode from str it will auto-cast it back to str, and
> cause the associated UnicodeEncodeErrors if you have anything that isn't
> ascii in the string.
> 
To be clear, you can return unicode from __str__() but, as John says, that
can yield an exception in certain (but not all) usages:

>>> class Test(object):
>>>  def __str__(self):
>>>   return u'cafe'
>>> a = Test()
>>> a.__str__()
u'caf\xe9'
>>> '%s' % a
u'caf\xe9'
>>> str(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
>>> print a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
>>> print '%s' % a
café
>>> print u'%s' % a
café

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/bazaar/attachments/20110527/881b4707/attachment.pgp>


More information about the bazaar mailing list