str vs unicode
Toshio Kuratomi
a.badger at gmail.com
Fri May 27 21:00:46 UTC 2011
On Fri, May 27, 2011 at 01:12:41PM +0200, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 05/27/2011 12:04 PM, Andrew Bennetts wrote:
> > John Arbash Meinel wrote:
> >> On 05/27/2011 05:19 AM, Robert Collins wrote:
> > […]
> >>> Doesn't python3 delete __unicode__ ? In the spirit of having no
> >>> compatiblity concerns, after all :)
> >>>
> >>> -Rob
> >>>
> >>
> >> It deleted __unicode__ because __str__ *is* __unicode__.
> >
> > Right. I think Rob's issue is how to write code that is simultaneously
> > compatible with 2 and 3 with no translation step. In that case I
> > suppose __unicode__ == __str__ (or vice versa?) is probably close to
> > close enough. IIRC you can actually return unicode from __str__ in
> > Python 2 (probably due to the pervasive str/unicode confusion there).
> >
> > -Andrew.
>
> If you return unicode from str it will auto-cast it back to str, and
> cause the associated UnicodeEncodeErrors if you have anything that isn't
> ascii in the string.
>
To be clear, you can return unicode from __str__() but, as John says, that
can yield an exception in certain (but not all) usages:
>>> class Test(object):
>>> def __str__(self):
>>> return u'cafe'
>>> a = Test()
>>> a.__str__()
u'caf\xe9'
>>> '%s' % a
u'caf\xe9'
>>> str(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
>>> print a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
>>> print '%s' % a
café
>>> print u'%s' % a
café
-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/bazaar/attachments/20110527/881b4707/attachment.pgp>
More information about the bazaar
mailing list