[storm] Questions about inserts

Christopher Armstrong radix at twistedmatrix.com
Fri Jun 13 16:36:46 BST 2008


On Fri, Jun 13, 2008 at 4:24 PM, David Koblas <koblas at extra.com> wrote:
> I'll totally agree about the automatic encoding bit.  The design problem is
> that while I _never_ do any collation or sorting of strings in the database
> (eg. no ORDER BY employee_title), it however is handy to do from time to
> time when you're debugging and testing from the sql command line.

I don't understand. Is it that you can't run SQL queries using an SQL
interactive interpreter if the columns are using unicode? I don't
understand why? Does MySQL not allow you to use unicode literals at
the interactive interpreter?

> [don't over generalize the examples, they're for example purposes]
>
> Thus, one is faced with this dilema -- use VARBINARY for all of your columns
> which give you RawStr and make the python code simple:
>     find(..., employee_title == 'CEO')
> which is a bit more natural than saying employee_title is unicode thus:
>     find(..., employee_title == u'CEO')

Maybe you think it's annoying, but it's correct, and doing it with
automatic conversion is error-prone.

> Now python says:
>     >>> print u'CEO' == 'CEO'
>     True

> While the UnicodeVariable() requires type == unicode, which doesn't allow
> for the natural promotion of str to unicode.

Yes, and as we've said, it's intentional.


>
> --koblas
>
> Christopher Armstrong wrote:
>
> David Koblas <koblas at ...> writes:
>
>
> I'm finding it a bit tidious to constantly be casting from str(...) to
> unicode(...) to keep the database happy.  Is there really a best
> practice for how to have VARCHAR columns that are should be treated as
> RawStr() for purposes of development?
>
>
> There is a best practice: It's to never rely on automatic encoding or
> decoding of str and unicode objects. On all I/O boundaries (web input
> / output, email input / output, file I/O, etc), you need to explicitly
> encode and decode with the appropriate encoding (utf-8, ascii, utf-16,
> etc) for the given transport.
>
> If you do end up using automatic encoding and decoding, then your app
> will most likely break some time when a user comes along and decides
> to enter characters that don't fit in ASCII.
>
>



-- 
Christopher Armstrong
International Man of Twistery
http://radix.twistedmatrix.com/
http://twistedmatrix.com/
http://canonical.com/



More information about the storm mailing list