[storm] Should storm have separate String and Blob datatypes?

Jason Baker jbaker at zeomega.com
Mon Jun 29 23:17:11 BST 2009


As of right now, it looks like storm provides one datatype for
VARCHAR2 and Binary data.  This makes sense for databases like MySQL
and Postgres where you can essentially treat them as the same
datatype.  So for instance, this will work under MySQL:

mysql> CREATE TABLE t (c BINARY(3));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t SET c = 'z';
Query OK, 1 row affected (0.01 sec)

However, Oracle relies on these values being hex:

    SQL> CREATE TABLE t (c RAW(3));

    Table created.

    SQL> INSERT INTO t VALUES ('z')
      2  ;
    INSERT INTO t VALUES ('z')
                      *
    ERROR at line 1:
    ORA-01465: invalid hex number

Instead, we have to convert it to hex:

    SQL> INSERT INTO t VALUES (rawtohex('z'));

    1 row created.

This makes it problematic to have the back end accept one data type
for both ascii character strings and binary data because there's not
really any way to know if the value should be hexlified or not.  Of
course the ideal solution is to just use unicode for all strings, but
that's very much a massive undertaking for an established code-base as
there are a lot of "devil's in the details" type issues along with
potential performance problems.

Would there be enough of a benefit for everybody to have separate
classes for binary and ascii data, or is this something we should just
use internally?



More information about the storm mailing list