[storm] some questions about Storm (from the perspective of Grok)

Stuart Bishop stuart at stuartbishop.net
Sun Mar 16 05:56:34 GMT 2008


Martijn Faassen wrote:

>>  Generated schemas are for toy applications:  real world applications
>>  need to have the schema designed.  Maybe someday the generators will
>>  have enough zen to do better tham the humans (as with compiler-generated
>>  object code vs. hand-assembled), but that doy is a *long* way out right now.
> 
> Are you saying that people who generate their schema from declarative
> information in their Python code aren't designing their schema? Are
> only people who write SQL "create table" statements capable of
> designing good schemas? Are all applications that use an object
> database like the ZODB toy applications, as the schema design is in
> Python code? Do you think that the generated schema SQL is so inferior
> that it will never be useful for any application whatsoever? After
> all, there is still a large class of useful applications that doesn't
> need to scale to vast amounts of users or massive quantities of data.
> Are all Ruby on Rails applications toy applications? Are traditional
> PHP applications, where people typically write their SQL schemas by
> hand, better? Do they typically have better schemas than RoR
> applications?

An application that doen't need to scale to vast amounts of users or massive
quantities of data is a Noddy application by my definition. That doesn't
mean they are not useful (and this is why I don't use the real world term
here). You don't care about the scalability issues with generated schemas
because scalability isn't an issue. You don't care about integrating with
existing data sources because there are none. You don't care about
maintainability or upgrades because data model required by the application
is so simple it doesn't matter. Just don't try to retrofit scalability
later. In most cases people would be better off using a simpler database
rather than something with all the overheads of SQL.

If you extend the Python tools enough to express the richness of data model
available in an enterprise level RDB, you are going to end up with something
more complex, less standard, less well known and less well documented than
SQL. How do you express in the Python API that you need a partial index on
these three columns and that function result or your application will only
run in geological time? Or that index A needs to be created after data is
loaded into table B but before data is loaded into table C? Or triggers? Or
table partitioning? Constraints? Stored procedures? Replication sets?

> Do you think that SQL *queries* created by tools such as Storm are
> also so inferior to hand-written queries that they are only suitable
> for toy applications? If not, why is query generation already there
> while schema generation is not?

The SQL queries generated by some ORMs are vastly inferior. Storm and other
ORMs targetting non-Noddy applications are designed to allow formulation of
complex queries with enough control to have them perform efficiently. And if
the Python syntax turns out to not offer fine grained enough control or
access to that proprietary feature you need to access, then you can fall
back to using SQL fragments but still have the results retrieved into your
object model. Some ORMs only allow you to retrieve a set of results from one
table at a time, which is incredibly inefficient when you then need to
retrieve related rows in other tables one at a time. Being able to formulate
a complex and efficient query and retrieve all the results from arbitrary
tables involved in that query is a must.


-- 
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/storm/attachments/20080316/b289716e/attachment-0001.pgp 


More information about the storm mailing list