[storm] Return subselect results

Shaun McCance shaunm at gnome.org
Mon Sep 20 15:11:02 BST 2010


Hi folks,

I'm trying to figure out how to return the result of a
subselect in a ResultSet. The Storm tutorial shows how
to use a subselect for an IS IN expression. I want to
return the result of a SELECT COUNT(*). Here's the SQL:

select Branch.ident,
  (select count(*) from Branch as Translation
    where Translation.type = "Translation"
    and Translation.parent_ident = Branch.ident)
  from Branch
  join SetModule on SetModule.pred_ident = Branch.parent_ident
  where SetModule.subj_ident = "/set/gnome-3-0-desktop"
  and Branch.type = "Document";

This SQL works fine, and returns the expected results.
Here's what I naively expected would work in Storm:

Translation = ClassAlias(Branch)
tables = store.using(Branch,
  Join(SetModule, SetModule.pred_ident == Branch.parent_ident))
tables.find((Branch.ident,
             Select(Count(Translation.ident),
               where=And(Translation.parent_ident == Branch.ident,
                         Translation.type == u'Translation'),
               tables=Translation)),
  SetModule.subj_ident == u'/set/gnome-3-0-desktop',
  Branch.type == u'Document')

But the resultant SQL looks like this:

SELECT Branch.ident, SELECT COUNT(`_1`.ident) FROM ...

It's very nearly correct, except the subselect needs
to have parentheses around it. I also tried this, at
the suggestion from somebody on IRC:

tables.find((Branch.ident,
             Count(Select(Translation.ident, ...)),
  ...)

The resultant SQL:

SELECT Branch.ident, COUNT((SELECT `_1`.ident FROM ...

This always returns nothing in Storm. If I run the SQL
Storm generates myself in MySQL, I get this:

ERROR 1242 (21000): Subquery returns more than 1 row

The first approach is what I think ought to work, and
it comes very close to working, in terms of the SQL it
generates. (I have no idea how much more effort it is
to figure out how to marshal that back into something
Pythonic in Storm.)

I found a similar problem on the mailing list in 2009.
I have something that works, thanks to James:

tables = store.using(Branch,
  Join(SetModule, SetModule.pred_ident == Branch.parent_ident),
  LeftJoin(Translation,
    Translation.parent_ident = Branch.ident,
    Translation.type == u'Translation'
  ))
tables.find((Branch.ident, Count(translation.ident)),
  SetModule.subj_ident == u'/set/gnome-3-0-desktop',
  Branch.type == u'Document'
).group_by(Branch.ident)

Same results, but a very different approach in terms
of the generated SQL. My tests against my database
show this approach to be about 60% slower than the
subselect approach.

So what are my options? Is the subselect approach
something that could work in Storm?

-- 
Shaun McCance
http://syllogist.net/




More information about the storm mailing list