[storm] Issue with find(), group by, and Single aggregates error

Wed Nov 10 12:20:00 GMT 2010

Hi again Tom

As per your email below, here's a summary of why I *think* I need to use
a derived ResultSet. The #1 reason is to solve the count() and
group_by() issue which as you say there is no other alternative for now.
I agree it would be nice to fix it in storm, but until then...

The #2 reason is that I am doing a query to populate a view. The query
does a projection and I need to assemble the resulting data from each
result set record into objects for use in the view. This way of doing
things is quite common in Hibernate and other ORM solutions I have used
in the past. This is with Java though not Python so thing could be
different using different languages.

I am overriding the _load_objects() method to do the work. Some pseudo code:

class MyResultSet(ResultSet):

    def _load_objects(self, result, values):
        values = super(MyResultSet, self)._load_objects(result, values)
        return self._make_result_object(*values)

    def _make_result_object(*values):
        # business logic to construct result object from attributes
        foo = self._store.get(Foo, values[0])
        bar = self._store.get(Bar, values[1])
        return FooBar(foo, bar, values[2], values[3])

The advantage of the above approach is that the data required to satisfy
the use case can be efficiently cherry picked from the database and
assembled into the required object structure quite transparently to the
invoking business logic. This makes it much easier, using the same
boiler plate as used everywhere else, to wire in iterable results sets
and batch navigators etc where the objects are instantiated only as
needed and in a way that is transparent to the view and controller etc.

Does that make sense? Hopefully I have explained it ok. The above
approach is working well for the feature I am coding right now, the
issue with specifying the derived ResultSet notwithstanding. Of course,
I am happy to be told there's another way :-)

> Hum, I'd be interested to see that :).
> 

See above :-)

> 
>> Plus there's still
>> places in the the Store implementation which calls ResultSet() directly
>> so any _result_set_factory override will be ignored.
> 
> Hum, the only place I see is ResultSet._set_expr, which should probably 
> use self.__class__ instead.
>

Yes, that was the example I found.

>> For my case, I am
>> using the store to only perform queries to populate a view so it all works.
> 
> Store._result_set_factory is definitely not a public interface, so I 
> don't encourage you to do that. Of course, there is not better way for now.
> 
>> What would perhaps be better is to allow the user to specify a ResultSet
>> implementation to be used whenever find() is called rather than using an
>> instance variable on the store object. The user specified ResultSet
>> would be tied to the specific find operation, not the Store instance.
> 
> Passing more arguments to find would be tricky from a compatibility 
> point of view. Maybe we could create a context manager to do that, with 
> a public API to customize _result_set_factory.
> 

I was think along the same lines in that using a context manager would
be worth a try.