[storm] why weakrefdict for cache?

Tue Sep 4 19:08:28 BST 2007

> so if I had:
> 
> class Blub(object):
>     def __init__(self, x, y):
>         self.x = x
>         self.y = y
> 
> and mapped it as a pickle type to Foober.blub:
> 
> f = Foober()
> f.blub = Blub(7, 8)
> 
> where's the event hook placed ?  when i say f.blub.y = 12 for example.

The Variable (Storm's wrapper) responsible for f.blub would register
itself into two events: "flush" and "object-deleted".  When f dies,
or when the store is flushed, the variable will be called back to
check for changes.  At this point the new blub.y is detected, and the
obj_info for the dead object gets added to the dirty set, and things
go on normally.

> one would think.  but no.  Psycopg2 buffers the whole result into  
> memory the moment you say cursor.execute().  fetchone()/fetchmany()/ 
> fetchall() all read from the buffer.

Really!?  Wow.. that's sad. :-(

> There *is* a way to override this, and that is to use a named  
> cursor.   In that case psycopg2 maintains the cursor on the server  
> side and each fetchXXX() call retrieves results over the wire.  I  
> don't know why there isnt a simple flag "server_side=True", but on  
> the psycopg2 list they seemed to hold the opinion that you shouldnt  
> be selecting a result set larger than that which you can hold in  

Well, considering that they put the whole result set in memory,
that's certainly true.

> memory, so they dont seem to view "server side cursors" as something
> anyone should really need (to them its just a side effect of using
> named cursors...hence not really documented or anything).

Unfortunately, it's a bad limitation for anyone working with
very large data sets.  I understand their basic idea.  Good
queries certainly restrict the data shown to reasonable subsets,
when what's wanted *is* a subset, or it's just an operation which
could be moved to the server.  Even then, not being able to iterate
over the data using conservative amounts of memory is certainly a
restriction.

> Anyway, somewhere in SA's 0.3 series we did an internal  
> reorganization such that our postgres dialect *can* optionally use  
> named cursors so that you get "server side" capability, however we  
> had to jump through many hoops to make it work since Psycopg2s  
> implementation makes it difficult (they will fail if used for  
> anything other than a SELECT, and cursor.description is not available  
> until the first row is fetched).

Nice.. I may get back to you about this at some point, if you're
fine with it.

-- 
Gustavo Niemeyer
http://niemeyer.net