[storm] Let's land support for Oracle!
James Henstridge
james at jamesh.id.au
Mon Nov 30 15:54:14 GMT 2009
On Mon, Nov 30, 2009 at 9:22 PM, Jason Baker <jbaker at zeomega.com> wrote:
> There's just *one* set of tests that isn't passing against trunk. I haven't
> had time to look in to it. James and I have discussed it in this thread:
> https://lists.ubuntu.com/archives/storm/2009-November/001198.html
> To answer James's question, I ran the test with something that looked like
> this:
> print "before select"
> rlist, wlist, xlist = select.select(readers, [], [], TIMEOUT)
> print "after select"
> After a few iterations, "before select" will print out but "after select"
> won't. There may be something I'm misunderstanding about Python's
> threading, but I believe that means that it's blocking on the select call.
> Of course, another possibility is that that's a coincidence and it's
> deadlocking somewhere else.
I had another thought about this: The body of the select() function
call will look something like this:
1. prepare arguments to select() system call
2. drop GIL
3. make the select() system call
4. acquire GIL
5. prepare result
I'd been assuming that it was blocking at (3), which didn't make sense
because of the use of the timeout. Buit it could also block at (4),
which would occur if the Oracle database adapter held on to the GIL
while talking to the database.
This should be pretty easy to check for using gdb to see what the
interpreter is doing when you hit the deadlock. If that is the case,
the fix would be to either (a) make the Oracle adapter drop the GIL at
the appropriate points, or (b) adjust the test infrastructure so that
the TCP proxy runs in a subprocess instead of a thread (and hence gets
a different GIL).
I'd be happy to see the Oracle code merged with those tests disabled
until the problem gets resolved. Some manual testing would be in
desirable in this case though.
James.
More information about the storm
mailing list