[storm] Profiling project

Mon Jun 20 15:51:15 UTC 2011

Hi folks,

I hope some of you will have time to help me with this.  I'm reviving a 
back-burner project I've had dormant for a while: access pattern 
profiling.  The goal is to provide better information to drive 
optimization of Storm-based application.

Before I start asking questions, which I will do in separate threads, 
let me introduce my Storm profiling prototype.  If you see anything 
wrong here, I'd be grateful if you could bring it up on the list so I 
can improve it!

The bzr branch is here:

     https://code.launchpad.net/~jtv/storm/profile-fetches

And now, a "quick" (ahem) run-through of the design:

= Profiles =

When you look at a Storm-backed object in memory, it got there in one of 
three ways:

1. It's just been created by the application.
2. It was returned by a database query (think Store.find).
3. The application followed a reference from another object.

Case 3 is the big problem with ORMs: it's far too easy to end up with 
lots of inefficient small queries.  It'd be nice to optimize those away.

So as an application developer you want to know: what reference?  Where 
did that other object come from, all the way back to an object that came 
from either 1 or 2?  How many objects were loaded into memory from the 
database at each step along that trail?  What objects should I 
pre-fetch, and where?  The profiler tries to answer exactly these questions.

The design as it stands only cares when objects first come into the 
cache.  When the application accesses objects that are already in cache 
— hopefully most of the time! — the profiler doesn't count anything; it 
just stays out of the way.  Maybe we'll want information about cached 
objects later, but I'd like to see how far we can get with minimal 
profiling overhead first.

= Contexts =

Profiles are grouped per "context."  The application gets to define 
these.  Contexts are named and nestable, much like the functions in a 
call stack, so if you have a function that may be used in different ways 
from different places in the application, you can have separate contexts 
for those uses.  Each "call stack" of contexts is profiled separately, 
so you'll be able to see whether the different uses of your function 
need separate optimization or not.

Each "trail" of accesses is tracked in the context where its _original_ 
object was first loaded or created.  It doesn't matter where the later 
accesses happen; they are all tracked in the original context.  So the 
profile for a context shows you not the "past" of objects in that 
context, but their "future" — and how you can improve that future by 
pre-fetching.

There can be multiple free-form queries in one context, and that may be 
a little confusing.  I'd be happy to discuss better ways to do this; I 
figured that python call stacks or the file/line location of a query 
would be too fine-grained to be very useful.  Plus, I'd like to support 
saving and reloading of profiles across program upgrades, to get 
meaningful long-term profiling data.

= Future =

The profiler is really just the first step in what I hope will be a 
longer-term project.  Here's what I'm hoping we can do over time:
  * Access profiling.
  * Automated optimization suggestions.
  * Long-term data management (load/save, scrubbing).
  * Query-time accounting.
  * Automated dynamic, profile-driven optimization.
  * "Long tail" of optimization tuning.

The ideal end product is something very close to how we use ORMs today, 
but without the headaches of manual optimization: lost time, abstraction 
leaks, unreadable code, painful rewrites when circumstances change.  I 
believe we can win back a good portion of the performance we gave up for 
convenience, and keep the convenience.  And I hope you're willing to 
join me on this adventure!

Jeroen