loggerhead takes a while to fetch revisions

Fri Dec 15 16:17:53 GMT 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robey Pointer wrote:
> (See previous post about what loggerhead is.)
> 
> One thing I ran into when writing a web front-end was that the process
> of fetching and processing a revision from a bazaar repository can take
> a while.  I don't think I'm doing that much post-processing on a
> revision (the relevant code is history.py:get_change) so I suspect
> something is going on within bzrlib.
> 
> I added a caching system that saves the pickled data structure that I
> create after fetching a revision and its delta (specifically: the files
> that are involved), and it helps IMMENSELY.  With a full cache, almost
> every operation can complete in <= 2 seconds, even annotating a file
> like bzr.dev's NEWS file.  Without it, some operations take 2 or more
> *minutes*.
> 
> Maybe this is normal behavior, but I thought it was worth
> investigating.  I did an lsprof of the get_cache() call and posted it here:
> 
>     http://www.lag.net/~robey/code/get_change.html

Looking at the performance graph, I'm pretty sure the bulk of time is
spent here:
http://www.lag.net/~robey/code/get_change.html#_..._bzrlib_knit.py_+1062(__init__)

That seems to be 654/937ms And of those 650ms, I'm guessing the bulk of
it is spent reading the inventory.knit header. Considering:

http://www.lag.net/~robey/code/get_change.html#_..._bzrlib_repository.py_+460(get_inventory_weave)
get_inventory_weave() is taking 395ms.

So there are a couple things...

1) If you can maintain a read lock inbetween actions, then I thought it
keeps the inventory knit cached in memory, and doesn't have to parse it
again. So while a lock won't help much for a single action, between
actions it should help a lot.

2) We need to work on getting _KnitIndex.__init__() faster anyway.
Dmitry has a couple of optimizations he has proposed, and I have
approved, but nobody else has looked at them.
http://bundlebuggy.aaronbentley.com/request/%3C45700C04.6040706@hlabs.spb.ru%3E

In deference to Aaron, I want to ask that we avoid the spurious
whitespace changes (changing whitespace just for cleanliness sake, since
it might make future merging conflict more).

3) We really want to switch so that revision_ids are just UTF-8 strings
in memory, rather than having to encode/decode them as they are
read/written. It means adding another encode() and decode() step in the
XML serializers, because cElementTree wants to work in Unicode objects.
So in the short term, it won't help a whole lot. But it will help while
parsing the knit index files. And in the long run, we'll probably switch
away from cElementTree, and then it will help a lot there too.

4) We might also be able to greatly revise how the index files are
parsed. Right now, we do a readlines() call and then iterate through the
lines. It might actually be a lot faster to do:

text = fp.read()
recs = [line.split() for line in text.split('\n')
                       if line.endswith(':')]
for rec in recs:
  parents = []
  for value in rec[4:-1]:
    if value.startswith('.'):
      parents.append(value[1:]) # No decode step, right?
    else:
      parents.append(self._history[int(value)])

Also, I honestly wonder if we are better of inlining the functions. I
know --lsprof makes it look bad, but in my experience it over estimates
the function call overhead.

It might also be possible to switch it around and do something like
several generators. So you have:

def parse_parents(recs):
  for rec in recs:
    parents = []
    for value in rec[4:-1]:
      if value.startswith('.'):
        parents.append(value[1:]) # No decode step, right?
      else:
        parents.append(self._history[int(value)])
    yield parents

recs = [line.split() for line in fp.read().split('\n')
                      if line.endswith(':')]
for rec, parents in izip(recs, parse_parents(recs)):
  version_id = rec[0]
  if version_id not in self._cache:
    index = len(self._history)
  else:
    index = self._cache[version_id][5]
  self._cache[version_id] = (version_id,
                             rec[1].split(','),
                             int(rec[2]),
                             int(rec[3]),
                             parents,
                             index,
                            )

I don't really know if it is faster, but if the penalty is function call
overhead, a generator avoids that overhead. I would actually guess that
it is slower, because of the double loop overhead...

Anyway, it would be nice to have the option to split
_KnitIndex.__init__() into stuff that could be profiled a little bit
better, to figure out where the time is actually being spent.

It may not even be the encode/decode steps, since those seem to only
account for 1000 calls, and about 8ms of the total time.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFgsqxJdeBCYSNAAMRAtdTAJ9CbClazruVaK87GSFXnoY4QjB4bwCfeEm0
Q3CEsV8dyMzjceln7eFSdow=
=i34H
-----END PGP SIGNATURE-----