regression? branching lp:branch into empty local shared repo is very slow

Andrew Bennetts andrew.bennetts at canonical.com
Mon Jan 11 01:39:10 GMT 2010


Martin Pool wrote:
> 2010/1/11 Andrew Bennetts <andrew.bennetts at canonical.com>:
> 
> > But if no results are found, it can't expand the results (if the heads()
> > were cheap to determine, I'd probably return those in this case, but
> > they aren't).
> >
> > But what is supposed to happen is that _walk_to_common_revisions in
> > InterRepository should be requesting 50 revisions at a time, not just
> > one.  If that's not occurring in this case, I think that is a regression
> > (and one we should add a ratchet test for to stop it regressing again).
> 
> It seems like in this case, you would want the server to send a packet
> of revision ids, distributed through the history back to
> null_revision.  Then the client can say if it has any of them.  This
> ought to let you fairly quickly identify that it should send the whole
> thing.

I'm not sure the server's behaviour is the problem here as much as the
client's.

I'm guessing what's happening in this case is a series of [hmm although
see below, I think this guess may be wrong...]:

  Request: get_parent_map(r9999)
  Response: (no results)
  Request: get_parent_map(r9998)
  Response: (no results)
  ...
  Request: get_parent_map(r1)
  Response: (no results)

(9999 round trips to find out there's no common history)

What's supposed to happen in this case:

  Request: get_parent_map(r9999..r9950)
  Response: (no results)
  Request: get_parent_map(r9949..r9900)
  Response: (no results)
  ...

(9999/50 ~= 200 round trips to find out there's no common history)

What would be ideal is that the client sends a request like:

  Request: get_parent_map(r9999,r8999,r7999,...r1)
  Response: (no results)

(1 round trip to find out there's no common history)

Although I worry a little that defaulting to loading the entire
left-hand history locally might not be great for performance in other
common cases.

Hmm, I guess what's happening is that _walk_to_common_revisions is
probably working as expected for a push from a local branch to a remote
repo, but when the source is remote it queries the remote one-by-one...
but that should mean it is returning results and thus should be
expanding those results to 64kB.  So there's a mystery here.

> > Perhaps a new variation of -Dhpss is called for (just like -Dhpssvfs)
> > that shows a traceback whenever a get_parent_map request is sent with
> > just one key?  Perhaps “-Dhpssevil”?
> 
> We've been a bit hit-and-miss with options like that in the past.
> What is the context in which it would be turned on?

Well, the question I have at the moment is “what's the traceback leading
to this inefficient request?”.  It's not too hard for a developer to
just hack the code temporarily to start investigating this case... I was
wondering aloud if this question occurs often enough to justify making
it more convenient for devs (and users) to get this information.  I'm
not at all sure it is, and it sounds like you don't think so either.
Fair enough :)

-Andrew.



More information about the bazaar mailing list