[MERGE, 1.4] Re: [1.4 REGRESSION] Fetch brutally slow for packs <=>knits.

Mon Apr 14 23:53:09 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
| On Mon, 2008-04-14 at 20:59 +0300, John Arbash Meinel wrote:
|> ~ calls
|> ~   for rev_id in rev_ids:
|> ~     KnitRepository._revision_store.has_revision_id(rev_id,
|> transaction)
|> ~ calls
|> ~
|> KnitRevisionStore.get_revision_file(transaction).has_version(revision_id)
|>
|> However, because transactions are no longer caching any objects, this
|> means that
|> we re-open the revisions.kndx file for *every* revision_id which is
|> not in common.
|>
|> So if you have to push 100 revisions, we open revisions.kndx 100
|> times. If you
|> are pushing 900, you open it and parse it 900 times. AKA Brutally
|> slow.
|>
|> I think this is a *very* serious regression as it makes
|> inter-repository work
|> pretty much unusable.
|>
|> So the "simple" fixes I can think of are:
|>
|> 1) have KnitRevisionStore.get_revision_file() cache the object itself
|> (since
|> ~   transactions won't anymore). This is bad because transaction was
|> the only
|> ~   think that actually figured out the lifetime of caching.
|>
|> 2) Reinstate at least some knit object caching. Perhaps only for
|> Inventory and
|> ~   Revision objects. Though I have the feeling we would then run into
|> the same
|> ~   problem about Knit text objects having then .kndx read 100 times
|> during the
|> ~   fetch. The only real advantage here is that they are less likely
|> to have as
|> ~   many revisions to fetch.
|>
|> 3) Put big "THIS IS BROKEN, UPGRADE TO THE SAME FORMAT REPOSITORIES"
|> warnings in
|> ~   our generic fetching code.
|>
|> Even if you took out the for loop, we still generally only know about
|> a small
|> handful of revision ids we are interested in at a time. (Because if I
|> commit 100
|> linear revisions, you only know about the direct parent, and its
|> parent, etc.)
|>
|> There are some other possibilities I guess:
|>
|> 4) Change the generic fetching code to break the api abstraction and
|> call
|> target._get_revision_vf() directly. And then use that object for
|> "target.has_revision_id()" instead of going through the regular apis.
|>
|> 5) Slightly better than (4), implement a Knit<=>Pack fetcher which
|> does the same
|> thing. But at least you know that KnitRepo and KnitPackRepo will have
|> the function.
|
| 6) the attached bundle uses the repository.get_graph() api rather than
| repository.has_missing; this uses the Graph interface that for knits has
| the revision vf as an instance variable.
|
| Attached.
|
| -Rob
|
|

BB:approve

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgD4FUACgkQJdeBCYSNAAMZRgCdFv46PtozifutNSTxGpnc/gtP
z4gAoMlI/pPGRSi6RqOFZUsmOFd/1+ow
=it/b
-----END PGP SIGNATURE-----