Optimizing tree building

Aaron Bentley aaron.bentley at utoronto.ca
Fri Jun 8 02:13:18 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> As a data point, launchpad has a 3MB revisions knit index. But most
> merges will only need the last 64KB. I think as long as we do not start
> too small, we'll win overall because of the reduced data transmission.
> The breakeven point is clearly where you spend more time in round trips
> than save in reduced downloads. I think doing a 64KB read size would be
> fine for SFTP/FTP/HTTP. For the smart server the indexes should never be
> read directly anhow - we should be doing something that sends points of
> the graph around rather than all the nodes; this allows avoiding large
> transfers of data in both directions for long graph-runs, while still
> being efficient in small cases. 

Okay.

>> I figure if we do get a container-based repository, we'll still need
>> indexes, no?  That makes me think it might be more productive to work on
>> a new way of handling inventories than to optimize the current one.
> 
> We'll still need indexes. What of, and how many, and what the keys are
> are still open questions. Changes to inventories - their content and
> representation - will also impact what indexing is needed, but I think
> the indexing layer should be implementable separately. One important
> thing is whether indexes contain graph data or not. Specifically, in
> knits the index and the graph are punned. This is good in some respects
> and bad in others; I hope we can address this during the design of
> indexing.

Oops.  I meant "indexs", not "inventories".

In a design like our current one, keeping the size of the indices low is
vital, so stripping the graph data would be a win in that regard.  It
would still be nice to be able to quickly get a list of
build-dependencies, though, so perhaps the primary index could point at
a secondary index, which would give a list of build-dependencies (and
their locations), which could then be requested from the text storage
all at once.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGaK0t0F+nu1YWqI0RAoLrAJwK3Y+qbUA35d9pwUA7XysXmM0s8wCeN5fv
VYRul+L9idL5hWHSxXQUXTQ=
=Ts9e
-----END PGP SIGNATURE-----



More information about the bazaar mailing list