notes/plan for hpss performance work
Robert Collins
robertc at robertcollins.net
Thu May 1 23:18:47 BST 2008
On Thu, 2008-05-01 at 12:29 -0500, John Arbash Meinel wrote:
>
> | * Pushing up pack data is actually pretty fast, because we write
> | generally only one file and the 1MB buffer keeps even a fat/long
> pipe
> | quite full; however creating the indexes takes one round trip each,
> | and writing the pack names requires taking a lock (see below). When
> | pushing a large new branch, performance is quite good to the network
> | maximum (imposed by the tcp maximum window size). When there's less
> | real data to send, the unnecessary roundtrips really hurt.
>
> It seems like with HPSS you could send multiple indexes in one trip,
> and
> certainly taking a lock should be only 1 round trip to take it, and 1
> to unlock.
> And even then, you only have to do that when the pack has been pushed
> and you
> want to update the pack-names file.
>
> Certainly updating the pack-names file could be an RPC, though I think
> we would
> rather have the pack creation, etc all done on the remote as a
> response to a
> "push_data_stream()" call.
I want a single stream insertion because:
- it is less round trips
- we won't have a race where we can leave a physical lock in place *due
to network errors*
- we can avoid buffering logic on the client
- knit repositories and other formats will benefit massively
- less total data transfer is needed because the inherent redundancy
between index and data will only be pushed once, the index will then be
created remotely.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080502/b088c84b/attachment.pgp
More information about the bazaar
mailing list