Another look at bzr network traffic

Andrew Cowie andrew at operationaldynamics.com
Tue Apr 6 04:28:46 BST 2010


On Sun, 2010-04-04 at 12:06 -0500, John Arbash Meinel wrote:
http://article.gmane.org/gmane.comp.version-control.bazaar-ng.general/67118
> 

So I was about to reply to that message, but we've got this thread here,
so here we are.

On Thu, 2010-03-18 at 08:51 -0500, John Arbash Meinel wrote: 
> I was sort of staying out of this as a 'known bug' of the 2a format, but
> since it wasn't clear in Andrew's head, I'll explain.
> ...

> What that means for the current smart protocol, is that all fetches will
> copy at least 1 fulltext, regardless of the delta size. So a 1 byte
> change to a 1MB file will transmit 1MB of data, though after
> autopack/pack it will probably get stored as a small handful of bytes
> again. (While it is in its own pack, it is still stored as 1MB on disk.)

Once upon a time when people were optimizing normal operations over
http:// it was mentioned that Bazaar is very good about using HTTP Range
requests to only grab the actual bytes that it cares about. All good.

But this makes it sound like over the bzr{,+ssh}:// protocol that
requesting a small revision may end up shipping an entire pack file no
matter what. If true, then the obvious concern is "I just got repacked
into a single pack that's now 100 MB big, goodie! here it comes!"

Meanwhile, to the preceding message,

On Wed, 2010-03-17 at 12:24 +1100, Andrew Bennetts wrote:
> I'm not sure that this is worth worrying about, e.g. on my usual
> internet connection 200kB takes less than a second to receive.  Is
> this
> behaviour a problem for you? 

That sounds ... cavalier. I mean, sure, sometimes we have nice fat
network connections all to ourselves, but just as often not (when I read
this email last week I was on a client site with 200 developers on less
bandwidth than you have at home). We all get that network bytes = time
and given that we're trying to make Bazaar fast, unnecessary network
traffic seems ... unnecessary.

I would have thought this was all just a case of me not understanding
what's really going on except for the comment:

> Git specifically addresses this with 'skinny packs' sent over the
> wire.

So, um, maybe we're missing a bet too?

For what it's worth, I see "excessive" network traffic (as subjectively
judged by "long" wall clock times for `bzr pull`s against remote
repositories) quite frequently.

AfC
Sydney

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20100406/40b34730/attachment-0001.pgp 


More information about the bazaar mailing list