Some unscientific timing results (on the Python source tree)

Paul Moore p.f.moore at gmail.com
Sat Mar 29 13:01:41 GMT 2008


On 27/03/2008, John Arbash Meinel <john at arbash-meinel.com> wrote:
> Paul Moore wrote:
>  |
>  | bzr - 262Mb
>  | hg - 140Mb
>  | Subversion checkout - 333Mb
>
> Something seems fishy here. Specifically, if the SVN checkout is 333Mb, that
>  sounds like the size of your working tree is 333/2 = 160MB. (SVN creates 2
>  copies of every file so it has a pristine copy to 'diff' against.)

That looks like a typo. I redid the tests and got 133MB for Subversion
(the other figures are about the same).

>  I'm also a little surprised that we are that much larger than hg, since usually
>  our on-disk tests show packs as taking up less space. (hg has 1 or 2 files per
>  versioned file, which usually causes a lot of 'wasted' space because of block
>  sizes.)
>
>  I'm wondering if you don't have a lot of stuff in .bzr/repository/obsolete_packs
>  which will be cleaned up over time. (When we generate new packs we leave the old
>  ones around a bit to make sure that you can recover even if the OS decides to
>  process deletes before writes and crashes in the middle.)

It doesn't seems so:

>bzr info
Standalone tree (format: rich-root-pack)
Location:
  branch root: .

Related branches:
  parent branch: C:/BZR/python/trunk

12:53 C:\Data\size_tests\bzr-python
>dir .bzr\repository\obsolete_packs

 Volume in drive C is Windows        Serial number is b087:e240
 Directory of  C:\Data\size_tests\bzr-python\.bzr\repository\obsolete_packs\*

29/03/2008  12:04         <DIR>    .
29/03/2008  12:04         <DIR>    ..
              0 bytes in 0 files and 2 dirs
 77,801,676,800 bytes free

I'm using rich-root-pack, because the source came from bzr-svn, but
otherwise no issues.

Also, I did a bzr pack, and that made no difference.

>  I'm also curious about your bandwidth/latency to both machines.

I'm not sure what machines you're talking about here. For the size
tests, and branching I was using local mirrors on my PC, and I wasn't
using network protocols at all. For the pulls keeping my mirrors in
sync, I was using HTTP, but I'm not sure how to calculate bandwidth
and latency. I'm on an ADSL broadband link, and ping gives:

>ping code.python.org

Pinging dinsdale.python.org [82.94.237.218] with 32 bytes of data:

Reply from 82.94.237.218: bytes=32 time=38ms TTL=56
Reply from 82.94.237.218: bytes=32 time=36ms TTL=56
Reply from 82.94.237.218: bytes=32 time=39ms TTL=56
Reply from 82.94.237.218: bytes=32 time=36ms TTL=56

Ping statistics for 82.94.237.218:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 36ms, Maximum = 39ms, Average = 37ms

>ping dev.pitrou.net

Pinging dev.pitrou.net [88.191.33.32] with 32 bytes of data:

Reply from 88.191.33.32: bytes=32 time=42ms TTL=54
Reply from 88.191.33.32: bytes=32 time=42ms TTL=54
Reply from 88.191.33.32: bytes=32 time=42ms TTL=54
Reply from 88.191.33.32: bytes=32 time=42ms TTL=54

Ping statistics for 88.191.33.32:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 42ms, Maximum = 42ms, Average = 42ms

Does that help?

>  I would also be interested to know the load on both machines. Hg's cgi script
>  does (effectively) a zcat | bzip2 to stream out the data. Which is nice for
>  bandwidth, but puts a rather heavy load on the server. I'm not sure what happens
>  when you have 10 people branching at the same time. (Which *might* be an issue
>  for a project like python.org, probably isn't for others.)

I have no idea what the load is on the remote machines - I don't have
access to them. For my PC, it was not idle (I was simulating "normal
use" and reading gmail while long commands were running) but there
wasn't anything big running other than the DVCS commands.

>  I'm currently on vacation in an area with very limited internet connectivity. If
>  I ever get a chance, I'd like to figure out why these numbers are so different.

Enjoy your vacation, but I hope this is of use for when you get back!

Paul.



More information about the bazaar mailing list