Some unscientific timing results (on the Python source tree)

Talden talden at gmail.com
Sat Mar 29 22:36:06 GMT 2008


On Sun, Mar 30, 2008 at 2:56 AM, Toshio Kuratomi <a.badger at gmail.com> wrote:
> Talden wrote:
>  >>  I would also be interested to know the load on both machines. Hg's cgi script
>  >>  does (effectively) a zcat | bzip2 to stream out the data. Which is nice for
>  >>  bandwidth, but puts a rather heavy load on the server. I'm not sure what happens
>  >>  when you have 10 people branching at the same time. (Which *might* be an issue
>  >>  for a project like python.org, probably isn't for others.)
>  >>
>  >>  I know we've considered doing that for bzr+http:// but we haven't decided if it
>  >>  is worth the server load yet. (We could probably make it configurable, but there
>  >>  is always the 'you should work the best you can "out of the box"' problem.)
>  >
>  > It seems that the most common case across the potential user-base of
>  > Bazaar is not 10+ people branching at once.
>  >
>  > For systems with potentially large numbers of concurrent branching
>  > operations, having a 'turn off stream compression' switch seems a
>  > reasonable tuning _option_ rather than a default.
>  >
>  For estimating server load this might be somewhat less rare than you
>  imagine at first.
>
>  In Fedora we have fedorahosted.org which hosts multiple projects on two
>  servers.  Simultaneous branching doesn't have to be of a single project
>  to increase CPU load; it can be of multiple projects on the same server.
>   I'd imagine that many organizations are going to have a centralized
>  server for all their repositories even though bzr can support a more
>  decentralized usage.
>
>  -Toshio

I don't doubt for a moment that the compression streaming needs to be
able to be disabled - there are clearly cases where no hardware of a
practical cost will handle the computational load.  Of course many
'big-corporate' dedicated servers these days will have cores and RAM
coming out of its ears and far, far fewer concurrent users - bandwidth
over WANs is pretty much always in short supply however...

We have three offices in different time-zones (US, NZ and India) with
the team numbers spread across them fairly evenly.  Presently a
central CVS server is accessed over a VPN from each non-US office and
is very slow - latency and bandwidth torture us in equal measure.

Though I'm sure that Bazaar will be better than CVS even without
mirroring the official mainlines, I expect we will have a mirror in
each of the two non-central offices that their devs pull from (and is
updated automatically) - they would push to the central server.  Of
course in our case we're only talking 7-10 devs in each office
(something in the order of <50 pulls a day).

NB: Devs in different offices often work together, using the rather
painful CVS branching or pair programming with Radmin. I would like to
see the CVS branching for features/fixes go away and instead devs
would simply 'bzr serve' their own branch to collaborate.  We use
video-conferencing and VOIP-conferencing extensively for design,
planning and reviews, working as though we're a single office - yet
the poor WAN performance of CVS makes two offices into second-class
(some would say third or fourth-class) citizens, something we're
anxious to remedy.

I still don't believe that the majority of uses to which Bazaar is put
would likely produce enough simultaneous activity to swamp servers yet
slow streaming affects nearly everyone.  Making compressed streaming
the default and providing a tuning switch for others seems a better
choice if we're trying to work the best we can out of the box for most
users.

--
Talden



More information about the bazaar mailing list