Feedback from evaluation in a corporate environment

Fri Jan 8 05:44:29 GMT 2010

Uri Moszkowicz writes:

 > I think you vastly underestimate the strength of the git architecture.

 > I considered that possibility and ran some benchmarks to determine if that
 > was the case. Sadly, it is not. At least, not with Bazaar,

What makes you think Bazaar's architecture is the same as git's?  In
fact, it's quite different.

 > which as I understand is comparable in performance to Mercurial and
 > GIT now.

Don't believe all the marketing fluff you read.  Bazaar is comparable
in performance to Mercurial or git *only* if you adapt your workflow
to use optimized Bazaar facilities, including smart servers, shared
repos, and lightweight checkouts.  In fact, it is easy to create
horribly performing Bazaar workflows.  It's also reasonably easy to
construct adapted workflows for the workflows I worry about.  Though I
can't make promises for yours, I would bet that performance could be
made acceptable with tuning of the workflow.

 > While I was writing the last email, I finally clocked in my last
 > branch at 47min and that's still optimistic for the real
 > deployment.

47 minutes for a repository (which is what a standalone branch is,
really) isn't too bad, IMO, because cloning repositories is something
that will happen rarely.  IIUC, in your use case you *never* want the
repository cloned if you can avoid it.  See below for some details of
a suggestion for how this might be done in your environment.

No promises from me that the above will actually work well in
practice, I have no need (and no NetApps to try it on!) so it's just
hearsay.  But probably others here do have some experience and can confirm.

 > >  This is a noop.  A commit that succeeds in one repository will succeed
 > > in *all* related repositories once it gets there ("related" meaning
 > > that the parent commit(s) is (are) available in all repositories).
 > 
 > Not if multiple people are committing at the same time.

Yes, if multiple people are committing at the same time, too.  You are
confounding "commit" with updating the public branch reference.  They
are different concepts in DVCS.  Commits are an implementation detail
you shouldn't worry about; the question is how to ensure certain
properties for the commit that the public branch refers to.  This
model, enabled by DVCS, brings the DVCS reality *much* closer to the
way the people are actually working.

The management problem is to get the people to behave in a centralized
way when the software offers much more comfortable possibilities. :-)

 > The up-to-date check may pass on the first few repositories and
 > then fail at some point in the middle because someone else was
 > committing to them all at the same time but visited them in a
 > different order. That person will then run into the same problem
 > hitting the repositories that you've already hit.

This is a people-system management problem as well as a DVCS problem.

 > I'm not sure how a DVCS can keep up with our development process.

I suspect the real problem is the other way around.  The DVCS allows
people to commit without serialization, and your development process
will get snafu'd by that.  You just don't want a DVCS; you want a CVCS
with a distributed backend.

 > It seems to me that staffing changes would need to be made as well
 > as work on the software in the repository itself. Not an easy task for a
 > team as described in my last email and not clear that it would be worth the
 > investment. As I said earlier, I thought Bazaar's strength was that is could
 > support a variety of workflows so I'm somewhat surprised at the resistance
 > to supporting a previously unknown workflow, one that I suspect is fairly
 > common in corporate environments.

What resistance?  It doesn't support it, and that's a fact.  The
reason is pretty simple: DVCSes were created by people who *want*
DVCSes.  The Bazaar goal of supporting CVCS use cases came afterwards,
and implementation of support has followed demonstration of demand for
the use cases (and often contribution of prototypes by the users).

 > > Assuming by "proxy bound to master" you mean "bzr bind", I believe
 > > you're misunderstanding.  True, currently I don't think you can
 > > recursively bind to (or checkout from) a branch which is bound to yet
 > > another branch.  However, if you clone the proxy, AIUI a local commit
 > > in the clone does not update the proxy or the master, but a push to
 > > the proxy will update not only the proxy but also the bound master.
 > >
 > 
 > Ah maybe a push would work - I didn't try. But why would you want to
 > propagate a push and not a commit?

Because that's the way it happens to work at the moment.  In your
specified preferred workflow, you would indeed want the commits to
propagate without need for an explicit push.  However, AIUI there is a
restriction on cascades of pushes in current Bazaar.  I don't know how
hard it would be to remove that restriction.

Terminology note: in DVCS, there are two operations: *record* the
commit in "your" repository, and *push* a commit to the "other"
repository.  In a CVCS, there is no "your" repository, but we can
abuse terminology with the equation "commit = record + push".

Unfortunately, all of the DVCSes we have talked about have chosen to
name the record command "commit", and (IMO even more unfortunately)
Bazaar has chosen to overload the "commit" command with record + push
semantics in the case of a checkout.

 > >  It had better be; that's what DVCS means.  Unless I don't understand
 > > what you mean by "blocking" here.
 > 
 > I meant that the only time that your prompt is blocked is the time that it
 > takes to transfer your local changes to your local server. The propagation
 > around the network of repositories would happen independently among the
 > machines involved in hosting those repositories.

In that case, none of the DVCSes would block.  However, elsewhere you
specify *blocking* semantics because of your synchronization
requirement.  You're going to have to pick one, blocking or
non-blocking; the law of the excluded middle applies here.  If you
pick blocking, then we can talk about how to minimize the blocked
period.  If you pick non-blocking, you'll have to adapt workflow.

 > >  Well, your whole set of requirements is incompatible with DVCS
 > > philosophy (but Bazaar intends to be more than "just DVCS", so that in
 > > itself is no problem).  However, given those requirements, update on
 > > reconnection seems like the obvious solution.  Coherency check on
 > > checkout/branch is insufficient, you'd really need a coherency check
 > > on every update, so I don't think that idea will work.
 > 
 > True, sort of. You could delay the coherency check until a push/commit since
 > it won't matter until then, though you might want to discover it earlier.

That's a software-centric view of things.  The developers will be
*very* pissed off if you impose such software on them, I suspect.

 > It's not expected to happen often so minimizing opportunities to pay the
 > cost is a good tradeoff.

Huh?  Synchronicity is going to fail at least briefly with every
commit.  In an active distributed project, that probably means that
people will fail synchronicity checks with annoying frequency.  Also,
from a people management perspective, you need to think of files
changed in a developer's workspace as "proto-committed".  One view of
what a DVCS does is allow the developer to offload certain
administrivia in dealing with "proto-commits" to the VCS by committing
locally, and merging up at her convenience (rather than having the
merge imposed on her at any arbitrary attempt to commit).

 > The trouble with update on reconnection is there no way to pass the
 > message to the repository that it needs to update as it may not
 > know that it was disconnected.

So what?  The agent that sent the message knows there was no ack, so
it keeps trying.  In the current Bazaar implementation what happens if
the branch's host is down is the commit fails, and the *user* has to
retry.  But it wouldn't be hard to have a "push queue" and
automatically resend.

Of course there's a significant probability of concurrent work while
the connection was lost, but this is a people-systems problem as well
as a DVCS problem anyway.  Cf. "proto-commits".

 > Storage space is only cheap on desktops - not on network appliances, which
 > are commonly found in the corporate environment.

But a network appliance appears to the workstation as a mounted
filesystem, does it not?  It seems to me that you just put the shared
repository (bzr terminology) or object storage (git) on the network
appliance and share it among the developers: the need to replicate it
would be if there are network contention issues or geographical
latency issues.  In git object storage is append-only from the user's
point of view, there is no consistency problem.  I'm pretty sure bzr
works the same way, conceptually, but I'm not familier with the
implementation.

Check out the documentation for the --no-trees option to bzr
init-repo.  This makes branching in the shared repo quite lightweight,
and the developers' workspaces will then be created in personal
sandboxes with (presumably lightweight) checkouts.  Since the
sandboxes are actually on the network appliance (right?) this is
optimal in time, and very efficient (at worst! :-) in space.

You still have the workflow problems of "proto-commits" and "who owns the
public branch reference?", of course, but your storage and latency
requirements should be met.

 > And that's still ignoring the time that it takes to actually create
 > the repositories.

Ignore it; see next point re: fastimport.

 > Yes I think [some compression] is necessary. I could also use a
 > pipe but I was hoping to keep it around. I expect that the
 > conversion would still take a really long time though (maybe 1 week
 > at the rate it was going?).

So what?  Surely you've already spent a week (calendar time) working
on this.  Once you have *one* copy of a reasonably fresh repo, you're
done with this.  With an adapted workflow, all the rest will use bzr
protocol and be much faster (measurable in 10s of minutes at worst,
and probably a handful of seconds).  A project with 10GB repos and
1000 developers can surely afford the cost of one workstation with 1TB
of local storage, and a week of calendar time to do the conversion.
That is the extent of the cost of conversion itself (if it works -- so
it's a shame you didn't let it run for a week if necessary to find out).