Experience of centralized workflow with NFS-mounted storage?
Mikael Karlsson
mikael.k.karlsson at axis.com
Fri Nov 21 09:04:47 GMT 2008
John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Mikael Karlsson wrote:
>> We're setting up a Bazaar environment with a centralized workflow and we
>> wonder if someone have experience of having a Bazaar server with a
>> mounted NFS-storage where the branch data is stored?
>>
>> Clients will checkout data over SSH from the server to clients like this
>>
>> Branch data <=== NFS ===> Bazaar server <=== SSH ===> Developer
>>
>> We could be talking about several hundred users checking out and
>> committing data and several gigabytes of branch data simultaneously.
>>
>> The actual question is how file locking will work when using NFS? I know
>> it doesnt work with cvs but it work with subversion.
>>
>> I've tried to find information in the old mailing list archive but not
>> found anything that gives me answers.
>>
>> The obvious reason to store everything on the SAN are for availability
>> and security reasons and also to be able to make a quick recover if one
>> server fails since there will be a backup server in standby in case of
>> failure.
>>
>> I would be pleased if the developers of Bazaar could give me a
>> recommendation if this is a good idea or not.
>> Also if someone have experience of this, good or bad.
>>
>>
>
> To describe how Bazaar would work with NFS and locking....
>
> 1) We only use OS locks for *working trees*. So if you have a repository
> + branches on NFS everything should work fine. What tends to break is
> when your home directory is an NFS directory, because we use an OS lock
> on one of the files there (.bzr/checkout/dirstate).
Home directories will be local.
>
> 2) For Branch and Repository, we use directory locking (create a
> directory, add a file with your lock code, rename the directory into
> place, check if your lock code is the one in that directory.) Generally
> this makes it safe for any filesystem.
>
> (Aside from old versions of Twisted's SFTP server which had a buggy
> implementation that caused
>
> rename(dir, existing)
> to be treated as
> rename(dir, existing/dir)
>
> Which matches what the "mv" command does, but not what the low level
> "rename()" os call does.)
>
>
> 3) Things that are mounted into the local filesystem are generally
> accessed with different patterns than things that are accessed via
> bzr+ssh or sftp. For example, if you are accessing via bzr+ssh we buffer
> 64kB for all index reads, while if you access via file:/// we don't do
> any buffering.
>
> This may be better/worse for you. But local access is generally assumed
> to have very high bandwidth and very low latency. If that isn't true for
> your NFS mounts, then you might consider accessing via bzr+ssh:// which
> tries harder to hide latency.
>
> Also, bzr+ssh:// does have 2 processes running (the local and remote),
> which means you share the workload a little bit, but you also can cause
> us to (re-)serialize the data to send it over the wire.
bzr+ssh:// is the only acceptable option. We will also try bzr+https to
see what is best.
>
>
> 4) Why mount over NFS versus direct access to the Branch Data? Just
> because it is on a SAN which doesn't let you install anything? (This can
> be reason enough, I just want to make sure I understand what is going on.)
It's an enterprise SAN solution and we want to be able to take advantage
of replicating data and snapshots.
>
> As for a backup server, it is pretty easy to replicate a Bazaar
> repository with a cron script. It could be easier to provide 2 Bazaar
> servers rather than a SAN solution. (Except you probably already have
> invested the time, effort and money into the SAN solution. :)
We have, and we will still have two servers (one in standby). :) We dont
want to manually replicate data.
>
> Bazaar would probably handle multiple "master" repositories slightly
> better, though. (You still have problems if 2 people commit to the
> "same" branch on each repo, but the repository storage itself would be
> quite capable of figuring out what revisions need to be copied.)
>
> Perhaps to put it a different way. You could scale "up" in the number of
> Bazaar servers you have, as long as there was only one official master
> for each branch. Users could get read access from Branches from all
> servers, but Write access for branches 1-10 from master 1, branches
> 11-20 from master 2, etc.
>
> This may be more complexity than you want/need. Just mentioning it as a
> way to scale up.
Interesting :)
>
>
> 5) I highly recommend using the latest repository format if you aren't
> concerned about backwards compatibility. (--1.9-rich-root if you can,
> - --1.9 otherwise)
> I'll mention that we've been focusing a lot lately on how we scale to
> very large projects, so I would expect at least one more repository
> format update.
We will go for 1.9, depending if there is a new release when everything
is setup.
>
> As mentioned, you can do things like lightweight checkouts, stacked
> branches, local shared repositories, etc to help minimize the impact of
> very large histories. If you are doing strictly centralized development,
> then lightweight checkouts should work well for you. I'll admit that
> lightweight checkouts of network repositories haven't had as much
> optimization time as some of the other arrangement. Mostly because when
> you are as distributed as Open Source, you don't have much "local
> network" to rely on. :)
This will be forwarded to the developers since I'm not the expert on
Bazaar (thats why I'm asking questions), just one of those responsible
for the infrastructure,
> 6) Asking questions here, and giving us feedback about how things are
> working is a great way to help ensure that as Bazaar evolves we continue
> to fit your needs and make things better for you. We try to be
> responsive, especially if there are "hundreds of developers" being impacted.
We're glad to hear that and really appreciate all the information. We
just want to gather as much information as possible so we dont implement
something that we later have to change because it's not working. Also,
if Bazaar and NFS turns out to be no problem the chances are good that
we will move other projects to Bazaar too. We really dont like to
manually sync data between servers like we are doing today.
Regards
Mikael
More information about the bazaar
mailing list