Experience of centralized workflow with NFS-mounted storage?

Thu Nov 20 12:47:24 GMT 2008

Mikael Karlsson wrote:
> We're setting up a Bazaar environment with a centralized workflow and we 
> wonder if someone have experience of having a Bazaar server with a 
> mounted NFS-storage where the branch data is stored?
>
> Clients will checkout data over SSH from the server to clients like this
>
> Branch data <=== NFS ===> Bazaar server <=== SSH ===> Developer
>
> We could be talking about several hundred users checking out and 
> committing data and several gigabytes of branch data simultaneously.
>
> The actual question is how file locking will work when using NFS? I know 
> it doesnt work with cvs but it work with subversion.
>
> I've tried to find information in the old mailing list archive but not 
> found anything that gives me answers.
>
> The obvious reason to store everything on the SAN are for availability 
> and security reasons and also to be able to make a quick recover if one 
> server fails since there will be a backup server in standby in case of 
> failure.
>
> I would be pleased if the developers of Bazaar could give me a 
> recommendation if this is a good idea or not.
> Also if someone have experience of this, good or bad.
>
>
> Regards
>
> Mikael
>
>   
Mikael,
I am a user who did set up what you propose and had some problems with it.

When we set up our repositories they were stored on an NFS disk just as 
you described.
This was with about bzr 1.3 if I remember right. Our machines are all 
Ubuntu Gutsy.

I did not observe any problems with file locking, however your mileage 
may vary as file locking in NFS is a tricky issue; not all 
implementations support it robustly. I believe that the bzr team have 
done the very best they can in this regard, but any problems will lie on 
the nfs side which is out of their control.

We experienced severe performance problems (e.g. multi-hour-long bzr 
branch executions) and these were significantly relieved by moving the 
repositories to a local disk on the bazaar server, and also asking users 
where possible to put their branches on disks local to their own desktop 
machines. You can see my mails about benchmarks in the bazaar mail 
archive. These problems resulted many times in bzr failures from timeouts.
We are therefore not using nfs to serve the repositories now and 
performance is very acceptable, not blindingly fast but now good enough 
to not annoy anyone.

However things have moved on since I did the above and I have not 
changed our setup in the meantime

    * The bzr team have done some great work and some serious
      performance problems with bzr have been resolved. It is very
      likely that these changes would have helped our problems.
    * I suspect that there is something wrong in the setup of our NFS
      file server/network but our IT staff have not been able to
      identify it yet. If you have several hundred developers already
      accessing the SAN without problems then your network must already
      be well tuned.

Some hints of things to watch out for:

    *  If your developers are using home directories also mounted by NFS
      from the SAN then you have 2 streams of traffic involving the SAN
      when a developer does a branch:

      Branch data/SAN <== NFS ==> Bazaar server <== SSH ==> Developer <== NFS ==> SAN

    * lightweight checkouts and stacked branches require much less
      network traffic than a full branch.
    * Access via file:// will nearly always be faster than bzr+ssh://
    * If many developers are doing things concurrently with bzr+ssh://
      to the bazaar server then this server will be very heavily loaded
      and you may be better off going straight to the SAN. Putting the
      bazaar server in the loop adds 2 network hops and very little
      benefit that I can see. The repositories can still be protected by
      file access rights.

David.