Work flow on large repositories

Scott Aubrey scottaubrey at capuk.org
Wed Jul 28 10:26:13 BST 2010


Of note, this kind of workflow is what the bzr-colo[1] plugin was created for. It helps with some commands specifically for creating local feature branches, and adds a colo: prefix to make sure you always select the local branch instead of remote branch in a bound branch situation, and puts the shared repository in the .bzr folder so it's hidden, but within the same folder as your lightweight checkout. Using your example, the layout would look more like:

> .../                                <---- lightweight-checkout of./.bzr/branches/f1 (not a normal checkout, but one specifically set-up using bzr reconfigure --lightweight-checkout)
> .../.bzr/branches                   <---- shared repository, no-trees by default
> .../.bzr/branches/f1                <---- branch with no working tree of bzr+ssh://remotehost/path/to/branches/f1
> .../.bzr/branches/f2                <---- branch with no working tree of bzr+ssh://remotehost/path/to/branches/f2
> 
> .../.bzr/branches/origin/trunk      <---- checkout of svn mainline

You could then create a branch with (when  your checkout is attached to trunk):

> bzr colo-branch f1

and switch between them with:

> bzr switch colo:f2
> bzr switch colo:origin/trunk

and updating the trunk (or any other origin branch) is as simple as:

> bzr colo-pull

We use that successfully with the same kind of setup, except all our branches are native bzr.

The colo plugin has basic workspace sync support too, though we don't use that.

- Scott

[1] http://doc.bazaar.canonical.com/plugins/en/colo-plugin.html

On 28 Jul 2010, at 08:26, Philip Peitsch wrote:

> Certainly :)
> 
> As local computers are not backed up, we wanted branch data to be actually saved in a remote location as well.  At the same time though, I needed to keep switching commands small and convenient.  What I discovered bzr can do is actually have a local cache of a remote branch... in it's own folder.
> 
> Our current working structure looks like
> 
> .../                  <---- shared repository, no-trees by default
> .../features/f1  <---- branch with no working tree of bzr+ssh://remotehost/path/to/branches/f1
> .../features/f2  <---- branch with no working tree of bzr+ssh://remotehost/path/to/branches/f2
> .../working <---- lightweight-checkout of ../features/f1 (not a normal checkout, but one specifically set-up using bzr reconfigure --lightweight-checkout)
> .../svntrunk <--- checkout of svn mainline
> 
> Running bzr info in working shows:
> Lightweight checkout (format: 2a)
> Location:
>        light checkout root: .
>   repository checkout root: .../features/f1
>         checkout of branch: bzr+ssh://remotehost/path/to/branches/f1
>          shared repository: .../
> 
> Related branches:
>     push branch: .../features/f1
>   parent branch: .../svntrunk
> 
> Running bzr info in .../features/f1 shows:
> Repository bound branch (format: 2a)
> Location:
>   shared repository: .../
>   repository branch: .
> 
> Related branches:
>     push branch: bzr+ssh://remotehost/path/to/branches/f1
>   parent branch: .../svntrunk
> 
> Under this arrangement, commits made to the lightweight checkout are immediately pushed to the remote location.  The local folder allows easy switching and merging as the filesystem auto-complete can assist.  so rather than needing to remember the full remote url, I can simply do "bzr switch ../features/f1" (I haven't yet gotten around to how to write a custom plugin to remove the need for the remote url... one day maybe)
> 
> The setup for a feature is essentially done using:
> bzr branch svntrunk features/f1
> bzr push --create-prefix features/f1 bzr+ssh://remotehost/path/to/branches/f1
> cd features/f1
> bzr bind bzr+ssh://remotehost/path/to/branches/f1
> 
> bzr switch working features/f1
> 
> Occasionally we have issues with dev's switching before they commit their changes, fortunately bzr is pretty sensible about handling that so no data has been lost yet.  The other thing that has happened once or twice is a dev will type "bzr switch f1" and find themselves bound to the remote url, rather than the local repo... this can create confusion when they try and merge back into svntrunk doing "bzr merge ../features/f1" only for it to report no local changes.  A quick bzr pull [remote uri] in the offending feature directory rectifies this issue though.
> 
> We've been playing with this system for 3 months and find it works acceptably.  There seems to be a few issues with ghost revisions between dev machines that I haven't figured out how to avoid yet though.  I suspect our usage is a little off the beaten path and is potentially exposing some idiosyncracies in how bzr manages pushing and pullings revs.  Still, it is much better than pure svn!
> 
> Cheers,
> 
> Philip
> 
> On Wed, Jul 28, 2010 at 5:06 PM, Chris Hecker <checker at d6.com> wrote:
> 
> If it is relevant for you, I also have more information about how we
> mirror our branches onto a shared network repository and other such
> magics that go to making the setup work well for us.
> 
> I'd be very interested to hear more details.
> 
> Chris
> 
> 
> On 2010/07/27 23:54, Philip Peitsch wrote:
> I am currently happily using bzr with a 1.2Gb branch through judicious
> use of shared repositories and switching.  Basically, I have one
> heavy-weight checkout of the 1.2Gb branch in a shared repository, where
> branches are created with no working trees by default.  Then I switch
> the heavy-weight checkout between "features", before committing to the
> mainline.  The obvious downside is that you can only work on X number of
> features at the same time (where X is the number of heavy-weight
> checkouts you have)... though switching this checkout between features
> is a very rapid operation (~10s on mine roughly).
> 
> Basic set up is:
> # Setup the shared bzr repo
> cd /some/repo/
> bzr init-repo --no-trees --default .
> 
> # Make a clean branch of mainline
> bzr branch /mainline/bzr/branch mainline
> 
> # Make a feature to play with
> bzr branch mainline some-new-feature
> 
> # Make a checkout to play with
> bzr co some-new-feature working_1
> *hack and play....*
> *bzr commit etc.*
> 
> # Make a new feature to play with
> bzr branch mainline some-other-feature
> 
> # Switch the heavy workout to the new feature DONT FORGET TO COMMIT
> FIRST (though bzr won't lose your changes... it's polite that way)
> cd working_1
> bzr switch ..\some-other-feature
> # Commits will now go to some-other-feature
> 
> 
> To make this convenient on a day-to-day- basis, I've wrapped up common
> operations (like creating the branch and then binding the heavy checkout
> to the new feature branch) up in script files.  So to start a new
> feature, I do "create-feature some-other-feature", and the scripts know
> my directory layout to take care of the branching etc.
> 
> Anywho... not sure if that is at all relevant to your use case, but it
> is working quite well here for ~8devs who have only just migrated from
> SVN... no data has been accidently lost yet either which is a bonus :).
> If it is relevant for you, I also have more information about how we
> mirror our branches onto a shared network repository and other such
> magics that go to making the setup work well for us.
> 
> Philip
> 
> On Wed, Jul 28, 2010 at 3:29 PM, Chris Hecker <checker at d6.com
> <mailto:checker at d6.com>> wrote:
> 
> 
>    Ah, sorry, I thought the whole .bzr directory was only 20mb and was
>    confused!
> 
>    Very interested to hear responses!
> 
>    Chris
> 
> 
> 
>    On 2010/07/27 21:01, Michael Hope wrote:
> 
>        'bzr revno 4.4' shows 93541 revisions.  The mirror .bzr directory is
>        549 MB.  The 4.4 branch directory is 20 MB.  The exported size
>        is 559
>        MB over 63000 files.  It's reasonably big.
> 
>        -- Michael
> 
>        On Wed, Jul 28, 2010 at 3:57 PM, Chris Hecker<checker at d6.com
>        <mailto:checker at d6.com>>  wrote:
> 
> 
> 
>            It seems like "large repository" means amount of history?
>              20mb is not very
>            big at all for the .bzr, I think.  How many revisions are there?
> 
>            My repo is 1k revisions, but 55mb, and operations are
>            reasonably fast
>            (creating a new working copy is a little pokey, 1300 files,
>            but not too bad,
>            maybe 10 seconds).
> 
>            I'm very interested in this topic.  I have binary files in
>            my repro (maya
>            models and animations, photoshop psds, and audio files), and
>            I'm worried
>            about it blowing up in my face at some point, but so far
>            it's been fine and
>            I'm loving bzr relative to svn.
> 
>            Chris
> 
> 
>            On 2010/07/27 19:40, Michael Hope wrote:
> 
> 
>                Hi there.  I'm working on the gcc-linaro branch which is
>                stored under
>                bzr and hosted on Launchpad.  This is a fairly big
>                branch as it was
>                imported from upstream SVN and contains a large amount
>                of history.
> 
>                Most of the work is day-to-day changes on topic
>                branches.  I also want
>                to run a buildbot style program that continually updates
>                and builds
>                the latest.
> 
>                My issue is that the various operations are taking too
>                long.  Could
>                anyone suggest tricks or a different work flow to speed
>                things up?
> 
>                Some of the operations include:
> 
>                Creating a mirror branch by doing init-repo, branch
>                lp:gcc-linaro/4.4.
>                  The finding revisions stage takes about 10 minutes at
>                1kB/s.  The
>                download stage is much faster.
> 
>                Day-to-day work is done on topic branches.  Creating the
>                branch takes
>                46 s, 250 MB of RAM, and creates a 20 MB .bzr directory.
>                  Pushing this
>                branch to LP for merging involves pushing the full 20
>                MB, but this is
>                acceptable.
> 
>                Doing a bzr pull on the 4.4 mirror directory may more
>                than half an
>                hour and more than 500 MB of memory.
> 
>                Doing a bzr checkout takes over 20 minutes and 800 MB of
>                memory on my
>                fastest machine.  On my netbook and ARM board this
>                causes significant
>                swapping.  I've yet to complete a checkout on either.
> 
>                I'd also like to share the mirror with other local
>                machines to skip
>                downloading the same 500 MB many times.  Running bzr
>                serve and then
>                checking out causes 100 % CPU usage for more than 10
>                minutes on the
>                host.
> 
>                These numbers were with 2.2b4.  2.2 is significantly
>                better than 2.1.
> 
>                -- Michael
> 
> 
> 
> 
> 
> 
> 
> 
> --
> Philip Peitsch
> Mob: 0439 810 260
> 
> 
> 
> -- 
> Philip Peitsch
> Mob: 0439 810 260




More information about the bazaar mailing list