Work flow on large repositories

Philip Peitsch philip.peitsch at gmail.com
Wed Jul 28 08:26:45 BST 2010


Certainly :)

As local computers are not backed up, we wanted branch data to be actually
saved in a remote location as well.  At the same time though, I needed to
keep switching commands small and convenient.  What I discovered bzr can do
is actually have a local cache of a remote branch... in it's own folder.

Our current working structure looks like

.../                  <---- shared repository, no-trees by default
.../features/f1  <---- branch with no working tree of
bzr+ssh://remotehost/path/to/branches/f1
.../features/f2  <---- branch with no working tree of
bzr+ssh://remotehost/path/to/branches/f2
.../working <---- lightweight-checkout of ../features/f1 (not a normal
checkout, but one specifically set-up using bzr reconfigure
--lightweight-checkout)
.../svntrunk <--- checkout of svn mainline

Running bzr info in working shows:
Lightweight checkout (format: 2a)
Location:
       light checkout root: .
  repository checkout root: .../features/f1
        checkout of branch: bzr+ssh://remotehost/path/to/branches/f1
         shared repository: .../

Related branches:
    push branch: .../features/f1
  parent branch: .../svntrunk

Running bzr info in .../features/f1 shows:
Repository bound branch (format: 2a)
Location:
  shared repository: .../
  repository branch: .

Related branches:
    push branch: bzr+ssh://remotehost/path/to/branches/f1
  parent branch: .../svntrunk

Under this arrangement, commits made to the lightweight checkout are
immediately pushed to the remote location.  The local folder allows easy
switching and merging as the filesystem auto-complete can assist.  so rather
than needing to remember the full remote url, I can simply do "bzr switch
../features/f1" (I haven't yet gotten around to how to write a custom plugin
to remove the need for the remote url... one day maybe)

The setup for a feature is essentially done using:
bzr branch svntrunk features/f1
bzr push --create-prefix features/f1
bzr+ssh://remotehost/path/to/branches/f1
cd features/f1
bzr bind bzr+ssh://remotehost/path/to/branches/f1

bzr switch working features/f1

Occasionally we have issues with dev's switching before they commit their
changes, fortunately bzr is pretty sensible about handling that so no data
has been lost yet.  The other thing that has happened once or twice is a dev
will type "bzr switch f1" and find themselves bound to the remote url,
rather than the local repo... this can create confusion when they try and
merge back into svntrunk doing "bzr merge ../features/f1" only for it to
report no local changes.  A quick bzr pull [remote uri] in the offending
feature directory rectifies this issue though.

We've been playing with this system for 3 months and find it works
acceptably.  There seems to be a few issues with ghost revisions between dev
machines that I haven't figured out how to avoid yet though.  I suspect our
usage is a little off the beaten path and is potentially exposing some
idiosyncracies in how bzr manages pushing and pullings revs.  Still, it is
much better than pure svn!

Cheers,

Philip

On Wed, Jul 28, 2010 at 5:06 PM, Chris Hecker <checker at d6.com> wrote:

>
>  If it is relevant for you, I also have more information about how we
>> mirror our branches onto a shared network repository and other such
>> magics that go to making the setup work well for us.
>>
>
> I'd be very interested to hear more details.
>
> Chris
>
>
> On 2010/07/27 23:54, Philip Peitsch wrote:
>
>> I am currently happily using bzr with a 1.2Gb branch through judicious
>> use of shared repositories and switching.  Basically, I have one
>> heavy-weight checkout of the 1.2Gb branch in a shared repository, where
>> branches are created with no working trees by default.  Then I switch
>> the heavy-weight checkout between "features", before committing to the
>> mainline.  The obvious downside is that you can only work on X number of
>> features at the same time (where X is the number of heavy-weight
>> checkouts you have)... though switching this checkout between features
>> is a very rapid operation (~10s on mine roughly).
>>
>> Basic set up is:
>> # Setup the shared bzr repo
>> cd /some/repo/
>> bzr init-repo --no-trees --default .
>>
>> # Make a clean branch of mainline
>> bzr branch /mainline/bzr/branch mainline
>>
>> # Make a feature to play with
>> bzr branch mainline some-new-feature
>>
>> # Make a checkout to play with
>> bzr co some-new-feature working_1
>> *hack and play....*
>> *bzr commit etc.*
>>
>> # Make a new feature to play with
>> bzr branch mainline some-other-feature
>>
>> # Switch the heavy workout to the new feature DONT FORGET TO COMMIT
>> FIRST (though bzr won't lose your changes... it's polite that way)
>> cd working_1
>> bzr switch ..\some-other-feature
>> # Commits will now go to some-other-feature
>>
>>
>> To make this convenient on a day-to-day- basis, I've wrapped up common
>> operations (like creating the branch and then binding the heavy checkout
>> to the new feature branch) up in script files.  So to start a new
>> feature, I do "create-feature some-other-feature", and the scripts know
>> my directory layout to take care of the branching etc.
>>
>> Anywho... not sure if that is at all relevant to your use case, but it
>> is working quite well here for ~8devs who have only just migrated from
>> SVN... no data has been accidently lost yet either which is a bonus :).
>> If it is relevant for you, I also have more information about how we
>> mirror our branches onto a shared network repository and other such
>> magics that go to making the setup work well for us.
>>
>> Philip
>>
>> On Wed, Jul 28, 2010 at 3:29 PM, Chris Hecker <checker at d6.com
>> <mailto:checker at d6.com>> wrote:
>>
>>
>>    Ah, sorry, I thought the whole .bzr directory was only 20mb and was
>>    confused!
>>
>>    Very interested to hear responses!
>>
>>    Chris
>>
>>
>>
>>    On 2010/07/27 21:01, Michael Hope wrote:
>>
>>        'bzr revno 4.4' shows 93541 revisions.  The mirror .bzr directory
>> is
>>        549 MB.  The 4.4 branch directory is 20 MB.  The exported size
>>        is 559
>>        MB over 63000 files.  It's reasonably big.
>>
>>        -- Michael
>>
>>        On Wed, Jul 28, 2010 at 3:57 PM, Chris Hecker<checker at d6.com
>>        <mailto:checker at d6.com>>  wrote:
>>
>>
>>
>>            It seems like "large repository" means amount of history?
>>              20mb is not very
>>            big at all for the .bzr, I think.  How many revisions are
>> there?
>>
>>            My repo is 1k revisions, but 55mb, and operations are
>>            reasonably fast
>>            (creating a new working copy is a little pokey, 1300 files,
>>            but not too bad,
>>            maybe 10 seconds).
>>
>>            I'm very interested in this topic.  I have binary files in
>>            my repro (maya
>>            models and animations, photoshop psds, and audio files), and
>>            I'm worried
>>            about it blowing up in my face at some point, but so far
>>            it's been fine and
>>            I'm loving bzr relative to svn.
>>
>>            Chris
>>
>>
>>            On 2010/07/27 19:40, Michael Hope wrote:
>>
>>
>>                Hi there.  I'm working on the gcc-linaro branch which is
>>                stored under
>>                bzr and hosted on Launchpad.  This is a fairly big
>>                branch as it was
>>                imported from upstream SVN and contains a large amount
>>                of history.
>>
>>                Most of the work is day-to-day changes on topic
>>                branches.  I also want
>>                to run a buildbot style program that continually updates
>>                and builds
>>                the latest.
>>
>>                My issue is that the various operations are taking too
>>                long.  Could
>>                anyone suggest tricks or a different work flow to speed
>>                things up?
>>
>>                Some of the operations include:
>>
>>                Creating a mirror branch by doing init-repo, branch
>>                lp:gcc-linaro/4.4.
>>                  The finding revisions stage takes about 10 minutes at
>>                1kB/s.  The
>>                download stage is much faster.
>>
>>                Day-to-day work is done on topic branches.  Creating the
>>                branch takes
>>                46 s, 250 MB of RAM, and creates a 20 MB .bzr directory.
>>                  Pushing this
>>                branch to LP for merging involves pushing the full 20
>>                MB, but this is
>>                acceptable.
>>
>>                Doing a bzr pull on the 4.4 mirror directory may more
>>                than half an
>>                hour and more than 500 MB of memory.
>>
>>                Doing a bzr checkout takes over 20 minutes and 800 MB of
>>                memory on my
>>                fastest machine.  On my netbook and ARM board this
>>                causes significant
>>                swapping.  I've yet to complete a checkout on either.
>>
>>                I'd also like to share the mirror with other local
>>                machines to skip
>>                downloading the same 500 MB many times.  Running bzr
>>                serve and then
>>                checking out causes 100 % CPU usage for more than 10
>>                minutes on the
>>                host.
>>
>>                These numbers were with 2.2b4.  2.2 is significantly
>>                better than 2.1.
>>
>>                -- Michael
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> Philip Peitsch
>> Mob: 0439 810 260
>>
>


-- 
Philip Peitsch
Mob: 0439 810 260
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ubuntu.com/archives/bazaar/attachments/20100728/dc8f3396/attachment-0001.htm 


More information about the bazaar mailing list