[MERGE] Add support for `bzr pull --dry-run`

Fri Jan 2 03:42:39 GMT 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Marius Kruger wrote:
> This patch adds support for `bzr pull --dry-run`, which prints out the
> same as `bzr pull`
> without actually pulling the changes.
> I added all the tests I could think of with my tired brain, let me know
> it I need more.

bb:resubmit

I think this is not the right approach.  It is an API break, because
cmd_pull will not work with other branch or tree implementations.  I
also worry that it will be an ongoing challenge to make sure it does not
write any changes to disk, or have any of the side-effects that pull is
normally supposed to have.

Instead of pushing the dry-run facilities into Branch.pull, I think it
would make more sense restructure cmd_pull so that you can perform the
output without actually
performing the pull.

> --- bzrlib/branch.py	2008-12-16 02:58:31 +0000
> +++ bzrlib/branch.py	2009-01-01 02:33:19 +0000
> @@ -492,7 +492,7 @@
>  
>      @needs_write_lock
>      def update_revisions(self, other, stop_revision=None, overwrite=False,
> -                         graph=None):
> +                         graph=None, dry_run=False):

Rather than having update_revisions support dry-run, you can factor the
calculation of stop_revno into another method, and then use that
directly instead of update_revisions.

> === modified file 'bzrlib/builtins.py'
> --- bzrlib/builtins.py	2008-12-23 22:34:38 +0000
> +++ bzrlib/builtins.py	2009-01-01 00:44:49 +0000
> @@ -727,13 +727,14 @@
>              short_name='d',
>              type=unicode,
>              ),
> +        Option('dry-run', help='Don\'t actually make changes.'),
>          ]
>      takes_args = ['location?']
>      encoding_type = 'replace'
>  
>      def run(self, location=None, remember=False, overwrite=False,
>              revision=None, verbose=False,
> -            directory=None):
> +            directory=None, dry_run=False):
>          # FIXME: too much stuff is in the command class
>          revision_id = None
>          mergeable = None
> @@ -794,24 +795,32 @@
>                  change_reporter = delta._ChangeReporter(
>                      unversioned_filter=tree_to.is_ignored)
>                  result = tree_to.pull(branch_from, overwrite, revision_id,
> -                                      change_reporter,
> -                                      possible_transports=possible_transports)
> -            else:
> -                result = branch_to.pull(branch_from, overwrite, revision_id)
> -
> -            result.report(self.outf)
> -            if verbose and result.old_revid != result.new_revid:
> +                    change_reporter, possible_transports=possible_transports,
> +                    dry_run=dry_run)
> +            else:
> +                result = branch_to.pull(branch_from, overwrite, revision_id,
> +                    dry_run=dry_run)
> +        finally:
> +            branch_to.unlock()

You should not unlock a branch if you are not finished with it.
Unlocking a branch prematurely can reduce efficiency or correctness.
This branch_to.unlock should happen after the verbose handling is finished.

> === modified file 'bzrlib/tag.py'
> --- bzrlib/tag.py	2008-09-08 13:35:44 +0000
> +++ bzrlib/tag.py	2009-01-01 02:33:19 +0000
> @@ -187,10 +187,10 @@
>              raise ValueError("failed to deserialize tag dictionary %r: %s"
>                  % (tag_content, e))
>  
> -    def merge_to(self, to_tags, overwrite=False):
> +    def merge_to(self, to_tags, overwrite=False, dry_run=False):
>          """Copy tags between repositories if necessary and possible.

As with Branch.update, it would be better to add a new method that
calculated the result of the merge, and then invoke that method directly
for a dry run.

> === modified file 'bzrlib/tests/branch_implementations/test_pull.py'
> --- bzrlib/tests/branch_implementations/test_pull.py	2008-06-09 15:37:14 +0000
> +++ bzrlib/tests/branch_implementations/test_pull.py	2009-01-01 02:56:00 +0000
> @@ -118,6 +118,39 @@
>          self.assertEqual(tree_b.branch.revision_history(),
>                           tree_a.branch.revision_history())
>  
> +    def test_pull_dry_run(self):
> +        tree_a = self.make_branch_and_tree('tree_a')
> +        tree_a.commit('message 1', rev_id='rev1a')
> +        tree_b = tree_a.bzrdir.sprout('tree_b').open_workingtree()
> +        tree_b.commit('message 2', rev_id='rev2b')
> +
> +        # check dry-run without conflicts
> +        result = tree_a.branch.pull(tree_b.branch, dry_run=True)
> +        self.assertEqual('rev1a', tree_a.branch.last_revision())
> +        self.assertFalse(tree_a.branch.repository.has_revision('rev2b'))
> +        self.assertEqual(1, result.old_revno)
> +        self.assertEqual('rev1a', result.old_revid)
> +        self.assertEqual(2, result.new_revno)
> +        self.assertEqual('rev2b', result.new_revid)
> +
> +        # check dry-runs with conflicts

It is not a conflict for branches to be diverged.  Conflicts refer to
discrepancies encountered while merging into a working tree.  (Since
Tree.pull does perform a merge, this is confusing.)

> === modified file 'bzrlib/tests/workingtree_implementations/test_pull.py'
> --- bzrlib/tests/workingtree_implementations/test_pull.py	2007-11-01 09:52:45 +0000
> +++ bzrlib/tests/workingtree_implementations/test_pull.py	2009-01-01 04:23:18 +0000
> @@ -56,3 +56,7 @@
>          tree_b.pull(tree_a.branch)
>          self.assertFileEqual('contents of from/file\n', 'to/file')
>  
> +    def test_pull_dry_run(self):
> +        tree_a, tree_b = self.get_pullable_trees()
> +        tree_b.pull(tree_a.branch, dry_run=True)
> +        self.assertFalse(tree_b.branch.repository.has_revision('A'))

You should also ensure that WorkingTree.pull does not change the files
when dry_run is used.

> === modified file 'bzrlib/workingtree.py'
> --- bzrlib/workingtree.py	2008-12-22 08:44:20 +0000
> +++ bzrlib/workingtree.py	2009-01-01 00:44:49 +0000
> @@ -1531,7 +1531,7 @@
>  
>      @needs_write_lock
>      def pull(self, source, overwrite=False, stop_revision=None,
> -             change_reporter=None, possible_transports=None):
> +             change_reporter=None, possible_transports=None, dry_run=False):
>          top_pb = bzrlib.ui.ui_factory.nested_progress_bar()
>          source.lock_read()
>          try:
> @@ -1540,7 +1540,7 @@
>              old_revision_info = self.branch.last_revision_info()
>              basis_tree = self.basis_tree()
>              count = self.branch.pull(source, overwrite, stop_revision,
> -                                     possible_transports=possible_transports)
> +                possible_transports=possible_transports, dry_run=dry_run)
>              new_revision_info = self.branch.last_revision_info()
>              if new_revision_info != old_revision_info:
>                  pp.next_phase()

IIUC, this will change the files, even though the branch has not been
changed.

> == some motivation/history/gripes ==
> At work we have a human gatekeeper who checks who changed what files
> before it gets deployed.
> So I have scripts that does the following tasks on each of the component
> branches (there are about 40 components and I use scmproj to manage them
> sanely. thanks Alexander):
> 1) tag the dev branches to PRE-QA
> 2) determine what is missing in qa up to dev:tag:PRE-QA (create change
> report for the human gatekeeper to see which files changed)
> 3) pull changes to up to dev:tag:PRE-QA
> 
> My problem is with step 2), normally it runs right after step 1) so I
> can use normal `missing -v`.

If I wanted a report of files changed, I would use diff -r ancestor: or
status -r ancestor:.  For me, this is more useful information about what
has actually changed.

Merge --preview would also work, and this allows you to specify a
particular tag.

> But if I need to run it later changes added after tag:PRE-QA are also shown.
> 
> I started to implement `bzr missing --theirs-only -v -r..tag:PRE-QA`,
> which I got working in the backend,
> but I got stuck on the user interface ..AARG!:
> Users could potentially want to filter on the local revisions AND the
> remote revisions.

I don't understand this.

> So we may want to have separate options eg:
> `bzr missing --revision=-2..-1 --remote-revision=..tag:PRE-QA`

I don't understand why you'd have --revision=-2..-1 and
- --remote-revision=..tag:PRE-QA.

What does that mean?  Show me all the revisions in the range
..tag:PRE-QA that are not present in the range -2..-1?  That would be
nearly all revisions.

We might well want to allow revision options, and if we did, I would
expect them to be of the form: [REMOTE REVISION]..[LOCAL REVISION].

> and
> `bzr missing -r-2..-1 -f=..tag:PRE-QA`  (-f for a lack of a better short
> option)

Again, I don't understand what this implies.

> on IRC asabil suggested I use `bzr pull --dry-run -v -r tag:PRE-QA`
> which simplifies
> the user interface a little (but the backend is more complex, because
> you have to
> make sure you don't apply any changes).

But pull -v shows different revisions than missing.  Missing shows
mainline revisions that are added or removed from the ancestry.  Pull -v
shows revisions that are added or removed from the lefthand history,
even if they are common ancestors.

> After implementing that I realise that `bzr pull -v` doesn't print the
> file names (which is sort of the point for me) ..AAARG!
> so much for 2008-31-12T10:00..2009-01-01T06:38
> 
> going forward I must choose between the following:
> a) implement `bzr missing *-r*-2..-1 *-f*=..tag:PRE-QA`
> or
> b) implement  `bzr pull -v *--show-files*` === `bzr pull -v *-f*`
> 
> suggestions would be appreciated. I lean towards a) since I'm 90% there..

I suggest c) add a revision spec that finds the common ancestor with a
revisionspec, so that you can do

diff -r revancestor:tag:PRE-QA or status -r revancestor:tag:PRE-QA

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkldjSwACgkQ0F+nu1YWqI2teACfU+SQxv8+aCTArJzLUV7GdzTG
UyIAn3FzARXJA92wiH5K2ZrmrBzhDb7/
=s38s
-----END PGP SIGNATURE-----