Rev 5790: merge bzr.dev 5786 to incorporate release-notes (aka NEWS) in http://bazaar.launchpad.net/~jameinel/bzr/2.4-revert-faster-759096

John Arbash Meinel john at arbash-meinel.com
Fri Apr 15 08:17:43 UTC 2011


At http://bazaar.launchpad.net/~jameinel/bzr/2.4-revert-faster-759096

------------------------------------------------------------
revno: 5790 [merge]
revision-id: john at arbash-meinel.com-20110415081732-ppkpaq416fgut5sd
parent: john at arbash-meinel.com-20110415081656-d3ipmkl0vujwpntl
parent: pqm at pqm.ubuntu.com-20110415081233-mqfd5six3sqmi1sn
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.4-revert-faster-759096
timestamp: Fri 2011-04-15 10:17:32 +0200
message:
  merge bzr.dev 5786 to incorporate release-notes (aka NEWS)
modified:
  bzrlib/config.py               config.py-20051011043216-070c74f4e9e338e8
  bzrlib/lsprof.py               lsprof.py-20051208071030-833790916798ceed
  bzrlib/tests/test_config.py    testconfig.py-20051011041908-742d0c15d8d8c8eb
  bzrlib/transform.py            transform.py-20060105172343-dd99e54394d91687
  doc/en/release-notes/bzr-2.4.txt bzr2.4.txt-20110114053217-k7ym9jfz243fddjm-1
-------------- next part --------------
=== modified file 'bzrlib/config.py'
--- a/bzrlib/config.py	2011-04-05 14:47:26 +0000
+++ b/bzrlib/config.py	2011-04-09 20:01:11 +0000
@@ -967,6 +967,61 @@
         super(LockableConfig, self).remove_user_option(option_name,
                                                        section_name)
 
+def _iter_for_location_by_parts(sections, location):
+    """Keep only the sessions matching the specified location.
+
+    :param sections: An iterable of section names.
+
+    :param location: An url or a local path to match against.
+
+    :returns: An iterator of (section, extra_path, nb_parts) where nb is the
+        number of path components in the section name, section is the section
+        name and extra_path is the difference between location and the section
+        name.
+    """
+    location_parts = location.rstrip('/').split('/')
+
+    for section in sections:
+        # location is a local path if possible, so we need
+        # to convert 'file://' urls to local paths if necessary.
+
+        # FIXME: I don't think the above comment is still up to date,
+        # LocationConfig is always instantiated with an url -- vila 2011-04-07
+
+        # This also avoids having file:///path be a more exact
+        # match than '/path'.
+
+        # FIXME: Not sure about the above either, but since the path components
+        # are compared in sync, adding two empty components (//) is likely to
+        # trick the comparison and also trick the check on the number of
+        # components, so we *should* take only the relevant part of the url. On
+        # the other hand, this means 'file://' urls *can't* be used in sections
+        # so more work is probably needed -- vila 2011-04-07
+
+        if section.startswith('file://'):
+            section_path = urlutils.local_path_from_url(section)
+        else:
+            section_path = section
+        section_parts = section_path.rstrip('/').split('/')
+
+        matched = True
+        if len(section_parts) > len(location_parts):
+            # More path components in the section, they can't match
+            matched = False
+        else:
+            # Rely on zip truncating in length to the length of the shortest
+            # argument sequence.
+            names = zip(location_parts, section_parts)
+            for name in names:
+                if not fnmatch.fnmatch(name[0], name[1]):
+                    matched = False
+                    break
+        if not matched:
+            continue
+        # build the path difference between the section and the location
+        extra_path = '/'.join(location_parts[len(section_parts):])
+        yield section, extra_path, len(section_parts)
+
 
 class LocationConfig(LockableConfig):
     """A configuration object that gives the policy for a location."""
@@ -1001,49 +1056,20 @@
 
     def _get_matching_sections(self):
         """Return an ordered list of section names matching this location."""
-        sections = self._get_parser()
-        location_names = self.location.split('/')
-        if self.location.endswith('/'):
-            del location_names[-1]
-        matches=[]
-        for section in sections:
-            # location is a local path if possible, so we need
-            # to convert 'file://' urls to local paths if necessary.
-            # This also avoids having file:///path be a more exact
-            # match than '/path'.
-            if section.startswith('file://'):
-                section_path = urlutils.local_path_from_url(section)
-            else:
-                section_path = section
-            section_names = section_path.split('/')
-            if section.endswith('/'):
-                del section_names[-1]
-            names = zip(location_names, section_names)
-            matched = True
-            for name in names:
-                if not fnmatch.fnmatch(name[0], name[1]):
-                    matched = False
-                    break
-            if not matched:
-                continue
-            # so, for the common prefix they matched.
-            # if section is longer, no match.
-            if len(section_names) > len(location_names):
-                continue
-            matches.append((len(section_names), section,
-                            '/'.join(location_names[len(section_names):])))
+        matches = list(_iter_for_location_by_parts(self._get_parser(),
+                                                   self.location))
         # put the longest (aka more specific) locations first
-        matches.sort(reverse=True)
-        sections = []
-        for (length, section, extra_path) in matches:
-            sections.append((section, extra_path))
+        matches.sort(
+            key=lambda (section, extra_path, length): (length, section),
+            reverse=True)
+        for (section, extra_path, length) in matches:
+            yield section, extra_path
             # should we stop looking for parent configs here?
             try:
                 if self._get_parser()[section].as_bool('ignore_parents'):
                     break
             except KeyError:
                 pass
-        return sections
 
     def _get_sections(self, name=None):
         """See IniBasedConfig._get_sections()."""

=== modified file 'bzrlib/lsprof.py'
--- a/bzrlib/lsprof.py	2010-09-19 09:07:42 +0000
+++ b/bzrlib/lsprof.py	2011-04-12 14:55:57 +0000
@@ -275,12 +275,13 @@
         code = subentry.code
         totaltime = int(subentry.totaltime * 1000)
         #out_file.write('cob=%s\n' % (code.co_filename,))
-        out_file.write('cfn=%s\n' % (label(code, True),))
         if isinstance(code, str):
             out_file.write('cfi=~\n')
+            out_file.write('cfn=%s\n' % (label(code, True),))
             out_file.write('calls=%d 0\n' % (subentry.callcount,))
         else:
             out_file.write('cfi=%s\n' % (code.co_filename,))
+            out_file.write('cfn=%s\n' % (label(code, True),))
             out_file.write('calls=%d %d\n' % (
                 subentry.callcount, code.co_firstlineno))
         out_file.write('%d %d\n' % (lineno, totaltime))

=== modified file 'bzrlib/tests/test_config.py'
--- a/bzrlib/tests/test_config.py	2011-04-05 12:11:26 +0000
+++ b/bzrlib/tests/test_config.py	2011-04-09 20:01:11 +0000
@@ -1264,55 +1264,52 @@
         self.failUnless(isinstance(global_config, config.GlobalConfig))
         self.failUnless(global_config is my_config._get_global_config())
 
+    def assertLocationMatching(self, expected):
+        self.assertEqual(expected,
+                         list(self.my_location_config._get_matching_sections()))
+
     def test__get_matching_sections_no_match(self):
         self.get_branch_config('/')
-        self.assertEqual([], self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([])
 
     def test__get_matching_sections_exact(self):
         self.get_branch_config('http://www.example.com')
-        self.assertEqual([('http://www.example.com', '')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('http://www.example.com', '')])
 
     def test__get_matching_sections_suffix_does_not(self):
         self.get_branch_config('http://www.example.com-com')
-        self.assertEqual([], self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([])
 
     def test__get_matching_sections_subdir_recursive(self):
         self.get_branch_config('http://www.example.com/com')
-        self.assertEqual([('http://www.example.com', 'com')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('http://www.example.com', 'com')])
 
     def test__get_matching_sections_ignoreparent(self):
         self.get_branch_config('http://www.example.com/ignoreparent')
-        self.assertEqual([('http://www.example.com/ignoreparent', '')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('http://www.example.com/ignoreparent',
+                                      '')])
 
     def test__get_matching_sections_ignoreparent_subdir(self):
         self.get_branch_config(
             'http://www.example.com/ignoreparent/childbranch')
-        self.assertEqual([('http://www.example.com/ignoreparent',
-                           'childbranch')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('http://www.example.com/ignoreparent',
+                                      'childbranch')])
 
     def test__get_matching_sections_subdir_trailing_slash(self):
         self.get_branch_config('/b')
-        self.assertEqual([('/b/', '')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('/b/', '')])
 
     def test__get_matching_sections_subdir_child(self):
         self.get_branch_config('/a/foo')
-        self.assertEqual([('/a/*', ''), ('/a/', 'foo')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('/a/*', ''), ('/a/', 'foo')])
 
     def test__get_matching_sections_subdir_child_child(self):
         self.get_branch_config('/a/foo/bar')
-        self.assertEqual([('/a/*', 'bar'), ('/a/', 'foo/bar')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('/a/*', 'bar'), ('/a/', 'foo/bar')])
 
     def test__get_matching_sections_trailing_slash_with_children(self):
         self.get_branch_config('/a/')
-        self.assertEqual([('/a/', '')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('/a/', '')])
 
     def test__get_matching_sections_explicit_over_glob(self):
         # XXX: 2006-09-08 jamesh
@@ -1320,8 +1317,7 @@
         # was a config section for '/a/?', it would get precedence
         # over '/a/c'.
         self.get_branch_config('/a/c')
-        self.assertEqual([('/a/c', ''), ('/a/*', ''), ('/a/', 'c')],
-                         self.my_location_config._get_matching_sections())
+        self.assertLocationMatching([('/a/c', ''), ('/a/*', ''), ('/a/', 'c')])
 
     def test__get_option_policy_normal(self):
         self.get_branch_config('http://www.example.com')

=== modified file 'bzrlib/transform.py'
--- a/bzrlib/transform.py	2011-04-14 14:59:10 +0000
+++ b/bzrlib/transform.py	2011-04-15 08:17:32 +0000
@@ -3049,7 +3049,8 @@
                         file_id = tt.final_file_id(trans_id)
                         if file_id is None:
                             file_id = tt.inactive_file_id(trans_id)
-                        entry = path_tree.inventory[file_id]
+                        _, entry = path_tree.iter_entries_by_dir(
+                            [file_id]).next()
                         # special-case the other tree root (move its
                         # children to current root)
                         if entry.parent_id is None:

=== modified file 'doc/en/release-notes/bzr-2.4.txt'
--- a/doc/en/release-notes/bzr-2.4.txt	2011-04-15 08:16:56 +0000
+++ b/doc/en/release-notes/bzr-2.4.txt	2011-04-15 08:17:32 +0000
@@ -26,11 +26,9 @@
 .. Improvements to existing commands, especially improved performance 
    or memory usage, or better results.
 
-* When building a new WorkingTree (such as during ``bzr co`` or
-  ``bzr branch``) we now properly store the stat and hash of files that
-  are old enough. This saves a fair amount of time on the first
-  ``bzr status`` (on a 500MB tree, it saves about 30+s).
-  (John Arbash Meinel, #740932)
+* ``bzr merge`` in large trees is now significantly faster. On a 70k entry
+  tree, the time went from ~3min down to 30s.
+  (John Arbash Meinel, #759091)
 
 * Resolve ``lp:FOO`` urls locally rather than doing an XMLRPC request if
   the user has done ``bzr launchpad-login``. The bzr+ssh URLs were already
@@ -42,6 +40,13 @@
   call as much as 2s from Sydney. You can test the local logic by using
   ``-Dlaunchpad``.  (John Arbash Meinel, #397739)
 
+* When building a new WorkingTree (such as during ``bzr co`` or
+  ``bzr branch``) we now properly store the stat and hash of files that
+  are old enough. This saves a fair amount of time on the first
+  ``bzr status`` (on a 500MB tree, it saves about 30+s).
+  (John Arbash Meinel, #740932)
+
+
 Bug Fixes
 *********
 
@@ -58,6 +63,15 @@
 
 * Lazy hooks are now reset between test runs. (Jelmer Vernooij, #745566)
 
+* ``bzrlib.merge.Merge`` now calls ``iter_changes`` without
+  ``include_unversioned=True``. This makes it significantly faster in many
+  cases, because it only looks at modified files, rather than building
+  information about all files. This can cause failures in other
+  TreeTransform code, because it had been expecting to know the names of
+  things which had not changed (such as parent directories). All cases we
+  know about so far have been fixed, but there may be fallout for edge
+  cases that we are missing. (John Arbash Meinel, #759091)
+
 * Standalone bzr.exe installation on Windows: user can put additional python 
   libraries into ``site-packages`` subdirectory of the installation directory,
   this might be required for "installing" extra dependencies for some plugins.



More information about the bazaar-commits mailing list