[PATCH] Speed improvement

Goffredo Baroncelli kreijack at alice.it
Fri Nov 25 19:02:40 GMT 2005


Hi all,

the purpose of the patch attached is to improve the performance of the 
_copy_one_weave( ) function. This function merge a remote weave
to the one correspondent of the current branch.

The patch adds 2 optimizations:

1) The patch add a hash ( file_ids_names ) which caches the revision 
contained in a weave already  processed, so it isn't needed to reload the
weave in order to check if a revision is present.

2) Moreover if it is required to merge a remote weave without a correspondent 
weave in the local branch, the merge phase is skipped and the remote weave is 
copied into the repository without other action.

The test below highlights a gain of 3x of user time, against a repository with 
a history of 2548 revisions and 350 files. The repository is in the same machine,
 which is a Duron 700MHz w/ 384MB of ram.

$ # w/ patch
$ time bzr branch http://kreijack.homelinux.net:8077/bazaar/hgweb_devel/
Branched 1279 revision(s).

real    59m9.511s
user    40m14.090s
sys     1m57.430s

$ # w/o patch
$ time bzr branch http://kreijack.homelinux.net:8077/bazaar/hgweb_devel/
Branched 1279 revision(s).

real    147m34.191s
user    122m10.273s
sys     5m43.247s



Goffredo

---------------------


=== modified file 'bzrlib/fetch.py'
--- bzrlib/fetch.py	
+++ bzrlib/fetch.py	
@@ -101,6 +101,7 @@
         self.count_total = 0
         self.count_weaves = 0
         self.copied_file_ids = set()
+        self.file_ids_names = {}
         if pb is None:
             self.pb = bzrlib.ui.ui_factory.progress_bar()
         else:
@@ -217,8 +218,14 @@
 
     def _copy_one_weave(self, rev_id, file_id, text_revision):
         """Copy one file weave, esuring the result contains text_revision."""
+        # check if the revision is already there
+        if file_id in self.file_ids_names.keys( ) and \
+            text_revision in self.file_ids_names[file_id]:
+                return        
         to_weave = self.to_weaves.get_weave_or_empty(file_id,
             self.to_branch.get_transaction())
+        if not file_id in self.file_ids_names.keys( ):
+            self.file_ids_names[file_id] = to_weave.names( )
         if text_revision in to_weave:
             return
         from_weave = self.from_weaves.get_weave(file_id,
@@ -226,14 +233,19 @@
         if text_revision not in from_weave:
             raise MissingText(self.from_branch, text_revision, file_id)
         mutter('copy file {%s} modified in {%s}', file_id, rev_id)
-        try:
-            to_weave.join(from_weave)
-        except errors.WeaveParentMismatch:
-            to_weave.reweave(from_weave)
+
+        if len(to_weave.names( )):
+            try:
+                to_weave.join(from_weave)
+            except errors.WeaveParentMismatch:
+                to_weave.reweave(from_weave)
+        else:
+            to_weave = from_weave.copy( )
         self.to_weaves.put_weave(file_id, to_weave,
             self.to_branch.get_transaction())
         self.count_weaves += 1
         self.copied_file_ids.add(file_id)
+        self.file_ids_names[file_id] = to_weave.names()
         mutter('copied file {%s}', file_id)
 
 

-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijackATinwind.it>
Key fingerprint = CE3C 7E01 6782 30A3 5B87  87C0 BB86 505C 6B2A CFF9

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051125/4f8fd6aa/attachment.pgp 


More information about the bazaar mailing list