[RFC][PATCH 1/4] Speed improvement in fetch/clone: file_involved( ) function

Robert Collins robertc at robertcollins.net
Thu Dec 15 04:27:04 GMT 2005


On Sat, 2005-12-10 at 19:15 +0100, Goffredo Baroncelli wrote:
> This patch add the function file involved( )
> 
> === modified file 'bzrlib/branch.py'
> --- bzrlib/branch.py
> +++ bzrlib/branch.py
> @@ -1090,6 +1090,55 @@
>          self.revision_store.add(StringIO(gpg_strategy.sign(plaintext)),
>                                  revision_id, "sig")
> 
> +    def file_involved(self, arg1=None, arg2=None):

This seems like something generally useful to "Branch" - so the function
should be defined on Branch as a 'raise NotImplemented' stub, and then
the BzrBranch implementation in the BzrBranch class.

> +        """ This function returns the file_id(s) involved in the
> +            changese between two revisions, or in the changes
> +
> +            The revisions are expressed as revision_id
> +
> +            if two args are passed,the changes are searched between
> +                'rev-arg1'..'rev-arg2'
> +            if one arg is passed, the changes are searched up to rev-arg1 or
> +                if it is a set, inside this set
> +            if no args is passed, all files_id are returned
> +        """
> +
> +        w = self._get_inventory_weave( )
> +        file_id = set( )
> +
> +        if arg2:
> +            from_set = set(w.inclusions([w.lookup(arg1)]))
> +            to_set = set(w.inclusions([w.lookup(arg2)]))
> +            changed = to_set.difference(from_set)
> +        elif arg1:
> +            if isinstance(arg1, set):
> +                changed = map(w.lookup, arg1 )
> +            else:
> +                changed = w.inclusions([w.lookup(arg1)])
> +        else:
> +            changed = set(w.inclusions([w.numversions( )-1]))
> +
> +        for lineno, insert, deleteset, line in w._walk():
> +            if insert in changed:
> +
> +                start = line.find('file_id="')+9
> +                if start < 9: continue
> +                end = line.find('"',start)
> +                assert end>= 0
> +                fid = line[start:end]
> +

fid?

Longer variable names please. And more clear ones:
arg1, arg2 are meaningless to me. 

Lastly, I note the 'if isinstance' calls - that makes apis hard to use -
can we please not do this unless we really need to.

I'd rather see one of the following variations:
def file_involved(self, revisions=None, require_ancestor=None,
require_descendant=None):
   ...
   revisions is a set of revisions to consider or None 
   for no constraint.
 
   require_ancestor: a revision_id to require as an ancestor in
   the found revisions.

   require_descendant: a revision_id that must be a descendant
   in the found revisions.

   These three constraints are combined with AND clauses - all
   specified constraints must be met for a revision to be returned.
   """

This is more flexible than the arg1, arg2 approach, and IMO clearer and
easier to use and understand when someone sees this code elsewhere.

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051215/2ca15c0c/attachment.pgp 


More information about the bazaar mailing list