[MERGE] filtered-deltas - basis for log DIR

Ian Clatworthy ian.clatworthy at internode.on.net
Sat Mar 14 01:13:37 GMT 2009


Aaron Bentley wrote:

>         if self.root is None:
>> +            return Inventory(root_id=None)
>> +        other = Inventory(entries.next()[1].file_id)
> 
> 
>> +        other.root.revision = self.root.revision
>> +        other.revision_id = self.revision_id
>> +        directories_to_expand = set()
>> +        for path, entry in entries:
>> +            file_id = entry.file_id
>> +            if (file_id in specific_fileids
>> +                or entry.parent_id in directories_to_expand):
>> +                if entry.kind == 'directory':
>> +                    directories_to_expand.add(file_id)
> 
> I don't think this is correct.  How are you ensuring that grandparents
> and great-grandparents, etc. are being retained?
> 
> bb:comment until this issue is resolved.

I collect those in the interesting_parents set before starting
the iteration. The code immediately after the snippet above is ...

>> +            elif file_id not in interesting_parents:
>> +                continue
>> +            other.add(entry.copy())

which retains them.

> It seems strange for an optimization to be doing a copy.

Non-chk inventories are mutable so I'm doing the copying to
be safe. We could change the semantics of the method though so
that it returned a 'view' rather than something safe to mutate.
That would be fine given the way I'm using it in the layer above.
What do you think? Maybe do that and use a more explicit method
name?

>> -    def get_deltas_for_revisions(self, revisions):
>> +    def get_deltas_for_revisions(self, revisions, specific_fileids=None):
>>          """Produce a generator of revision deltas.
>>  
>>          Note that the input is a sequence of REVISIONS, not revision_ids.
>>          Trees will be held in memory until the generator exits.
>>          Each delta is relative to the revision's lefthand predecessor.
>> +
>> +        :param specific_fileids: if not None, the result is filtered
>> +          so that only those file-ids, their parents and their
>> +          children are included.

>> +        # Get the matching filtered trees. Note that it's more
>> +        # efficient to pass filtered trees to changes_from() rather
>> +        # than doing the filtering afterwards. changes_from() could
>> +        # arguably do the filtering itself but it's path-based, not
>> +        # file-id based, so filtering before or afterwards is
>> +        # currently easier.
> 
> Filtering in changes_from will be more efficient, and I think that's
> ultimately the right approach.  But a direct implementation of
> get_deltas_for_revisions directly on CHK repositories would make a lot
> of sense.

Right. This patch is explicitly focussed as doing the minimum possible
to get something working on bzr.dev. I'll then focus my optimisation
energy on making it efficient over a chk format.

Ian C.



More information about the bazaar mailing list