[MERGE] Inventory-free checkout

John Arbash Meinel john at arbash-meinel.com
Thu Jan 3 21:46:59 GMT 2008


Aaron Bentley wrote:
> John Arbash Meinel wrote:
>> John Arbash Meinel has voted tweak.
>> Status is now: Conditionally approved
>> Comment:
>> +    def _apply_removals(self, removals):
>> +        for path in sorted(removals, reverse=True):
>> +            dirname, basename = osutils.split(path)
>> +            block_i, entry_i, d_present, f_present = \
>> +                self._get_block_entry_index(dirname, basename, 0)
>> +            entry = self._dirblocks[block_i][1][entry_i]
>> +            entry[1][0] = ('a', '', 0, False, '')
> 
>> ^- I think apply_removals should be using "DirState._make_absent". As it
>> handles when the removal is "permanent". (For example, renaming a file 2
>> times, deleting a file which was just added, etc.)
> 
> Thanks for the pointer.  The DirState code is still very unfamiliar.
> 
>> -        active_tree_ids = set((f for f in self._tree.inventory if
>> +        active_tree_ids = set((f for f in
>> self._tree.iter_all_file_ids() if
>>                                 f not in removed_tree_ids))
> 
>> ^- This seems better written as:
>> active_tree_ids = set(self._tree.iter_all_file_ids())
>> active_tree_ids.difference_update(removed_tree_ids)
> 
>> or even
>> active_tree_ids = set(self._tree.iter_all_file_ids()) - removed_tree_ids
> 
> Okay, will update.
> 
>> I do wonder if "iter_all_file_ids()" is the best interface, given our
>> recent concerns about 'iter' performance.
> 
> Oh, I wasn't aware there were big concerns.  I would be happy with an
> API that returned a set or list, too.  But we do need a way to get the
> exhaustive list.

I don't know that it is "big concerns". Just that we should be aware of
when we really want to use iterators/generators rather than just using
them everywhere.

This is the sort of place where a set or list would seem to be
appropriate. Probably a set() since something like "all_file_ids()"
doesn't have an intrinsic order, and you would kind of expect to use it
as a set.

> 
>> -    if len(wt.inventory) > 1:  # more than just a root
>> +    if len(list(wt)) > 1:  # more than just a root
> 
>> ^- It seems like there should be a better way to find out that we have
>>> =2 entries than building up a list of all 55k entries.
> 
> There are more efficient ways, but if there are >=2 entries, we'll error
> out, so I thought efficiency wasn't very important, and I'd go for clarity.

Seems okay. Though I'm not sure that "list(wt)" is terribly clear.

I also see this comment:
    # should be deprecated - this is slow and in any case treating them as a
    # container is (we now know) bad style -- mbp 20070302
    ## @deprecated_method(zero_fifteen)
    def __iter__(self):

So list(wt) should actually be raising a DeprecationError.

Actually, looking at the implementation, it has to go to
"self._inventory" and then stat every file in the inventory to see if it
really does exist.

So unless list(wt) does something else than list(wt.__iter__()), we
really need to do something different here.

> 
>> The biggest thing is using _make_absent to make sure all references are
>> properly cleaned up, rather than just setting the field to
>> NULL_PARENT_DETAILS
> 
> Will do.  Do you also want me to change _iter_all_file_ids to something
> else?  Set?  List?  And if you want me to change the len(list(wt)) > 2,
> I'll do that too.
> 
> Aaron

At first I thought neither was terribly important. Looking closer, I
think I'd like to see the set of file ids, and something different than
list(wt).

John
=:->



More information about the bazaar mailing list