[PATCH][MERGE] Improvements to is_ignored

John A Meinel john at arbash-meinel.com
Wed Jan 11 04:00:55 GMT 2006


Jan Hudec wrote:
> Hello All,
> 
> I refactored WorkingTree.is_ignored to compile all patterns to one big regexp
> and match against that.
> 
> The method was split in two, is_ignored and is_ignored_by. is_ignored now
> returns only boolean value (the match object actually, but it's mostly
> useless) and uses non-capturing groups in the pattern to avoid the
> bookkeeping of captures. The is_ignored_by returns the (last) matching
> pattern for the cases where it is used.
> 
> Due to how match object work, the _last_ matching pattern is returned from
> is_ignored_by.
> 
> The _glob_to_regex function still uses the same logic of fnmatch. However it
> should now be easy to replace with better convertor (which is however pretty
> hard to write).
> 
> Brief measurements on the bzr tree itself show, that is faster by some 10%.
> Please if you can, compare the performance on some large tree with many
> ignore patterns. I don't have any such lying around here.
> 
> The branch is on http://www.ucw.cz/~bulb/bzr/bzr.ignore
> (revision 1562; pushed, so no working tree)
> 

Here are my results:
time bzr status
real    0m8.374s
user    0m7.840s
sys     0m0.404s

time bzr.ignore/bzr status
real    0m5.497s
user    0m4.988s
sys     0m0.412s

So it was able to shave a lot of the time off. This is with 71 ignore
patterns, and 805 ignored files.

I also found an interesting problem if you don't use (?:), specifically:
bzr: ERROR: exceptions.AssertionError: sorry, but this version only
supports 100 named groups
  at /usr/lib/python2.4/sre_compile.py line 506
  in compile

However, that probably means that is_ignored_by fails:
$ time ~/bzr/mirrors/hudec/bzr.ignore/bzr add
bzr: ERROR: exceptions.AssertionError: sorry, but this version only
supports 100 named groups
  at /usr/lib/python2.4/sre_compile.py line 506
  in compile

real    0m2.554s
user    0m2.360s
sys     0m0.172s

So we probably need to do is_ignored_by differently. It is infrequent
enough, it might be fine to just iterate over the patterns.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060110/5c85cd11/attachment.pgp 


More information about the bazaar mailing list