[MERGE] is_ignored improvements...

John Arbash Meinel john at arbash-meinel.com
Sat May 20 17:57:15 BST 2006


Jan Hudec wrote:
> On Sat, May 20, 2006 at 09:35:28 -0500, John A Meinel wrote:


...

>> Well, I think the #1 pattern format is *.foo, where we are just looking
>> for some sort of extension. And looking in all directories, etc. I
>> almost wonder if we wouldn't be better off with some sort of translation
>> that changes all of the *.foo into 'path.endswith()' calls.
> 
> You can try to time it, but I don't believe it. I do more believe in
> stripping the *. from them, converting to regexps, oring together and then
> wrapping with r'.*\.(?:%s)'. That would make a fourth case in the converting
> switch.

Yeah, as I saw it was actually slower than the separate regex.
regex.match() is actually faster than endswith() (with a compiled regex).

...

>> I also tried this:
>> compPrefix = [re.compile('.*\\.(?:' +
>>         '|'.join(['(?:%s$)' % i for i in range(0, max)])
>>         + ')' )]
>> (Factoring out the '.*\.' prefix), and I found that it doubles the
>> performance:
>>
>> # For foo.19
>> $ python ,time-matches.py
>> NoMatch: 0.572
>> NoMatchSplit: 0.568
>> MatchSplit: 0.740
>> Separate: 2.792
>> Prefix: 0.292
>> PrefixSplit: 0.288
>> Endswith: 3.509
>>
>> So I think there is stuff worth looking into.
> 
> Yes. Seems to make quite a difference.
> 

Yep. I would also say that it would be nice if we put commonly hit
patterns early, since I also saw a big difference between foo.19 and
foo.999 (more of a difference than even the difference between 100, 1k
and 2k patterns).

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060520/fe2aeaac/attachment.pgp 


More information about the bazaar mailing list