Performance testing of ignore changes?

John Arbash Meinel john at arbash-meinel.com
Wed Jan 6 21:52:26 GMT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


...

>> So make the list ordered, and all will be well, without needing
>> to add a double-negative operator to the syntax.

There is also the problem that there is a global ignore file
(~/.bazaar/ignore) and when do those ignores take place versus local
ignores (possibly just all take precedence...)


> 
> An ordered style, allowing interleaving would lead to a list of regexes,
> rather than (with exclusions) at most 2 regexes. We apply ignore logic
> inside the inner loop of status, and we don't currently cache the result
> of that. I'm extremely concerned about the potential for regressions on
> 100000 file trees - take a 100K file tree, with 50K ignored artifacts:
> adding one regex takes us from 50K to 100K regex matches; adding an
> arbitrary list of regexes will permit users to create slow performing
> environments unless they understand all the internals and know exactly
> what is going on: users would be able to make bzr perform badly while
> doing a normal operation.
> 

Well, we already split at 99 sub-groups, because the regex engine
requires it. Also, we use separate regexes for "*.ext" rules versus
rules with a '/' versus rules without. So it is actually 3 or 3x2=6
regexes at a minimum. Though still bound versus N regexes.

There is another option. Namely, use a single regex to see if this is
interesting at all, and then switch to a finer-grained step-by-step
matching.

This would at least help the "bzr init; bzr add" case, it probably
wouldn't help the "bzr status" with 50k ignored files case.


> In the absence of detailed data, I'd like to suggest that the simple two
> sets (ignore, exclude-from-ignores) patch that John is working on is a
> same middle ground: it adds the feature, doesn't block on needing a new
> cache of ignored files, and will scale well.
> 
> -Rob

I generally agree here. Though we should note that he is potentially
adding a third "exclude-from-excluded-ignores". He has a reasonable use
case for it, though I don't know if in practice it is worth the effort.
You *could* just write your excluded form in a regex that doesn't match
the files you want to keep ignored. Though that is often horrible to do.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktFBhoACgkQJdeBCYSNAAMVWACgj3y+vZ8Z+T892+BI/6UOYBfe
RpIAoJs14U9T05ZOcorbavYk7aWYJqxL
=zPtq
-----END PGP SIGNATURE-----



More information about the bazaar mailing list