Alternate glob matcher for .bzrignore
John Arbash Meinel
john at arbash-meinel.com
Sat Jan 7 19:40:28 GMT 2006
John Arbash Meinel wrote:
> Now with the actual attachments :)
>
> John
> =:->
>
>
> John Arbash Meinel wrote:
>
>>In my encodings branch, I found that fnmatch doesn't match unicode
>>characters.
>>
>>So if you do:
>>$ echo 'test' > Bågfors.txt
>>$ bzr unknowns
>>"Bågfors.txt"
>>$ bzr ignore ./Bågfors.txt
>>$ cat .bzrignore
>>./Bågfors.txt
>>$ bzr unknows #This is what fails
>>"Bågfors.txt"
>>
>>We had discussed in the past changing the matcher so that it would
>>create one big pattern, and then from that, we would check all paths one
>>time, instead of checking each file many times. (This should help with
>>paths with a large number of ignored files and patterns).
>>
>>I did some work to implement it. Basically creating a new translator
>>from glob patterns into regular expressions. I also updated the fact so
>>that "*" doesn't match directories. (It would be nice if we didn't have
>>to worry about backslash being a directory separator.)
>>
>>Anyway, attached is the glob_matcher, and the test suite I wrote for it.
>>They are present in my encoding branch.
>>
>>To replace our current "is_ignored" check, we would have to do:
>>
In doing some more testing, ('**/' + pat) may not work, because it
probably wants at least one directory separator to exist. In regular
expression terms we want '(.*/)?'.
I could write another globs_to_matcher() which would understand that if
there is no '/' in the pattern, it needs to prepend the above to the
regular expression.
Or we could break the matching into 2 styles of patterns, one with, and
one without paths. And then just check 2 regular expressions, one with
just the trailing part of the path, and the other with the full path.
John
=:->
>>def get_ignore_matcher(self):
>> if not self._ignore_matcher:
>> patterns = []
>> for pat in self.get_ignore_list():
>> if '/' in pat or '\\' in pat:
>> patterns.append(pat)
>> else:
>> # Patterns without a slash match in any
>> # directory, so
>> patterns.append('**/' + pat)
>> self._ignore_matcher = globs_to_matcher(patterns)
>> return self._ignore_matcher
>>
>>def is_ignored(self, filename):
>> matcher = self.get_ignore_matcher()
>> return matcher(filename):
>>
>>We would then want another function like
>>def ignored_by(self, filename):
>> ...
>>
>>Which could be used to figure out which pattern matched in our list of
>>ignores. I think it is worth giving up that information in the general
>>case, because operations like commit and status don't need it.
>>
>>John
>>=:->
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060107/c3719b40/attachment.pgp
More information about the bazaar
mailing list