glob-semantics on win32: windows or unix semantics?

Kuno Meyer kuno.meyer at gmx.ch
Thu Jul 12 22:46:22 BST 2007


On 12.07.2007 08:43, Martin Pool wrote:
> On 7/12/07, Kuno Meyer <kuno.meyer at gmx.ch> wrote:
> 
>> > There do not seem to be any explicit tests already.  I think it would
>> > be good to add two different tests.
>> >
>> > 1- When glob_expand_for_win32 is called, it has the right effect:
>> > expanding things that are globs and can match something.  The easiest
>> > thing is probably to make it a TestCaseInTempDir and then use
>> > build_tree to make some files to test against.
>> >
>> > 2- When we do a blackbox test on bzr add, this method does get
>> > correctly invoked.  This is a bit tricky as it's only supposed to be
>> > active on Windows.
>> > I can see a couple of options:
>> >
>> > 2a - use run_bzr_subprocess, which will give the shell a chance to do
>> > the expansion on Unix (i think).
>> >
>> > 2b - change run_bzr and rearrange the layering so that if you try to
>> > run bzr in-process with wildcards it does the expansion through this
>> > method, even on unix.
>>
>> Ok. I will provide some tests at least for case 1-, but in a separate
>> patch. Thank you for a hint how to implement it.
> 
> Thanks, that would be great.
> 

<skip>

After writing some initial tests, I think the current implementation of 
win32utils.glob_expand trying to imitate the Unix shell expansion is not 
correct.

- case-insensitivity
   (win32: pattern 'a' is expected to match with filename 'A')


In the cases of the '.' (extension separator) and the '?' wildcard I am 
not sure whether implementing the Windows semantics is the right thing, 
because this behaviour is very strange and users might be surprised:

- '?' matches with 'zero or one char, but not "."'
   (win32: pattern 'a?' matches with 'a', 'a1')
   (unix: pattern 'a?' matches just with 'a1')

   (win32: pattern 'a??' matches with 'a', 'a1', 'a11', but not 'a.1')
   (unix: pattern 'a??' matches with 'a11' *and* 'a.1')

- '*.*' is an equivalent to '*'
   (win32: pattern '*.*' matches with 'a')
   (unix: no match)

- '*.' matches with anything without extension
   (win32: pattern '*.' matches with 'a',
           as 'a' and 'a.' are identical names)


In my opinion,
1) we have to care about the case-insensitivity (try "touch a; bzr add 
A; bzr st" on Windows, and you see what I mean).
2) Patterns ending with "." seem to be treated correctly even in the 
current code base.
3) For all other cases, implementing the Unix semantics seems to be the 
better solution.

What is your opinion?

Kuno



More information about the bazaar mailing list