[Bug 172861] Re: add: accepts aliases due to case insensitivity

Alexander Belchenko bialix at ukr.net
Thu Nov 29 19:11:54 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel пишет:
> Alexander Belchenko wrote:
>> John A Meinel ?8H5B:
>>> The other problem, is that I don't know of any way to get the real name of a file. You could use "os.listdir(os.path.dirname(path))" and then search through it for possible names, but you don't know if foO is there because of a rename, or if it is actually a different file.
>>> It would be nice if you could do:
>>> st = os.stat('foo')
>>> and have "st" have a st_name or some other property that gives you the exact name on disk.
>>> Does anyone know if that is possible?
>> I know. Recently I dig through MSDN and found this way to determine real filename without
>> using os.listdir (it's too expensive in Python). You need pywin32 library for this, or
>> writing C-extension. Using ctypes is also possible but it's too verbose in Python.
> 
>> Here the actual code:
> 
>> import win32file
>> names_list = win32file.FindFilesW(path_in_question)  # returns list of WIN32_FIND_DATA structs
>> # if path_in_question contains wildcard characters * or ? then we get list of al matching files,
>> # like with glob function
>> real_name = names_list[0][8]
> 
>> Here is example running in bzr.dev tree:
> 
>> In [4]: import win32file
> 
>> In [5]: names_list = win32file.FindFilesW('BzR')
> 
>> In [6]: names_list[0][8]
>> Out[6]: u'bzr'
> 
>> In [7]: len(names_list)
>> Out[7]: 1
> 
> 
> We might think about using something like this. But it pains me to think about
> making 50k calls to FindFilesW just to make sure that when we stat to see if
> 'X' exists that it isn't really called 'x'....

Looking at internal realization of os.listdir in CPython I'm sure one day we should
rewrite win32 walkdirs code and throw away os.listdir *completely*.
Because:
1) os.listdir use FindFiles API internally
2) WIN32_FIND_DATA contains *all* stat info, so additional os.lstat in walkdirs code
is simply redundant! On win32 we are able produce os.listdir and os.lstat for each
item in os.listdir output in one pass! If you're remember additional os.lstat costs
too much on FAT32 as I discover with my fake symlinks code.
3) IMO walkdirs generator on win32 should emit pair of filenames: real name on disk
and normalized name (lowercased for win32). Again we will be able to produce it
in one pass and therefore get performance win.

> 
> Maybe once we get commit to use an _iter_changes() style api, it will detect
> these in a more straightforward manner.
> 
> And then we could use something like FindFilesW for things like misses, to see
> if they really missed, or if it is just a case-insensitivity thing.
> 
> We certainly could use it for sanitizing user parameters (for things like bzr
> add, or maybe even bzr status).
> 
> Because those are usually limited (obviously if someone is scripting you may
> get a bunch, but there is still the limit of number of characters on the
> command line).
> 
> Do we have any similar function for Mac OS X? Especially since Mac likes to
> rename files based on unicode normalization. (And as someone commented it isn't
> pure NFC, so you really need to ask Mac what the name is.)
> 
> John
> =:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHTw76zYr338mxwCURAkyVAJ4s0jdxXyruwBHb6pC2YiHxUCSdiACeLl9E
+KYVd+OfFAvdks1J0GccdFs=
=ilT0
-----END PGP SIGNATURE-----



More information about the bazaar mailing list