[MERGE] Fast _walkdirs for win32

Alexander Belchenko bialix at ukr.net
Thu Jul 17 19:24:16 BST 2008


John Arbash Meinel пишет:
> Alexander Belchenko wrote:
> 
> |
> | 2) But anyway it failed because linker cannot find Win32 API functions:
> |
> 
> ....
> 
> | bzrlib\_walkdirs_win32.pyd : fatal error LNK1120: 4 unresolved externals
> | Building of "bzrlib._walkdirs_win32" extension failed, will use the
> | Python version instead
> |
> |
> | I found that if I compile _walkdirs_win32.c with custom setup.py
> | (without involving Pyrex at all)
> | I get the same error. When I looked at the generated C code I saw this
> | definition:
> |
> | #ifndef WIN32
> |   #define __stdcall
> |   #define __cdecl
> | #endif
> |
> | If I manually add
> |
> | #define WIN32_LEAN_AND_MEAN
> | #include <windows.h>
> |
> | before mentioned lines *then* module compiled successfully without any
> | warning or error.
> | I don't know is it Pyrex bug or not.

As I explain in another mail I think it's probably problems with my compiler,
but this problems reveals bug in the Pyrex. I sent the patch that allows me to
finally successfully compile your pyx.

> 
> ....
> 
> | +    cdef int GetLastError()
> | +
> | +    # Wide character functions
> | +    DWORD wcslen(WCHAR *)
> |
> | ^-- You don't need to use cdef's in this block, because you're using
> | "cdef extern from" construct at the top. They are don't seem to harm
> | though.
> 
> I'm not sure which block you mean, but I'm directly accessing the
> members, so I have to tell Pyrex what their types are.

I mean you could write this:

+cdef extern from "_walkdirs_win32.h":
+    struct _HANDLE:
+        pass
+    ctypedef _HANDLE *HANDLE
+    ctypedef unsigned long DWORD
+    ctypedef long long __int64
+    ctypedef unsigned short WCHAR
+    struct _FILETIME:
+        DWORD dwHighDateTime
+        DWORD dwLowDateTime
+    ctypedef _FILETIME FILETIME
+
+    struct _WIN32_FIND_DATAW:
+        DWORD dwFileAttributes
+        FILETIME ftCreationTime
+        FILETIME ftLastAccessTime
+        FILETIME ftLastWriteTime
+        DWORD nFileSizeHigh
+        DWORD nFileSizeLow
+        # Some reserved stuff here
+        WCHAR cFileName[260] # MAX_PATH
+        WCHAR cAlternateFilename[14]
+
+    # We have to use the typedef trick, otherwise pyrex uses:
+    #  struct WIN32_FIND_DATAW
+    # which fails due to 'incomplete type'
+    ctypedef _WIN32_FIND_DATAW WIN32_FIND_DATAW
+
+    HANDLE INVALID_HANDLE_VALUE
+    HANDLE FindFirstFileW(WCHAR *path, WIN32_FIND_DATAW *data)
+    int FindNextFileW(HANDLE search, WIN32_FIND_DATAW *data)
+    int FindClose(HANDLE search)

According to Pyrex documentation "cdef extern from..." imply
"cdef" keyword for all definitions in this block. But this is not big deal.
And please, don't call me nitpicking, I'm just boring.

> | === modified file 'bzrlib/osutils.py'
> | --- bzrlib/osutils.py    2008-06-11 03:56:46 +0000
> | +++ bzrlib/osutils.py    2008-07-17 04:49:57 +0000
> | @@ -1190,8 +1191,14 @@
> |          pass to os functions to affect the file in question. (such as
> | os.lstat)
> |      """
> |      fs_encoding = _fs_enc.upper()
> | -    if (sys.platform == 'win32' or
> | -        fs_encoding not in ('UTF-8', 'US-ASCII', 'ANSI_X3.4-1968')): #
> | ascii
> | +    if (sys.platform == 'win32'):
> | +        try:
> | +            from bzrlib._walkdirs_win32 import
> | _walkdirs_utf8_win32_find_file
> | +        except ImportError:
> | +            return _walkdirs_unicode_to_utf8(top, prefix=prefix)
> | +        else:
> | +            return _walkdirs_utf8_win32_find_file(top, prefix=prefix)
> | +    if (fs_encoding not in ('UTF-8', 'US-ASCII', 'ANSI_X3.4-1968')): #
> | ascii
> |          return _walkdirs_unicode_to_utf8(top, prefix=prefix)
> |      else:
> |          return _walkdirs_fs_utf8(top, prefix=prefix)
> |
> | ^-- Windows 98 does not have Unicode API for FindFirstFileW and Co.
> | Although I remember clearly that bzr does not support Windows 98
> | explicitly,
> | you could change your check for platform so it will be win98-friendly:
> |
> | +    if win32utils.winver == 'Windows NT':
> | +        try:
> | +            from bzrlib._walkdirs_win32 import
> | _walkdirs_utf8_win32_find_file
> | +        except ImportError:
> | +            return _walkdirs_unicode_to_utf8(top, prefix=prefix)
> | +        else:
> | +            return _walkdirs_utf8_win32_find_file(top, prefix=prefix)
> |
> | win32utils.winver == 'Windows NT' on non-windows platforms will be False
> | as well.
> |
> 
> Do we strictly depend on win32utils?

Yes. Although win32utils is lazy loaded in osutils, but actually directly used
regardless of underlying platform. There is top-level code:

if win32utils.winver == 'Windows 98':
     _win32_abspath = _win98_abspath

Probably it's wrong usage and should be fixed somehow.

> Also, do we get 'Windows NT' for stuff like Vista and beyond?

If you look at the win32utils.pt then you'll see this:

# Windows version
if sys.platform == 'win32':
     _major,_minor,_build,_platform,_text = sys.getwindowsversion()
     # from MSDN:
     # dwPlatformId
     #   The operating system platform.
     #   This member can be one of the following values.
     #   ==========================  ======================================
     #   Value                       Meaning
     #   --------------------------  --------------------------------------
     #   VER_PLATFORM_WIN32_NT       The operating system is Windows Vista,
     #   2                           Windows Server "Longhorn",
     #                               Windows Server 2003, Windows XP,
     #                               Windows 2000, or Windows NT.
     #
     #   VER_PLATFORM_WIN32_WINDOWS  The operating system is Windows Me,
     #   1                           Windows 98, or Windows 95.
     #   ==========================  ======================================
     if _platform == 2:
         winver = 'Windows NT'
     else:
         # don't care about real Windows name, just to force safe operations
         winver = 'Windows 98'
else:
     winver = None

According to MSDN we already supported Windows Vista, although I don't have it to test.

> There is the variable "os.name" which gives me just 'nt' on Vista. Can
> you confirm what it is on other platforms?

On Windows 98 this variable is 'nt' too. This value is hardcoded in python.dll.
See http://docs.python.org/lib/module-os.html#l2h-2579
That's why I'm used such complicated way to detect either bzr runs on win98 or not.

> | But of course such change is entirely for your taste.
> |
> 
> I'm happy to bring in small changes that allow supporting Win98. I
> wasn't going to base the whole work around it.

Explicit win98 support don't worth of any effort. But IMO proposed change is really small
and should not cost much for future support. But you guys know better.

> | And something indirectly related to your pyrex code.
> |
> | About a year(?) ago I was experimenting on bringing my code for fake
> | cygwin-like
> | symlinks (win32symlink plugin) into bzrlib. I was failed in the past
> | because my code required another stat call for every file, and I found
> that
> | this slowdown walkdir operations for ~80% on FAT32 and ~15% on NTFS.
> |
> | Now with your very carefully optimized pyrex extension fake symlinks
> | have another
> | chance for life. To differentiate fake symlinks from regular files
> | SYSTEM file
> | attribute is used. So in your pyrex code only one function
> | _get_mode_bits needs
> | to be tweaked. (And also providing support in some osutils functions 
> too).
> |
> 
> That would certainly be possible. If you want to look into it a bit, I
> would be interested in reviewing the code.

No chance for me, sorry.




More information about the bazaar mailing list