non-recursive status of a directory?
Mark Hammond
mhammond at skippinet.com.au
Sat Jun 7 08:05:29 BST 2008
> I should mention that this was a clean tree, on decent hardware with a
> hot cache.
...
> Also, 'iter_changes()' is certainly capable of doing just
> subdirectories, so if you are only viewing a 5,000 file
> subtree of the 55,000, you're still good.
The common use-case for tortoise is browsing in from the root of the drive. Thus the *parent* of your repository will be seen first - and your parent may have a large number of repositories under it (ie, my '\src' directory I mentioned in my recent post).
As a result, the caches are not likely to be hot, and we really don't want to assume excellent hardware - I know that on my fast disks, a stat() of every file in a mozilla tree is going to take more than a few seconds.
> Now.... we probably have a Unicode encoding performance issue on Windows.
> Specifically, when we 'os.listdir()' we get back Unicode names. Which is
> probably slower than it could be (I haven't benchmarked it on win32),
If I understand correctly, the file system is likely to be natively Unicode, so calling the wide API is probably the fastest way to get that information from windows. If you call the ASCII version of the API, I expect windows is just calling WideCharToMultiByte() for you, so its unlikely to be faster (indeed, an off-the-cuff benchmark shows os.listdir('') 25% slower than os.listdir(u'') on a dir with ~500 items). The Python unicode object already seems to have C implemented utf8 encoding with optimizations for ascii and latin1 - so I (helpfully :) can't see anything obvious to exploit here...
Cheers,
Mark
More information about the bazaar
mailing list