[MERGE] Streamline _walkdirs_utf8 for utf8 file systems, reducing time to traverse a mozilla tree from 1s to .6 seconds. (Robert Collins)

Wed Sep 10 20:41:05 BST 2008

On Wed, 2008-09-10 at 12:13 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Robert Collins wrote:
> > So, this is an updated patch of the one John tweaked to work on windows.
> > The windows stuff has not been changed further, so it should still work
> > fine.
> > 
> > Its faster :)
> > 
> > Before, I see 1000ms to walkdirs a mozilla tree, with it, I see 612ms.
> > In 'st' terms, thats 3.84 seconds down to 3.34. Or half a second saved,
> > ~14% total time reduction.
> > 
> > Enjoy.
> > 
> > -Rob
> > 
> > 
> 
> You are doing a bunch of PyOS_snprintf calls, but not actually using the
> pathstart or abspath strings.
> 
> For each directory you go into, you end up with a 'chdir' in, and then a
> 'chdir' back to the original directory. That seems *really* inefficient. Can
> you instead change the outer loop so that either:
> 
> a) Pass in a file handle to 'fchdir()' back to
> b) Save the location at the beginning of looping, and chdir back at the end of
> looping?

Its nicely encapsulated at the moment, and threadsafe enough that I'd
rather not do either of these things for now. I'm saving 40% of the time
we used to spend doing walkdirs -> no need to aim for perfection in one
step. Doing a) requires a leaky abstraction I feel, and can leak fd's,
b) is what I do basically, just with a chdir at the end of each dir.

> As you mention, this is only 14% reduction, and how much of that is directly
> because of the chdir() and how much of it is other tuning? If you feel it is
> good and import, I can accept it. It just feels... unclean, to change the
> global state. If it is 10/14% then I'm fine with it, if it is 2/14% then I'm
> not so sure. It seems like something we should come back to, rather than
> something we should depend explicitly on now.

Its a 40% reduction in the altered layer. I think that is a lot! Its
preserving global state because it chdirs back before control is
returned to the caller. The 14% reduction is because this layer was only
30%ish of total st time.

> Anyway, given you probably feel strongly towards it, I'll review the rest:
> 
> +cdef class _Stat:
> +    """Represent a 'stat' result."""
> +
> +    cdef readonly int st_mode
> +    # nanosecond time definitions use MACROS, fucking up te ability to have a
> +    # variable called st_mtime. Yay Ulrich, thanks a lot.
> +    cdef readonly time_t _ctime
> 
> ^- I believe we try to avoid profanity, and regardless "the" is misspelled.

My bad, I'm normally better than this. I was just flabbergasted at the
time.

> Also, why not just add:
> 
> cdef extern "readdir.h":
>   pass
> 
> And in 'readdir.h':
> #undef st_mtime

Won't work, we can't access the st_mtime field of struct stat if we do
that because it doesn't exist anymore. (Look at __USE_MISC in
bits/stat.h).

> Also, regardless I think it would be good to unify our extension "_Stat" objects.

Agreed.

> A few other things...
> 
> You tend to use "int" in a lot of places where you should be using Py_ssize_t.
> And for most cases "st_size" is actually a 64-bit integer (or PyLong as you
> prefer).

True.

> You are still using:
> 
> dot = ord('.')
> 
> instead of c'.', not a huge issue, but I feel like:

I forgot the spelling in pyrex of c'.' :).

> 1) Clean up the snprintf stuff if you aren't going to be using it.

agreed,

> 2) Clean up types. Use Py_ssize_t for python API's, 'off_t' for stat objects, etc.

Well, I used the same types you did for stat objects on windows, if
we're unifying them it seems strange to change. As for python api's, the
C code is correct, the types just hint what magic pyrex should invoke to
get values from python on auto conversion. Happy to change specific
issues, but I can't tell a-priori which ones you spotted.

> 3) Make sure we really need to do chdir(). If you are confident, then we can
> do so.

Yes, we do.

> 4) Consider finding a way to not have to continually reset the cwd using a
> path. Such as using 'fchdir()'. Also, we know the inode for the directory, as
> we found it in the previous pass. Is there any way we could use that to speed
> up the initial chdir(dir_to_look_in)?
> 
> I also wonder if using a relative chdir() would be faster than using an
> "absolute" path each time.

We can look at passing state in-out to help with this, I think it may
give a better result, but this code isn't worse than what we have :).

> 5) Alternatively, perhaps we make the api for DirReaders require that callers
> use a 'finalize()' style statement. And *that* would reset the current working
> directory.

Such apis are more fragile, if we can avoid them I'd like to.

I'm off today and tomorrow, so if someone wants to just whip through and
do the things we've agreed on here, that would be grand.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080911/5d123a1a/attachment.pgp