[RFC/MERGE] Updates to decorators

Sun Jan 14 05:09:52 GMT 2007

On 11 Jan 2007, John Arbash Meinel <john at arbash-meinel.com> wrote:
> I have 2 updates to our current decorators, one follows on the other
> one, but is a little more controversial.
> 
> Basically, I've been doing lsprof testing, and I've been getting
> frustrated with the 'read_locked()' functions. Because in a big test,
> you can have 10 or more @needs_read_lock functions. So when you are
> tracking up from the bottom, or down from the top, you frequently hit a
> barrier which makes it difficult to tell who is calling what. (10
> functions call "read_locked()" and read_locked calls 10 other functions,
> so which is being called without going to the source?)
> 
> The first pass changes the needs_read_lock and needs_write_lock
> decorators so that they use 'exec' to create a function with the a
> mostly correct name (plus a _read_locked suffix). In testing, it was
> enough to make sure that functions could be uniquely determined. In
> theory if we have Branch.get() and Repository.get() they would end up
> having the same function signatures.
> 
> As a small blurb this is a snippet from the old 'bzr st --lsprof':
> 
>     2   0 0.2330 0.0337   bzrlib.tree:461(_iter_changes)
>  +528   0 0.0576 0.0168   +bzrlib.workingtree:1307(_comparison_data)
>  +482   0 0.1104 0.0092   +bzrlib.decorators:35(read_locked)
> +1058   0 0.0122 0.0091   +bzrlib.inventory:947(iter_entries_by_dir)
>  +482   0 0.0076 0.0054   +bzrlib.revisiontree:84(get_file_sha1)
> 
>   500 488 0.4910 0.0098   bzrlib.decorators:35(read_locked)
>    +1   0 0.1445 0.0389   +bzrlib.branch:1104(revision_history)
>  +488   0 0.0316 0.0103   +bzrlib.workingtree:1846(unlock)
>  +488   0 0.0318 0.0098   +bzrlib.workingtree:1348(lock_read)
>  +482   0 0.0393 0.0049   +bzrlib.workingtree:588(get_file_sha1)
>    +5   0 0.0018 0.0002   +bzrlib.lockable_files:181(get_utf8)
>    +3   0 0.0017 0.0001   +bzrlib.workingtree:1801(_last_revision)
>    +1   0 0.1069 0.0001   +bzrlib.workingtree:1447(read_working_in...
>    +1   0 0.0006 0.0001   +bzrlib.workingtree:1833(conflicts)
>    +4   0 0.0013 0.0001   +bzrlib.lockable_files:175(get)
>    +9   0 0.0001 0.0001   +bzrlib.lockable_files:247(unlock)
>    +9   0 0.0001 0.0001   +bzrlib.lockable_files:231(lock_read)
>    +1   0 0.2337 0.0000   +bzrlib.tree:429(compare)
>    +1   0 0.0001 0.0000   +bzrlib.branch:1055(unlock)
>    +1   0 0.0000 0.0000   +bzrlib.branch:1047(lock_read)
>    +1   0 0.0002 0.0000   +bzrlib.inter:103(unlock)
>    +1   0 0.0002 0.0000   +bzrlib.repository:902(is_shared)
>    +1   0 0.0001 0.0000   +bzrlib.inter:82(lock_read)
>    +1   0 0.0001 0.0000   +bzrlib.repository:236(lock_read)
>    +1   0 0.0001 0.0000   +bzrlib.repository:295(unlock)
> 
> 
> and after the changes:
> 
>     2 0 0.2365 0.0329   bzrlib.tree:461(_iter_changes)
>  +528 0 0.0588 0.0165   +bzrlib.workingtree:1307(_comparison_data)
>  +482 0 0.1119 0.0097   +<<string>>:1(get_file_sha1_read_locked)
> +1058 0 0.0120 0.0090   +bzrlib.inventory:947(iter_entries_by_dir)
>  +482 0 0.0089 0.0065   +bzrlib.revisiontree:84(get_file_sha1)
> 
>   482 0 0.1119 0.0097   <<string>>:1(get_file_sha1_read_locked)
>  +482 0 0.0309 0.0098   +bzrlib.workingtree:1348(lock_read)
>  +482 0 0.0311 0.0096   +bzrlib.workingtree:1846(unlock)
>  +482 0 0.0402 0.0051   +bzrlib.workingtree:588(get_file_sha1)
> 
> 
> I hope that you can see why the new form is easier for me to read and
> follow the trace.

Yes, that's definitely better.  Clever hack.

> I'm not 100% happy with having to recompile everything on startup, as it
> does have some overhead. It does help documentation and lsprof results,
> though. So it might be worth having around somehow.
> 
> I did some performance timing, and the results are:
> 	      bzr.dev  patch1  patch2
> bzr rocks     162ms    168ms   176ms
>  --no-plugins  95ms     96ms    96ms
> bzr status    443ms    462ms   476ms
>  --no-plugins 415ms    439ms   447ms

Since the point is to speed things up, I don't think this should be on
by default.

> Basically, the performance impact is large enough that I'm not sure we
> want to do it. But the improvements to documentation and lsprof seem
> like it would be worth keeping around *somehow*.
> 
> One other possibility, is that we could change the definition of
> 'needs_read_lock' based on whether --lsprof is supplied (or some other
> flag in the case of generating documentation). We could do something
> like what we do right now in 'bzr' with 'bzrlib.inspect_for_copy'. And
> have the 'bzr' front-end switch the definitions to the faster form while
> it is loading bzrlib.
> 
> This would mean that users importing bzrlib directly would load a little
> slower, but with better documentation (which is just fine from the
> interactive interpreter, or when generating epydocs), but 'bzr' the
> command line program would not have to pay the penalty.

Yes I think that would be the best thing.  There is already precedent in 

profiling = False
if '--profile-imports' in sys.argv:
    sys.argv.remove('--profile-imports')
    import profile_imports
    profile_imports.install()
    profiling = True

(Actually both cases could be a bit more conservative by looking for the global
options only up to the first non-option argument...)

-- 
Martin