[RFC/MERGE] Updates to decorators
Martin Pool
mbp at canonical.com
Sun Jan 14 05:09:52 GMT 2007
On 11 Jan 2007, John Arbash Meinel <john at arbash-meinel.com> wrote:
> I have 2 updates to our current decorators, one follows on the other
> one, but is a little more controversial.
>
> Basically, I've been doing lsprof testing, and I've been getting
> frustrated with the 'read_locked()' functions. Because in a big test,
> you can have 10 or more @needs_read_lock functions. So when you are
> tracking up from the bottom, or down from the top, you frequently hit a
> barrier which makes it difficult to tell who is calling what. (10
> functions call "read_locked()" and read_locked calls 10 other functions,
> so which is being called without going to the source?)
>
> The first pass changes the needs_read_lock and needs_write_lock
> decorators so that they use 'exec' to create a function with the a
> mostly correct name (plus a _read_locked suffix). In testing, it was
> enough to make sure that functions could be uniquely determined. In
> theory if we have Branch.get() and Repository.get() they would end up
> having the same function signatures.
>
> As a small blurb this is a snippet from the old 'bzr st --lsprof':
>
> 2 0 0.2330 0.0337 bzrlib.tree:461(_iter_changes)
> +528 0 0.0576 0.0168 +bzrlib.workingtree:1307(_comparison_data)
> +482 0 0.1104 0.0092 +bzrlib.decorators:35(read_locked)
> +1058 0 0.0122 0.0091 +bzrlib.inventory:947(iter_entries_by_dir)
> +482 0 0.0076 0.0054 +bzrlib.revisiontree:84(get_file_sha1)
>
> 500 488 0.4910 0.0098 bzrlib.decorators:35(read_locked)
> +1 0 0.1445 0.0389 +bzrlib.branch:1104(revision_history)
> +488 0 0.0316 0.0103 +bzrlib.workingtree:1846(unlock)
> +488 0 0.0318 0.0098 +bzrlib.workingtree:1348(lock_read)
> +482 0 0.0393 0.0049 +bzrlib.workingtree:588(get_file_sha1)
> +5 0 0.0018 0.0002 +bzrlib.lockable_files:181(get_utf8)
> +3 0 0.0017 0.0001 +bzrlib.workingtree:1801(_last_revision)
> +1 0 0.1069 0.0001 +bzrlib.workingtree:1447(read_working_in...
> +1 0 0.0006 0.0001 +bzrlib.workingtree:1833(conflicts)
> +4 0 0.0013 0.0001 +bzrlib.lockable_files:175(get)
> +9 0 0.0001 0.0001 +bzrlib.lockable_files:247(unlock)
> +9 0 0.0001 0.0001 +bzrlib.lockable_files:231(lock_read)
> +1 0 0.2337 0.0000 +bzrlib.tree:429(compare)
> +1 0 0.0001 0.0000 +bzrlib.branch:1055(unlock)
> +1 0 0.0000 0.0000 +bzrlib.branch:1047(lock_read)
> +1 0 0.0002 0.0000 +bzrlib.inter:103(unlock)
> +1 0 0.0002 0.0000 +bzrlib.repository:902(is_shared)
> +1 0 0.0001 0.0000 +bzrlib.inter:82(lock_read)
> +1 0 0.0001 0.0000 +bzrlib.repository:236(lock_read)
> +1 0 0.0001 0.0000 +bzrlib.repository:295(unlock)
>
>
> and after the changes:
>
> 2 0 0.2365 0.0329 bzrlib.tree:461(_iter_changes)
> +528 0 0.0588 0.0165 +bzrlib.workingtree:1307(_comparison_data)
> +482 0 0.1119 0.0097 +<<string>>:1(get_file_sha1_read_locked)
> +1058 0 0.0120 0.0090 +bzrlib.inventory:947(iter_entries_by_dir)
> +482 0 0.0089 0.0065 +bzrlib.revisiontree:84(get_file_sha1)
>
> 482 0 0.1119 0.0097 <<string>>:1(get_file_sha1_read_locked)
> +482 0 0.0309 0.0098 +bzrlib.workingtree:1348(lock_read)
> +482 0 0.0311 0.0096 +bzrlib.workingtree:1846(unlock)
> +482 0 0.0402 0.0051 +bzrlib.workingtree:588(get_file_sha1)
>
>
> I hope that you can see why the new form is easier for me to read and
> follow the trace.
Yes, that's definitely better. Clever hack.
> I'm not 100% happy with having to recompile everything on startup, as it
> does have some overhead. It does help documentation and lsprof results,
> though. So it might be worth having around somehow.
>
> I did some performance timing, and the results are:
> bzr.dev patch1 patch2
> bzr rocks 162ms 168ms 176ms
> --no-plugins 95ms 96ms 96ms
> bzr status 443ms 462ms 476ms
> --no-plugins 415ms 439ms 447ms
Since the point is to speed things up, I don't think this should be on
by default.
> Basically, the performance impact is large enough that I'm not sure we
> want to do it. But the improvements to documentation and lsprof seem
> like it would be worth keeping around *somehow*.
>
> One other possibility, is that we could change the definition of
> 'needs_read_lock' based on whether --lsprof is supplied (or some other
> flag in the case of generating documentation). We could do something
> like what we do right now in 'bzr' with 'bzrlib.inspect_for_copy'. And
> have the 'bzr' front-end switch the definitions to the faster form while
> it is loading bzrlib.
>
> This would mean that users importing bzrlib directly would load a little
> slower, but with better documentation (which is just fine from the
> interactive interpreter, or when generating epydocs), but 'bzr' the
> command line program would not have to pay the penalty.
Yes I think that would be the best thing. There is already precedent in
profiling = False
if '--profile-imports' in sys.argv:
sys.argv.remove('--profile-imports')
import profile_imports
profile_imports.install()
profiling = True
(Actually both cases could be a bit more conservative by looking for the global
options only up to the first non-option argument...)
--
Martin
More information about the bazaar
mailing list