[MERGE] benchmark display

Tue Aug 15 03:02:58 BST 2006

Hi, heres a review of this patch. Its not quite ready for merging IMO.

There are some general stylistic issues with the patch:
 - please use self.assertTHING rather than 'assert' in the test suite.
We like our tests to run with -O [in principal]. More importantly is the
consistency with the rest of bzrlib and the improved diagnostics we get.
 - please put docstrings or comments in your tests explaining what and
why you are testing. They are rather bare at the moment.
 - please put docstrings on your functions outside the test suite too!
 - you have some duplicate code fragments in the tests that could well
be factored out - making the tests easier to read.
 - please put docstrings on classes. These are really essential when
trying to understand how you are intending to tie the various components
together.

I think that the benchmark page generator could well be a hidden
command, that way theres no new binary to install when bzr is packaged,
and there would be less boilerplate needed.

Also, as I mention below, I think doing an external css would make
sense, and make it cleaner in terms of code.

I haven't looked at the sample output yet, but I'll do that once the
current issues are worked through.

On Sat, 2006-08-12 at 21:12 +0200, Jan Balster wrote:
> === added file 'bzrlib/tests/test_benchmark_report.py'
> --- bzrlib/tests/test_benchmark_report.py       1970-01-01 00:00:00
> +0000
> +++ bzrlib/tests/test_benchmark_report.py       2006-08-12 18:27:29
> +0000
> @@ -0,0 +1,2090 @@
> +from bzrlib.tests import TestCase, TestCaseInTempDir
> +from tools import pyxml
> +import unittest
> +from tools.benchmark_report import (PerfResult, PerfTable,
> PerfResultSample,
> +                                PerfResultDelta, Page, NullResult)
> +
> +testlines = """\
> +--date 1152625530.0
> hacker at canonical.com-20060705122759-0a3481a4647b16dc
> + 1906ms
> bzrlib.benchmarks.bench_add.AddBenchmark.test_one_add_kernel_like_tree
> + 3534ms
> bzrlib.benchmarks.bench_bench.MakeKernelLikeTreeBenchmark.test_make_kernel_like_tree
> + 13595ms
> bzrlib.benchmarks.bench_checkout.CheckoutBenchmark.test_build_kernel_like_tree
> + 29846ms
> bzrlib.benchmarks.bench_commit.CommitBenchmark.test_commit_kernel_like_tree
> +
> + 156ms
> bzrlib.benchmarks.bench_inventory.InvBenchmark.test_make_10824_inv_entries
> +""".splitlines()

I think that this could be clearer with shorter sample data - 
"""package.module.test_name 100ms
...
"""
would be enough to test with, but a lot easier to read.

> +class TestPerfTable(TestCase):
> +
> +    def setUp(self):
> +        self.lines = testlines[:]
> +        self.num_test_ids = 5
> +
> +    def test_parse(self):

A docstring here saying what you are testing please.

> +        lines = self.lines[:2]
> +        t = PerfTable()
> +        data = list(t.parse(lines))
> +        assert len(data) == 1
> +        
> +        assert isinstance(data[0], PerfResult)
> +        pr = data[0]
> +        expected = PerfResult(
> +            date=1152625530.0,
> +
> revision_id='hacker at canonical.com-20060705122759-0a3481a4647b16dc',

likewise here - the sample data need not be so long.

> +            elapsed_time=1906,

...

> +    def test_multiline_parse(self):

docstring please
...

> +    def test_add_lines(self):
> +        pass

This seems either incomplete or redundant!

...

> +        file('header.html',
> 'w').write(unicode(page).encode('latin1'))  

This construct is unsafe - please be sure to use 
outfile = file('header.html', 'w')
try:
    file.write(unicode(page).encode('latin1'))
finally:
    outfile.close()

We run on windows where open files cause serious grief with test suite
cleanups.

> +    def test_table(self):
> +        p1 = PerfResultSample(
> +            [PerfResult(
> +                    elapsed_time=2, 
> +                    revision_date=124.8, 
> +                    revision=100, 
> +
> test_id=('bzrlib.benchmarks.bench_add.AddBenchmark.'
> +                                'test_one_add_kernel_like_tree'),
> +                )
> +            ]
> +        )
> +        p2 = PerfResultSample(
> +            [PerfResult(
> +                    elapsed_time=3, 
> +                    revision_date=12456.3, 
> +                    revision=200,
> +
> test_id=('bzrlib.benchmarks.bench_add.AddBenchmark.'
> +                                'test_one_add_kernel_like_tree'),
> +                ),
> +            ]
> +        )

There seems to be quite a bit of duplicate work here - I suggest a
helper routine to make this more compact. (This is a common theme you
could apply throughout your patch).

> === added file 'tools/benchmark_report.py'
> --- tools/benchmark_report.py   1970-01-01 00:00:00 +0000
> +++ tools/benchmark_report.py   2006-08-12 18:27:29 +0000
> @@ -0,0 +1,709 @@
> +#!/usr/bin/env python
> +
> +from pyxml import html as pyhtml
> +import Image
> +import ImageDraw
> +import urllib
> +import StringIO
> +import math
> +import sys
> +
> +from bzrlib.branch import Branch
> +from bzrlib.log import show_log, LongLogFormatter
> +from bzrlib.osutils import format_date
> +from bzrlib.errors import NotBranchError
> +
> +class PerfTable:
> +
> +    def __init__(self, iterlines = [], path_to_branch=None):
> +        self.path_to_branch = path_to_branch
> +        self.branch = None
> +        self._revision_cache = {}

If you take a lock_read() out on the branch, you should find that you
dont need this revision cache any more. It will make things -much-
faster.

...

> +class PerfResultDelta:
> +    stud_num = 100
> +    stude_t = [[1.282, 1.645, 1.960, 2.326, 2.576, 3.090],  # inf
> +               [3.078, 6.314, 12.706, 31.821, 63.657, 318.313], #1.
...
> +               [1.290, 1.660, 1.984, 2.364, 2.626, 3.174] # 100.
> +               ]

Whats this table for ? Do we need it embedded in the source, or can we
generate it ?
...

> +        page = pyhtml.html( 
> +            pyhtml.head(
> +                pyhtml.meta(
> +                    name="Content-Type", 
> +                    value="text/html; charset=latin1",
> +                ),
> +                pyhtml.style(
> +                    ("""

I think that the style sheet should be separate - can we refer to an
external style sheet (i.e. bzr-performance.css), include one in the bzr
source, but allow people to replace it on disk if desired ?

> ...+    def header(self, start, end):
> +        """
> +        benchmarks on bzr.dev
> +        from r1231 2006-04-01
> +        to r1888 2006-07-01
> +        """

Is this the expected output ? Or a sample output ?

> === added file 'tools/pyxml.py'
> --- tools/pyxml.py      1970-01-01 00:00:00 +0000
> +++ tools/pyxml.py      2006-08-12 18:01:19 +0000
> @@ -0,0 +1,214 @@
> +
> +"""
> +generic (and pythonic :-) xml tag and namespace objects
> +
> +the following code is extracted from the MIT licensed py library
> +(http://codespeak.net/py). 
> +""" 

I'm not sure what this is for? We are outputting html not parsing it -
could you enlarge on why its needed ?

Cheers,
Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060815/b5e94439/attachment.pgp