[RFC] Performance test framework

Fri Jun 19 10:22:05 UTC 2015

Hello all,

there have recently been a few occasions where we wanted to experiment with
performance improvements, but lacked a way to easily measure their effect.
There have also been a few occasions where different implementations of the
(supposedly) same performance test came up with different results. In light of
these issues, it would be helpful to have a common performance test framework,
to make it easy to create and share performance tests scenarios.

As our codebase is C++ it would be natural to provide such a framework
in C++, providing the maximum flexibility and customizability.

On the other hand, sharing C++ code is not convenient, especially if want more
casual users to be able to try out the scenarios. As an alternative, we could
provide this test framework as a python library, possibly backed by a glue C++
library as needed (note that for some scenarios a glue layer is not needed,
since we could use our demo servers and clients, or write new ones).

Here are some pros and cons of the solutions (pros of one are cons of the other
in this case):

C++:
 + Easy integration with code base
 + Customizability of server/client behavior

python (possibly backed by C++ glue layer):
 + Easy to write
 + Convenience in sharing test scenarios
 + Plethora of existing libraries we need (e.g. python libs for babeltrace,
   uinput, statistics)

This how I imagine a performance test script could look like in Python (the C++
version wouldn't be very different in terms of abstractions):

    import time
    from mir import PerformanceTest, Server, Client, Input

    host = Server(reports=["input","compositor"])
    nested = Server(host=host)
    client = Client(server=nested)

    test = PerformanceTest([host,nested,client])
    test.start()

    # Send an input event every 50ms
    input = Input()
    for i in range(100):
        input.inject(...)
        time.sleep(0.050)

    test.stop()

    trace = test.babeltrace()
    ... process trace and print results ...

[Note: This is example is for illustration only. The specifics of the
performance framework API are not the main point of this post. We will have a
chance to discuss more about them in the future, after the high level decisions
have been made.]

The main drawback with the script based approach (vs a pure C++ approach) is
that it's not clear how to provide custom behavior for servers and, more
importantly, clients.

If we find that our performance tests need only a small collection of behaviors
we could make them available as configuration options:

    Client(behavior="swap_on_input")

Alternatively, we could provide more fine grained customization points that run
python code (which will of course increase the complexity of the glue layer):

    def on_input_received(...): do stuff
    Client(on_input_received=on_input_received)

So, what I would like to hear from you is:

1. Your general preference for python vs C++ for the test scripts
2. Any particular performance tests that you would like to see implemented,
   so we can get a first idea of what kinds of custom behaviors we may need
3. Any other comments, of course!

Thank you,
Alexandros