brainstorming for UDS-N - Performance
Matthew Tippett
matthew at phoronix.com
Sat Oct 2 14:33:55 BST 2010
On 10/1/10 2:19 PM, Kees Cook wrote:
>> Have you asked them why that is? Maybe they don't know how to automate
>> the measurement, where to host it, or who to tell about it.
> In discussions at the last UDS, it seems that most teams could not agree
> on what would be valuable to measure. For the teams that did have things
> they wanted to measure (e.g. a specific firefox rendering speed test),
> no one stepped up to automate it.
>
> In a test-driven development style, it really seems like these measurements
> must be defined and automated before work on performance can be done. The
> trouble is that the performance work is rarely being done in the same team
> that will feel the impact, so it's non-trivial to understand the effect on
> another team's performance numbers.
>
> Oddly, I want these things not to measure how awesome upcoming performance
> improvements are, but to justify security/performance trade-offs. :P
> "It's only 10% slower, but you'll never have this class of high-risk
> security vulnerability again!" :)
>
> -Kees
>
These are some critical points. I recently (yesterday) reviewed the
data that the automation around Phoronix Test Suite (Phoromatic) has
been gathering for the last 6 or so months. The system has two types of
systems that it tests - A kernel tracker which pulls down the daily
kernel from the repo, and a full Ubuntu distribution tracker which
updates to the newer daily image each day.
I have thrown together (ie: not heavily reviewed by myself or others) a
summary of some interesting data at
http://www.phoromatic.com/resources/long-term-study/ .
The tests used are more or less random tests that exist with Phoronix
Test Suite. The results themselves show very interesting regressions
all over the place.
As engineers and developers, we tend to operate on the assumption that
we need to have highly tuned tests that target specific subsystems and
provide a strong correlation. My view is that we shouldn't obsess over
the tests themselves, but rather focus on ensuring that we have broad
and continuous testing in place at any cost. Once you start getting the
data and conducting the analysis, it will become clear where tests are
missing and usually a byproduct of the analysis will be more tests anyway.
The subtlety in Kees' words around is very important. Ubuntu does have
the base capability, but not really the built mechanism to allow
identification of the regressing package, so a test that identifies an
regression in a potentially unrelated subsystem can still be used to
identify the point at which things broke. At least getting it down to
the packaging changed to cause a regression shaves a considerable amount
from the analysis effort. Allowing teams to dive even deeper and
earlier within their own packages to resolve a regression.
Regards,
Matthew
More information about the ubuntu-devel
mailing list