[RFC] tool to generate a repository for performance analysis
Goffredo Baroncelli
kreijack at tiscalinet.it
Fri Jun 6 18:48:17 BST 2008
Hi all,
in order to improve the bazaar performance, I think that our
analysis should be performed on a "know" repository layout.
In a lot of thread we cited the "Emacs" or "mozilla" repos; the
problem is that these repos are very big and not easily repeatable nor
trasportable.
Moreover sometime we are interessed to a history browsing (so the files size
don't matter and should be 0 in order to compact the repository size).
Sometime we are interested in the repository size efficacy so the files size
or the files number matter !
My idea is to develop a tool which is able to create a repository on the
basis of a set of parameters, like:
- history depth
- number of files
- files size
- # of file added/deleted/removed per revision
- mainline branch frequency
- branch branch frequency
- merge on mainline frequency
- merge on branch frequency
- others ?
We can define the parameters above in terms of "average" and "standard
deviation".
We can use the python standard random generator with a prefixed initial seed.
So the repositories generated are repeatable.
The output should be a *-fast-export like stream; so we can use this tool with
different DVCS.
After defining some typical "repository layouts", we can automatically develop
performance [regression] tests.
Moreover these repository layout can be used for performance comparation with
every DVCS compatible with the fast-export protocol.
Thoughts and comments are welcome.
BR
Goffredo
--
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo)
<kreijack at inwind_DOT_it>
Key fingerprint = CE3C 7E01 6782 30A3 5B87 87C0 BB86 505C 6B2A CFF9
More information about the bazaar
mailing list