[RFC] Compact origin information for knit data files
lists at hlabs.spb.ru
Wed Nov 22 12:43:48 GMT 2006
Often adjacent lines of data in annotated knit data files contains the same
origin information so it would be useful to compact the information in such
case. I propose to skip origin information for all lines except the first line
in a block of adjacent lines with the same origin. So instead of:
the content will be:
When knit file parser gets a line without any origin information the
information will be taken from a previous line which contains such an
information within the block of adjacent lines.
I expect not only smaller revision store size but also a some speedup (smaller
data files will be processed faster, no need to utf-8 encoding/decoding for
every data line).
Maybe instead of just skip origin information it would be better to place a one
char marker at the start of the line? It would be useful in case of different
markers for different line flavors. For example: '=' marker could be used if
origin information is the same as the version id of block of changes and '+'
marker in case of the same information for adjacent lines.
It seems the new repository format version number should be introduced. How
repository may be converted into new format (bzr upgrade?)?
Dmitry Vasiliev (dima at hlabs.spb.ru)
More information about the bazaar