Excess data size for a single revision

John Arbash Meinel john at arbash-meinel.com
Mon Jan 23 14:24:35 UTC 2012

On 1/23/2012 12:55 PM, Eli Zaretskii wrote:
>> Date: Mon, 23 Jan 2012 12:22:37 +0100 From: John Arbash Meinel
>> <john at arbash-meinel.com> CC: mbp at sourcefrog.net,
>> bazaar at lists.canonical.com
>>> What am I looking for, though?  E.g., the .rix index
>>> corresponding to the above pack has 35 revisions, while the
>>> corresponding .tix file has 4088 texts.  Is the latter
>>> unusually large?
>> Averaged across a lot of histories (bzr, mysql, linux kernel,
>> emacs I think), a good heuristic is <10 texts changed per commit.
>> Above is averaging 100 texts changed per commit, or about 10x
>> normal. So yes, it is larger than expected.
> As I wrote earlier, only 7 files were changed and the diffs are
> less than 200 lines.

bzr log -n0 -r 106891 shows:

    99634.2.1005 Glenn Morris	2012-01-10
                Update short copyright year to 2012 (do not merge to

    99634.2.1006 Glenn Morris	2012-01-10
                Add 2012 to FSF copyright years for Emacs files (do
not merge to trunk)

Which together modify about 2000 files, and
         99634.21.7 Kenichi Handa	2012-01-13 [merge]

Which also has about 2000 entries (though those may not have been
modified vs trunk).

Now, it is possible that the changes introduced by 99634.2.1005 and
99634.2.1006 were reverted when they were merged to trunk. However,
that history is still part of the ancestry and that 2000 texts is
still copied around.

