Diffing commits of big files is slow
Martin Pool
mbp at sourcefrog.net
Mon Jun 22 01:50:34 BST 2009
2009/6/22 Robert Collins <robert.collins at canonical.com>:
> On Mon, 2009-06-22 at 01:51 +0200, Daniel Clemente wrote:
>> In a pack-0.92 branch with latest Bazaar (1.17dev), I replaced a 40 Mb video with a newer version (30Mb).
>>
>> I am intrigued as to why the following operations are so slow:
>
> bzr doesn't know that the files are binary until it extracts them and
> examines the content. It knows they are different before extracting, but
> that doesn't help a lot.
I guess it could shortcut this case by streaming the file out and
looking at just the start of it. If the workingcopy file is obviously
binary it in theory shouldn't need to read the repository's copy at
all.
1- matching a user rule saying *.ogg (or this file-id or whatever) is
binary and shouldn't be diffed;
<https://bugs.edge.launchpad.net/bzr/+bug/218128> -- commonly duped
2- streaming extraction of the content and detecting from the first
bit of it that it's binary and shouldn't be diffed
<https://bugs.edge.launchpad.net/bzr/+bug/390418> -- would be nice
--
Martin <http://launchpad.net/~mbp/>
More information about the bazaar
mailing list