Diffing commits of big files is slow
Alexander Belchenko
bialix at ukr.net
Mon Jun 22 05:06:05 BST 2009
Aaron Bentley пишет:
> Martin Pool wrote:
>> 2009/6/22 Robert Collins <robert.collins at canonical.com>:
>>> On Mon, 2009-06-22 at 01:51 +0200, Daniel Clemente wrote:
>>>> In a pack-0.92 branch with latest Bazaar (1.17dev), I replaced a 40 Mb video with a newer version (30Mb).
>>>>
>>>> I am intrigued as to why the following operations are so slow:
>>> bzr doesn't know that the files are binary until it extracts them and
>>> examines the content. It knows they are different before extracting, but
>>> that doesn't help a lot.
>> I guess it could shortcut this case by streaming the file out and
>> looking at just the start of it.
>
> I could have sworn we did that already. We certainly have the code, in
> textfile.text_file.
I found that code is not enough in the case of PDF files.
textfile.text_file() checks only first 1K bytes, but in PDF files null byte could be found
much later.
More information about the bazaar
mailing list