Diffing commits of big files is slow

Alexander Belchenko bialix at ukr.net
Mon Jun 22 05:06:05 BST 2009


Aaron Bentley пишет:
> Martin Pool wrote:
>> 2009/6/22 Robert Collins <robert.collins at canonical.com>:
>>> On Mon, 2009-06-22 at 01:51 +0200, Daniel Clemente wrote:
>>>> In a pack-0.92 branch with latest Bazaar (1.17dev), I replaced a 40 Mb video with a newer version (30Mb).
>>>>
>>>>   I am intrigued as to why the following operations are so slow:
>>> bzr doesn't know that the files are binary until it extracts them and
>>> examines the content. It knows they are different before extracting, but
>>> that doesn't help a lot.
>> I guess it could shortcut this case by streaming the file out and
>> looking at just the start of it.
> 
> I could have sworn we did that already.  We certainly have the code, in
> textfile.text_file.

I found that code is not enough in the case of PDF files.
textfile.text_file() checks only first 1K bytes, but in PDF files null byte could be found
much later.




More information about the bazaar mailing list