Introduction to history deltas

David Allouche david at allouche.net
Wed Dec 7 12:08:06 GMT 2005


On Wed, 2005-12-07 at 17:14 +1100, Martin Pool wrote:
> Operating on many small files is slower than getting all the data you
> want from one large files[*].
[...]
> [*] I did a mini benchmark of writing one line to N files vs writing N
> lines to one file; for small N the speed is comparable but for large N
> it can be a hundred times slower.  This may be because opening each file
> uses up a fixed amount of memory in both the kernel and in Python -- for
> example we are using 4kB of disk cache for each line.

I am happy you bring that issue forward, I wanted to make a comment
since I began reading that thread.

I expect that grouping hunks by commits would make building the weave
for a text-revision marginally more expensive in the hot disk cache case
(because of open overhead) and very significantly more expensive in the
cold disk cache case, because it defeats read-ahead in the kernel.

Be nice to your system caches, they will pay you back for it.
-- 
                                                            -- ddaa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051207/d3f71d02/attachment.pgp 


More information about the bazaar mailing list