brisbane:CHKMap.iteritems() tweaks
Robert Collins
robert.collins at canonical.com
Tue Mar 24 22:11:06 GMT 2009
On Tue, 2009-03-24 at 16:43 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Ian Clatworthy wrote:
> > John Arbash Meinel wrote:
> >>> 1) Change 'bzr pack' so that it creates a separate set of groups for
> >>> items that are available in the newest XX (say 100) revisions. Or
> >>> possibly group everything into 100 rev chunks.
> >> This was easy to implement for CHK streams. And it changes "time bzr ls
> >> -r-1" from 4.4s down to 1.6s. (I implemented as splitting at 10
> >> revisions). And the time without the patch is 2.2s up from 1.6s. So the
> >> patch makes a bigger difference when we aren't swamped with extract time.
> >
> > Nice.
> >
> >> The total size on disk after packing is barely noticeable:
> >> 125666
> >> 125981
> >>
> >> I guess that is 300KiB. But out of 125MiB, that is only 0.2%.
> >
> > Well worth it IMO.
>
> So I've done a bit of playing with this. Specifically, I felt that if
> you want to pull out the most recent 100 revs of chk pages, then you
> probably also want to pull out the most recent texts.
...
> So we save ~2s during extracting the texts time. (This is with my fix to
> TT.create_file(string) to use f.write() rather than f.writelines())
>
> I'm not as convinced that this is worthwhile yet. Considering that we
> spend 4.7s in 'get_build_details', making get_bytes_as() 2.5=>1.0s
> doesn't seem really worth the 10% increase in repository size.
>
> I guess I can say "maybe", but it isn't as clear-cut as the benefit to
> changing the chk pages.
I think we should address the penalty of having many texts in the group
rather than splitting the groups up. git has a single pack when fully
packed with the new stuff at the front, and we should be getting similar
locality of reference.
I don't think 10% repository size is worth it.
-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090325/a5883521/attachment.pgp
More information about the bazaar
mailing list