[MERGE][bug #242510] Move all pack ops into a single pack
John Arbash Meinel
john at arbash-meinel.com
Fri Sep 19 18:26:37 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This is a follow up to my earlier fix. I wrote the patch as an incremental
change, (mostly to allow people to approve the first one, without approving
the second, rather than BB marking it superseded.)
John Arbash Meinel wrote:
> This is a fairly simple fix. Basically, when the autopack logic decides how it
> wants to lay out the new set of packs, it can decide that it wants to put an
> existing pack into a new pack, but not put anything else with it. When it hits
> that condition, it ends up creating an identical pack, and the has a name
> collision.
>
> The patch is fairly straightforward. I'm thinking to put together a different
> patch which changes the logic, but that takes a bit more effort, and I'd like
> to have a bugfix available.
>
> The existing code says something like:
>
> Given packs of size 55, 25, 21, 9, (total of 110 revisions). It ends up
> creating packs of size 101, and 9.
>
> The new code just notices that the last pack is a 'no-op' pack, and removes it.
So this version instead says "to get optimal packing, I want to rewrite all of
55, 25, 21, 9 into one pack of size 110".
The idea is: Say we have 4 pack files that are "sub-optimal". 55, 45, 7, 3. At
present, we read all of that content, and write it out, but we split the write
into 2 packs, so we end up with a size 100, and a size 10. However, if we just
consider the I/O involved, that is identical to writing out a single pack of
size 110.
Further, by not having the size 10 pack lying around, we delay the next
autopack even longer.
It turns out this strategy is also immune to the original bug. That bug can
only be provoked when creating more than 1 pack. (If we only have 1 pack to
move, we are smart enough to do nothing.)
I wrote a script to let me experiment with the two styles of packing, and I'm
attaching it in case someone else wants to play with it.
I basically used it by doing stuff like:
add(1)
add(10)
or even
add_random()
What I found was:
1) The new logic generates slightly larger pack files, and packs slightly less
often. However, the new logic never really "skips" a pack. Meaning it waits to
pack a 3rd time, but it will pack a 3rd time before the old logic packs a 4th
time.
2) They both use the same "should I trigger" logic, which means that they
almost always trigger when we get a big rollover. Most of the packs happen
because of the 'number' of pack files desired, not because of the specific
arrangement. For example:
we currently have 1234 revisions, we add 76 to get 1300 and that drops us
from wanting 10 pack files to only 4. This is actually the same arrangement
as if you went from 3421 revisions to 4000 (add 579 revisions).
3) When you add a random number of revisions at a time (which I think is a
general idea when you are doing "push/pull/update" in a branch more than you
are doing "commit"), the auto-pack tends *not* to trigger, because it just
skips between big numbers. Meaning you tend to go "1234" to "1243" or "1254".
In these cases, you don't ever go through "1240" and think you want to pack.
4) On average, I think the new logic preserves slightly tighter packing, with
less effort. However, whichever logic just issued a pack always has the
short-term advantage.
In the end, I like the new logic. I think it is fairly universally better than
the old. It doesn't leave obvious orders-of-magnitude delineations around (you
tend to get packs of size 2000, rather than 2 of size ~1000). Though I think
that is actually a good thing. (With the old code, you could do "ls -s packs/"
and have a rather darn good idea of how many revisions each pack file held,
because the would be ~1MB, ~10MB, ~100MB, etc.)
Anyway, I think we want at least one of these implementations, so feedback is
welcome.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFI0+DNJdeBCYSNAAMRAlhJAKC8DH+udBUsFo4MH4oZ4h+aSTy0RQCeNAyk
a7P4P1023mEQV+W3BHiwMSs=
=l5Gq
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: autopack_bug_242510-2.patch
Type: text/x-diff
Size: 9119 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080919/627338a4/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_autopacking.py
Type: text/x-python
Size: 5615 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080919/627338a4/attachment.py
More information about the bazaar
mailing list