[REGRESSION][1.6] fetching knits => packs

John Arbash Meinel john at arbash-meinel.com
Mon Aug 18 19:15:01 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

https://bugs.edge.launchpad.net/bzr/+bug/256757

This bug is an example of a regression in fetching between knit repositories
and pack repositories. It is a combination of the api friction for
'get_record_stream' and Robert's recent change to the fetch ordering (being
determined by the target repository.)

Specifically, the bug starts with a knit repository that is 70MB on disk (35MB
- --apparent). And it ends up 170MB after copying into a pack repository.
Further, it uses 700+MB of memory for the copy.

I haven't sorted out all of the details, but I believe the issue is in using
'unsorted' as the fetch order.

What is happening is that the fetch logic is downgrading to the 'fulltext =>
fulltext' logic because the repositories have different formats. In doing so,
it ends up inserting the texts in "arbitrary" order, as fulltexts, one-by-one.

This wouldn't be a huge problem if it was only caused by fetch, but the
'upgrade' logic uses the same fetch code path. So it means that upgrading from
knits => packs has the possibility to bloat your repositories *tremendously*.
And since 1.6 now almost forces you to upgrade (because we now warn when the
source format is Knit), it is going to cause a lot of grief for a lot of people.

So I'm going to bump the 1.6-final release until a solution has been found for
this. I'm looking into it now, but it is rather serious.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIqbwlJdeBCYSNAAMRAi6kAJ9nOJrmRWWrzS+R8uH+RZo15Q6WFQCdFudi
4DUJGwLHSrG1gC6YlUq+agk=
=5HLZ
-----END PGP SIGNATURE-----



More information about the bazaar mailing list