Data migration to Bazaar 2.x status update
Ian Clatworthy
ian.clatworthy at canonical.com
Mon Aug 31 01:06:31 BST 2009
Thanks to those who helped with the data migration testing last week ...
I made more progress on bzr-fastimport late last week and over the
weekend. Building on the discussions with the fastimport guys from other
VCS projects, "round-tripping" of basic Bazaar repositories now mostly
works, i.e.
bzr fast-export --no-plain my-branch my.fi.gz
bzr fast-import my.fi.gz my-repo
will give you back a Bazaar repository with history (almost) equivalent
between my-branch and my-repo/trunk. To be specific, on the small
repositories (e.g. bzr-fastimport itself with 600+ revisions) I tried:
bzr log -v --forward --include-merges my-branch > source.log
bzr log -v --forward --include-merges my-repo/trunk > dest.log
diff source.log dest.log
showed no differences. Hooray! Note the -v shows the history including
file operations: adds, modifies, renames, deletes, etc. If -v is
replaced with -v -p, then the diffs are showed as well. With -p, I'm
seeing some timestamp differences in the diff headers but that's all.
(I'm yet to track down why.)
On larger repositories, things aren't acceptable yet. On the bright
side, bzr-fastimport can export and import bzr.dev (at long last)
without falling over. On the downside, the revision total in the
generated trunk is off by a few so the conversion obviously failed in
some way. I'm yet to look into why.
I also tried converting:
* FireFox 3.5 from hg (26K revisions)
* Linux 2.6.30 from git (146K revisions)
Firefox took 2 hours to import and 30 minutes to export. The Linux
kernel took 13 hours to import. The export of the kernel from the
generated bzr repository fell over after 3.5 hours (112k revisions)
because a file-id was missing, i.e. thanks to a bug in the import. Damn.
To complete the migration stress test, I'll throw in MySql 5.1 (58K
revisions) this week.
In summary, data migration isn't there yet. For small repositories (less
than 10K revisions say), I suspect it's close enough in practice not to
matter much. For larger repositories, smooth, reliable migration is
still several days of engineering time (a few weeks elapsed time) away.
Ian C.
PS: Note that fast-export --no-plain uses experimental "features" yet to
be reviewed by the vcs-fastimport-devs. I'll discuss those features with
them real soon now once I'm sure they cover what we need. If you're
curious, the fast-export help in the latest Bazaar Data Migration Guide
(http://doc.bazaar-vcs.org/migration/en/data-migration/fast-export.html)
outlines the proposed features: "multiple-authors", "commit-properties"
and "empty-directories".
More information about the bazaar
mailing list