cvs to bazaar migration

Michael Haggerty mhagger at alum.mit.edu
Tue May 6 14:20:34 BST 2008


Stefan Monnier wrote:
>> Is there any chance we could get a cvs2bzr command added to cvs2svn?
>> Initially, it could do exactly the same as cvs2git code wise, but the
>> web page could explain how to import into Bazaar.  I'm happy to help with
>> web page content if you want/need assistance there.
> 
> IIUC one of the problems with cvs2svn when used for fastimport is that
> it doesn't infer file renames, probably because Git does its own
> file-rename inference.
> 
> Then again, maybe it does guess file renames, and I just hit an unlucky
> failure of the guesswork when I tried it,

cvs2svn does not infer file renames at all, even for Subversion, where
that information would be useful.  Indeed, cvs2svn issue #1 covers the
detection of server-side copies and renames:

http://cvs2svn.tigris.org/issues/show_bug.cgi?id=1

File copy and rename information is not available from CVS, so detecting
them would be a matter of heuristics.  Moreover, different people handle
renames in different ways:

1. Delete the old file via "cvs rm" and create a completely new file
with the same contents (losing history)

2. Rename the old *,v file within the CVS repository (breaking all older
revisions)

3. Copy the *,v file to the new name then mark the old copy deleted via
"cvs rm"

  3a. Leave the old tags in the new file copy (breaking old revisions
and old tags)

  3b. Remove old tags from the new file copy (breaking old revisions but
not old tags)

Arguably only (1) is justifiable in CVS's childlike view of the world,
but all of them are quite common in the wild.

The cvs2svn bug report gives a lot of ideas for heuristics that could be
used to detect various types of server-side copies and renames.  But
implementing them would be a vast amount of work.

The hash-based approach, as suggested in the bug report, would be
limited because it would only detect move/renames where the file
contents are identical.  Better would be to estimate the similarity of
file contents as is done by git.  This would be expensive, but possibly
doable if one limits oneself to detecting where a file-delete and
file-add in the same commit can plausibly be considered a rename (and
then, of course, only if the changeset is reconstructed correctly).

Absent a very ambitious volunteer or sponsorship, this feature is very
unlikely to be implemented.

I wonder whether converting CVS -> git -> bzr would give you the rename
information for type (1) renames courtesy of git's heuristics?

Michael



More information about the bazaar mailing list