A few thoughts on foreign integration...

Tue Oct 19 16:46:12 BST 2010

On Tue, Oct 19, 2010 at 10:34 AM, Jelmer Vernooij <jelmer at samba.org> wrote:
[snip]
> The draft is indeed quite vague. Rob might have a somewhat clearer idea
> what they are, but I don't know if that's written down anywhere. As I
> understand it path tokens might also be useful to support file copy
> tracking in the future.

That's what I understood as well... along with the fact that they'd be
able to say these files were joined in history, which I think is
useful for more than just file copying.

> I don't think dropping file ids completely makes much sense by itself.
> It would mean we'd lose the ability to do proper rename tracking, which
> is a very useful feature. Dropping just file ids also doesn't help with
> e.g. parallel imports since we would still have different revision ids
> for revisions that have (roughtly) the same contents. The issue you're
> running into has more to do with lack of cherrypicking support imho.

I agree...

> Perhaps dropping file ids in combination with other model changes might
> make sense, though.

I was actually thinking more along this line.  I do think you're right
though, better cherrypicking support would have helped here.

>> Also, what are file id aliases?  How would that help solve this?
> File id aliases are basically a per-repository map of file ids that
> should be considered equivalent to each other. This has some performance
> implications and code complexity implications - finding out if two file
> ids are referring to the same file requires more than just a call to
> str.__eq__ but probably a lookup in some sort of index.
>
> Personally I (now) think we shouldn't try to work around the existing
> limitations of file ids by adding file id aliases but rather investigate
> the alternatives, such as path tokens.

Given the above description, I'm inclined to agree.  File id aliases
sound a bit nasty.

> Btw, I don't think we'll be able to cope with the situation you're
> facing properly (without any conflicts) until cherrypick tracking
> support lands in bzr and bzr-svn. We can however make your life a bit
> easier by adding an option to allow path-based merges, and to cope with
> the fact that two files that have different file ids are actually the
> same for subsequent merges.

In my most recent scenario, I'm not sure even that path-based merging
would have helped (but it most definitely would in a several other
scenarios we face often).  The number of conflicts I got was enormous,
so I investigated it further and found a couple of things.  First, the
branch was created from ours, so, in this case, bzr-svn did know the
file-ids of the tree.  However, two mistakes were made.  On our end
the dev was re-organizing the tree and instead of using 'bzr mv', he
nuked the tree, and then re-added the files in their new location
(naughty, naughty, naughty).  Next, I think the other team had a bit
of trouble merging our refactoring (svn has traditionally sucked at
this), so I'm sure some interesting compromises were made there too.
So when we finally go back to remedy all of this, we see that the old
tree was deleted, and there is new "unrelated" tree.  The result is a
tremendous amount of conflicts.

The good news is now that I understand what happened, we can work
around this better on several fronts.  But it took considerable time
to figure it all out. :-(  Like you said though, support for
cherrypicking would have really made this much easier.  I hope that
eventually gets on the books for development.  Is there a game plan
for that somewhere (in terms of implementation)?

-John