Conflicts due to independently creating identical files

Andrew S. Townley ast at atownley.org
Wed Dec 22 23:56:31 GMT 2010


Hi Vincent,

Thanks for picking up this thread.  Comments/answers below.

On 9 Dec 2010, at 5:51 AM, vila wrote:

>>>>>> Andrew S Townley <ast at atownley.org> writes:
> 
> <snip/>
> 
>> In my case, I synchronize the entire bzr trees.  However, what
>> I've also noticed recently is that occasionally bzr "forgets" who
>> I am.
> 
> This seems weird, are you setting 'email' in 'bazaar.conf' or
> 'branch.conf' ?
> 
> If you do it in the later and then sync your tree from a host where you
> used the former, then that would explain it.

Possibly.  I don't know at this stage, but I do know that I've never used the files until bzr complained about not knowing my identity.  At this stage, it could completely be down to overwriting the configuration as you suggest.

>> I never used to use a .bzr file, but now I have to.  I don't know
>> if they're related or not.
> 
> What .bzr "file" are you referring to here ?

Sorry for not being precise.  I was trying to find the exact location where this is stored, but I couldn't.  I thought I found something in the .bzr directory, but maybe I was mistaken.  I don't seem to have this on my machine (laptop), but I might have it in the office on the desktop.  I can't get there from here at the moment, but I'll check.

>> I regularly do development across a few machines for various
>> reasons, and this is becoming a very annoying problem for me.  Is
>> it possible in this scenario to do a diff on the file and if they
>> are identical to not issue this conflict situation?
> 
> This is worked on as a work around to the so-called 'parallel import'
> problem.
> 
> When a file is added to bzr, it receives a file-id which is then used to
> track the renames and ensure that the modifications made on the file
> after various renamings are still recognized as being about this precise
> file.
> 
> When a file is added in two (or more) different branches, bzr generates
> a conflict (a 'Content Conflict') to indicate that there is no automated
> way to decide which file-id should be kept (and the associated history).

While I understand the theory here (and I do the same type of thing in another system I've written), what I don't understand is that if it detects this sort of information and has both copies of the file (or at least can store some metadata about the file), why can't it do some kind of md5/sha hash of the thing to determine that the contents are identical and merge or replace the IDs.

It happens to me with such regularity now that I've actually changed my VCS habits to hardly ever commit changes because of all the pain involved.  I really like bzr and have been using it for a long time, but this is starting to get really annoying.  I can accept that perhaps I'm in the minority here, so maybe I just need to find a different tool.

>> My standard resolution process is to do a find . -name '*.moved'
>> -exec rm {} \; and then resolve the conflicts.  I know the files
>> are identical because they're synchronized with unison.
> 
> If you maintain the trees yourself, you may want to resync their
> histories too, that would get rid of the conflicts entirely. Also using
> Unison (based on rsync right ?) should not give better results than 'bzr
> pull' (especially if you use the recommended stable 2a format).

I'm not quite sure what you mean here.  I maintain all the trees on all the machines.  I run 'bzr update' when it tells me the branch is out of date, and then that's normally when I get the conflicts.

As someone else said, it isn't based on rsync, unison is bidirectional.  I have a lot of repositories that have been around a while.  I remember seeing that some of them have been upgraded over time, but I have never explicitly specified a repository format.

My issue is that my tree(s) are in other trees that aren't managed by bzr that I need to maintain across the multiple machines.  Part of what I liked about bzr and DVCS in general is the "it's just a directory tree" with very little magic, so you should be able to copy things around.  Unison keeps things in sync and manages conflicts on the whole set of directory trees, and then bzr provides VCS for projects in some of the subtrees.  It would be totally impractical to manage the entirety of this with bzr.

>> I would really be interested in hearing why the above proposal of
>> doing a check for identical files isn't a valid way to avoid the
>> problem.
> 
> Well, for the edge case where the files have been created independently
> and are then tracked by bzr, syncing the histories after the first
> conflict resolution is the cleanest way.
> 
> What makes the workaround you propose less reliable is when the files
> keep being modified under their original file-id. This requires more
> care and more work and hasn't been addressed (yet, but hopefully will,
> soon).

Thanks for letting me know.  Again, I'm not modifying them intentionally under the original file ids, but that seems likely to be the main issue here.  Can you tell me what I need to do in this case.  I couldn't see anything obvious in the first few pages of google search results. ;)

Thanks again for your help.

ast
--
Andrew S. Townley <ast at atownley.org>
http://atownley.org




More information about the bazaar mailing list