Bzr development stopped

Tue Nov 27 19:54:10 UTC 2012

On Wed, Nov 28, 2012 at 7:01 AM, Chris Hecker <checker at d6.com> wrote:
>
>> He implied that this strategy could be implemented in a way that is
>> transparent to the user
>
> But isn't this exactly analogous to the old svn argument that
> copy+delete would work for rename and be transparent to the user?
> Except, in practice, this stuff is subtle and it's really hard to get
> all the edge cases, and I broke it constantly.  I haven't used svn in a
> while, so maybe they've finally spackled over all the trouble spots, but
> renames were a disaster for the many years I used it.

Indeed! There is a general thing here where you can try to capture
semantic information, or not - and arguments on both sides. For
instance, when lines within a file move, one can:
 - treat it as delete+add
 - additionally when doing an imperfect application apply a heuristic
to guess whether its a move or a delete+add
 - or capture extra data saying it moved at the time.

arch applied this 'capture extra data' approach to files and
directories using a fairly large hammer - an inode like concept
'fileid'. But for in-file changes it uses the simple 'delete+add'
approach, with a little bit of contextual data stored to permit fuzzy
patches.

bzr copied the fileid approach, but we didn't - and I now think we
should have - review its overheads and failure modes. The places where
fileid's fail are:
 - parallel imports [files with the same name and content appear
different and are hard to merge]
 - file copies [patches merged into a tree where the file has been
split in two won't follow the split]
 - file joins [the inverse of copies]

I think capturing extra data has worked very well on balance, but
we've paid for it - we store ~40bytes more per changed file per
commit, and we have odd and hard to explain behaviours due to them. It
would be nice, i think, to explore two options:
 - capture the extra data in an alongaside data structure, rather than
a core part of the model. Consult it when it may be applicable but
work gracefully when its not there. I think this would make 'oops I
moved without telling bzr' much easier to recover from - even several
commits later, while still preserving 'renames handle arbitrary
directory rearrangements'.
 - generalise fileids into a structure that can handle joins and
copies - and use that to handle parallel imports.

These two things are complementary I think.

> bzr's rename support is so refreshingly solid, I never worry about
> renaming files, changing them to get everything working, renaming a
> directory after I renamed and/or changed some files, renaming them again
> when I change my mind in the middle of a fix, it all Just Works, which
> is what you want from a sccs feature.

Thanks, thats lovely to hear - I mean, we know, but its also nice when
folk tell us :)

-Rob
-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Cloud Services