computing the resulting inventory from the merge changeset

Wed Dec 14 23:09:37 GMT 2005

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

duchier at ps.uni-sb.de wrote:
> Thank you for your comments.  I believe I understand the details
> rather better now.  I revised my code and I attach the new version
> below (not a patch), but just for the function
> apply_changeset_to_inventory.

Yes, that's a nice way to read it.

> Please, let me know if this one looks correct 

It does not look correct, because you're still doing things in more than
two phases.  In my view, the form used in the filesystem manipulation is
the simplest way to ensure it is correct.

Here are the two phases as I see them:
1. remove the old names, in children-to-parents order
2. insert the new names, in parents-to-children order

By "remove", I mean, either "delete", or "move out of the way" depending
on whether your ultimate purpose is to delete or rename the entry.
Similarly, by "insert", I mean "create" or "move into place", depending
on whether or not the entry already exists.

The reason for doing removal first is because the Inventory maintains
the invariant that no two entries may have the same name, just as a
POSIX filesystem does.  Removing the old names first means that there is
no time when an old name and a new name that have the same value will
coexist.

The reason for doing the removal in children-to-parents order is that
child entries become unreachable (or, in POSIX, the removal fails) if
you operate on their parents first.

The reason for doing the insertion in parents-to-children order is
because both Inventory and a POSIX filesystem require that the parent
exist before a child can be inserted into it.

(it does pass the entire
> test suite, plus the 2 new tests that you suggested).

Thanks for your work on improving the test cases.  I actually was
suggesting another set of cases.  I've provided details below.

One thing I should note is that when I say "this case requires", it is
strictly true for filesystem manipulation, when you cannot operate in
terms of inventory IDs.  For your case, more aggressive use of the
"saved" dict might work, but I still feel my two-phase approach is
simpler and more obviously correct.  You'll note that cases 1 and 4 are
inversions of each other, as are 2 and 3.  This is one of the reasons I
feel the process should be symmetrical.

Case #1:
$ touch foo
$ bzr add foo
$ bzr commit
$ bzr mv foo bar
$ touch foo
$ bzr add foo
$ bzr commit

This case requires that you must not do creates before move-into-place

Case #2:
$ touch foo
$ bzr add foo
$ bzr commit
$ bzr mkdir bar
$ bzr add bar
$ bzr mv foo foo/bar
$ bzr commit

This case requires that you must not do move-into-place before creates,
and that you must not do children after parents.

Therefore, you must do phase 1. above

Case #3:
$ mkdir foo
$ touch foo/bar
$ bzr add foo/bar
$ bzr commit
$ bzr mv foo/bar bar
$ rmdir foo
$ bzr commit

This case requires that you must not do deletes before
move-out-of-the-way, and that you must not do children after parents.

Case #4:
$ touch foo
$ touch bar
$ bzr add foo bar
$ bzr commit
$ rm foo
$ bzr mv bar foo
$ bzr commit

This case requires that you must not do move-out-of-the-way before deletes.

Therefore, you must do phase 2, above.

Since you are not doing phase 2, I would expect this case to fail with
your current code.  To work around it, you'd need ensure that all
descendants of entries in "saved" had their own entries in "saved", then
do deletions on both "new_inv" and "saved", depending where the entries
were.

Hope that explanation's neither too obscure nor too simple.  It is a
tricky line to walk.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDoKYx0F+nu1YWqI0RAsiBAJ9JosG7ANhBVyB8PwpmqCwQhQ/tHwCfWEJb
NgUvEiPqoh6+UEvuCOla3LU=
=v5j2
-----END PGP SIGNATURE-----