[brisbane-core/MERGE] fix create_by_apply_delta() when rename+modify in same delta
John Arbash Meinel
john at arbash-meinel.com
Thu Mar 5 15:02:05 GMT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
ian.clatworthy at internode.on.net wrote:
> Yet another create_by_apply_delta() fix. A good inventory delta was generating
> a bad chkmap delta for the parent_id_basename_to_file_id CHKMap. I think
> it's worth refactoring create_by_apply_delta() so that the conversion from
> an inventory delta to chkmap deltas can be separately tested. That can come
> later though, as can a pile of tests for the many scenarios.
>
> Together with the earlier brisbane-core fixes I posted, this lets my
> yet-to-be-released version of fastimport work on repository formats
> supporting CHKInventories, at least for simple import streams. Hooray.
> I wonder if they have also fixed the upgrade issues I was having?
> I'll test that tomorrow.
>
> Ian C.
>
I'm trying to understand this change. I can sort of see where it is
coming from, but it feels like it is covering up a different bug.
Specifically, an 'old' entry should never be modified. so given the
entries in an inventory delta:
for old_path, new_path, file_id, entry in inventory_delta:
...
I'm 75% sure that self[file_id].basename must
== osutils.basename(old_path)
old_entry = self[file_id]
# If a file is renamed then modified all in the one
# delta, then using the basename from the entry is
# wrong and leads to an incorrect delta being
# passed to the CHKMap(). We need to use the basename
# from the old_path instead ...
old_basename = osutils.basename(old_path)
old_key = self._parent_id_basename_key(old_entry,
basename=old_basename)
Or is the 'inventory_delta' for an item which is renamed and then
modified not being collapsed into a single 'inventory_delta' entry?
To put it a different way, I would make the check:
old_entry = self[file_id]
old_basename == osutils.basename(old_path)
assert old_entry.name == old_basename
So for now, I'm going to hold off on merging this change until we can
sort out whether my understanding is wrong, or if there is something
wrong with the inventory delta.
- From what it sounds like, your inventory delta is looking something like:
('foo/bar', 'foo/did', 'bar-id', IFile(sha1='1234', p_id='foo-id'))
('foo/did', 'foo/did', 'bar-id', IFile(sha1='5678', p_id='foo-id'))
Because there were 2 entries for the same file_id, that originally was
getting turned into a double parent_id_to_basename_map delta:
(('foo-id', 'bar'), ('foo-id', 'did'), 'bar-id')
(('foo-id', 'bar'), ('foo-id', 'did'), 'bar-id')
And with your change that becomes:
(('foo-id', 'bar'), ('foo-id', 'did'), 'bar-id')
(('foo-id', 'did'), ('foo-id', 'did'), 'bar-id')
And the latter one is stripped because it doesn't change anything.
Consider, though, what happens if you have a rename into a new
directory, rather than just a rename within a directory:
('foo/bar', 'bla/did', 'bar-id', IFile(sha1='1234', p_id='bla-id'))
('bla/did', 'bla/did', 'bar-id', IFile(sha1='5678', p_id='bla-id'))
In that case, with your new code it would be:
(('foo-id', 'bar'), ('bla-id', 'did'), 'bar-id')
(('foo-id', 'did'), ('bla-id', 'did'), 'bar-id')
^^^^^^^^ bad mojo, as ('foo-id', 'did') never existed.
So my gut feeling is that an inventory-delta needs to only talk about a
given file_id 1 time. If that isn't how it is defined, then we need to
be more careful. Because in that case 'old_path' is no longer the 'path
in the previous inventory', it is now the 'path in the inventory just
until this change occurs'. So we actually need to be looking things up
in the *new* inventory, not the old one.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkmv6W0ACgkQJdeBCYSNAANsAQCeOVuLHckNaKSBm46nyP22AgZl
K8IAnAoh7Zll0HwaimMEWtcKsNhXFtjO
=Ym2W
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list