Changesets feature complete

John Arbash Meinel john at arbash-meinel.com
Thu May 25 14:17:09 BST 2006


Aaron Bentley wrote:
> John Arbash Meinel wrote:
> | Aaron Bentley wrote:
> |
> |>Validating the entire changeset before installing any of it would
> |>require creating ChangesetTrees from other ChangesetTrees, created from
> |>other ChangesetTrees.  While it's a fun exercise in recursion, I think
> |>it would probably be slow.
> |
> |
> | No, I'm fine with just installing what you can. As long as we don't
> | install bogus stuff, we can always install a little bit extra.
> 
> That's fine with me, but it does leave me perplexed about the
> verify-changeset command, because I'm pretty sure "verify" doesn't imply
> "install".

Verify-changeset was just a test command which would read in a changeset
and make sure that it parsed properly. In the past, I kept a sha hash
for the parent revisions, which actually meant that you could
reconstruct the entire Revision xml, and verify its hash. I *think* you
could do some minor inventory checks as well.

Now that we are using Testaments, I'm not planning on trying as hard to
be able to pre-validate the changeset before we install stuff. As long
as we add in signatures, and check them before we pull a new revision,
we can be reasonably guaranteed that the changeset is valid.

> 
> |>| I would default 'executable' to no, since that is the common case. Then
> |>| you don't need to add that information to every line 'added' line.
> |>| This is a simple patch:
> |>| === modified file 'bzrlib/changeset/read_changeset.py'
> |>| --- bzrlib/changeset/read_changeset.py
> |>|
> |>| +++ bzrlib/changeset/read_changeset.py
> |>| @@ -858,7 +858,7 @@
> |>|          if path in self._executable:
> |>|              return self._executable[path]
> |>|          else:
> |>| -            return self.base_tree.inventory[file_id].executable
> |>| +            return False
> |>
> |>The semantic in ChangesetTree is that if a value is not supplied, it is
> |>unchanged from the base revision.  This patch changes it so that, for
> |>this item alone, the semantic is that if it's not supplied, it's false.
> |>~ That means that we must supply it for every case where it is true, that
> |>is, we must note the executable bit for all unmodified executable files.
> |>~ I think this is the wrong place to do this.
> |
> |
> | That is what the semantic used to be. You changed it so that it always
> | set 'executable'.
> 
> No, do_meta was only invoked when the tree delta showed that the execute
> bit had changed.
> 
> | (I had a whole lot of lines with // executable:no). I
> | prefer the 'set it only if it changes', but second best is 'set it only
> | if true'.
> 
> Your two statements "I would default 'executable' to no, since that is
> the common case" and "I prefer the 'set it only if it changes'" seem
> inconsistent to me.
> 
> Perhaps you're not considering that 'add' is a change, but it is.  So if
> we're setting it for every change, not defaulting it to 'no', then it
> will be set for every added file.

Well, I guess I would like to say "set it for every change, but consider
the default to be 'no' for new files".
I think it is a very reasonable default, and in my head, when the file
didn't exist, it certainly wasn't executable. :)

> 
> |>I think a better approach is to do handle this default in processing the
> |>the action lines.  So when processing adds, modifies, and renames, if
> |>the execute bit is not supplied, we set it to False in the
> |>ChangesetTree, and if it is supplied, we set it according to the
> |>supplied value.
> |
> |
> | That is what my patch did.
> 
> No, your patch changed the implementation of ChangesetTree.  I was
> talking about changing only ChangesetReader._update_tree
> 
> |>Yes.  Once I realized what was going on, I decided to focus on getting
> |>it working and tested, before doing a lot of refactoring.
> |>
> |
> |
> | Sure. It is nice to have working code to refactor, rather than something
> | broken. :) Now we just need to get it to add revision signatures.
> 
> Yeah, I guess we should, so that mailing a changeset supplies all the
> info that publishing a branch does.  I'm not very enthusiastic about
> signing, especially when there's no verification.

Well, I have my 'signing' plugin, which adds the command 'bzr
verify-sigs'. So we do have a little bit of verification. But I agree,
we need to get it more into core bzr, so that it checks signatures
before accepting a new revision.

> 
> But I'd like to encourage you to focus on getting it into a mergeable
> state, rather than making it perfect.  I do want to get it into the core
> so it doesn't bitrot again.  Once it's in, presumably we can update it.
> 

Well, in my mind, the text format won't change much after the
refactoring. It is just changes to the internal code. So how about we
merge it, and just make sure that we are clear it needs to be cleaned up.


> |>That's fine with me, although I don't think the ChangesetReader easily
> |>supports the API described for the serializer.
> |
> |
> | I think for 0.7, I'm just going to have the 0.7 serializer create a
> | ChangesetReader. But I don't think it should be part of the public API.
> | So stuff like merge_changeset() shouldn't take a ChangesetReader object.
> 
> I don't really understand.  I guess I'll have to see what you come up with.
> 
> Aaron

The point of the serializer was just to have a factory which could be
used to read and write changesets. The api for a ChangesetSerializer was
*very* simple, just a read() and write() function.
Where when reading, the appropriate format is determined by reading the
first line of the changeset. And then a Serializer is instantiated, and
read() is called with the rest of the file.

As far as what 'read' would return, I wasn't fully clear on that myself.
At one point, I was thinking that it should be a set of ChangesetTrees,
where each tree was equivalent to a RevisionTree (it has the revision
info, and the inventory info). The implementation would be similar to
what we already have, where you have some basis_tree, and override a few
parts of it based on the patch.

However, I could also see returning a more complicated object than a
list, (similar to ChangesetReader/ChangesetInfo). This object would have
the information to create a ChangesetTree on request by revision_id.
This makes it easier to do late parsing of the changeset, which helps in
the case that you already have some of the changeset revisions.

In my refactoring, I was working on moving non-parsing stuff out of
ChangsetReader and into ChangesetInfo, and making that the return value
of read().

(I also changed the upfront merge command, so that it could grab a
changeset over a Transport).

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060525/de4badb3/attachment.pgp 


More information about the bazaar mailing list