Signing snapshots

John A Meinel john at
Tue Jun 21 17:22:39 BST 2005

Aaron Bentley wrote:

> Martin Pool wrote:
> >On 21 Jun 2005, Aaron Bentley < at> wrote:
> >>Hi all,
> >>
> >>Part of the plan for signing in bzr was to sign the snapshot, not the
> >>data generated from it (i.e. the revision store gzips or whatever).
> >As we discussed a little while ago, we primarily plan to actually sign
> >the revision, which includes by reference the inventory.  That doesn't
> >make any difference to asuffield's points though.
> Yes.  I think that needs to be done very carefully, though.  We want to
> be able to upgrade the signatures without invalidating old signatures.
> For example, if you sign a hash of
> mbp at, and the hash
> algorithm is later broken, it should be possible to re-sign that
> revision using a later hash, yet still be able verify it using the old
> hash.  And it would also be nice to be able to remove the old hash
> without disturbing the new hash.
> So I guess what I'm saying is, when generating hashes, you should not
> pay any attention to hashes generated using a different algorithm.  If
> you're generating an SHA-1, you should only look at SHA-1 hashes, not
> MD5 or SHA-160.  If you're generating a SHA-160, you should only look at
> SHA-160 hashes for the file/inventory/etc.
> In light of this, I don't know what to make of the recently-added
> "revision_sha1" attribute for parent revisionss.  I thought the notion
> was that we would sign the entire revision history.  This means that
> creating a sha-160 signature for a revision requires adding sha-160s to
> every ancestor revision.  I think this makes merge horizons impossible.
As soon as you modify the text, you invalidate the signature. Since we
might actually be signing the gzipped form, you need the actual bytes,
not something that is newly generated (because you might use a slightly
different compression level, or the algorithm has been tweaked between
For new revisions you can switch from using "revision_sha1" to
"revision_sha160". You still validate older revisions using sha1
*because you have to* that is what was signed. Your new 160 hash does
not have a sha1 signature.

You don't have to update the old revisions if you don't want to, in
fact, once we start doing signatures, updating old revisions would
require all new signatures.

> Also, I think signing snapshots makes sense because not every snapshot
> is a revision.  (Or is it?)  Requiring people to commit in order to
> produce changesets seems onerous.

What are you considering to be a snapshot? Are you thinking that a
changeset can be produced by comparing to the local working tree? I
agree with that, but I don't think it needs a permanantly assigned
signature, if it doesn't have a permanantly assigned revision id.
If you are wanting to send it as an email, just sign the changeset as it
goes into the email. bzr diff | gpg --clearsign
In my changeset plugin, I don't support working tree yet, as there are
fields I would like to get from the revision info. But you could just as
easily do the basic work for commit so that you have an effective
Revision entry, just one that is not committed to the repository. If you
don't associate a Revision id with it, then it is just free-floating.

> >>00:30 < asuffield> abentley: I would expect to find DoS attacks against
> >>           the inventory process and ways to slip files past it
> >>           which never appear in the inventory, and that's
> >>           without even thinking about it
> >I think that is less plausible with bzr than with arch; files which
> >aren't in the inventory simply don't exist from bzr's point of view,
> >and won't be considered for merging.
> Hmm.  True.  The files may not even be stored in a temporary directory,
> for ChangesetTrees or when merge is better-integrated.
> >>00:31 < asuffield> I would also expect to find implementation bugs that
> >>           were exploitable, probably suitable for remote
> >>           arbitrary code execution
> >This is certainly a good point; the verification should be done as
> >early as possible in the pipe, so that untrusted data gets to touch
> >the least code.
> >>From this perspective the tla approach of writing the hash of the
> >files then signing the hashes is rather nice.
> Yes, this is what asuffield was pushing as the only sane option.  His
> case was that signing anything more abstract would always lead to holes.
> >All we need to do with
> >untrusted data is calculate its hash, and we can be reasonably sure
> >that there won't be vulnerabilities in the SHA-1 calculator.  There
> >might be some in the code that parses the checksum file or the gpg
> >signature.  On the other hand this approach flakes out of the more
> >important problem of evaluating whether the code is signed by a
> >meaningful key.
> Sorry, didn't parse that.

The idea with a detached signature is that you don't actually have to
parse a semantic meaning to the bytes. Just read in some bytes and
compute the sha hash, then check the signature. If you have a
--clearsign style signature, you have to at least read the file and look
for where the ---BEGIN and ---END lines exist.

> >One approach is to just put a GPG signature next to every revision
> >file, and verify that before reading the revision.  In that case the
> >only exploitable code is GPG itself.
> >  gpg --detach-sign .bzr/revision-store/thingthing
> I wonder whether there's a useful difference between trusted and
> authoritative?  E.g., I will trust John Meinel's signature to prove that
> data is not malicious, but I will only trust your signature to prove
> that the revision produced is actually
> mbp at

gpg makes the difference between an "UNTRUSTED GOOD" signature, trusted
good and bad. Which means that if you set your gpg keyring to trust me,
you will get trusted good on my signatures, if you leave out mpool, then
you will get untrusted good on his signatures.
Are you asking for more than this?

It is arguable that the trust levels should be built into bzr rather
than using gpg-keyrings (I know that was a complaint against baz's trust
model). But gpg has done all of the hard work of implementing trust
networks, why not use it?

> >Perhaps the most interesting attack method is to mail someone a
> >malicious changeset, because this avoids the need to convince the
> >targetted user to access a malicious server.
> Sure.  The risk is limited to compromising the implementation of
> ChangesetTree, though.
> >Processing untrusted data is always a risk.  I propose a defence in
> >several lines:
> ...
> > - Authenticate data as soon as possible in processing it; make this
> >   give a reasonable level of security by default.
> I'd suggest that we retain that authentication data as well, so that we
> can determine later that data is signed by a compromised key.
Retain it in what form? Are you saying that when I pull your patch, I
should retain your signature? That is a good idea if possible.
With the multiple-ancestor code going in, it seems we might be pulling
other people's histories, and I certainly think we should also pull the
signatures if we have them available.

> >Regardless of what signing method we use, it's possible that people
> >will create malicious changesets signed by trusted keys.  Then it just
> >comes down to whether the program has any vulnerabilities throughout.
> >We can aim for that but historically it's rarely achieved.
> Yeah, though the impossibility of certain kinds of overflows in Python
> does work to our advantage.  But bug-free code is not an attainable ideal.
> Often systems that need to handle untrusted data will have a way to drop
> privilages and/or use a sandbox.  Paranoia might lead to a StreamTree
> class that communicated with a chrooted bzr over a pipe.  Then it's just
> a few short steps to a smart server...
Interesting idea. I'm wondering if the effort to do a proper chrooted
bzr would be better spent on just validating user input properly.

> Aaron


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : 

More information about the bazaar mailing list