[RFC] history editing vs history presentation

Wed Jun 10 13:25:35 BST 2009

On Tue, 2009-06-09 at 17:25 -0700, Maritza Mendez wrote:

> This a most articulate and comprehensive summary of the case for
> "history editing."   I am sure this will help keep the complicated
> discussion in focus.
> 
> Disclosure: I am one of those people who prefer not to use history
> editing.  What I mean by this is the potential abuses of history
> editing outweigh the benefits for my team.  I say this even though I
> have been greatly tempted to edit commit messages for profanity or
> misleading info.  But I resist the temptation because it is a slippy
> slope.  Yet I recognize that there are other legitimate uses for
> history editing and I respect the opinions of people who want to see
> this in bzr.  
> 
> I would ask that the following ideas be considered in the discussion
> if possible:
> 
> 1. Allow bzr-init to set a non-revocable policy (property of a format)
> when a branch is created about whether (and what kinds) of history
> editing will ever be allowed on that branch.  This sounds simple but
> might not be.  Checkouts would have to inherit from their parents I
> think, and successively unbinding and rebinding might cause ugly edge
> cases.  

We have this today, though its not irrevocable - the
append_revisions_only setting.

To make an irrevocable setting I think it would have to do a commit
(otherwise just restoring from backup would be enough to undo the
setting). And if its in history, how can user branches tell that the
setting shouldn't apply to them (because they are just the scratch area
a dev works on at first). As Stephen says, making proofs about code
seems to devolve to digital signatures on commits.

For a proof, consider that any logic we put in bzr can be trivially
bypassed by simply disabling the checks in a copy bzr and then doing the
history edit on another machine, and finally doing a copy to overwrite
the database. If you prohibit copy and only permit semantic bzr commands
then you limit the people that can do this in your organisation.

Diminishing returns apply here. I think it is sufficient to have a
setting which bzr honours, and which the users that can toggle it can be
limited by regular system permissions. That lets the group of folk that
can fiddle things be sharply restricted. If you're on a unix that allows
mandatory auditing you can then also get an audit log of changes to the
config file that an auditor would ask about.

This isn't to say that what you are asking about isn't desirable in some
circumstances, rather about choosing what parts are feasible and
reasonable to do in bzr, and what should be done outside of bzr.

> 2. Treat the history editing as a kind of meta-history which is 100%
> auditable.  The default behavior of all bzr commands could be to show
> the history-as-edited but history-aware commands would have new
> options to display the "true" meta-history along with the edited
> history.

There are two basic ways around this sort of thing with different
tradeoffs:
 - history preserving history edits: create commits that derive from
        existing commits and keep the old ones as a sort of 'shadow
        history'. We'd probably want log to not show such shadow history
        by default, and for merge not to traverse links in a like
        manner. This is the sort of thing loom does. It costs in storage
        size, particularly if used a lot. And the old history is hard to
        discard if the edit was one where discarding is important (for
        size or privacy concerns).
 - keep a record outside of the history that the edit took place.
        One way to do this is via journal of the value of pointers such
        as tags and branch heads. This lets you recover the old history
        by providing some way to talk about an old version of such a
        point. This is complicated by some changes being more important,
        and not wanting very long logs. (Consider a history 10K items
        long: that took 10K changes to get to its current value. A
        single history edit 5K items in is lost in noise.) Old history
        costs the same, or more than in the first case (it may be harder
        to make it compress as well), but is easier to discard (remove
        the item from the journal). 

I'd be inclined to provide the same set of editing tools with a flag to
control how the old history is referenced (Not referenced, meta- e.g. a
journal entry in a log, history- e.g. directly from each altered
commit).

> FInally, I would simply remind that some of us work in environments
> where auditability is a serious requirement.  No system is perfect in
> ths way.  But if it can be shown that a system can be easily subverted
> to conceal or misrepresent actual events -- even if the intent is
> honorable -- then some organizations may not be able to use it for
> some jobs.  Either of the above suggestions (or maybe a better idea)
> which gives users the *option* of preserving auditability would be
> very helpful.

I suspect that a simple log of every value branch tips take, enforced by
the server with VFS operations, local access and SFTP disabled would be
sufficient.

-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090610/95bd3193/attachment-0001.pgp