[RFC] history editing vs history presentation

Thu Jun 11 02:28:40 BST 2009

Points well taken.  Your idea of ”shadow history” matches my second
thought quite well.  Namely no history is ever lost.  History edits
are purely additive and history which has been ”overwritten” is simply
marked as superseded.  This works for all forms of history, not just
commit messages.

On the other hand . I reject arguments based on the size of the repo.
It is perfectly acceptable for extreme use cases to carry matching
consequences.  If that becomes unbearable, then just as you pointed
out in a different context, the user can branch to a new location,
leaving behind the "shadow" history.

The idea here from my point of view is to make it easy for groups to
*choose* to work auditably, not to force it on anyone.  If that
requires me to sign certain operations so be it.  Its my choice.

Thanks for reading.

-M

On 6/10/09, Robert Collins <robert.collins at canonical.com> wrote:
> On Tue, 2009-06-09 at 17:25 -0700, Maritza Mendez wrote:
>
>> This a most articulate and comprehensive summary of the case for
>> "history editing."   I am sure this will help keep the complicated
>> discussion in focus.
>>
>> Disclosure: I am one of those people who prefer not to use history
>> editing.  What I mean by this is the potential abuses of history
>> editing outweigh the benefits for my team.  I say this even though I
>> have been greatly tempted to edit commit messages for profanity or
>> misleading info.  But I resist the temptation because it is a slippy
>> slope.  Yet I recognize that there are other legitimate uses for
>> history editing and I respect the opinions of people who want to see
>> this in bzr.
>>
>> I would ask that the following ideas be considered in the discussion
>> if possible:
>>
>> 1. Allow bzr-init to set a non-revocable policy (property of a format)
>> when a branch is created about whether (and what kinds) of history
>> editing will ever be allowed on that branch.  This sounds simple but
>> might not be.  Checkouts would have to inherit from their parents I
>> think, and successively unbinding and rebinding might cause ugly edge
>> cases.
>
> We have this today, though its not irrevocable - the
> append_revisions_only setting.
>
> To make an irrevocable setting I think it would have to do a commit
> (otherwise just restoring from backup would be enough to undo the
> setting). And if its in history, how can user branches tell that the
> setting shouldn't apply to them (because they are just the scratch area
> a dev works on at first). As Stephen says, making proofs about code
> seems to devolve to digital signatures on commits.
>
> For a proof, consider that any logic we put in bzr can be trivially
> bypassed by simply disabling the checks in a copy bzr and then doing the
> history edit on another machine, and finally doing a copy to overwrite
> the database. If you prohibit copy and only permit semantic bzr commands
> then you limit the people that can do this in your organisation.
>
> Diminishing returns apply here. I think it is sufficient to have a
> setting which bzr honours, and which the users that can toggle it can be
> limited by regular system permissions. That lets the group of folk that
> can fiddle things be sharply restricted. If you're on a unix that allows
> mandatory auditing you can then also get an audit log of changes to the
> config file that an auditor would ask about.
>
> This isn't to say that what you are asking about isn't desirable in some
> circumstances, rather about choosing what parts are feasible and
> reasonable to do in bzr, and what should be done outside of bzr.
>
>> 2. Treat the history editing as a kind of meta-history which is 100%
>> auditable.  The default behavior of all bzr commands could be to show
>> the history-as-edited but history-aware commands would have new
>> options to display the "true" meta-history along with the edited
>> history.
>
> There are two basic ways around this sort of thing with different
> tradeoffs:
>  - history preserving history edits: create commits that derive from
>         existing commits and keep the old ones as a sort of 'shadow
>         history'. We'd probably want log to not show such shadow history
>         by default, and for merge not to traverse links in a like
>         manner. This is the sort of thing loom does. It costs in storage
>         size, particularly if used a lot. And the old history is hard to
>         discard if the edit was one where discarding is important (for
>         size or privacy concerns).
>  - keep a record outside of the history that the edit took place.
>         One way to do this is via journal of the value of pointers such
>         as tags and branch heads. This lets you recover the old history
>         by providing some way to talk about an old version of such a
>         point. This is complicated by some changes being more important,
>         and not wanting very long logs. (Consider a history 10K items
>         long: that took 10K changes to get to its current value. A
>         single history edit 5K items in is lost in noise.) Old history
>         costs the same, or more than in the first case (it may be harder
>         to make it compress as well), but is easier to discard (remove
>         the item from the journal).
>
> I'd be inclined to provide the same set of editing tools with a flag to
> control how the old history is referenced (Not referenced, meta- e.g. a
> journal entry in a log, history- e.g. directly from each altered
> commit).
>
>> FInally, I would simply remind that some of us work in environments
>> where auditability is a serious requirement.  No system is perfect in
>> ths way.  But if it can be shown that a system can be easily subverted
>> to conceal or misrepresent actual events -- even if the intent is
>> honorable -- then some organizations may not be able to use it for
>> some jobs.  Either of the above suggestions (or maybe a better idea)
>> which gives users the *option* of preserving auditability would be
>> very helpful.
>
> I suspect that a simple log of every value branch tips take, enforced by
> the server with VFS operations, local access and SFTP disabled would be
> sufficient.
>
> -Rob
>

-- 
Sent from my mobile device