Automatic commit squashing

James Tunnicliffe james.tunnicliffe at canonical.com
Thu Jun 16 16:16:09 UTC 2016


TLDR: Having guidelines rather than rules is good. Having a tool
mindlessly squashing commits can throw away valuable information.

I am a little confused as to why we need to squash stuff at all. Git
history isn't flat so if you don't want to see every commit in a
branch that has landed then you don't have to. I realise that I am a
scummy GUI user and I haven't looked at how to use the git CLI to do
this. I am not against squashing commits to provide a nice logical
history without the 'fix the fact that I am dumb and missed that
rename' noise, but I don't think squashing to a single commit is
always the right thing to do.

Once code is up for review I want the history to remain from the start
to the end of the review loop so if I ask someone to change something
I can actually see that change. I have no problem with those commits
being squashed pre-merge if they are minor changes to the originally
proposed code.

James

On 16 June 2016 at 08:25, Marco Ceppi <marco.ceppi at canonical.com> wrote:
> This is purely anecdotal, but on the ecosystem side for a lot of our
> projects I've tried to psuedo-enforce the "one commit", or really, a
> change/fix/feature per commit. Thereby allowing me to cherrypick for patch
> releases to stable (or revert a commit) with confidence and without a lot of
> hunting for the right grouping.
>
> With the advent of squashing in github I've dropped this push and use this
> unless the author has already done the logical grouping of commits, in which
> case I'll merge them myself, out of github, to avoid merge messages but
> retain their grouping (and potentially modify commit messages, to make it
> easier to identify the PR number and the bug number it fixes).
>
> I don't think the Juju core project can just carte blanche squash every pull
> request, but I do think it's up to the code authors to put an effort in to
> squashing/rewriting/managing their commits prior to submittion to make the
> code's history more observable and manageable overtime. Much in the same way
> you would document or comment blocks of code, commits are a window into what
> this patch does, if you want to keep your history, for reference, branching
> is cheap in git and you absolutely can.
>
> Happy to share more of the latter mentioned workflow for those interested,
> but otherwise just some 2¢
>
> Marco
>
> On Thu, Jun 16, 2016 at 10:12 AM John Meinel <john at arbash-meinel.com> wrote:
>>
>> Note that if github is squashing the commits when it lands into Master, I
>> believe that this breaks the ancestry with your local branch. So it isn't a
>> matter of "the history just isn't present in the master branch", but "it
>> looks like a completely independent commit revision, and you have no obvious
>> way to associate it with the branch that you have at all".
>>
>> It may be that git adds information to the commit ('this commit is a
>> rollup of hash deadbeef") in which case the git tool could look it up.
>>
>> I don't know the github UI around this. If I do "git merge --squash" then
>> it leaves me with an uncommitted tree with the file contents updated, and
>> then I can do "git commit -m new-content"
>>
>> And then if I try to do:
>> $ git branch -d test-branch
>> error: The branch 'test-branch' is not fully merged.
>> If you are sure you want to delete it, run 'git branch -D test-branch'
>>
>> Which indicates to me that it intentionally forgets everything about all
>> of your commits, which mean you need to know when it got merged so that you
>> can prune your branches, because the tool isn't going to track what has and
>> hasn't been merged.
>>
>> (I don't know about other people, but because of the delays of waiting for
>> reviews and merge bot bouncing things, it can take a while for something to
>> actually land. I often have branches that sit for a while, and it is easy
>> for me to not be 100% sure if that quick bugfix I did last week actually
>> made it through to master, and having 'git branch -d ' as a short hand was
>> quite useful.)
>>
>> Note that if we are going to go with "only 1 commit for each thing
>> landed", then I do think that using github's squash feature is probably
>> better than rebasing your branches. Because if we just rebase your branch,
>> then you end up with 2 revisions that represent your commit (the one you
>> proposed, and the merge revision), vs just having the "revision of master
>> that represents your changes rebased onto master". We could 'fast forward
>> when possible' but that just means there is a window where sometimes you
>> rebased your branch and it landed fast enough to be only 1 commit, vs
>> someone else landed a change just before you and now you have a merge
>> commit. I would like us to be consistent.
>>
>> For people who do want to throw away history with a rebase, what's your
>> feeling on whether there should be a merge commit (the change as I proposed
>> it) separate from the change-as-it-landed-on-master. I mean, if you're
>> getting rid of the history, the the change-as-I-proposed-it (and personally
>> tested it) doesn't really matter, right? And the bot tests the
>> change-as-it-landed.
>>
>> John
>> =:->
>>
>>
>> On Thu, Jun 16, 2016 at 4:20 PM, Ian Booth <ian.booth at canonical.com>
>> wrote:
>>>
>>>
>>>
>>> On 16/06/16 19:04, David Cheney wrote:
>>> > Counter suggestion: the bot refuses to accept PR's that contain more
>>> > than one commit, then it's up to the submitter to prepare it in any
>>> > way that they feel appropriate.
>>> >
>>>
>>> Please no. I do not want to be forced to alter my local history.
>>>
>>> I was hopeful that having the landing bot / github squash commits would
>>> satisfy
>>> those people what did not want to use git log --first-parent to present a
>>> sanitised view of commits, but allow me to retain the history in my
>>> branches
>>> locally so I could look back on what I did and when and how (if needed).
>>>
>>> If the default github behaviour is not sufficient, perhaps we can add
>>> some
>>> tooling in the merge bot to do the squashing prior to merging.
>>>
>>>
>>> > On Thu, Jun 16, 2016 at 6:44 PM, roger peppe
>>> > <roger.peppe at canonical.com> wrote:
>>> >> Squashed commits are nice, but there's something worth watching
>>> >> out for: currently the merge commit is committed with the text
>>> >> that's in the github PR, but when a squashed commit is made, this
>>> >> text is ignored and only the text in the actual proposed commit ends
>>> >> up
>>> >> in the history. This surprised me (I often edit the PR description
>>> >> as the review continues) so worth being aware of, I think.
>>> >>
>>> >>   cheers,
>>> >>     rog.
>>> >>
>>> >> On 16 June 2016 at 02:12, Menno Smits <menno.smits at canonical.com>
>>> >> wrote:
>>> >>> Hi everyone,
>>> >>>
>>> >>> Following on from the recent thread about commit squashing and commit
>>> >>> message quality, the idea of automatically squashing commit at merge
>>> >>> time
>>> >>> has been raised. The idea is that the merge bot would automatically
>>> >>> squash
>>> >>> commits for a pull request into a single commit, using the PR
>>> >>> description as
>>> >>> the commit message.
>>> >>>
>>> >>> With this in place, developers can commit locally using any approach
>>> >>> they
>>> >>> prefer. The smaller commits they make as they work won't be part of
>>> >>> the
>>> >>> history the team interacts with in master.
>>> >>>
>>> >>> When using autosquashing the quality of pull request descriptions
>>> >>> should get
>>> >>> even more scrutiny during reviews. The quality of PR descriptions is
>>> >>> already
>>> >>> important as they are used for merge commits but with autosquashing
>>> >>> in place
>>> >>> they will be the *only* commit message.
>>> >>>
>>> >>> Autosquashing can be achieved technically by either having the merge
>>> >>> bot do
>>> >>> the squashing itself, or by taking advantage of Github's feature to
>>> >>> do this
>>> >>> (currently in preview mode):
>>> >>>
>>> >>> https://developer.github.com/changes/2016-04-01-squash-api-preview/
>>> >>>
>>> >>> We need to ensure that the squashed commits are attributed to the
>>> >>> correct
>>> >>> author (i.e. not jujubot). I'm not sure what we do with pull requests
>>> >>> which
>>> >>> contain work from multiple authors. There doesn't seem to be an
>>> >>> established
>>> >>> approach for this.
>>> >>>
>>> >>> Thoughts?
>>> >>>
>>> >>> - Menno
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Juju-dev mailing list
>>> >>> Juju-dev at lists.ubuntu.com
>>> >>> Modify settings or unsubscribe at:
>>> >>> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>>> >>>
>>> >>
>>> >> --
>>> >> Juju-dev mailing list
>>> >> Juju-dev at lists.ubuntu.com
>>> >> Modify settings or unsubscribe at:
>>> >> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>>> >
>>>
>>> --
>>> Juju-dev mailing list
>>> Juju-dev at lists.ubuntu.com
>>> Modify settings or unsubscribe at:
>>> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>>
>>
>> --
>> Juju-dev mailing list
>> Juju-dev at lists.ubuntu.com
>> Modify settings or unsubscribe at:
>> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>
>
> --
> Juju-dev mailing list
> Juju-dev at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>



More information about the Juju-dev mailing list