Strawman: eliminating debdiffs

Colin Watson cjwatson at ubuntu.com
Thu Oct 9 13:59:25 BST 2008


I'm going to reply to a number of points made throughout this thread,
but am replying to the topmost article because I'd like to do it all at
once.

On Wed, Oct 01, 2008 at 10:27:49AM +0100, James Westby wrote:
> It's my opinion that our current process encourages our interaction
> with upstream projects on pushing bug fixes to be to the wrong way
> round.
> 
> Currently the typical route is for a contributor to come up with a
> fix and seek sponsorship for a debdiff. Often they won't forward
> the patch first, and the sponsor will encourage them to do this.
> Either the sponsor blocks on doing that, which gives more delays
> and can lead to dropped fixes, or the sponsor encourages the contributor
> to do it in their comment when uploading, which can be easily ignored.
> 
> My proposal would be to discourage debdiffs for this sort of fix.

This is perhaps going off on a slight tangent, and anyone who knows me
will probably have heard versions of this rant before, but I have
noticed roughly the following dysfunctions with the current process:

  * People are encouraged to prepare debdiffs for trivial changes
    (description adjustments, typos, etc.). This encourages
    trivial-change uploads when they aren't really necessary, rather
    than batching up a number of changes at once, and it reinforces the
    bogus idea that the form of a debdiff is important even when the
    change is manifestly trivial ("hey guys, could you change "foo" to
    "bar" please?), thus wasting everyone's time.

  * Many sponsors go back and forth with the contributor quite a lot
    over trivial adjustments, such as changelog style. This is done with
    noble intentions, namely training up contributors, but the end
    result is often to put contributors off because they have to jump
    through hoops that aren't really all that important. Furthermore,
    this produces a skewed idea among contributors about what really is
    important. We should be punting things back to the contributor when
    the changes required are substantive, and just making minor
    adjustments ourselves for trivial things and informing the
    contributor about what we did. This is what I do myself.

  * Asking people to produce a fully-formed debdiff, including
    changelog, that can be applied and uploaded directly is often a net
    loss: if the package is maintained in revision control and there are
    already changes outstanding then you have to merge the changelog,
    which is usually no easier than just writing the dratted changelog
    entry yourself. Most upstream projects are far more sensible than
    this: if you supply a proposed changelog entry in the mail or bug
    report along with the patch then they'll include that, and otherwise
    they'll just write one. After all the upstream maintainer has to
    understand the patch in order to merge it anyway!

  * Sometimes people send a perfectly good patch in an Ubuntu bug report
    and then somebody says "in order to get this included you should
    attach a debdiff instead". This is *a complete and utter waste of
    time*. A debdiff is just a kind of patch and any Ubuntu developer
    who can't figure out how to apply some slightly
    differently-formatted patch in a matter of seconds shouldn't be an
    Ubuntu developer. What bug triage processes recommend that people
    say this and how can we get them fixed? I know people who are
    perfectly competent developers who think we're mired in useless
    bureaucracy due to this and have been put off contributing.

  * As you say, there is little incentive for anyone to submit patches
    upstream because that's not the main thing they're measured on when
    attempting to join the Ubuntu development team.

  * Quite a number of people seem to spend considerable time preparing
    debdiffs for simple changes. If that really is something that gains
    credit in the MOTU application process then we should review the
    process. We should focus on quality over quantity; people who are
    capable of doing more complicated bug-fixing work will not have any
    trouble with simple things.

In short, based on my own experience and what I see on a day-to-day
basis in the bug tracking system, I don't see any benefit to the regime
of asking for debdiffs rather than just asking for a patch plus a
changelog entry. I think we would attract more contributors if we
weakened this requirement, without a substantial change in workload for
sponsors.


I really can't agree with Michael Bienia's comment [1] that sponsors
don't have time to "package" patches (e.g. dpatch/quilt/..., changelog,
etc.). The contributor should supply something that's properly explained
and that applies to the current version, certainly; but just blatting it
into the source package hardly takes any time at all. If the patch is at
all complicated then you'll have to spend more time thinking about
whether it's correct than you would applying it to the source package.

Furthermore, I think that anyone who spends a reasonable amount of time
reading other people's changes (as you naturally do when working on
packages produced by other people) will pick up reasonable changelog
habits. If they don't, we can always follow our usual practice of
mailing them and ubuntu-devel(-discuss) with commentary on how it could
be improved, and perhaps improving documentation. I don't think we need
to hamper the whole contribution process just for this. I don't remember
anyone ever teaching me how to write changelog entries and it seems to
have worked out OK just by osmosis ...

[1] https://lists.ubuntu.com/archives/ubuntu-devel/2008-October/026626.html


So that's the dead horse of debdiffs thoroughly beaten. Let's move on to
the upstream contribution workflow.

Scott Kitterman writes [2] that he feels that this will be more
intimidating for new contributors. While I can see his point, our
process is actually considerably more complicated than the process of
contributing to most upstream projects (once you invest a little capital
in figuring out how to do so; I acknowledge Kees Cook's point that this
is sometimes difficult but I don't think that's the case for the vast
majority of packages in Ubuntu).

Mostly, you mail somebody a patch or stick it in a bug tracking system
somewhere, and either they say "thanks, applied"; or you have a bit of
an argument about whether it's the right thing or suggestions for
improvements; or maybe they reject it and say they aren't interested; or
you get ignored. Not all of these possibilities are good outcomes but at
least they're relatively simple. In Ubuntu right now you get bounced
back and forward among different teams, you have to supply your patch in
an extremely specific format, you might have to get freeze clearance,
all sorts of stuff that essentially amounts to safeguards against us
doing the wrong thing.

If the patch is not fundamentally Ubuntu-specific in some way, I think
contributors will actually often be better served by sending patches
upstream first, and then documenting that they've done so in the
corresponding Ubuntu bug report and perhaps going through whatever
rigmarole is needed to get the patch applied in Ubuntu. For anything
non-trivial related to a package where we don't have significant
expertise in Ubuntu, upstream is the place where your patch is going to
get serious review and commentary. It's usually so much more pleasant to
work with developers who know the code you're working on really well.

Finally, if the contributor doesn't forward the patch upstream, then who
will? Generally the sponsor or somebody else in a similar position, and
in my experience forwarding patches written by other people often just
doesn't work. If the patch is at all difficult or contentious, then you
can end up in a position of explaining or even defending a patch that
you may not even be sure you agree with yourself. This can work if you
have a good, established relationship with upstream (like the patch flow
in the kernel, where Linus has a number of lieutenants he trusts to
accept patches for various subsystems; or in the case of some Debian
developers), but for the most part the best person to forward a patch is
the person who wrote it; and therefore, while it is clearly more work
for the contributor to do so, it's *right* to give them this extra work.

[2] https://lists.ubuntu.com/archives/ubuntu-devel/2008-October/026632.html


> Instead I propose the following:
> 
>   * The contributor finds a fix to a problem, and forwards the
>     patch upstream. They follow the progress of the bug and
>     work with upstream to get it committed.

Daniel Holbach mentioned https://wiki.ubuntu.com/Bugs/Upstream, which I
found interesting. It seems to me that if several of us got into the
habit of extending that with information about new upstream bug tracking
systems when using them, we'd quickly cover nearly everything of
interest (remembering that most small projects just accept patches by
e-mail to the maintainer). I just spent five minutes adding OpenSSH to
the list there.

Based on my own experiences, when I don't forward patches upstream it's
usually because I am being too laser-focused on Ubuntu rather than
because it's too difficult. Honestly, I regard this as a bug in myself,
and it's certainly something I'm going to try to fix. I can entirely
understand that others might find it too difficult, though, and it makes
sense for us to record as much helpful information as we can so that our
fellow developers don't have to rediscover the same information.

>   * For small fixes the process will stop there.
>   * Once the fix is committed, or some time has passed with no 
>     comment from upstream, if the contributor deems the fix
>     important enough to warrant an upload before the new upstream
>     is packaged they seek a sponsor.
>   * A sponsorship request is a description of the problem and the
>     fix and a pointer to the bug report and/or commit upstream.
>   * The sponsor grabs the patch and reviews it, with more scrutiny
>     if there has been no comment upstream. They drop it in the
>     package and add a changelog entry, which will be easy because
>     contributors will be encouraged to provide a lot of information
>     about the fix.

I generally like the patch tagging guidelines
(https://wiki.ubuntu.com/UbuntuDevelopment/PatchTaggingGuidelines); they
make a lot of sense when you're using a patch system. Of course, not
everything does (I avoid them like the plague wherever possible, and in
native packages they're just silly). In that case I think the changelog
is quite adequate, and it doesn't have to get very long. A simple change
I committed while writing this mail was pretty much the worst case:

  * ssh-copy-id: Strip trailing colons from hostname (closes: #226172,
    LP: #249706; thanks to Karl Goetz for nudging this along; forwarded
    upstream as https://bugzilla.mindrot.org/show_bug.cgi?id=1530).

(You could often reasonably abbreviate the upstream bug link in
changelogs too; in practice I think "GNOME #nnnnnn" would do just as
well as "http://bugzilla.gnome.org/show_bug.cgi?id=nnnnnn".)

Aside from issues where patch systems aren't used, I haven't seen any
real contention around the patch tagging guidelines. Regardless of the
rest of this thread, I think we should make that standard practice. I've
added it to https://wiki.ubuntu.com/SponsorshipProcess, and unless there
are well-founded objections over the next week I'll send a note to
ubuntu-devel-announce about that.

> One issue is that the contributor won't necessarily get record of their
> work on their /+packages page. The sponsor could use their details in
> the changelog footer, but some sponsors may not like that as they did
> the work of pulling the fix in to the package.

This issue is only going to become more extensive as it becomes more
common to have multiple changes in a single upload (which is already
prevalent in main; it's rare for me to integrate a change from a
contributor and upload it without also including something else),
particularly with packages in revision control. We're going to have to
wean ourselves off measuring people according to +packages sooner or
later.

> There are obviously changes for which there is no upstream, or where
> it's not appropriate to forward them, and these could be handled in
> the current fashion, but the default could be to encourage changes
> in the way described above.

I think it does make sense to handle genuinely important problems
differently; Michael Bienia makes a good point that some upstream
projects release only very infrequently. For bugs that are less severe
but not trivial it's a judgement call, usually IME depending on the
complexity of the patch and compatibility implications. There are
certainly plenty of cases where we can fix a bug locally without digging
ourselves into a hole if upstream take a different approach. Still, it
would be excellent discipline for people to get into the habit of
judging the difference.

In a nod to this, I left a note at the end of
https://wiki.ubuntu.com/Bugs/Upstream/OpenSSH about the perils of adding
new configuration options locally (which is the main problem that tends
to arise in practice with that particular project).


Thanks for starting this thread!

-- 
Colin Watson                                       [cjwatson at ubuntu.com]



More information about the ubuntu-devel mailing list