ACK: [Oneiric/Natty][SRU][PATCH 1/1] net/netfilter/nf_conntrack_netlink.c: fix Oops on container destroy

Wed Sep 21 19:48:05 UTC 2011

On 09/21/2011 01:28 PM, Leann Ogasawara wrote:
> Hi All,
>
> BugLink: http://bugs.launchpad.net/bugs/843892
>
> == SRU Justification ==
> It's been reported that destroying a container causes a kernel Oops and
> will hang the system.  The issue is reproducible.  The user has
> successfully tested the patch against Oneiric and can confirm the Oops
> no longer occurs when using a patched Oneiric kernel.  The patch has
> been submitted upstream (CC'd upstream stable) and is currently queued
> in the -mm tree.  It also appears it will hit the 3.2 merge window.
> Please consider for SRU against Oneiric and Natty.
>
> == Impact ==
> The commit message of the patch notes that this will likely affect
> 2.6.26 and newer kernels, ie affects Lucid, Maverick, Natty, Oneiric.
> However, due to the nature of our SRU process, the bug reporter is
> likely only able to readily test Natty and Oneiric.  Thus I'm only
> submitting this for SRU against Oneiric and Natty.
>
> == Test Case ==
> See reproducer in comment #6 of bug report
>
> Thanks,
> Leann
>
>  From ba1f54e39f8ec3a7db00dd2844dc29988638c5a5 Mon Sep 17 00:00:00 2001
> From: Alex Bligh<alex at alex.org.uk>
> Date: Wed, 14 Sep 2011 13:43:36 -0700
> Subject: [PATCH] net/netfilter/nf_conntrack_netlink.c: fix Oops on container destroy
>
> BugLink: http://bugs.launchpad.net/bugs/843892
>
> Problem:
>
> A repeatable Oops can be caused if a container with networking
> unshared is destroyed when it has nf_conntrack entries yet to expire.
>
> A copy of the oops follows below. A perl program generating the oops
> repeatably is attached inline below.
>
> Analysis:
>
> The oops is called from cleanup_net when the namespace is
> destroyed. conntrack iterates through outstanding events and calls
> death_by_timeout on each of them, which in turn produces a call to
> ctnetlink_conntrack_event. This calls nf_netlink_has_listeners, which
> oopses because net->nfnl is NULL.
>
> The perl program generates the container through fork() then
> clone(NS_NEWNET). I does not explicitly	set up netlink
> explicitly set up netlink, but I presume it was set up else net->nfnl
> would have been NULL earlier (i.e. when an earlier connection
> timed out). This would thus suggest that net->nfnl is made NULL
> during the destruction of the container, which I think is done by
> nfnetlink_net_exit_batch.
>
> I can see that the various subsystems are deinitialised in the opposite
> order to which the relevant register_pernet_subsys calls are called,
> and both nf_conntrack and nfnetlink_net_ops register their relevant
> subsystems. If nfnetlink_net_ops registered later than nfconntrack,
> then its exit routine would have been called first, which would cause
> the oops described. I am not sure there is anything to prevent this
> happening in a container environment.
>
> Whilst there's perhaps a more complex problem revolving around ordering
> of subsystem deinit, it seems to me that missing a netlink event on a
> container that is dying is not a disaster. An early check for net->nfnl
> being non-NULL in ctnetlink_conntrack_event appears to fix this. There
> may remain a potential race condition if it becomes NULL immediately
> after being checked (I am not sure any lock is held at this point or
> how synchronisation for subsystem deinitialization works).
>
> Patch:
>
> The patch attached should apply on everything from 2.6.26 (if not before)
> onwards; it appears to be a problem on all kernels. This was taken against
> Ubuntu-3.0.0-11.17 which is very close to 3.0.4. I have torture-tested it
> with the above perl script for 15 minutes or so; the perl script hung the
> machine within 20 seconds without this patch.
>
> Applicability:
>
> If this is the right solution, it should be applied to all stable kernels
> as well as head. Apart from the minor overhead of checking one variable
> against NULL, it can never 'do the wrong thing', because if net->nfnl
> is NULL, an oops will inevitably result. Therefore, checking is a reasonable
> thing to do unless it can be proven than net->nfnl will never be NULL.
>
> Check net->nfnl for NULL in ctnetlink_conntrack_event to avoid Oops on
> container destroy
>
> Signed-off-by: Alex Bligh<alex at alex.org.uk>
> Cc: Patrick McHardy<kaber at trash.net>
> Cc: David Miller<davem at davemloft.net>
> Cc:<stable at kernel.org>
> Signed-off-by: Andrew Morton<akpm at linux-foundation.org>
> (applied from -mm http://marc.info/?l=linux-mm-commits&m=131603308900694&w=2)
>
> Signed-off-by: Leann Ogasawara<leann.ogasawara at canonical.com>
> ---
>   net/netfilter/nf_conntrack_netlink.c |    5 +++++
>   1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
> index 482e90c..0790d0a 100644
> --- a/net/netfilter/nf_conntrack_netlink.c
> +++ b/net/netfilter/nf_conntrack_netlink.c
> @@ -570,6 +570,11 @@ ctnetlink_conntrack_event(unsigned int events, struct nf_ct_event *item)
>   		return 0;
>
>   	net = nf_ct_net(ct);
> +
> +	/* container deinit, netlink may have died before death_by_timeout */
> +	if (!net->nfnl)
> +		return 0;
> +
>   	if (!item->report&&  !nfnetlink_has_listeners(net, group))
>   		return 0;
>

Acked-by: Tim Gardner <tim.gardner at canonical.com>
-- 
Tim Gardner tim.gardner at canonical.com