Lucid SRU - UBUNTU: SAUCE: netns: Add quota for number of NET_NS instances.

Fri Dec 2 01:05:04 UTC 2011

On 12/01/2011 03:39 PM, Brad Figg wrote:
> On 12/01/2011 01:48 PM, Tim Gardner wrote:
>> Please consider this (untested) patch for inclusion in Lucid. See the
>> discussion in http://bugs.launchpad.net/bugs/790863 for arguments
>> proposing to restore CONFIG_NET_NS.
>>
>> I'll post a test kernel to the bug in awhile.
>>
>> One of the issues I have with this patch is that it appears that any
>> consumer of network name spaces will have to initially write a
>> non-zero value to netns_max before _any_ name spaces can be
>> successfully allocated. If copy_net_ns() fails in
>> create_new_namespaces(), then it seems the whole allocation is buggered.
>>
>> rtg
>>
>>
>
> Tim,
>
> If you follow the thread that starts at:
> http://www.spinics.net/lists/netdev/msg180263.html
> you will see that Tetsuo actually proposed a modified
> version of this patch: http://www.spinics.net/lists/netdev/msg180360.html.
>
> Brad

I did see the second version, but its more complicated and I'm not 
convinced that it solves the OOM better then the first (simpler) patch.

This is my (perhaps incorrect) model of the problem.

Consider the workload that first brought this issue to light. vsftpd 
receives a login request for which it forks a process and indirectly 
allocates a network name space. Eventually the login process terminates 
and synchronously frees all of its resources except the network 
namespace (which is now on an RCU list to be freed later). Now imagine 
this happening at a sufficiently high rate that the lower priority RCU 
thread never gets to run and free its list elements. Eventually all slab 
space is exhausted and the OOM killer cranks up.

So, the first patch simply synchronously returns an error if the number 
of network name spaces exceeds the specified maximum. This happens 
within the context of the fork, the login process is aborted, and the 
remote user is told to buzz off.

With the second patch, once the maximum number of network name spaces 
has been reached, the fork _waits_ until a name space is free (having 
already consumed some non-zero amount of task structure memory). In the 
meantime login requests continue to pour in and vsftpd attempts to fork 
still more processes which consume still more memory. If the login 
attempt rate is sufficiently high, then I think the forks will 
eventually start to fail when they cannot allocate task structure memory.

Of course, with either patch failure recovery is deferred to user space, 
but I'm not convinced that the end result is any different.

With both patches, vsftpd fails a login attempt when there are 
insufficient resources, so why not use the simpler approach ?

rtg
-- 
Tim Gardner tim.gardner at canonical.com