[RFC/Review] Prevent network namespace memory exhaution

Daniel Lezcano daniel.lezcano at free.fr
Fri Mar 25 09:46:38 UTC 2011


On 03/25/2011 04:00 AM, Tim Gardner wrote:
> On 03/24/2011 09:41 AM, Stefan Bader wrote:
>> BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/720095
>>
>> This series of patches tries to cover a problem that we caught by
>> enabling network namespaces (CONFIG_NET_NS) in Lucid, which was done
>> (although the feature was still marked experimental) to support
>> containerize usecases (and we would get some complaints by
>> removing it now).
>>
>> I tried to come up with some usable solution. Unfortunately picking the
>> minimal set of patches which prevents the memory buildup, also causes 
>> the
>> rate of connects (which in that case makes use of network namespace 
>> cloning
>> a lot) to go down noticeably.
>>
>> The second half would improve the situation slightly but still not as
>> much as it has been achieved in Maverick. And using the Maverick
>> backport causes other problems in that specific case the bug is
>> reported.
>>
>> To quantify that a bit better:
>>
>> Lucid current        10 connections per second
>> Lucid set 1         1 connection every  2 seconds
>> Lucid set 2         2 connections every 3 seconds
>> Maverick         2 connections per second
>>
>> There has not been a way to verify how bad the impact of the slowdown
>> would be in a real production environment. So it might be a viable
>> approach to limit changes to the first set. Assuming that creating
>> and destroying namespaces is not the common usecase we have.
>>
>> Should there be performance complaints, we still could think of
>> having a closer look at the second set (or more).
>>
>> So generally, does this sound like an approach we can SRU? And
>> second, more eyes looking at the set(s) would be appreciated.
>>
>> -Stefan
>>
>> Those are enough to prevent memory being eaten:
>> * net: Introduce unregister_netdevice_queue()
>> * net: Introduce unregister_netdevice_many()
>> * net: add a list_head parameter to dellink() method
>> * veth: Fix veth_dellink method
>> * veth: Fix unregister_netdevice_queue for veth
>> * net: Implement for_each_netdev_reverse.
>> * net: Batch network namespace destruction.
>>
>> Those seem to speed up the number of connects to vsftp per time (though
>> not as much as Maverick):
>> * net: Automatically allocate per namespace data.
>> * net: Add support for batching network namespace cleanups
>> * netns: Add an explicit rcu_barrier to 
>> unregister_pernet_{device|subsys}
>> * net: Use rcu lookups in inet_twsk_purge.
>> * tcp: fix inet_twsk_deschedule()
>> * net: Batch inet_twsk_purge
>>
>> The following changes since commit 
>> 054b34d3a38dc2a775ab722411b934b52a33707f:
>>    Brad Figg (1):
>>          UBUNTU: Ubuntu-2.6.32-31.60
>>
>> are available in the git repository at:
>>
>>    git://kernel.ubuntu.com/smb/ubuntu-lucid netnsbpv2
>>
>> Eric Dumazet (5):
>>        net: Introduce unregister_netdevice_queue()
>>        net: Introduce unregister_netdevice_many()
>>        net: add a list_head parameter to dellink() method
>>        veth: Fix veth_dellink method
>>        tcp: fix inet_twsk_deschedule()
>>
>> Eric W. Biederman (8):
>>        veth: Fix unregister_netdevice_queue for veth
>>        net: Implement for_each_netdev_reverse.
>>        net: Batch network namespace destruction.
>>        net: Automatically allocate per namespace data.
>>        net: Add support for batching network namespace cleanups
>>        netns: Add an explicit rcu_barrier to 
>> unregister_pernet_{device|subsys}
>>        net: Use rcu lookups in inet_twsk_purge.
>>        net: Batch inet_twsk_purge
>>
>>   drivers/net/macvlan.c            |    6 +-
>>   drivers/net/veth.c               |    6 +-
>>   include/linux/netdevice.h        |   12 ++-
>>   include/net/inet_timewait_sock.h |    6 +-
>>   include/net/net_namespace.h      |   32 ++++-
>>   include/net/rtnetlink.h          |    3 +-
>>   net/8021q/vlan.c                 |    8 +-
>>   net/8021q/vlan.h                 |    2 +-
>>   net/core/dev.c                   |  120 ++++++++++-----
>>   net/core/net_namespace.c         |  296 
>> +++++++++++++++++++++++---------------
>>   net/core/rtnetlink.c             |   14 +-
>>   net/ipv4/inet_timewait_sock.c    |   47 ++++---
>>   net/ipv4/tcp_ipv4.c              |   11 +-
>>   net/ipv6/tcp_ipv6.c              |   11 +-
>>   14 files changed, 369 insertions(+), 205 deletions(-)
>>
>
> Thats a honking big patch set for an SRU. Its not clear to me from the 
> commit logs, but I assume they are all clean cherry-picks ?
>
> I'm still not convinced that CONFIG_NET_NS=n isn't the best solution, 
> despite the complaints that change might elicit. I'd like to hear from 
> the consumers of network name spaces about how they are using the 
> feature, and possible workarounds if it were to go away.

The users are heavily using all the namespaces and the cgroup through 
the Linux Containers http://lxc.sourceforge.net
There is not workaround if it is not set. If you remove this feature, 
IMO people will really complain.

The patchset providing the batching was introduced to speed up the 
network namespace destruction. Before this patch, destroying thousand of 
network namespace was taking a very long time (AFAIR, about 20 minutes). 
With this patchset it takes 2 mins.





More information about the kernel-team mailing list