[TRUSTY][PATCH] namespaces: Use task_lock and not rcu to protect nsproxy

Rafael David Tinoco rafael.tinoco at canonical.com
Fri Aug 22 11:48:38 UTC 2014


[TRUSTY][PATCH] namespaces: Use task_lock and not rcu to protect nsproxy
[UTOPIC][PATCH] namespaces: Use task_lock and not rcu to protect nsproxy

SRU Justification:

 Impact: network namespace creation has performance regression since v3.5.
 Fix: my analysis, lklm discussion, upstream patch

 Testcase:

   http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
   http://people.canonical.com/~inaddy/lp1328088/parse.py
   http://people.canonical.com/~inaddy/lp1328088/charts/250.html
   http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

   Running make_fake_routers.sh 4000 and using parse.py you can check if
   "fake routers" are being created in a good rate /sec (and you can
   compare with all generated charts). 

Extra Information:

BugLink: https://bugs.launchpad.net/bugs/1328088

It was brought to my attention that network namespace creation scalability 
was affected during kernel development.

The following script was used for all the tests and charts generation:

http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
http://people.canonical.com/~inaddy/lp1328088/parse.py

I measured how many "fake routers" (above script) could be added per second from 0 to 4000 created routers mark. Using this script and a git bisect on kernel tree I was led to one specific commit causing regression: #911af50 "rcu: Provide compile-time control for no-CBs CPUs". It introduced a performance scalability regression (explained below) that still lasts.

RCU related code looked like to be responsible for the problem. With that, every commit from tag v3.8..master that changed any of this files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The idea was to check performance regression during rcu development. In the worst case, the regression not being related to rcu, I would still have data to interpret the performance/scalability regression.

All text below this refer to 2 groups of charts, generated during the study:

1) Kernel git tags from 3.8 to 3.14.
http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html

2) Kernel git commits for rcu development (111 commits).
http://people.canonical.com/~inaddy/lp1328088/charts/250.html

Since there was difference in results depending on how many cpus or how the no-cb cpus were configured, 3 kernel config options were used on every measure:

- CONFIG_RCU_NOCB_CPU (disabled): nocbno
- CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
- CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone

Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since w/ only 1 cpu there is no no-cb cpu

After charts being generated it was clear that NOCB_CPU_ALL (4 cpus) affected the "fake routers" creation process performance and this regression continues up to upstream version. It was also clear that, after commit #911af50, having more than 1 cpu does not improve performance/scalability for netns, makes it worse.

#911af50
...
+#ifdef CONFIG_RCU_NOCB_CPU_ALL
+ pr_info("\tExperimental no-CBs for all CPUs\n");
+ cpumask_setall(rcu_nocb_mask);
+#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
... 

The following LKLM discussion was created: https://lkml.org/lkml/2014/6/11/42, in which you can read Paul E. McKenney and Eric Biederman discussing about the use of rcu locking mechanism inside network namespace creation functions. After some Paul E McKenney's suggestions, we all agreed that the use of rcu locking might not be the best for network namespace workload type/behavior: Since the first bad commit was setting cpu mask for rcu callbacks to run, we could see that using rcu callbacks on multiple cpus can cause a big number of network namespace creation to not perform. 

Please consider this cherry-pick (from upstream) to be included in our master-next branch. I have already compiled and tested -master-next + this patch (including small conflict resolution) with the same test case presented in the beginning of this email (add-apt-repository ppa:inaddy/sf00062980).

Thank you in advance.

Best Regards

-Rafael

Rafael David Tinoco
rafael.tinoco at canonical.com




More information about the kernel-team mailing list