[TRUSTY][PATCH] namespaces: Use task_lock and not rcu to protect nsproxy
Rafael David Tinoco
rafael.tinoco at canonical.com
Fri Aug 22 11:48:38 UTC 2014
[TRUSTY][PATCH] namespaces: Use task_lock and not rcu to protect nsproxy
[UTOPIC][PATCH] namespaces: Use task_lock and not rcu to protect nsproxy
SRU Justification:
Impact: network namespace creation has performance regression since v3.5.
Fix: my analysis, lklm discussion, upstream patch
Testcase:
http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
http://people.canonical.com/~inaddy/lp1328088/parse.py
http://people.canonical.com/~inaddy/lp1328088/charts/250.html
http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
Running make_fake_routers.sh 4000 and using parse.py you can check if
"fake routers" are being created in a good rate /sec (and you can
compare with all generated charts).
Extra Information:
BugLink: https://bugs.launchpad.net/bugs/1328088
It was brought to my attention that network namespace creation scalability
was affected during kernel development.
The following script was used for all the tests and charts generation:
http://people.canonical.com/~inaddy/lp1328088/make_fake_routers.sh
http://people.canonical.com/~inaddy/lp1328088/parse.py
I measured how many "fake routers" (above script) could be added per second from 0 to 4000 created routers mark. Using this script and a git bisect on kernel tree I was led to one specific commit causing regression: #911af50 "rcu: Provide compile-time control for no-CBs CPUs". It introduced a performance scalability regression (explained below) that still lasts.
RCU related code looked like to be responsible for the problem. With that, every commit from tag v3.8..master that changed any of this files: "kernel/rcutree.c kernel/rcutree.h kernel/rcutree_plugin.h include/trace/events/rcu.h include/linux/rcupdate.h" was tested. The idea was to check performance regression during rcu development. In the worst case, the regression not being related to rcu, I would still have data to interpret the performance/scalability regression.
All text below this refer to 2 groups of charts, generated during the study:
1) Kernel git tags from 3.8 to 3.14.
http://people.canonical.com/~inaddy/lp1328088/charts/250-tag.html
2) Kernel git commits for rcu development (111 commits).
http://people.canonical.com/~inaddy/lp1328088/charts/250.html
Since there was difference in results depending on how many cpus or how the no-cb cpus were configured, 3 kernel config options were used on every measure:
- CONFIG_RCU_NOCB_CPU (disabled): nocbno
- CONFIG_RCU_NOCB_CPU_ALL (enabled): nocball
- CONFIG_RCU_NOCB_CPU_NONE (enabled): nocbnone
Obs: For 1 cpu cases: nocbno, nocbnone, nocball behaves the same since w/ only 1 cpu there is no no-cb cpu
After charts being generated it was clear that NOCB_CPU_ALL (4 cpus) affected the "fake routers" creation process performance and this regression continues up to upstream version. It was also clear that, after commit #911af50, having more than 1 cpu does not improve performance/scalability for netns, makes it worse.
#911af50
...
+#ifdef CONFIG_RCU_NOCB_CPU_ALL
+ pr_info("\tExperimental no-CBs for all CPUs\n");
+ cpumask_setall(rcu_nocb_mask);
+#endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
...
The following LKLM discussion was created: https://lkml.org/lkml/2014/6/11/42, in which you can read Paul E. McKenney and Eric Biederman discussing about the use of rcu locking mechanism inside network namespace creation functions. After some Paul E McKenney's suggestions, we all agreed that the use of rcu locking might not be the best for network namespace workload type/behavior: Since the first bad commit was setting cpu mask for rcu callbacks to run, we could see that using rcu callbacks on multiple cpus can cause a big number of network namespace creation to not perform.
Please consider this cherry-pick (from upstream) to be included in our master-next branch. I have already compiled and tested -master-next + this patch (including small conflict resolution) with the same test case presented in the beginning of this email (add-apt-repository ppa:inaddy/sf00062980).
Thank you in advance.
Best Regards
-Rafael
Rafael David Tinoco
rafael.tinoco at canonical.com
More information about the kernel-team
mailing list