[B][D][E][F][SRU][PATCH 0/1] bonding: fix state transition issue in link monitoring
po-hsu.lin at canonical.com
Wed Nov 13 10:51:17 UTC 2019
BTW I just found out that this is queuing for the stable kernel upstream.
On Wed, Nov 13, 2019 at 3:33 PM Po-Hsu Lin <po-hsu.lin at canonical.com> wrote:
> == Justification ==
> From the well explained commit message:
> Since de77ecd4ef02 ("bonding: improve link-status update in
> mii-monitoring"), the bonding driver has utilized two separate variables
> to indicate the next link state a particular slave should transition to.
> Each is used to communicate to a different portion of the link state
> change commit logic; one to the bond_miimon_commit function itself, and
> another to the state transition logic.
> Unfortunately, the two variables can become unsynchronized,
> resulting in incorrect link state transitions within bonding. This can
> cause slaves to become stuck in an incorrect link state until a
> subsequent carrier state transition.
> The issue occurs when a special case in bond_slave_netdev_event
> sets slave->link directly to BOND_LINK_FAIL. On the next pass through
> bond_miimon_inspect after the slave goes carrier up, the BOND_LINK_FAIL
> case will set the proposed next state (link_new_state) to BOND_LINK_UP,
> but the new_link to BOND_LINK_DOWN. The setting of the final link state
> from new_link comes after that from link_new_state, and so the slave
> will end up incorrectly in _DOWN state.
> Resolve this by combining the two variables into one.
> == Fixes ==
> * 1899bb32 (bonding: fix state transition issue in link monitoring)
> This patch can be cherry-picked into E/F
> For older releases like B/D, it will needs to be backported as they are
> missing the slave_err() printk marco added in 5237ff79 (bonding: add
> slave_foo printk macros) as well as the commit to replace netdev_err()
> with slave_err() in e2a7420d (bonding/main: convert to using slave
> printk macros)
> For Xenial, the commit that causes this issue, de77ecd4, does not exist.
> == Test ==
> Test kernels can be found here:
> The X-hwe and Disco kernel were tested by the bug reporter, Aleksei,
> the patched kernel works as expected.
> == Regression Potential ==
> This patch just unify the variable used in link state change commit
> logic to prevent the occurrence of an incorrect state. And the changes
> are limited to the bonding driver itself.
> (Although the include/net/bonding.h will be used in other drivers, but
> the changes to that file is only affecting this bond_main.c driver)
> Jay Vosburgh (1):
> bonding: fix state transition issue in link monitoring
> drivers/net/bonding/bond_main.c | 43 ++++++++++++++++++++---------------------
> include/net/bonding.h | 3 +--
> 2 files changed, 22 insertions(+), 24 deletions(-)
More information about the kernel-team