ACK/Cmnt: [SRU B/C] [PATCH 0/4] Line discipline buffer flush/tty_reopen() race fix

Stefan Bader stefan.bader at canonical.com
Wed Jan 9 14:11:50 UTC 2019


On 08.01.19 21:28, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1791758
> 
> ** NOTICE: Patch 1 in this SRU request should be applied only to 4.15,
> since it is already present in 4.18.
> 
> 
> [Impact]
> 
> * Line discipline code is racy when we have buffer being flush while the
> tty is being initialized or reinitialized. For the first problem, we have
> an upstream patch since January 2018:
> b027e2298bd5 ("tty: fix data race between tty_init_dev and flush of buf");
> although it is not in Ubuntu kernel 4.4, only in kernels 4.15 and subsequent
> ones.
> 
> * For the race between the buffer flush while tty is being reopened, we
> have a patch that addresses this issue recently merged for 5.0-rc1:
> 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()").
> No Ubuntu kernel currently contains this patch, hence we're hereby
> submitting the SRU request. The upstream complete patch series for
> this is in [0].
> 
> * The approach of both patches are similar - they rely in locking/semaphore
> to prevent race conditions. Some additional patches are necessary to prevent
> correlated issues, like preventing a potential deadlock due to bad
> prioritization in servicing I/O over releasing tty_ldisc_lock() - refer to
> c96cf923a98d ("tty: Don't block on IO when ldisc change is pending").
> All the necessary fixes are grouped here in this SRU request.
> 
> * The symptom of the race condition between the buffer flush and the
> tty reopen routine is a kernel crash with the following trace:
> 
> 
> BUG: unable to handle kernel paging request at 0000000000002268
> IP: [<addr>] n_tty_receive_buf_common+0x6a/0xae0
> [...]
> Call Trace:
> [<addr>] ? kvm_sched_clock_read+0x1e/0x30
> [<addr>] n_tty_receive_buf2+0x14/0x20
> [<addr>] flush_to_ldisc+0xd5/0x120
> [<addr>] process_one_work+0x156/0x400
> [<addr>] worker_thread+0x11a/0x480
> [...]
> 
> 
> * A kernel crash was collected from an user, analysis is present in
> comment #4 in LP #1791758.
> 
> 
> [Test Case]
> 
> * It is not trivial to trigger this fault, but the usual recipe is to keep
> accessing a machine through SSH (or keep killing getty when in IPMI serial
> console) and in some way run commands before the terminal is ready in that
> machine (like hacking some echo into ttySx or pts in an infinite loop).
> 
> * We have reports of users that could reproduce this issue in their
> production environment, and with the patches present in this SRU request
> the problem was fixed.
> 
> 
> [Regression Potential]
> 
> * tty subsystem is highly central and patches in that area are always
> delicate. For example, the upstream series [0] is a re-spin (V6) due to
> a hard to reproduce issue reported in the PA-RISC architecture, which was
> found in the V5 iteration [1] but was fixed by the patch c96cf923a98d,
> present in this SRU request.
> 
> * The patchset [0] is present in tty-next tree since mid-November, so
> the overall likelihood of regressions is low.
> 
> * These patches were sniff-tested for the 3 versions (4.4, 4.15 and 4.18)
> and didn't show any issues.
> 
> 
> [0] https://marc.info/?l=linux-kernel&m=154103190111795
> [1] https://marc.info/?l=linux-kernel&m=153737852618183
> 
> 
> Dmitry Safonov (4):
>   tty: Drop tty->count on tty_reopen() failure
>   tty: Hold tty_ldisc_lock() during tty_reopen()
>   tty: Don't block on IO when ldisc change is pending
>   tty: Simplify tty->count math in tty_reopen()
> 
>  drivers/tty/n_hdlc.c    |  4 ++--
>  drivers/tty/n_r3964.c   |  2 +-
>  drivers/tty/n_tty.c     |  8 ++++----
>  drivers/tty/tty_io.c    | 13 ++++++++++---
>  drivers/tty/tty_ldisc.c |  7 +++++++
>  include/linux/tty.h     |  7 +++++++
>  6 files changed, 31 insertions(+), 10 deletions(-)
> 

Indeed same cherry-pick comment and mabye a suggestion for future submissions.
At least the way I work from the threaded view in Thunderbird a set like

<cover> X/B/C SRU...
+- X1
+- X2
+ ...
+- X5
+- B
+- B+C1
+- ...
+- B+C4

would help a bit because cross checking with the lp bug then quickly shows all
nominations and patch parts match. Otherwise the other thread might be somewhere
not close to the currently looked at part.
Acked-by: Stefan Bader <stefan.bader at canonical.com>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20190109/dbc2f113/attachment-0001.sig>


More information about the kernel-team mailing list