ACK/cmnt: [SRU X] [PATCH 0/5] Line discipline buffer flush/tty_reopen() race fix

Kleber Souza kleber.souza at canonical.com
Wed Jan 9 10:32:47 UTC 2019


On 1/9/19 11:10 AM, Stefan Bader wrote:
> On 09.01.19 10:57, Kleber Souza wrote:
>> On 1/8/19 9:18 PM, Guilherme G. Piccoli wrote:
>>> BugLink: https://bugs.launchpad.net/bugs/1791758
>>>
>>> [Impact]
>>>
>>> * Line discipline code is racy when we have buffer being flush while the
>>> tty is being initialized or reinitialized. For the first problem, we have
>>> an upstream patch since January 2018:
>>> b027e2298bd5 ("tty: fix data race between tty_init_dev and flush of buf");
>>> although it is not in Ubuntu kernel 4.4, only in kernels 4.15 and subsequent
>>> ones.
>>>
>>> * For the race between the buffer flush while tty is being reopened, we
>>> have a patch that addresses this issue recently merged for 5.0-rc1:
>>> 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()").
>>> No Ubuntu kernel currently contains this patch, hence we're hereby
>>> submitting the SRU request. The upstream complete patch series for
>>> this is in [0].
>>>
>>> * The approach of both patches are similar - they rely in locking/semaphore
>>> to prevent race conditions. Some additional patches are necessary to prevent
>>> correlated issues, like preventing a potential deadlock due to bad
>>> prioritization in servicing I/O over releasing tty_ldisc_lock() - refer to
>>> c96cf923a98d ("tty: Don't block on IO when ldisc change is pending").
>>> All the necessary fixes are grouped here in this SRU request.
>>>
>>> * The symptom of the race condition between the buffer flush and the
>>> tty reopen routine is a kernel crash with the following trace:
>>>
>>>
>>> BUG: unable to handle kernel paging request at 0000000000002268
>>> IP: [<addr>] n_tty_receive_buf_common+0x6a/0xae0
>>> [...]
>>> Call Trace:
>>> [<addr>] ? kvm_sched_clock_read+0x1e/0x30
>>> [<addr>] n_tty_receive_buf2+0x14/0x20
>>> [<addr>] flush_to_ldisc+0xd5/0x120
>>> [<addr>] process_one_work+0x156/0x400
>>> [<addr>] worker_thread+0x11a/0x480
>>> [...]
>>>
>>>
>>> * A kernel crash was collected from an user, analysis is present in
>>> comment #4 in LP #1791758.
>>>
>>>
>>> [Test Case]
>>>
>>> * It is not trivial to trigger this fault, but the usual recipe is to keep
>>> accessing a machine through SSH (or keep killing getty when in IPMI serial
>>> console) and in some way run commands before the terminal is ready in that
>>> machine (like hacking some echo into ttySx or pts in an infinite loop).
>>>
>>> * We have reports of users that could reproduce this issue in their
>>> production environment, and with the patches present in this SRU request
>>> the problem was fixed.
>>>
>>>
>>> [Regression Potential]
>>>
>>> * tty subsystem is highly central and patches in that area are always
>>> delicate. For example, the upstream series [0] is a re-spin (V6) due to
>>> a hard to reproduce issue reported in the PA-RISC architecture, which was
>>> found in the V5 iteration [1] but was fixed by the patch c96cf923a98d,
>>> present in this SRU request.
>>>
>>> * The patchset [0] is present in tty-next tree since mid-November, and the
>>> patch b027e2298bd5 is available upstream since January/2018 (it's available
>>> in both Ubuntu kernels 4.15 and 4.18), so the overall likelihood of
>>> regressions is low.
>>>
>>> * These patches were sniff-tested for the 3 versions (4.4, 4.15 and 4.18)
>>> and didn't show any issues.
>>>
>>>
>>> [0] https://marc.info/?l=linux-kernel&m=154103190111795
>>> [1] https://marc.info/?l=linux-kernel&m=153737852618183
>>>
>>>
>>> Dmitry Safonov (4):
>>>   tty: Drop tty->count on tty_reopen() failure
>>>   tty: Hold tty_ldisc_lock() during tty_reopen()
>>>   tty: Don't block on IO when ldisc change is pending
>>>   tty: Simplify tty->count math in tty_reopen()
>>>
>>> Gaurav Kohli (1):
>>>   tty: fix data race between tty_init_dev and flush of buf
>>>
>>>  drivers/tty/n_hdlc.c    |  4 ++--
>>>  drivers/tty/n_r3964.c   |  2 +-
>>>  drivers/tty/n_tty.c     |  8 ++++----
>>>  drivers/tty/tty_io.c    | 20 +++++++++++++++++---
>>>  drivers/tty/tty_ldisc.c | 11 +++++++++--
>>>  include/linux/tty.h     |  9 +++++++++
>>>  6 files changed, 42 insertions(+), 12 deletions(-)
>>>
>> Those are indeed some sensitive changes and a bit intrusive, but they
>> are needed to fix the issue and most of them have been around on
>> mainline for some time now. I guess we'll also detect issues with
>> regression tests if they can be easily broken.
>>
>> Please note that the required format for the provenance of the patch is:
>>
>>  (backported from commit <sha1> <upstream repo name>)
>> or
>>  (cherry-pick from commit <sha1> <upstream repo name>)
> no, "cherry picked from commit ..." without the - and with an "ed" ;)

Thanks for pointing that out. I just copied and pasted from one of our
wiki pages without paying too much attention, I've fixed the wiki page
now :)

>
>> so the "commit" part is missing on those patches. The <upstream repo
>> name> can be omitted if they come from upstream. This can be fixed when
>> applying, but please keep it in mind for the next submissions :-).
>>
>>
>> Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
>>
>>
>>




More information about the kernel-team mailing list