APPLIED[X/Z]: [SRU][X/Z/A][PATCH] Fix for LP:#1729256
Stefan Bader
stefan.bader at canonical.com
Mon Nov 20 10:42:09 UTC 2017
On 03.11.2017 03:16, Gavin Guo wrote:
> BugLink: http://bugs.launchpad.net/bugs/1729256
>
> [Impact]
> The frequent kernel errors can be seen inside the XFS based OSD
> processes and it causes to crash and restart.
>
> BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
> IP: [<ffffffffc03705a0>] xfs_da3_node_read+0x30/0xb0 [xfs]
> CPU: 8 PID: 2855031 Comm: tp_fstore_op Not tainted 4.4.0-78-generic #99-Ubuntu
> Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 09/13/2016
> task: ffff880620f38e00 ti: ffff880678228000 task.ti: ffff880678228000
> RIP: 0010:[<ffffffffc03705a0>] [<ffffffffc03705a0>]
> xfs_da3_node_read+0x30/0xb0 [xfs]
> RSP: 0018:ffff88067822bd00 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: ffff88078abff3f0 RCX: 0000000000000001
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88067822bcb0
> RBP: ffff88067822bd20 R08: 0000000000000001 R09: fffffffffffffffe
> R10: ffffea001e2aff80 R11: 0000000000000001 R12: ffff88067822bd50
> R13: ffff8805a37f0000 R14: 0000000000000001 R15: 000000008d180e3e
> FS: 00007f7573c01700(0000) GS:ffff88103fa00000(0000)knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000000a0 CR3: 0000001686f3d000 CR4: 00000000003406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
> ffffffffc03cfb50 ffffffffc03b0ecc ffff88067822bde0 0000000000000001
> ffff88067822bd98 ffffffffc038c8b3 0000000200000008 ffff88081cb06040
> 00000002d8723bf8 ffff880fc142ae80 0000000000000000 0000000000000000
> Call Trace:
> [<ffffffffc03b0ecc>] ? xfs_trans_roll+0x2c/0x50 [xfs]
> [<ffffffffc038c8b3>] xfs_attr3_node_inactive+0x183/0x220 [xfs]
> [<ffffffffc038c9fc>] xfs_attr3_root_inactive+0xac/0x100 [xfs]
> [<ffffffffc038cb9c>] xfs_attr_inactive+0x14c/0x1a0 [xfs]
> [<ffffffffc03a6da5>] xfs_inactive+0x85/0x120 [xfs]
> [<ffffffffc03ac2f5>] xfs_fs_evict_inode+0xa5/0x100 [xfs]
> [<ffffffff8122aaee>] evict+0xbe/0x190
> [<ffffffff8122add1>] iput+0x1c1/0x240
> [<ffffffff8121f859>] do_unlinkat+0x199/0x2d0
> [<ffffffff812203f6>] SyS_unlink+0x16/0x20
> [<ffffffff81840a32>] entry_SYSCALL_64_fastpath+0x16/0x71
>
> [Fix]
> commit e678a63e6c95f140befe6fcd81b49075ecb3c701
> Author: Brian Foster <bfoster at redhat.com>
> Date: Mon Oct 9 11:38:56 2017 -0700
>
> xfs: reinit btree pointer on attr tree inactivation walk
>
> xfs_attr3_root_inactive() walks the attr fork tree to invalidate the
> associated blocks. xfs_attr3_node_inactive() recursively descends from
> internal blocks to leaf blocks, caching block address values along the
> way to revisit parent blocks, locate the next entry and descend down
> that branch of the tree.
>
> The code that attempts to reread the parent block is unsafe because it
> assumes that the local xfs_da_node_entry pointer remains valid after
> an xfs_trans_brelse() and re-read of the parent buffer. Under heavy
> memory pressure, it is possible that the buffer has been reclaimed and
> reallocated by the time the parent block is reread. This means that
> 'btree' can point to an invalid memory address, lead to a
> random/garbage value for child_fsb and cause the subsequent read of
> the attr fork to go off the rails and return a NULL buffer for an attr
> fork offset that is most likely not allocated.
>
> Note that this problem can be manufactured by setting
> XFS_ATTR_BTREE_REF to 0 to prevent LRU caching of attr buffers,
> creating a file with a multi-level attr fork and removing it to
> trigger inactivation.
>
> To address this problem, reinit the node/btree pointers to the parent
> buffer after it has been re-read. This ensures btree points to a valid
> record and allows the walk to proceed.
>
> [Test]
> The patch has been tested on the production system.
>
> [Regression Risk]
> Clean cherry-pick queued for stable, so we'll be picking it up anyway.
>
> Brian Foster (1):
> xfs: reinit btree pointer on attr tree inactivation walk
>
> fs/xfs/xfs_attr_inactive.c | 2 ++
> 1 file changed, 2 insertions(+)
>
Applied to Xenial/Zesty master-next (as mentioned was already applied to
Artful). Thanks.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20171120/9707f842/attachment.sig>
More information about the kernel-team
mailing list