APPLIED[F]: [B/F][PATCH 0/5] ext4/jbd2: data=journal: write-protect pages on transaction commit

Kelsey Skunberg kelsey.skunberg at canonical.com
Thu Sep 23 22:57:09 UTC 2021


Applied to Focal master-next. Thank you!

-Kelsey

On 2021-09-09 17:22:21 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1847340
> 
> [Impact]
> 
> With mmap()ed files on ext4's data journaling it's possible to change
> a mapped page's buffers contents during their jbd2 transaction commit
> (as currently nothing prevents/blocks the write access at that time.)
> 
> This might happen between the buffers checksum calculation and actual
> write to journal, so the (old) checksum is invalid for the (new) data.
> 
> If the system crashes after that, but before such journal entry makes
> it to the filesystem, the journal replay on the next mount just fails,
> and the filesystem now requires fsck. (apparently curtin might set up
> /etc/fstab with passno=0, requiring manual intervention.)
> 
>     [39751.096455] EXT4-fs: Warning: mounting with data=journal disables delayed allocation and O_DIRECT support!
>     [39751.114435] JBD2: Invalid checksum recovering block 87305 in log
>     [39751.146133] JBD2: Invalid checksum recovering block 88039 in log
>     [39751.195950] JBD2: Invalid checksum recovering block 49633 in log
>     [39751.265158] JBD2: recovery failed
>     [39751.265163] EXT4-fs (vdc): error loading journal
> 
> [Fix]
> 
> The fix is to write-protect the pages during journal transaction commit,
> so that writes to mapped pages hit a page fault, then ext4's page_mkwrite
> hook can block until the commit finishes and the buffers can be modified.
> 
> In order to do that, add jbd2 journal callbacks that the filesystems can
> customize, called before/after the critical region in transaction commit,
> then have ext4 in data journaling mode to write-protect the pages whose
> buffers are being committed (and handle cases that need pages redirtied.)
> 
> The changes are restricted to the data journaling mode and page_mkwrite
> hook, and other modes/paths use the same code/behavior in the callbacks.
> 
> [Test Case]
> 
> Set up an ext4 filesystem in data journaling mode, and run stress-ng's
> mmap file test on it, then crash the system after a bit; check whether
> the filesystem can mount again or not (i.e., with jbd2 checksum errors.)
> 
>     # mkfs.ext4 $DEV
>     # mount -o data=journal $DEV $DIR
>     # cd $DIR
>     # stress-ng --mmap $((4*$(nproc))) --mmap-file &
>     # sleep 60
>     # echo c >/proc/sysrq-trigger
>     ...
>     # mount -o data=journal $DEV $DIR   # PASS/FAIL.
>     # dmesg | tail
> 
> [Regression Potential]
> 
> Regressions would likely manifest in ext4 data journaling mode (which
> is not the default mode, 'ordered') with memory mapped access, as the
> other modes/paths are largely unaffected by the changes/same behavior.
> 
> This has been tested with (x)fstests, that showed no regressions on
> data=ordered and data=journal on both Bionic and Focal (with kernel
> versions 4.15.0-156-generic and 5.4.0-84-generic) w/in 10 runs each.
> And the stress-ng test-case as well. (Numbers/details in the LP bug.)
> 
> [Other info]
> 
> The patchset is applied on 5.10, so Hirsute (5.11) is already fixed;
> only Focal and Bionic need it.
> 
> There are little changes in the patches between Focal and Bionic
> (mostly minor backport adjustments, mainly due to no vm_fault_t)
> but unfortunately that needs separate versions for most patches.
> 
> Thanks!
> 
> -- 
> Mauricio
> 
> 
> 
> Jan Kara (1):
>   ext4: fix mmap write protection for data=journal mode
> 
> Mauricio Faria de Oliveira (4):
>   jbd2: introduce/export functions
>     jbd2_journal_submit|finish_inode_data_buffers()
>   jbd2, ext4, ocfs2: introduce/use journal callbacks
>     j_submit|finish_inode_data_buffers()
>   ext4: data=journal: fixes for ext4_page_mkwrite()
>   ext4: data=journal: write-protect pages on
>     j_submit_inode_data_buffers()
> 
>  fs/ext4/inode.c      | 63 +++++++++++++++++++++++++++-----
>  fs/ext4/super.c      | 87 ++++++++++++++++++++++++++++++++++++++++++++
>  fs/jbd2/commit.c     | 62 ++++++++++++++++---------------
>  fs/jbd2/journal.c    |  2 +
>  fs/ocfs2/journal.c   |  4 ++
>  include/linux/jbd2.h | 29 ++++++++++++++-
>  6 files changed, 207 insertions(+), 40 deletions(-)
> 
> -- 
> 2.30.2
> 
> 
> -- 
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team



More information about the kernel-team mailing list