[SRU][F][PATCH 1/1] btrfs: trim: fix underflow in trim length to prevent access beyond device boundary
Colin Ian King
colin.king at canonical.com
Fri Sep 18 09:01:26 UTC 2020
On 18/09/2020 04:27, Matthew Ruffell wrote:
> From: Qu Wenruo <wqu at suse.com>
>
> BugLink: https://bugs.launchpad.net/bugs/1896154
>
> [BUG]
> The following script can lead to tons of beyond device boundary access:
>
> mkfs.btrfs -f $dev -b 10G
> mount $dev $mnt
> trimfs $mnt
> btrfs filesystem resize 1:-1G $mnt
> trimfs $mnt
>
> [CAUSE]
> Since commit 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to
> find_first_clear_extent_bit"), we try to avoid trimming ranges that's
> already trimmed.
>
> So we check device->alloc_state by finding the first range which doesn't
> have CHUNK_TRIMMED and CHUNK_ALLOCATED not set.
>
> But if we shrunk the device, that bits are not cleared, thus we could
> easily got a range starts beyond the shrunk device size.
>
> This results the returned @start and @end are all beyond device size,
> then we call "end = min(end, device->total_bytes -1);" making @end
> smaller than device size.
>
> Then finally we goes "len = end - start + 1", totally underflow the
> result, and lead to the beyond-device-boundary access.
>
> [FIX]
> This patch will fix the problem in two ways:
>
> - Clear CHUNK_TRIMMED | CHUNK_ALLOCATED bits when shrinking device
> This is the root fix
>
> - Add extra safety check when trimming free device extents
> We check and warn if the returned range is already beyond current
> device.
>
> Link: https://github.com/kdave/btrfs-progs/issues/282
> Fixes: 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to find_first_clear_extent_bit")
> CC: stable at vger.kernel.org # 5.4+
> Signed-off-by: Qu Wenruo <wqu at suse.com>
> Reviewed-by: Filipe Manana <fdmanana at suse.com>
> Signed-off-by: David Sterba <dsterba at suse.com>
> (backported from commit c57dd1f2f6a7cd1bb61802344f59ccdc5278c983)
> [mruffell: CHUNK_STATE_MASK file changed to extent_io.h, context fixups]
> Signed-off-by: Matthew Ruffell <matthew.ruffell at canonical.com>
> ---
> fs/btrfs/extent-tree.c | 14 ++++++++++++++
> fs/btrfs/extent_io.h | 2 ++
> fs/btrfs/volumes.c | 4 ++++
> 3 files changed, 20 insertions(+)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 541497036cc2..aad83d3b8f93 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -32,6 +32,7 @@
> #include "block-rsv.h"
> #include "delalloc-space.h"
> #include "block-group.h"
> +#include "rcu-string.h"
>
> #undef SCRAMBLE_DELAYED_REFS
>
> @@ -5599,6 +5600,19 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
> &start, &end,
> CHUNK_TRIMMED | CHUNK_ALLOCATED);
>
> + /* Check if there are any CHUNK_* bits left */
> + if (start > device->total_bytes) {
> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
> + btrfs_warn_in_rcu(fs_info,
> +"ignoring attempt to trim beyond device size: offset %llu length %llu device %s device size %llu",
> + start, end - start + 1,
> + rcu_str_deref(device->name),
> + device->total_bytes);
> + mutex_unlock(&fs_info->chunk_mutex);
> + ret = 0;
> + break;
> + }
> +
> /* Ensure we skip the reserved area in the first 1M */
> start = max_t(u64, start, SZ_1M);
>
> diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
> index bc858c8cef0a..fcf1807cc8dd 100644
> --- a/fs/btrfs/extent_io.h
> +++ b/fs/btrfs/extent_io.h
> @@ -35,6 +35,8 @@
> */
> #define CHUNK_ALLOCATED EXTENT_DIRTY
> #define CHUNK_TRIMMED EXTENT_DEFRAG
> +#define CHUNK_STATE_MASK (CHUNK_ALLOCATED | \
> + CHUNK_TRIMMED)
>
> /*
> * flags for bio submission. The high bits indicate the compression
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 81be71fb569e..489cc21d7c6f 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4907,6 +4907,10 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
> }
>
> mutex_lock(&fs_info->chunk_mutex);
> + /* Clear all state bits beyond the shrunk device size */
> + clear_extent_bits(&device->alloc_state, new_size, (u64)-1,
> + CHUNK_STATE_MASK);
> +
> btrfs_device_set_disk_total_bytes(device, new_size);
> if (list_empty(&device->post_commit_list))
> list_add_tail(&device->post_commit_list,
>
Thanks Matthew,
Fixes corruption issue, the fix win easily outweighs the regression risk
IMHO. Backport looks good, good test and positive test results.
Acked-by: Colin Ian King <colin.king at canonical.com>
More information about the kernel-team
mailing list