[Bug 1914911] [NEW] bluefs doesn't compact log file

dongdong tao 1914911 at bugs.launchpad.net
Sun Feb 7 08:48:48 UTC 2021


Public bug reported:

For a certain type of workload, the bluefs might never compact the log file, 
which would cause the bluefs log file slowly grows to a huge size 
(some bigger than 1TB for a 1.5T device).

This bug could eventually cause osd crash and failed to restart as it couldn't get through the bluefs replay phase during boot time.
We might see below log when trying to restart the osd:
bluefs mount failed to replay log: (5) Input/output error

There are more details in the bluefs perf counters when this issue happened:
e.g. 
"bluefs": {
"gift_bytes": 811748818944,
"reclaim_bytes": 0,
"db_total_bytes": 888564350976,
"db_used_bytes": 867311747072,
"wal_total_bytes": 0,
"wal_used_bytes": 0,
"slow_total_bytes": 0,
"slow_used_bytes": 0,
"num_files": 11,
"log_bytes": 866545131520,
"log_compactions": 0,
"logged_bytes": 866542977024,
"files_written_wal": 2,
"files_written_sst": 3,
"bytes_written_wal": 32424281934,
"bytes_written_sst": 25382201
}

As we can see the log_compactions is 0, which means it's never compacted and the log file size(log_bytes) is already 800+G. After the compaction, the log file size would reduced to around 
1 G

Here is the PR[1] that addressed this bug, we need to backport this to ubuntu 12.2.13
[1] https://github.com/ceph/ceph/pull/17354

** Affects: ceph (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1914911

Title:
  bluefs doesn't compact log file

Status in ceph package in Ubuntu:
  New

Bug description:
  For a certain type of workload, the bluefs might never compact the log file, 
  which would cause the bluefs log file slowly grows to a huge size 
  (some bigger than 1TB for a 1.5T device).

  This bug could eventually cause osd crash and failed to restart as it couldn't get through the bluefs replay phase during boot time.
  We might see below log when trying to restart the osd:
  bluefs mount failed to replay log: (5) Input/output error

  There are more details in the bluefs perf counters when this issue happened:
  e.g. 
  "bluefs": {
  "gift_bytes": 811748818944,
  "reclaim_bytes": 0,
  "db_total_bytes": 888564350976,
  "db_used_bytes": 867311747072,
  "wal_total_bytes": 0,
  "wal_used_bytes": 0,
  "slow_total_bytes": 0,
  "slow_used_bytes": 0,
  "num_files": 11,
  "log_bytes": 866545131520,
  "log_compactions": 0,
  "logged_bytes": 866542977024,
  "files_written_wal": 2,
  "files_written_sst": 3,
  "bytes_written_wal": 32424281934,
  "bytes_written_sst": 25382201
  }

  As we can see the log_compactions is 0, which means it's never compacted and the log file size(log_bytes) is already 800+G. After the compaction, the log file size would reduced to around 
  1 G

  Here is the PR[1] that addressed this bug, we need to backport this to ubuntu 12.2.13
  [1] https://github.com/ceph/ceph/pull/17354

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1914911/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list