[SRU][B][F][G][H][PATCH 0/6] raid10: Block discard is very slow, causing severe delays for mkfs and fstrim operations

Tue May 25 06:00:57 UTC 2021

Hi Tim,

Thanks for the ack. Please also ask your colleagues for a second and maybe
third review. 

The patches still apply cleanly to master-next for all series.

I just wanted to circle around with how testing has been going.

My instance on Google Cloud has 4x 375gb NVMe disks, arranged in Raid10, mounted
as /home.

I changed the fstrim.timer to include /home with Preservehome=no, and set the
fstrim.timer to run hourly, instead of weekly.

I have my kernel git repos on the instance, and over the last week or two I have
been doing kernel builds and general work. I did install ubuntu-desktop and try
RDP, but it was super slow, and unusable, so I stuck to ssh.

Things have been pretty stable.

I have been doing regular reboots, and the instance always comes back up, so
/home hasn't been corrupted.

I did a fsck, and it came back mostly clean, with an off-by-one free inode
count being the only complaint:

$ sudo fsck -n -f /dev/md0
fsck from util-linux 2.34
e2fsck 1.45.5 (07-Jan-2020)
Warning!  /dev/md0 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free inodes count wrong (49055036, counted=49055037).
Fix? no

/dev/md0: 80580/49135616 files (0.2% non-contiguous), 4370396/196541952 blocks

I also did a git fsck on the kernel git repos, and they came back clean, with
the same warnings as my regular work pc non-raid10.

Raid10 ext4: https://paste.ubuntu.com/p/HB2CJ7h3jw/
Standard ext4: https://paste.ubuntu.com/p/4rG32zMXJx/

So my kernel git trees experienced no data corruption.

One of my customers has been running the Bionic HWE test kernel on their QA
test clusters for a few weeks now, and they haven't ran into any issues. They
are also asking for a hotfix, so they can roll out the kernel more widely, but
I think they can hold on for a kernel in -proposed.

One of my other customers have been doing some more testing with the test
kernels, and are happy to help test as soon as new kernels land in -proposed.

I also have two community users testing the test kernels.

Thimo, in LP #1907262 [1], the original regression reporter, has been running
the test kernel on a machine, and hasn't found any issues.

Evan, in LP #1896578 [2], has installed the test kernel and can attest to its
speed, but hasn't done any long runs with the kernel to attest to its data
safety.

If the patches get another ack and get applied and built, I am confident we
can get everyone testing the kernel while it is in -proposed. I will also switch
my cloud instance to the -proposed kernel.

Let me know if there is any more testing I can do in the meantime.

Thanks,
Matthew

On 11/05/21 12:08 am, Tim Gardner wrote:
> 
> 
> On 5/9/21 7:25 PM, Matthew Ruffell wrote:
>> Hi Tim,
>>
>> I appreciate your cautiousness, and I too am happy to play the conservative game
>> on this particular patchset. I _really_ don't want to cause another regression,
>> the one in December caused too much trouble for everyone as it was.
>>
>> I'm happy to NACK the patchset for this cycle, and resubmit in the next cycle,
>> as long as we set out and agree on what additional regression testing should be
>> performed, and if I go perform those tests then you would seriously consider
>> accepting these patches.
>>
>> Testing that I have currently performed:
>> - Testing against the testcase of mkfs.xfs / mkfs.ext4 / fstrim on a Raid10 md
>>    block device, as specified in LP #1896578 [1].
>> - Testing against the disk corruption regression testcase, as reported in
>>    LP #1907262 [2]. All disks now fsck clean with this new revision of the
>>    patchset, which you can see in comment #15 [3] on LP #1896578.
>> - Three customers have tested test kernels and haven't found any issues (yet).
>>    Testing that I could go perform over the next week or two:
>> - Run xfstests with the generic/*, xfs/* and ext4/* testsuites over two Raid10
>>    md block devices, one test block device, one scratch block device. I would
>>    do two runs, one with a released kernel without the patches, and one with the
>>    test kernels in [4], to see if there is any regressions.
>> - I could use a cloud instance with NVMe drives as my primary computer over the
>>    next two or so weeks, and have my /home on a Raid10 md block device, and
>>    change the fstrim systemd timer from weekly to every 30 minutes, and see if I
>>    come across data corruption.
>>
> 
> I like this setup ^. Its a bit more live and random then focused testing. Plus, you're more likely to notice regressions or delays in file system access.
> 
>> Just so that you are aware, I have three different customers who would very much
>> like these patches to land in the Ubuntu kernels. Two are deploying systems via
>> MAAS or curtin, and are seeing deployment timeouts when deploying Raid10 to
>> NVMe disks, since their larger arrays take 2-3 hours to perform a block discard
>> on with current kernels, when normal systems only take 15-30 mins to deploy, so
>> they don't want to increase the deployment timeout setting for this one outlier.
>> The other customer doesn't want to spend 2-3 hours waiting for their Raid10 md
>> arrays to format with a filesystem, when it could take 4 seconds instead.
>>
> 
> I can see why they are pushing to get this fixed.
> 
>> I know this is a big change, and I know that this set has already caused grief
>> with a regression in December, but customers are requesting this feature, and
>> because of that, I'm willing to work with you to figure out appropriate testing
>> required, and hopefully get this landed in Ubuntu kernels safely in the near
>> future.
>>
>> Let me know what additional testing you would like to see, and I will go and
>> complete it.
>>
>> Thanks,
>> Matthew
>>
>> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1896578
>> [2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262
>> [3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1896578/comments/15
>> [4] https://launchpad.net/~mruffell/+archive/ubuntu/lp1896578-test
>>
> 
> This patch set won't make the 2021.05.10 SRU cycle, so lets schedule it for the next SRU cycle (beginning about June 1). That'll give another few weeks of test. If there are no regressions then I think we can get it included. Be sure to ping me about then so I can annoy someone else on the team to review as well. Watch for upstream Fix patches in the meantime.
> 
> rtg
> -----------
> Tim Gardner
> Canonical, Inc