[stable] [PATCH 00/16]: Fix writeback regressions in 2.6.32

Jan Kara jack at suse.cz
Tue Aug 17 00:15:45 UTC 2010


On Mon 16-08-10 17:44:04, Stefan Bader wrote:
> On 08/16/2010 05:27 PM, Jan Kara wrote:
> > On Mon 16-08-10 14:32:55, Stefan Bader wrote:
> >> On 08/12/2010 05:28 AM, Jens Axboe wrote:
> >>> On 08/11/2010 04:12 PM, Jens Axboe wrote:
> >>>> On 08/11/2010 03:59 PM, Greg KH wrote:
> >>>>> On Tue, Aug 10, 2010 at 11:28:48AM +0200, Stefan Bader wrote:
> >>>>>> Impact: 2.6.32 is facing some serious performance regression which are visible
> >>>>>> plainly when unmounting filesystems as async writeback degrades to sync in
> >>>>>> that case. But there seem to be other less visible problems as well.
> >>>>>> I believe this set of patches also would be benefitial to go into upstream
> >>>>>> stable.
> >>>>>>
> >>>>>> Fix: This series of patches was picked and backported when needed to apply
> >>>>>> to 2.6.32.y (there is another set for 2.6.34.y[1]). It probably does not
> >>>>>> solve all currently known problems but testing has shown considerable
> >>>>>> improvements. At the beginning of the set there are some patches that were
> >>>>>> picked to make the later patches applicable. Only one of them is a bit more
> >>>>>> intrusive the others rather trivial.
> >>>>>>
> >>>>>> Testcase: Unmounting a fs that still has larger amounts of data to write back
> >>>>>> to disk will take noticably longer time to complete (IIRC might be 30min).
> >>>>>> Also general responsiveness was reported to improve. Currently there are no
> >>>>>> known regressions reported when testing this backport.
> >>>>>
> >>>>> Jens, what do you think about this series?
> >>>>
> >>>> I think it is scary :-)
> >>>>
> >>>> But seriously, we need to do something about this umount regression and
> >>>> there's likely no way around this series. Doing the full backport is
> >>>> the best option. Stefan, to what extent did you test this backport?
> >>>
> >>> So I discussed this with Jan tonight, and a better solution may be
> >>> something similar to my first attempt at fixing this patch earlier.
> >>> Basically just pass in the info on whether this is done from umount or
> >>> not. The bug here is that umount already holds the sb sem when doing
> >>> WB_SYNC_NONE writeback, so we skip all dirty inodes for that pass and
> >>> end up doing everything as WB_SYNC_ALL. If we pass in this lock
> >>> information, then wb_writeback() knows when to skip pinning the sb.
> >>>
> >>> The below patch is totally untested. Stefan, care to give it a spin with
> >>> the test case? This would be a lot less intrusive than the full
> >>> backport, which is primarily backporting cleanups and optimizations that
> >>> we really need not put in 2.6.32-stable if at all avoidable.
> >>>
> >> While applying the patch for testing I saw that stable 2.6.32 carries
> >>
> >> commit b78a38dca6e04634ddc718e315712b45abcf92fd
> >> Author: Eric Sandeen <sandeen at redhat.com>
> >> Date:   Wed Dec 23 07:57:07 2009 -0500
> >>
> >>     fs-writeback: Add helper function to start writeback if idle
> >>
> >>     commit 17bd55d037a02b04d9119511cfd1a4b985d20f63 upstream.
> >>
> >> which needs an additional change in writeback_inodes_sb_if_idle(). Making an
> >> educated guess I changed that call into the "not locked" variant (patch
> >> attached). Is that correct?
> >   Yes, that is correct.
> > 
> >> From e8f179b9dc92fe91d7379a3ac330a57a2ef9fc87 Mon Sep 17 00:00:00 2001
> >> From: Jens Axboe <axboe at kernel.dk>
> >> Date: Fri, 13 Aug 2010 14:45:16 +0200
> >> Subject: [PATCH] writeback: Alternate fix for umount regression
> >>
> >> Signed-off-by: Stefan Bader <stefan.bader at canonical.com>
> >   The patch looks OK to me. You can add
> > Acked-by: Jan Kara <jack at suse.cz>
> 
> Just to quickly note: this just makes it compile. Due to workload I did not yet
> succeed in providing test kernels to people for testing. So still in untested stage.
  OK, I gave it some testing and the patch seems to have a bug - from
__sync_filesystem() we are called with locked == 1, which is correct, but
bdi_start_writeback() does not wait for work to finish and thus the lock
gets released before the work finishes.
  An easy solution is to make bdi_start_writeback() wait for locked works
but then we don't use the paralellism of multiple bdis. To do that, we
would have to modify sync_filesystems() to submit unlocked work, while
sync_filesystem() would still submit locked work. Kind of nasty and looking
at upstream we don't do it there either...

								Honza
-- 
Jan Kara <jack at suse.cz>
SUSE Labs, CR




More information about the kernel-team mailing list