[Fwd: (fwd) BUG: mmapfile/writev spurious zero bytes still in the wild]
Stefan Bader
stefan.bader at canonical.com
Tue Nov 18 23:37:59 UTC 2008
Tim Gardner wrote:
> Stefan,
>
> Dunno if you saw this on the list awhile back, but it looks like a
No, that mail must somehow slipped by or drowned somewhere. The patch
description rings a distant bell, though.
> candidate for stable update. It cherry-pickled cleanly from 2.6.25.y to
> Hardy, but I haven't pushed from my repo pending your thoughts on the issue.
This definitely looks like a candidate. ACK from me. It should go in with the
next -proposed round.
>
> rtg
>
>
> ------------------------------------------------------------------------
>
> Subject:
> (fwd) BUG: mmapfile/writev spurious zero bytes still in the wild
> From:
> Bron Gondwana <brong at fastmail.fm>
> Date:
> Sat, 4 Oct 2008 10:30:06 +1000
> To:
> Ubuntu Kernel Team <kernel-team at lists.ubuntu.com>
>
> To:
> Ubuntu Kernel Team <kernel-team at lists.ubuntu.com>
> CC:
> emlists at gmail.com
>
>
> I should have just added you as CC. I've tried reporting this via
> launchpad, but it just keeps timing out on me again :(
>
> I'm also CCing the user who reported the problem to me. He's running
> 2.6.14-19-xen, but the problem exists in all 64 bit kernels in Ubuntu
> if it exists in one. This causes massive data corruption to Cyrus
> running on kernels between 2.6.23 and 2.6.25.8.
>
> You can read the full backthread here:
>
> http://lkml.org/lkml/2008/6/17/9
>
> Linus posted the patch here:
>
> http://lkml.org/lkml/2008/6/17/234
>
> And the fix was rolled into 2.6.25.8:
>
> commit 864f24395c72b6a6c48d13f409f986dc71a5cf4a
> Author: Linus Torvalds <torvalds at linux-foundation.org>
> Date: Tue Jun 17 17:47:50 2008 -0700
>
> x86-64: Fix "bytes left to copy" return value for copy_from_user()
>
> commit 42a886af728c089df8da1b0017b0e7e6c81b5335 upstream
>
> Most users by far do not care about the exact return value (they only
> really care about whether the copy succeeded in its entirety or not),
> but a few special core routines actually care deeply about exactly how
> many bytes were copied from user space.
>
> And the unrolled versions of the x86-64 user copy routines would
> sometimes report that it had copied more bytes than it actually had.
>
> Very few uses actually have partial copies to begin with, but to make
> this bug even harder to trigger, most x86 CPU's use the "rep string"
> instructions for normal user copies, and that version didn't have this
> issue.
>
> To make it even harder to hit, the one user of this that really cared
> about the return value (and used the uncached version of the copy that
> doesn't use the "rep string" instructions) was the generic write
> routine, which pre-populated its source, once more hiding the problem by
> avoiding the exception case that triggers the bug.
>
> In other words, very special thanks to Bron Gondwana who not only
> triggered this, but created a test-program to show it, and bisected the
> behavior down to commit 08291429cfa6258c4cd95d8833beb40f828b194e ("mm:
> fix pagecache write deadlocks") which changed the access pattern just
> enough that you can now trigger it with 'writev()' with multiple
> iovec's.
>
> That commit itself was not the cause of the bug, it just allowed all the
> stars to align just right that you could trigger the problem.
>
> [ Side note: this is just the minimal fix to make the copy routines
> (with __copy_from_user_inatomic_nocache as the particular version that
> was involved in showing this) have the right return values.
>
> We really should improve on the exceptional case further - to make the
> copy do a byte-accurate copy up to the exact page limit that causes it
> to fail. As it is, the callers have to do extra work to handle the
> limit case gracefully. ]
>
> Reported-by: Bron Gondwana <brong at fastmail.fm>
> Cc: Nick Piggin <npiggin at suse.de>
> Cc: Andrew Morton <akpm at linux-foundation.org>
> Cc: Andi Kleen <andi at firstfloor.org>
> Cc: Al Viro <viro at ZenIV.linux.org.uk>
> Acked-by: Ingo Molnar <mingo at elte.hu>
> Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh at suse.de>
>
>
> ----- Forwarded message from Bron Gondwana <brong at fastmail.fm> -----
>
> Date: Fri, 3 Oct 2008 21:44:14 +1000
> From: Bron Gondwana <brong at fastmail.fm>
> Subject: BUG: mmapfile/writev spurious zero bytes still in the wild
> Cc: Bron Gondwana <brong at fastmail.fm>,
> Linux Kernel Mailing List <linux-kernel at vger.kernel.org>,
> Nick Piggin <npiggin at suse.de>,
> Andrew Morton <akpm at linux-foundation.org>,
> Rob Mueller <robm at fastmail.fm>, Andi Kleen <andi at firstfloor.org>,
> Ingo Molnar <mingo at elte.hu>
> To: Linus Torvalds <torvalds at linux-foundation.org>
> Organization: brong.net
>
> On Tue, Jun 17, 2008 at 02:20:49PM -0700, Linus Torvalds wrote:
>
> [reminder from way back: this bug was caused by writev containing
> mmaped pages that weren't paged in, it's 64 bit only. It
> particularly affects Cyrus Imapd's database formats]
>
>> On Tue, 17 Jun 2008, Linus Torvalds wrote:
>>> Hmm. Something like this *may* salvage it.
>>>
>>> Untested, so far (I'll reboot and test soon enough), but even if it fixes
>>> things, it's not really very good.
>> Ok, so I just rebooted with this, and it does indeed fix the bug.
>>
>> I'd be happier with a more complete fix (ie being byte-accurate and
>> actually doing the partial copy when it hits a fault in the middle), but
>> this seems to be the minimal fix, and at least fixes the totally bogus
>> return values from the x86-64 __copy_user*() functions.
>
> Has this been revisited since? I haven't noticed, but I really only
> skim LKML - have to save some time in the day for my real job[tm] of
> keeping an email service running!
>
>> Not that I checked that I got _all_ cases correct (and maybe there are
>> other versions of __copy_user that I missed entirely), but Bron's
>> test-case at least seems to work properly for me now.
>>
>> Bron? If you have a more complete test-suite (ie the real-world case that
>> made you find this), it would be good to verify the whole thing.
>
> It's been fine for us since, but unfortunately most of the world is
> still running distribution "stable" kernels. I've just been helping a
> user who's getting corrupted flat file databases on Ubuntu's stable 64
> bit xen kernels, and it looks like it's the same issue.
>
> Is there a standard way to tell backporters "you really need to add this
> patch for your users' sanity"?
>
> Bron ( I tried reporting it in Launchpad, but kept getting timeout
> errors, so I figured reposting here might get noticed. Besides,
> I can follow up on the "more complete fix" at the same time! )
>
> ----- End forwarded message -----
>
--
When all other means of communication fail, try words!
More information about the kernel-team
mailing list