[Fwd: (fwd) BUG: mmapfile/writev spurious zero bytes still in the wild]

Tue Nov 18 23:37:59 UTC 2008

Tim Gardner wrote:
> Stefan,
> 
> Dunno if you saw this on the list awhile back, but it looks like a

No, that mail must somehow slipped by or drowned somewhere. The patch 
description rings a distant bell, though.

> candidate for stable update. It cherry-pickled cleanly from 2.6.25.y to
> Hardy, but I haven't pushed from my repo pending your thoughts on the issue.

This definitely looks like a candidate. ACK from me. It should go in with the 
next -proposed round.

> 
> rtg
> 
> 
> ------------------------------------------------------------------------
> 
> Subject:
> (fwd) BUG: mmapfile/writev spurious zero bytes still in the wild
> From:
> Bron Gondwana <brong at fastmail.fm>
> Date:
> Sat, 4 Oct 2008 10:30:06 +1000
> To:
> Ubuntu Kernel Team <kernel-team at lists.ubuntu.com>
> 
> To:
> Ubuntu Kernel Team <kernel-team at lists.ubuntu.com>
> CC:
> emlists at gmail.com
> 
> 
> I should have just added you as CC.  I've tried reporting this via
> launchpad, but it just keeps timing out on me again :(
> 
> I'm also CCing the user who reported the problem to me.  He's running
> 2.6.14-19-xen, but the problem exists in all 64 bit kernels in Ubuntu
> if it exists in one.  This causes massive data corruption to Cyrus
> running on kernels between 2.6.23 and 2.6.25.8.
> 
> You can read the full backthread here:
> 
> http://lkml.org/lkml/2008/6/17/9
> 
> Linus posted the patch here:
> 
> http://lkml.org/lkml/2008/6/17/234
> 
> And the fix was rolled into 2.6.25.8:
> 
> commit 864f24395c72b6a6c48d13f409f986dc71a5cf4a
> Author: Linus Torvalds <torvalds at linux-foundation.org>
> Date:   Tue Jun 17 17:47:50 2008 -0700
> 
>     x86-64: Fix "bytes left to copy" return value for copy_from_user()
>     
>     commit 42a886af728c089df8da1b0017b0e7e6c81b5335 upstream
>     
>     Most users by far do not care about the exact return value (they only
>     really care about whether the copy succeeded in its entirety or not),
>     but a few special core routines actually care deeply about exactly how
>     many bytes were copied from user space.
>     
>     And the unrolled versions of the x86-64 user copy routines would
>     sometimes report that it had copied more bytes than it actually had.
>     
>     Very few uses actually have partial copies to begin with, but to make
>     this bug even harder to trigger, most x86 CPU's use the "rep string"
>     instructions for normal user copies, and that version didn't have this
>     issue.
>     
>     To make it even harder to hit, the one user of this that really cared
>     about the return value (and used the uncached version of the copy that
>     doesn't use the "rep string" instructions) was the generic write
>     routine, which pre-populated its source, once more hiding the problem by
>     avoiding the exception case that triggers the bug.
>     
>     In other words, very special thanks to Bron Gondwana who not only
>     triggered this, but created a test-program to show it, and bisected the
>     behavior down to commit 08291429cfa6258c4cd95d8833beb40f828b194e ("mm:
>     fix pagecache write deadlocks") which changed the access pattern just
>     enough that you can now trigger it with 'writev()' with multiple
>     iovec's.
>     
>     That commit itself was not the cause of the bug, it just allowed all the
>     stars to align just right that you could trigger the problem.
>     
>     [ Side note: this is just the minimal fix to make the copy routines
>       (with __copy_from_user_inatomic_nocache as the particular version that
>       was involved in showing this) have the right return values.
>     
>       We really should improve on the exceptional case further - to make the
>       copy do a byte-accurate copy up to the exact page limit that causes it
>       to fail.  As it is, the callers have to do extra work to handle the
>       limit case gracefully. ]
>     
>     Reported-by: Bron Gondwana <brong at fastmail.fm>
>     Cc: Nick Piggin <npiggin at suse.de>
>     Cc: Andrew Morton <akpm at linux-foundation.org>
>     Cc: Andi Kleen <andi at firstfloor.org>
>     Cc: Al Viro <viro at ZenIV.linux.org.uk>
>     Acked-by: Ingo Molnar <mingo at elte.hu>
>     Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
>     Signed-off-by: Greg Kroah-Hartman <gregkh at suse.de>
> 
> 
> ----- Forwarded message from Bron Gondwana <brong at fastmail.fm> -----
> 
> Date: Fri, 3 Oct 2008 21:44:14 +1000
> From: Bron Gondwana <brong at fastmail.fm>
> Subject: BUG: mmapfile/writev spurious zero bytes still in the wild
> Cc: Bron Gondwana <brong at fastmail.fm>,
> 	Linux Kernel Mailing List <linux-kernel at vger.kernel.org>,
> 	Nick Piggin <npiggin at suse.de>,
> 	Andrew Morton <akpm at linux-foundation.org>,
> 	Rob Mueller <robm at fastmail.fm>, Andi Kleen <andi at firstfloor.org>,
> 	Ingo Molnar <mingo at elte.hu>
> To: Linus Torvalds <torvalds at linux-foundation.org>
> Organization: brong.net
> 
> On Tue, Jun 17, 2008 at 02:20:49PM -0700, Linus Torvalds wrote:
> 
> [reminder from way back: this bug was caused by writev containing 
> mmaped pages that weren't paged in, it's 64 bit only.  It
> particularly affects Cyrus Imapd's database formats]
> 
>> On Tue, 17 Jun 2008, Linus Torvalds wrote:
>>> Hmm. Something like this *may* salvage it.
>>>
>>> Untested, so far (I'll reboot and test soon enough), but even if it fixes 
>>> things, it's not really very good. 
>> Ok, so I just rebooted with this, and it does indeed fix the bug.
>>
>> I'd be happier with a more complete fix (ie being byte-accurate and 
>> actually doing the partial copy when it hits a fault in the middle), but 
>> this seems to be the minimal fix, and at least fixes the totally bogus 
>> return values from the x86-64 __copy_user*() functions.
> 
> Has this been revisited since?  I haven't noticed, but I really only
> skim LKML - have to save some time in the day for my real job[tm] of
> keeping an email service running!
> 
>> Not that I checked that I got _all_ cases correct (and maybe there are 
>> other versions of __copy_user that I missed entirely), but Bron's 
>> test-case at least seems to work properly for me now.
>>
>> Bron? If you have a more complete test-suite (ie the real-world case that 
>> made you find this), it would be good to verify the whole thing. 
> 
> It's been fine for us since, but unfortunately most of the world is
> still running distribution "stable" kernels.  I've just been helping a
> user who's getting corrupted flat file databases on Ubuntu's stable 64
> bit xen kernels, and it looks like it's the same issue.
> 
> Is there a standard way to tell backporters "you really need to add this
> patch for your users' sanity"?
> 
> Bron ( I tried reporting it in Launchpad, but kept getting timeout
>        errors, so I figured reposting here might get noticed.  Besides,
>        I can follow up on the "more complete fix" at the same time! )
> 
> ----- End forwarded message -----
> 

-- 

When all other means of communication fail, try words!