Likely Duplicate Bugs

Colin Ian King colin.king at canonical.com
Thu Feb 11 18:16:28 UTC 2010


Hiya,

On Thu, 2010-02-11 at 09:52 -0800, Brian Murray wrote:
> On Mon, Feb 08, 2010 at 03:15:57PM +0000, Colin Ian King wrote:
> > Hi Brian,
> > On Sun, 2010-02-07 at 18:25 -0800, Brian Murray wrote:
> > > On Mon, Feb 01, 2010 at 11:56:00AM +0000, Colin Ian King wrote:
> > > > Hi Brian,
> > > > 
> > > > This kernel message are dups of the same BIOS corruption message, but
> > > > occurs on a wide range of machines. I had a look at the first 15 or so
> > > > of the dups and saw that there was a wide spread of Aspire, HP Compaqs
> > > > and Pavilions and quite a number of unknown systems too.
> > > 
> > > Should these duplicate bug reports be consolidated into one bug or
> > > should there be multiple bug reports for each system model?
> > 
> > Well, the warning is the same manifestation of possibly different BIOS
> > issues from many different BIOSes (possibly even different versions of
> > the BIOS on the *same* system model), so I'd count these as separate
> > reports even if there are a lot of them.   I think I would only attempt
> > to consolidate bug reports based on the same system model or the
> > "Hardware name" field as reported in the OopsText.txt, e.g.:
> > 
> > Hardware name: HP Compaq nc6120 (PY390ES#ABZ)
> > 
> > ..not sure how easy that is to do automatically.
> 
> It wasn't that hard so I've consolidated the bug reports based on the
> hardware models per your suggestion.  There were quite a few with a
> Hardware name of INVALID which I left alone.  
>
INVALID is tricky - best to silo those into the lame BIOS category.

> Maybe it would be a good
> idea to also modify the bug summary / title with the hardware name?

I'm happy with the idea of modifying the bug summary / title like that.

> 
> > > > Specifically, the kernel fills known regions of the low 64K of memory
> > > > with a known pattern and periodically monitors them.  Any buggy BIOS
> > > > that writes to these regions gets detected and the warning is issued.
> > > > 
> > > > BIOS corruption of these regions can occur when doing suspend/resume or
> > > > HDMI cable unplugging.
> > > > 
> > > > The error message is a warning - the system's stability is not
> > > > compromised as the pages being monitored are already reserved for the
> > > > purpose of being monitored for corruption in the first place.
> > > > 
> > > > This check can be disabled by setting the kernel boot parameter
> > > > memory_corruption_check=0
> > > > 
> > > > Since this is intended as a BIOS corruption detection tool perhaps it
> > > > should be disabled as a compile time option to stop getting these
> > > > messages. However, it does have some value in showing that the BIOS may
> > > > be dodgy. 
> > > 
> > > Perhaps the ideal solution is to keep it enabled until the final version
> > > of the release.  This will still provide us with useful information
> > > about buggy BIOSes but by disabling it for the final release we will
> > > reduce the quantity of redundant bugs.
> > > 
> > Information about buggy BIOSes is useful in one respect, but generally
> > requires a BIOS fix which means we cannot do much apart from perhaps
> > recommend a BIOS upgrade - but that's a risky operation which may
> > provide little or no benefit.
> > 
> > Since the kernel isn't using the memory for anything important then
> > these kind of warnings are of a small amount of use. However they
> > possibly give users alarming messages when in fact they are just helpful
> > for some classes of debugging. Hence they should be totally disabling on
> > the final release IMHO.
> > 
> > Any comments anyone?

Please - someone else like to comment? :-)

> 
> It shouldn't be hard to write bug patterns for the check_bios_corruption
> messages and the specific hardware.  Does that seem like a good idea to
> anyone else?
>  






More information about the kernel-team mailing list