8.04-1 won't boot from degraded raid
sam.howard at gmail.com
Tue Aug 26 19:56:24 UTC 2008
I really don't want to get into the middle of a flame war, but I don't
understand something you wrote and would like clarification so that I am not
assuming something incorrectly.
On Tue, Aug 26, 2008 at 9:54 AM, Soren Hansen <soren at ubuntu.com> wrote:
> I'm quite happy that the server doesn't boot if my raid array is broken,
> Imagine a scenario where the disk controller is flaky. Disk A goes away
> while the system is running, and is then out of date. You reboot the
> machine (or perhaps it rebooted itself because the flaky controller
> short circuited or whatever), and for whatever reason (flaky controller,
> remember?), the system boots from disk B instead. The changes to your
> filesystem since disk A disappeared away are not there, and new changes
> are being written to disk B, and there's no chance of merging the two.
> This is what I refer to as "having a very bad day".
> There are lots of other scenarios where you really don't want to boot if
> your RAID array is not in tip-top shape. If the system is already
> running, it knows something about its current state, which disk is the
> more trustworthy one, etc. When booting, this is not the case.
> I value data over uptime.
I agree that data is is of the utmost importance, but in your scenerio, you
loose disk A in a running system, but you imply that upon reboot on disk B,
your data between A and B is not in sync. It is no more out of sync than
when the system was running with a broken A disk anyway. I am assuming you
are talking about RAID1, which would keep the disks in sync until one of
them goes away, at which point, B is your current disk anyway.
"The changes to your filesystem since disk A disappeared away are not there,
and new changes are being written to disk B, and there's no chance of
merging the two."
Did you just mistype your example, or am I missing something really obvious
Just to muddy the waters a bit more about which boot-on-broken-raid function
is more useful, I have to vote on booting on 1 disk of a broken raid. I say
this for a few reasons:
1 - since I run RAID1, my disks are always in sync (or the broken disk is
broken and out of sync and needs to be replaced anyway)
2 - I expect to be alerted by the mdadm daemon when a disk goes broken, so I
should know I have something to go fix (note: make sure the mdadm is
configured to send e-mail to someone who will actually *see* it)
3 - most of my servers are remote, so the ability to affect a repair and
recovery w/o a booting system (albeit on the surviving disk) is between slim
and none ... if you've ever tried to talk a non-technical user through
booting on a live cd and then configure the networking, you know what I'm
Specifically, I am working on a system about 2,000 miles away from me,
trying to recover to a new disk ... were I not able to boot off of the
surviving disk, we would be talking about FedEx'ing a server to me to try to
boot off of CD (after installing a CD drive, of course) or network, replace
and repair, and the FedEx back. Seems sort of silly, doesn't it? It also
opens the door for additional damage or data loss during shipping.
I support (professionally) servers literally around the world, of many *nix
operating systems, and the ability to remotely recover a server is
Ironically, the server I am recovering now is an old Debian server that I
have build Hardy replacements for, but now I am a bit nervous about sending
the replacement servers into the field. I would very much like to have a
workaround or fix that allows me to remotely repair a Hardy server ...
especially since most everything else seems to work so nicely in Hardy (Xen
server up in <30 minutes and only a handful of apt-gets and such before
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ubuntu-server