Discussion about UUID and ide-generic
bcollins at ubuntu.com
Tue Mar 21 16:00:12 GMT 2006
This was a lengthy discussion that actually touched on two subjects. The
quick summary is:
- Changed device names can cause failure to mount rootfs, and thus
failure to boot the system.
- The ProbeForRootFS goal addressed this for removable devices (USB
sticks and such) by mounting based on UUID, instead of device path.
- This doesn't help the case such as bug 6367, where the change is
caused by docking station on a laptop (hda becomes hde).
- Widespread usage of UUID is not acceptable because of limitations with
ide-generic (do not want to load it without reason, because it can break
SATA and such), plus being so late in dapper development.
So, several things were decided:
- ide-generic should not be loaded so soon. There is almost no reason
for it to be loaded at all. It was suggested that initramfs-tools wait
10 seconds for other drivers to finish initialization before falling
back to the failsafe ide-generic. This will fix a lot of other bugs.
Even though it was only indirectly related to the original problem, it
seems a good idea anyway. This would partially fix the limitations with
widespread usage of UUID for root-fs, but it is too late in dapper to
make the move to use UUID for other than removable media.
- To fix the original problem, Mithrandir suggested that when the
initramfs is created, to store the UUID of the root fs. This can be used
as a failover if the original device path does not become available,
even after all drivers are exhausted. This alone would fix the original
Primary IRC participants included Kamion, Mithrandir, mjg59, mdz and
BenC Anyone know if https://wiki.ubuntu.com/ProbeForRootFilesystem is
implemented and supported now?
* jdub wonders why evince-gtk isn't built from evince
Mithrandir BenC: it is
jdub *cry* and it's for xubuntu...
seb128 jdub: because I refuse to do that
jdub seb128: because it would ditch cdbs?
seb128 jdub: that would mean I've to update a 1000 lines patches on
evince to be able to to upload a new version for Ubuntu
Mithrandir BenC: as you'd have seen if you followed the LP link and
looked at the status.
jdub because jbailey hasn't fixed it to do multibuild? :)
* holycow (n=a at mail.wjsgroup.com) has joined #ubuntu-devel
jdub seb128: oh - i thought the gtk build was integrated now?
seb128 jdub: no, it's a xubuntu hack, nothing upstreamish
BenC Mithrandir: my issue was that a bug report seems to show that it
atleast wasn't working recently
jdub seb128: i thought it was sucked in already for 770 builds
jdub maybe they're maintaining a crazy patch
Mithrandir BenC: which bug?
jdub seb128: i figured xubuntu would have a bit of gnomey stuff anyway -
seb128 jdub: it didn't make to a tarball yet if that's the case
Mithrandir BenC: it's only done for removable devices; talk to scott for
the exact details on why it's done that way.
seb128 jdub: no, CVS has no mention of such change
* jsgotangco (n=jsg at ubuntu/member/jsgotangco) has joined #ubuntu-devel
BenC Mithrandir: Problem is, docking stations are "removable" devices
Mithrandir BenC: how can I detect that from d-i?
BenC Even then, removable devices can change the device names of
BenC For instance, I had a USB stick that loaded up as sda in the
installer, and my internal SATA disk later came up as sdb, during
BenC on boot, the SATA disk was sda
Kamion I think that's a hw-detect bug
Kamion it's loading usb before scsi
Mithrandir BenC: how would we solve that, then?
BenC IMO, the UUID thing needs to be for everything
Kamion if Tollef has time to test switching those round and upload if it
works, be my guest, otherwise I'll try to get round to it later
Kamion BenC: I don't think we can do that for dapper
BenC especially when we move to 2.6.17, where disks that were once hda
will suddenlly become sda
* zul (n=chuck at ubuntu/member/zul) has joined #ubuntu-devel
Kamion we looked at it at the distro sprint, and it was hard
* Lure has quit ("Konversation terminated!")
Kamion post-dapper, can rethink, sure
jdub that's going to be an amusing migration
BenC Kamion: what's difficult about it?
Mithrandir BenC: that you don't know what bus your root device is on.
* LeeJunFan has quit (Remote closed the connection)
BenC what does that have to do with UUID though?
Kamion I just remember that the initramfs-tools issues were hideously
Kamion Mithrandir probably has more state in his head about it
Mithrandir hda is ide, sda is scsi/sata, so you know if you're _ever_
going to need ide-generic to find /
BenC assumption of bus is a bad thing to depend on, why does it do that?
Mithrandir and to make this even more amusing, if your sata device
hasn't initialised completely and you decide that you want to load
ide-generic you risk that your nice and shiny controller falls back into
80s-mode since some controllers react quicker to old-style IDE commands
than to SATA commands.
BenC sda doesn't even assume scsi, it could be sata, scsi, usb,
Mithrandir it assumes !ide which is basically what we want to know.
Kamion the relevant bit is that we know sda isn't ide-generic (*at the
moment*) and loading ide-generic hurts us bad
Kamion if it's SATA
mjg59 Surely this can be dealt with by never loading ide-generic before
loading any sata modules?
Kamion mjg59: that might work if loading sata were synchronous
BenC what if that have a PATA disk using SATA module, and they move that
disk to an IDE controller (replace mobo, or just move it to a different
bus int he same system)?
BenC that should keep them from booting
Mithrandir mjg59: doesn't help when you have the sequence: load
sata_foo ; wait a bit; load ide-generic and bam - controller gets
claimed by ide-generic.
mjg59 Mithrandir: Hm. We really need some way of fixing that.
BenC shouldn't keep
Kamion BenC: when we deferred this, we were aware that there were still
some broken cases, but I think it's more important to avoid regressions
Mithrandir mjg59: iz hardware bug. I don't think we can.
mjg59 Mithrandir: ?
BenC Ok, then 6367 is just going to have to wait for this to all get
sorted out :)
mjg59 Mithrandir: The fact that some SATA controllers use the legacy
ports is hardly a bug...
Kamion ide-generic has a history of causing us horrible pain
BenC BTW, MacOSX uses UUID for the root device, and seems to handle it
Kamion you mean that operating system that runs on highly constrained
Mithrandir mjg59: the problem here is mostly everything is async.
Arbitration of, say, USB or SCSI is basically "wake up" and then devices
reporting back a bit later. Where "a bit" may be measured in minutes.
* bddebian (n=bddebian at mail.ottens.com) has joined #ubuntu-devel
BenC I'd hardly call my off-the-shelf AMD64 highly constrained
mjg59 Hang on a second. Why do we /ever/ need ide-generic on a PCI
Kamion BenC: you have Mac OS X running on your AMD64?
BenC yeah, on a disk that I installed from a P4
Mithrandir mjg59: I think it's needed for some machines like mdz's
mjg59 Mithrandir: ?
* Kamion scratches his head and wonders, er, *how*
mjg59 Mithrandir: If it is, then something else is going /seriously/
BenC so what package handles this probe-root-fs thing?
mjg59 I do dislike the way that ide-generic has reached this
mjg59 It's a very simple IDE driver that will bind to legacy i/o ports
and do pio
BenC I think ide-generic need only be loaded if the root-fs doesn't show
up after 10 seconds
Mithrandir BenC: congratulations, you just broke fabio's boot
Kamion BenC: kind of spread out over partman-target, initramfs-tools,
Kamion oh, and grub-installer I suppose
mjg59 The only cases where it is needed are systems where we don't have
a driver for the PCI interface
BenC Kamion: which one decides "use UUID or device"?
mdz BenC,mjg59: it's a tangled mess brimming with regression
mjg59 Or where it's an ISA system
Mithrandir BenC: grub-installer.
mdz mjg59: that doesn't seem to be the case; it's needed in some cases
even where we do have a driver for the PCI interface
Kamion debian-installer is an acceptable catch-all for installer bugs
mjg59 mdz: If anyone is currently using ide-generic, that's already a
mjg59 mdz: No, that was a bug in an old Debian patch. That hasn't been
true for some time
BenC I'm going to kick this bug over to grub-installer then, since
there's no way udev or the kernel can fix it
mjg59 mdz: The problem there was that the call that probed PCI IDE
interfaces was in ide-generic (by accident)
mdz mjg59: we've done it this way because we had bugs otherwise
Mithrandir BenC: we could _possibly_ change to by-id since we then get
* pitti_ is now known as pitti
mjg59 mdz: And we've never attempted to understand the bug properly
mdz Keybuk may recall the details better
* trulux (n=lorenzo at unaffiliated/trulux) has joined #ubuntu-devel
mjg59 If we can find any cases where ide-generic is necessary with a
current kernel, I'd be amazed
mjg59 (And will also fix it)
pitti hi mdz
* sbalneav (n=sbalneav at mail.legalaid.mb.ca) has joined #ubuntu-devel
BenC I agree with mjg59
BenC I cannot even remember the last time I actually had to use
Gwynn seveas: point taken, how long?
BenC even my old legacy systems don't require it
mdz pitti: morning
mdz I'll do a test on my laptop
BenC Mithrandir: are you saying that fabio has a machine that needs
* G0SUB (n=ghoseb at ubuntu/member/g0sub) has joined #ubuntu-devel
mdz oh, my laptop doesn't load ide_generic anymore
mjg59 mdz: Heh
Mithrandir BenC: no, I'm saying he has a machine where loading
ide-generic after ten seconds makes it not boot.
mjg59 Mithrandir: How?
mdz Mithrandir: if the root device hasn't appeared after 10 seconds, it
wouldn't boot anyway
Mithrandir mjg59: he has a complicated setup with multiple IDE and SCSI
Mithrandir mdz: why not?
mjg59 Mithrandir: That doesn't really answer my question :)
Mithrandir mdz: most scsi arrays need more than that to wake up.
BenC Mithrandir: No, I said do not load ide-generic unless the rootfs
isn't visible after 10 seconds of hw-detect
mdz Mithrandir: mine doesn't
BenC if his rootfs isn't visible after 10 seconds, he has a worse
Kamion BenC: I don't think you mean literally "hw-detect" there ...
(that's an installer component)
BenC I mean where it loads all the modules to support the PCI devices
Robot101 Mithrandir: my scsi arrays block the kernel/bios when they're
taking their stupidly long time to fiddle around
Mithrandir mjg59: ask fabio, I'm not intimately familiar with his exact
BenC if the rootfs doesn't appear for 10 seconds after that, load
ide-generic as a fallback _only_
mdz under what circumstances do we load ide_generic currently?
mdz I thought we loaded it unconditionally, but apparently not
mjg59 Mithrandir: When you load ide-generic, the only thing it does is
call ideprobe_init. That does nothing other than scan legacy i/o ports
and then bind to them if they're not already in use
BenC mdz: when the rootfs isn't loaded after loading the ide drivers
that it detects are needed
Mithrandir Kamion: ok, on-the fly selection of keymaps in
espresso-kbd-chooser works now.
BenC but it is immediate
Kamion Mithrandir: wow, great
Kamion I have an upload coming up soon if you want to sneak it in for
BenC mdz: we loaded it unconditionally in breezy
BenC that changed in dapper
BenC Mithrandir: if you implemented this 10 second rule, then you could
forget all about bus type, and implement UUID more widely
Kamion why 10 seconds, incidentally?
Mithrandir BenC: post-ff, too late.
Kamion just random?
* Gwynn has quit (Read error: 104 (Connection reset by peer))
BenC seemed like plenty of time without being too long
mjg59 Kamion: Basically guarantees that PCI drivers have had time to
* Gwynn (i=Gwynn at cp106356-a.tilbu1.nb.home.nl) has joined #ubuntu-devel
BenC 5 seconds may be racey
Mithrandir Kamion: upload of espresso or kbd-chooser?
Kamion Mithrandir: espresso
Kamion if you've got "n seconds" in there, it's racy anyway :-)
mjg59 Really we want some way of querying whether drivers have finished
their init code
Kamion you're just breaking the legs of one of the runners
mjg59 mdz: The comment in initramfs-tools claims that we only load it
because it may be needed for ISA devices
BenC Kamion: anything that can't beat 10 seconds needs to not even
finish the race :)
mdz mjg59: perhaps that is true these days; it's changed since the last
time I looked at it
Kamion I really think that installer changes that sweeping (affecting
just about all mainstream hardware installations) needed to go in before
mjg59 mdz: Yes. As I said, in the past it was needed in order to trigger
registration of the PCI IDE drivers (which was a bug)
Kamion I'm sorry we didn't have everyone in the right place to talk
about this earlier, but still
mjg59 Though, unrelatedly, I /would/ like it if we could get the "Don't
load ide-generic unless we really need to" into dapper regardless of
BenC Kamion: well, we have 6 extra weeks...any way we could crunch this
mjg59 It blocks some Toshibas for 15 seconds or so on load
Kamion installer changes like this need to go in at the start, surely
mdz mjg59: yes, I recall that not being necessary anymore
BenC yeah, not so much the UUID thing, but the ide-generic delay
Kamion BenC: Parkinson's Law
mdz I'm with Kamion; it's quite late to fiddle with regression-prone
bits like this
* stratus has quit (Read error: 113 (No route to host))
Kamion bottom line: is there hardware where this is a regression from
breezy that we could fix by moving to UUID?
BenC ide-generic is actually causing immediate problems, and the "bus
type" thing is just a hack
* ubuntulog (i=ubuntulo at trider-g7.fabbione.net) has joined #ubuntu-devel
Kamion and is there any other safe way to fix that hardware?
mdz BenC: what problems?
BenC mdz: problem where SATA isn't detecting fast enough, so ide-generic
gets loaded and your SATA controller is suddenly a legacy IDE controller
BenC Kamion: not so much a regression as it is that it's much more
Kamion can we do the 10-second thing (although I thought we did it
already) and *not* do UUID?
BenC that coupled with the fact that we need UUID support in order to
make the change from IDE to SATA/PATA drivers next release
mdz where does UUID enter into it?
Kamion BenC: we have to cope with migration from breezy anyway, doesn't
much matter whether we do that in dapper
BenC switching IDE to PATA requires us to do UUID atleast one release
mdz we made an explicit decision way back at UBZ not to do rely on UUID
by default for dapper
mjg59 BenC: No, since we can't enforce UUIDs on upgrades
jdub highvoltage: ping
Kamion but breezy->dapper upgrades wouldn't switch fstab/grub menu.lst
mdz BenC: how so? I don't see why we couldn't make that change on
BenC ok, UUID makes it _simpler_
mjg59 Or, if we /can/ enforce that on upgrades, we can do it on upgrade
Kamion mjg59: right
Mithrandir I'm thinking of an evil hack.
mdz mounting root by UUID by default is a completely appropriate dapper
mdz trivial to switch on early, plenty of time to sort it out, dapper
waiting in the wings if it turns out to be chancy
BenC UUID is the minor issue here, the ide-generic delay is the major
Mithrandir we could encode the uuid of the current root fs in the initrd
and fall back to that if / doesn't appear.
Kamion I've got no objection to changing initramfs-tools for that at
BenC Mithrandir: there's a sweet idea
mjg59 BenC: I think we should fix that in any case. It's initramfs-tools
that needs fixing - who's working on that right now?
Kamion if we get it wrong, it'll be noticed quickly, and it's fixable on
Mithrandir it's a hack, but it'd probably save our backside, wouldn't
Kamion installer bugs take much much longer for people to notice
BenC infinity is the culprit :)
BenC ok, so initramfs-tools, record UUID in image, check it if / doesn't
appear, if neither shows up, delay 10 seconds, load ide-generic and
repeat with current delay (30 seconds?)
BenC that sound good?
Kamion also, for the IDE->PATA upgrade, I'd actually prefer if dapper
didn't do UUIDs so that the upgrade issues from installations made using
breezy or earlier were right in our face rather than being hidden
BenC Kamion: good point
Kamion (since I doubt we have time to get the upgrades right for all
boot loaders etc. at this point)
Mithrandir Kamion: didn't do, as in, didn't fall back?
highvoltage jdub: pong
Kamion Mithrandir: no, I mean didn't mount root by UUID by default in
the simple way
Mithrandir Kamion: or are you talking about what's put on the kernel
Kamion not referring to your hack
Kamion command line
Mithrandir Kamion: sure, so not changing the current state, which is to
do it for stuff we see is on removable devices in the installer, but
Ubuntu - http://www.ubuntu.com/
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
SwissDisk - http://www.swissdisk.com/
More information about the ubuntu-devel-announce