[Bug 1869116] Re: smartctl-validate is borked in a recent release
cees
1869116 at bugs.launchpad.net
Tue Mar 31 04:58:32 UTC 2020
Rereading my comment above I had a huge typo, I meant to say that the
script output for 18/16 LTS said "the drivers do not support SMART",
rather than use the word RAID, sorry about that confusion!
-cees
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to util-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1869116
Title:
smartctl-validate is borked in a recent release
Status in lxd:
Unknown
Status in MAAS:
Triaged
Status in util-linux package in Ubuntu:
New
Bug description:
Bug (maybe?) details first, diatribe second.
Bug Summary: multi-hdd / raid with multiple drives / multiple devices
or something along those lines cannot be commissioned anymore: 2.4.x
worked fine, 2.7.0 does not.
Here is the script output of smartctl-validate:
-----
# /dev/sda (Model: PERC 6/i, Serial: 6842b2b0740e9900260e66c9220df4ac)
Unable to run 'smartctl-validate': Storage device 'PERC 6/i' with serial '6842b2b0740e9900260e66c9220df4ac' not found!
This indicates the storage device has been removed or the OS is unable to find it due to a hardware failure. Please re-commission this node to re-discover the storage devices, or delete this device manually.
Given parameters:
{'storage': {'argument_format': '{path
}', 'type': 'storage', 'value': {'id_path': '/dev/disk/by-id/wwn-0x6842b2b0740e9900260e66c9220df4ac', 'model': 'PERC 6/i', 'name': 'sda', 'physical_blockdevice_id': 33, 'serial': '6842b2b0740e9900260e66c9220df4ac'
}
}
}
Discovered storage devices: [
{'NAME': 'sda', 'MODEL': 'PERC_6/i', 'SERIAL': '6842b2b0740e9900260e66c9220df4ac'
},
{'NAME': 'sdb', 'MODEL': 'PERC_6/i', 'SERIAL': '6842b2b0740e9900260e66f924ecece0'
},
{'NAME': 'sr0', 'MODEL': 'TEAC_DVD-ROM_DV-28SW', 'SERIAL': '10092013112645'
}
]
Discovered interfaces: {'xx: xx: xx: xx: xx: xx': 'eno1'
}
-----
-----
# /dev/sdb (Model: PERC 6/i, Serial: 6842b2b0740e9900260e66f924ecece0)
Unable to run 'smartctl-validate': Storage device 'PERC 6/i' with serial '6842b2b0740e9900260e66f924ecece0' not found!
This indicates the storage device has been removed or the OS is unable to find it due to a hardware failure. Please re-commission this node to re-discover the storage devices, or delete this device manually.
Given parameters: {'storage': {'argument_format': '{path
}', 'type': 'storage', 'value': {'id_path': '/dev/disk/by-id/wwn-0x6842b2b0740e9900260e66f924ecece0', 'model': 'PERC 6/i', 'name': 'sdb', 'physical_blockdevice_id': 34, 'serial': '6842b2b0740e9900260e66f924ecece0'
}
}
}
Discovered storage devices: [
{'NAME': 'sda', 'MODEL': 'PERC_6/i', 'SERIAL': '6842b2b0740e9900260e66c9220df4ac'
},
{'NAME': 'sdb', 'MODEL': 'PERC_6/i', 'SERIAL': '6842b2b0740e9900260e66f924ecece0'
},
{'NAME': 'sr0', 'MODEL': 'TEAC_DVD-ROM_DV-28SW', 'SERIAL': '10092013112645'
}
]
Discovered interfaces: {'xx: xx: xx: xx: xx: xx': 'eno1'
}
-----
You can see that it says the storage cannot be found and immediately
lists it as a discovered device. It does it for both tests (one for
each drive), and for both servers
Bug Details:
I had maas 2.4.x for the longest time over my journey (see below journey) and have never had any problems re-commissioning (or deleting and re-discovering over boot PXE) 2 of my servers (r610, r710).
r610 has an iPERC 6, four 10K X00GB drives configured in a RAID10, 1 virtual disk.
r710 has an iPERC 6, 6x 2TB drives, configured in a RAID10, 2 virtual disks
So commission after commission trying to get through my journey, 0
problems. After I finally get everything figured out on the juju,
network/vlan, quad-nic end, I go to re-commission and I cannot.
smartctl-validate fails on both, over and over again. I even destroyed
and re-created the raid/VDs, still not.
After spending so much time on it I remembered that it was the first
time I had tried to re-commission these two servers since doing an
upgrade from 2.4.x->2.7 in an effort to use the updated KVM
integration to add a couple more guests. Once I got all everything
figured out I went to re-commission everything and boom.
[Upgrade path notes]
In full disclosure, in case this matters. I was on apt install of 2.4.x and using snap for 2.7, except it didn't work. So I read on how to do apt 2.7 and did that and did not uninstall snap 2.7 yet. I wanted to migrate from apt to snap but do not know how to without losing all maas data and could not find docs on it, so a problem for another day. But in case that is part of the problem for some odd reason, I wanted to share.
[Diatribe]
My journey to get maas+juju+openstack+kubernets has been less then stellar. I have ran into problem after problem; albeit some of which were my own. I am so close, after spending the last 6 months on/off when I had time, and really hardcore the last 4 days. The last half day of which has been this little gem. Maas has been pretty fun to work with but some thing have been the biggest pain in the a-hole to understand. Like un/managed subnets comes to mind: "Managed: we're going to use IPs, even with DHCP off. Unmanaged: We're still going to use IPs, but be different". Anyway, this doesn't belong here, if it gets modded out that's fine. It makes me feel a little better typing it knowing that I *think* my last problem was solved to get this up and running; just trying to contribute something that I can, back"
I did want to say thanks to those made/maintain maas. Despite the
problems I somehow always run into I have enjoyed figuring it out.
-Red
To manage notifications about this bug go to:
https://bugs.launchpad.net/lxd/+bug/1869116/+subscriptions
More information about the foundations-bugs
mailing list