How does MAAS pick which volume to boot from?
Daniel K
sathackr at gmail.com
Thu Dec 7 17:26:49 UTC 2017
Dmitrii,
Will do -- I've been encouraged off-list to attempt to contribute -- I
registered a launchpad account and may give it a shot.
Thanks,
Daniel
On Thu, Dec 7, 2017 at 3:37 AM, Dmitrii Shcherbakov <
dmitrii.shcherbakov at canonical.com> wrote:
> Daniel,
>
> > Thanks for the response -- I'm only working with a 32GB micro-sd card,
> nowhere near the 2TB for auto-gpt.
>
> Could you please create a bug here https://bugs.launchpad.net/
> maas/+filebug so that you and the dev team can track it?
>
> From the data model perspective a node in MAAS has a boot method stored in
> the database ("pxe" and "uefi" are relevant in this case). This is later
> used for validating the partition table type for a boot device (See below).
>
> # Holds the known `bios_boot_methods`. If `bios_boot_method` is not in this
> # list then it will fallback to `DEFAULT_BIOS_BOOT_METHOD`.
> KNOWN_BIOS_BOOT_METHODS = frozenset(["pxe", "uefi", "powernv", "powerkvm"])
>
> Non-UEFI QEMU/KVM virtual machines with MAAS 2.3.0 get "pxe" boot method:
>
> maasdb=# select id,bios_boot_method,boot_disk_id from maasserver_node;
> id | bios_boot_method | boot_disk_id
> -----+------------------+--------------
> 37 | *pxe* |
>
>
> Which results in MBR usage:
>
> sudo gdisk /dev/sda
> GPT fdisk (gdisk) version 1.0.1
>
> Partition table scan:
> MBR: MBR only
> BSD: not present
> APM: not present
> GPT: not present
>
>
>
> > gives me a "success" message but the type remains "MBR"
>
> I think this is why you are getting this:
>
> https://github.com/maas/maas/blob/2.3.0/src/maasserver/
> models/partitiontable.py#L164-L194
> class PartitionTable(CleanSave, TimestampedModel):
> ... def save(self, *args, **kwargs):
> self._set_and_validate_table_type_for_boot_disk()
> return super(PartitionTable, self).save(*args, **kwargs)
>
> def _set_and_validate_table_type_for_boot_disk(self):
> ...
> if boot_disk is not None and self.block_device.id ==
> boot_disk.id:
> bios_boot_method = node.get_bios_boot_method()
> if bios_boot_method in ["uefi", "powernv", "powerkvm"]: #
> <---- UEFI GPT code path
>
> else: # <-----------------------------
> ------------------------------------- non-UEFI code path
> # Don't even check if its 'pxe', because we always
> fallback
> # to MBR unless the disk is larger than 2TiB in that
> case
> # it is GPT.
> disk_size = self.block_device.size
> if not self.table_type:
> if disk_size >= GPT_REQUIRED_SIZE:
> self.table_type = PARTITION_TABLE_TYPE.GPT
> else:
> self.table_type = PARTITION_TABLE_TYPE.MBR #
> <---- auto-selection of MBR
> elif (disk_size < GPT_REQUIRED_SIZE and
> self.table_type != PARTITION_TABLE_TYPE.MBR):
> raise ValidationError({
> "table_type": [
> "Partition table on this node's boot disk "
> "must be using '%s'." % (
> PARTITION_TABLE_TYPE.MBR)]
> })
>
> So, there's definitely code there that blocks you from explicitly setting
> a partition table type to GPT on small devices.
>
> In my view, it should be optional to use MBR and GPT should be used even
> for non-UEFI systems, unless a very old Linux distribution is used or
> x86_64 Windows has to be deployed.
>
> Best Regards,
> Dmitrii Shcherbakov
>
> Field Software Engineer
> IRC (freenode): Dmitrii-Sh
>
> On Wed, Dec 6, 2017 at 10:25 PM, Daniel K <sathackr at gmail.com> wrote:
>
>> Dmitrii,
>>
>> Thanks for the response -- I'm only working with a 32GB micro-sd card,
>> nowhere near the 2TB for auto-gpt.
>>
>> I tried setting the partition table type to GPT for a node in the "Ready"
>> state and the request was ignored -- not sure how to force GPT without just
>> changing it after install(which causes problems with automation)
>>
>> What I would expect to work:
>>
>> > maas maasadmin block-device update 753 partition_table_type=GPT
>>
>> gives me a "success" message but the type remains "MBR"
>>
>>
>>
>>
>> On Wed, Dec 6, 2017 at 2:56 AM, Dmitrii Shcherbakov <
>> dmitrii.shcherbakov at canonical.com> wrote:
>>
>>> For non-UEFI systems it appears to be that MAAS 2.3.0 only generates a
>>> curtin operation with "gpt" when storage devices are larger than 2 TiB:
>>>
>>> https://github.com/maas/maas/blob/2.3.0/src/maasserver/prese
>>> ed_storage.py#L224-L237
>>> def _generate_disk_operation(self, block_device):
>>> ...
>>> if bios_boot_method in [
>>> "uefi", "powernv", "powerkvm"]:
>>> disk_operation["ptable"] = "gpt"
>>> if node_arch == "ppc64el":
>>> add_prep_partition = True
>>> elif (block_device.size >= GPT_REQUIRED_SIZE and # <---- this
>>> node_arch == "amd64"):
>>> disk_operation["ptable"] = "gpt"
>>> add_bios_grub_partition = True
>>> else: # <---- and this
>>> disk_operation["ptable"] = "msdos"
>>>
>>> class CurtinStorageGenerator:
>>> ... def generate(self):
>>> ...
>>> # Generate each YAML operation in the storage_config.
>>> self._generate_disk_operations()
>>> ...
>>> def _generate_disk_operations(self):
>>> """Generate all disk operations."""
>>> for block_device in self.operations["disk"]:
>>> self._generate_disk_operation(block_device)
>>>
>>> https://github.com/maas/maas/blob/2.3.0/src/maasserver/model
>>> s/partitiontable.py#L51-L53
>>> GPT_REQUIRED_SIZE = 2 * 1024 * 1024 * 1024 * 1024
>>>
>>>
>>>
>>> Best Regards,
>>> Dmitrii Shcherbakov
>>>
>>> Field Software Engineer
>>> IRC (freenode): Dmitrii-Sh
>>>
>>> On Wed, Dec 6, 2017 at 7:40 AM, Daniel K <sathackr at gmail.com> wrote:
>>>
>>>> Digging into chain.c32, it looks like several options could be passed
>>>> for the boot drive.
>>>>
>>>> > Usage:
>>>> > chain.c32 hd<disk#> [<partition>] [options]
>>>> > chain.c32 fd<disk#> [options]
>>>> > chain.c32 mbr:<id> [<partition>] [options]
>>>> > chain.c32 boot [<partition>] [options]
>>>> > chain.c32 fs [options]
>>>> > chain.c32 label=<label> [options]
>>>> > chain.c32 guid=<label> [options]
>>>>
>>>> It would seem that label= or guid= would be the most sure-fire way to
>>>> boot the drive you want to boot, but that requires a GPT partition instead
>>>> of MBR. Fallback method for mbr could be use the mbr:<id> option:
>>>>
>>>> > The mbr: syntax means search all the hard disks until one with a
>>>> specific MBR serial number (bytes 440-443) is found.
>>>> > You can get the MBR serial number, by running the following command
>>>> (change /dev/sda to the correct device):
>>>> > $ hexdump -s 440 -n 4 -e '"0x%08x\n"' /dev/sda
>>>> > 0x0ec8694c
>>>> > Or by running:
>>>> > $ fdisk -l /dev/sda
>>>> > ...
>>>> > Disk identifier: 0x0ec8694c
>>>> > Example:
>>>> > LABEL mbr_serial
>>>> > COM32 chain.c32
>>>> > APPEND mbr:0x0ec8694c
>>>>
>>>> So for a MBR boot it would seem to make sense that if during
>>>> commissioning, grub is installed on /dev/sdc then it should pass hd2
>>>> instead of hd0 to chain.c32, or pass mbr:<serial>. That way regardless of
>>>> the bios configuration, the correct drive would always boot. Looks like
>>>> there are some options to use variables in the pxe template files, but I
>>>> doubt a guid or mbr serial number would be availble.
>>>>
>>>> Looks like I may be able to sidestep this by hardcoding something like
>>>> "label=boot" instead of hd0 in the template file, then forcing curtin to
>>>> use a gpt table instead of mbr, and ensuring the disk/partition with grub
>>>> is labeled "boot" and no others are labeled as such. Still not quite
>>>> familiar enough with MAAS to know where to make that adjustment.
>>>>
>>>> This of course would also not be a problem if HPE would put the drives
>>>> in the right order. Or UEFI, which is not supported by these servers that I
>>>> can tell.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Dec 5, 2017 at 11:06 PM, Daniel K <sathackr at gmail.com> wrote:
>>>>
>>>>> Looks like the hd0 may be hardcoded :-/
>>>>>
>>>>> > root at maas1:~# cat /usr/lib/python3/dist-packages
>>>>> /provisioningserver/templates/pxe/config.local.amd64.template
>>>>> > DEFAULT local
>>>>> >
>>>>> > LABEL local
>>>>> > SAY Booting local disk ...
>>>>> > KERNEL chain.c32
>>>>> > APPEND hd0
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Dec 5, 2017 at 10:38 PM, Daniel K <sathackr at gmail.com> wrote:
>>>>>
>>>>>> So attacking this from the angle I'm most familiar with I've captured
>>>>>> the traffic between the maas server and a booting node to see if I can
>>>>>> catch the data in flight.
>>>>>>
>>>>>> PXE first downloads a file called pxelinux.0 - I see this file in
>>>>>> /var/lib/maas/boot-resources.
>>>>>> Then requests(and receives) a file called ldlinux.c32.
>>>>>>
>>>>>> Then requests a non existent file: pxelinux.cfg/<some sort of
>>>>>> uuid/guid?>
>>>>>> > # Packet 344 from C:\Users\user\asdf.pcap
>>>>>> > - 345
>>>>>> > - 166.825599
>>>>>> > - 10.20.128.111
>>>>>> > - 10.20.4.30
>>>>>> > - TFTP
>>>>>> > - 121
>>>>>> > - Read Request, File: pxelinux.cfg/36383031-3839-3255-5831-353130303732,
>>>>>> Transfer type: octet, tsize=0, blksize=1408
>>>>>> > - # Packet 345 from C:\Users\user\asdf.pcap
>>>>>> > - 346
>>>>>> > - 166.836864
>>>>>> > - 10.20.4.30
>>>>>> > - 10.20.128.111
>>>>>> > - TFTP
>>>>>> > - 61
>>>>>> > - Error Code, Code: File not found, Message: File not found
>>>>>>
>>>>>> Then requests and receives a file called pxelinux.cfg/01-<mac address>
>>>>>> > # Packet 346 from C:\Users\user\asdf.pcap
>>>>>> > - 347
>>>>>> > - 166.837008
>>>>>> > - 10.20.128.111
>>>>>> > - 10.20.4.30
>>>>>> > - TFTP
>>>>>> > - 105
>>>>>> > - Read Request, File: pxelinux.cfg/01-e8-39-35-2b-c9-5c, Transfer
>>>>>> type: octet, tsize=0, blksize=1408
>>>>>>
>>>>>> which contains the following printable text:
>>>>>> > 95+\W1ExL@@T
>>>>>> > oad*DEFAULT local
>>>>>> > LABEL local
>>>>>> > SAY Booting local disk ...
>>>>>> > KERNEL chain.c32
>>>>>> > APPEND hd0
>>>>>>
>>>>>> which I can correlate to log entries:
>>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>>> ldlinux.c32 requested by e8:39:35:2b:c9:5c
>>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>>> pxelinux.cfg/36383031-3839-3255-5831-353130303732 requested by
>>>>>> e8:39:35:2b:c9:5c
>>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>>> pxelinux.cfg/01-e8-39-35-2b-c9-5c requested by e8:39:35:2b:c9:5c
>>>>>>
>>>>>> I assume the "APPEND hd0" is what is telling the pxelinux loader
>>>>>> which disk to boot.
>>>>>> I searched but I cannot find a directory called pxelinux.cfg anywhere
>>>>>> on the maas servers, nor a file with any part of the mac address in it's
>>>>>> name. I'll assume then that some piece of maas is responding to that
>>>>>> request after fetching the config from some sort of database for that MAC
>>>>>> address/node.
>>>>>>
>>>>>> So then there must be a knob somewhere in MAAS that I can tweak to
>>>>>> cause a different disk to be sent in the APPEND hd0 command.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Dec 5, 2017 at 5:24 PM, Lloyd Parkes <
>>>>>> lloyd+lp at must-have-coffee.gen.nz> wrote:
>>>>>>
>>>>>>> I originally sent this from the wrong email address and so it got
>>>>>>> hung
>>>>>>> up on list moderation.
>>>>>>>
>>>>>>>
>>>>>>> On 5 December 2017 at 10:45, Daniel K <sathackr at gmail.com> wrote:
>>>>>>> >
>>>>>>> > There must be something that tells the PXE loader which physical
>>>>>>> disk to try
>>>>>>> > to boot
>>>>>>>
>>>>>>> This is almost certainly hard coded in the PXELinux boot script to
>>>>>>> default to BIOS disk 0x80. Have a look at
>>>>>>> http://www.syslinux.org/wiki/index.php?title=SYSLINUX#LOCALBOOT_type
>>>>>>> and see if it helps.
>>>>>>>
>>>>>>> I would dig into this myself because I want to make my HPE servers
>>>>>>> boot as well, but I'm 3265km and two months away from my MAAS
>>>>>>> servers.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Lloyd
>>>>>>>
>>>>>>> --
>>>>>>> Maas-devel mailing list
>>>>>>> Maas-devel at lists.ubuntu.com
>>>>>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>>>>>> an/listinfo/maas-devel
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Maas-devel mailing list
>>>> Maas-devel at lists.ubuntu.com
>>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>>> an/listinfo/maas-devel
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/maas-devel/attachments/20171207/d7ded854/attachment-0001.html>
More information about the Maas-devel
mailing list