How does MAAS pick which volume to boot from?

Daniel K sathackr at gmail.com
Thu Dec 7 17:26:49 UTC 2017


Dmitrii,

Will do -- I've been encouraged off-list to attempt to contribute -- I
registered a launchpad account and may give it a shot.

Thanks,

Daniel


On Thu, Dec 7, 2017 at 3:37 AM, Dmitrii Shcherbakov <
dmitrii.shcherbakov at canonical.com> wrote:

> Daniel,
>
> > Thanks for the response -- I'm only working with a 32GB micro-sd card,
> nowhere near the 2TB for auto-gpt.
>
> Could you please create a bug here https://bugs.launchpad.net/
> maas/+filebug so that you and the dev team can track it?
>
> From the data model perspective a node in MAAS has a boot method stored in
> the database ("pxe" and "uefi" are relevant in this case). This is later
> used for validating the partition table type for a boot device (See below).
>
> # Holds the known `bios_boot_methods`. If `bios_boot_method` is not in this
> # list then it will fallback to `DEFAULT_BIOS_BOOT_METHOD`.
> KNOWN_BIOS_BOOT_METHODS = frozenset(["pxe", "uefi", "powernv", "powerkvm"])
>
> Non-UEFI QEMU/KVM virtual machines with MAAS 2.3.0 get "pxe" boot method:
>
> maasdb=# select id,bios_boot_method,boot_disk_id from maasserver_node;
>  id  | bios_boot_method | boot_disk_id
> -----+------------------+--------------
>   37 | *pxe*              |
>
>
> Which results in MBR usage:
>
> sudo gdisk /dev/sda
> GPT fdisk (gdisk) version 1.0.1
>
> Partition table scan:
>   MBR: MBR only
>   BSD: not present
>   APM: not present
>   GPT: not present
>
>
>
> > gives me a "success" message but the type remains "MBR"
>
> I think this is why you are getting this:
>
> https://github.com/maas/maas/blob/2.3.0/src/maasserver/
> models/partitiontable.py#L164-L194
> class PartitionTable(CleanSave, TimestampedModel):
> ...    def save(self, *args, **kwargs):
>         self._set_and_validate_table_type_for_boot_disk()
>         return super(PartitionTable, self).save(*args, **kwargs)
>
>     def _set_and_validate_table_type_for_boot_disk(self):
> ...
>             if boot_disk is not None and self.block_device.id ==
> boot_disk.id:
>                 bios_boot_method = node.get_bios_boot_method()
>                 if bios_boot_method in ["uefi", "powernv", "powerkvm"]: #
> <---- UEFI GPT code path
>
>                 else:  # <-----------------------------
> ------------------------------------- non-UEFI code path
>                     # Don't even check if its 'pxe', because we always
> fallback
>                     # to MBR unless the disk is larger than 2TiB in that
> case
>                     # it is GPT.
>                     disk_size = self.block_device.size
>                     if not self.table_type:
>                         if disk_size >= GPT_REQUIRED_SIZE:
>                             self.table_type = PARTITION_TABLE_TYPE.GPT
>                         else:
>                             self.table_type = PARTITION_TABLE_TYPE.MBR  #
> <---- auto-selection of MBR
>                     elif (disk_size < GPT_REQUIRED_SIZE and
>                             self.table_type != PARTITION_TABLE_TYPE.MBR):
>                         raise ValidationError({
>                             "table_type": [
>                                 "Partition table on this node's boot disk "
>                                 "must be using '%s'." % (
>                                     PARTITION_TABLE_TYPE.MBR)]
>                             })
>
> So, there's definitely code there that blocks you from explicitly setting
> a partition table type to GPT on small devices.
>
> In my view, it should be optional to use MBR and GPT should be used even
> for non-UEFI systems, unless a very old Linux distribution is used or
> x86_64 Windows has to be deployed.
>
> Best Regards,
> Dmitrii Shcherbakov
>
> Field Software Engineer
> IRC (freenode): Dmitrii-Sh
>
> On Wed, Dec 6, 2017 at 10:25 PM, Daniel K <sathackr at gmail.com> wrote:
>
>> Dmitrii,
>>
>> Thanks for the response -- I'm only working with a 32GB micro-sd card,
>> nowhere near the 2TB for auto-gpt.
>>
>> I tried setting the partition table type to GPT for a node in the "Ready"
>> state and the request was ignored -- not sure how to force GPT without just
>> changing it after install(which causes problems with automation)
>>
>> What I would expect to work:
>>
>> > maas maasadmin block-device update 753 partition_table_type=GPT
>>
>> gives me a "success" message but the type remains "MBR"
>>
>>
>>
>>
>> On Wed, Dec 6, 2017 at 2:56 AM, Dmitrii Shcherbakov <
>> dmitrii.shcherbakov at canonical.com> wrote:
>>
>>> For non-UEFI systems it appears to be that MAAS 2.3.0 only generates a
>>> curtin operation with "gpt" when storage devices are larger than 2 TiB:
>>>
>>> https://github.com/maas/maas/blob/2.3.0/src/maasserver/prese
>>> ed_storage.py#L224-L237
>>>     def _generate_disk_operation(self, block_device):
>>> ...
>>>             if bios_boot_method in [
>>>                     "uefi", "powernv", "powerkvm"]:
>>>                 disk_operation["ptable"] = "gpt"
>>>                 if node_arch == "ppc64el":
>>>                     add_prep_partition = True
>>>             elif (block_device.size >= GPT_REQUIRED_SIZE and # <---- this
>>>                     node_arch == "amd64"):
>>>                 disk_operation["ptable"] = "gpt"
>>>                 add_bios_grub_partition = True
>>>             else: # <---- and this
>>>                 disk_operation["ptable"] = "msdos"
>>>
>>> class CurtinStorageGenerator:
>>> ...   def generate(self):
>>> ...
>>>         # Generate each YAML operation in the storage_config.
>>>         self._generate_disk_operations()
>>> ...
>>>     def _generate_disk_operations(self):
>>>         """Generate all disk operations."""
>>>         for block_device in self.operations["disk"]:
>>>             self._generate_disk_operation(block_device)
>>>
>>> https://github.com/maas/maas/blob/2.3.0/src/maasserver/model
>>> s/partitiontable.py#L51-L53
>>> GPT_REQUIRED_SIZE = 2 * 1024 * 1024 * 1024 * 1024
>>>
>>>
>>>
>>> Best Regards,
>>> Dmitrii Shcherbakov
>>>
>>> Field Software Engineer
>>> IRC (freenode): Dmitrii-Sh
>>>
>>> On Wed, Dec 6, 2017 at 7:40 AM, Daniel K <sathackr at gmail.com> wrote:
>>>
>>>> Digging into chain.c32, it looks like several options could be passed
>>>> for the boot drive.
>>>>
>>>> > Usage:
>>>> > chain.c32 hd<disk#> [<partition>] [options]
>>>> > chain.c32 fd<disk#> [options]
>>>> > chain.c32 mbr:<id> [<partition>] [options]
>>>> > chain.c32 boot [<partition>] [options]
>>>> > chain.c32 fs [options]
>>>> > chain.c32 label=<label> [options]
>>>> > chain.c32 guid=<label> [options]
>>>>
>>>> It would seem that label= or guid= would be the most sure-fire way to
>>>> boot the drive you want to boot, but that requires a GPT partition instead
>>>> of MBR. Fallback method for mbr could be use the mbr:<id> option:
>>>>
>>>> > The mbr: syntax means search all the hard disks until one with a
>>>> specific MBR serial number (bytes 440-443) is found.
>>>> > You can get the MBR serial number, by running the following command
>>>> (change /dev/sda to the correct device):
>>>> > $ hexdump -s 440 -n 4 -e '"0x%08x\n"' /dev/sda
>>>> > 0x0ec8694c
>>>> > Or by running:
>>>> > $ fdisk -l /dev/sda
>>>> > ...
>>>> > Disk identifier: 0x0ec8694c
>>>> > Example:
>>>> > LABEL mbr_serial
>>>> > COM32 chain.c32
>>>> > APPEND mbr:0x0ec8694c
>>>>
>>>> So for a MBR boot it would seem to make sense that if during
>>>> commissioning, grub is installed on /dev/sdc then it should pass hd2
>>>> instead of hd0 to chain.c32, or pass mbr:<serial>. That way regardless of
>>>> the bios configuration, the correct drive would always boot. Looks like
>>>> there are some options to use variables in the pxe template files, but I
>>>> doubt a guid or mbr serial number would be availble.
>>>>
>>>> Looks like I may be able to sidestep this by hardcoding something like
>>>> "label=boot" instead of hd0 in the template file, then forcing curtin to
>>>> use a gpt table instead of mbr, and ensuring the disk/partition with grub
>>>> is labeled "boot" and no others are labeled as such. Still not quite
>>>> familiar enough with MAAS to know where to make that adjustment.
>>>>
>>>> This of course would also not be a problem if HPE would put the drives
>>>> in the right order. Or UEFI, which is not supported by these servers that I
>>>> can tell.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Dec 5, 2017 at 11:06 PM, Daniel K <sathackr at gmail.com> wrote:
>>>>
>>>>> Looks like the hd0 may be hardcoded :-/
>>>>>
>>>>> > root at maas1:~# cat /usr/lib/python3/dist-packages
>>>>> /provisioningserver/templates/pxe/config.local.amd64.template
>>>>> > DEFAULT local
>>>>> >
>>>>> > LABEL local
>>>>> >   SAY Booting local disk ...
>>>>> >   KERNEL chain.c32
>>>>> >   APPEND hd0
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Dec 5, 2017 at 10:38 PM, Daniel K <sathackr at gmail.com> wrote:
>>>>>
>>>>>> So attacking this from the angle I'm most familiar with I've captured
>>>>>> the traffic between the maas server and a booting node to see if I can
>>>>>> catch the data in flight.
>>>>>>
>>>>>> PXE first downloads a file called pxelinux.0 - I see this file in
>>>>>> /var/lib/maas/boot-resources.
>>>>>> Then requests(and receives) a file called ldlinux.c32.
>>>>>>
>>>>>> Then requests a non existent file: pxelinux.cfg/<some sort of
>>>>>> uuid/guid?>
>>>>>> > # Packet 344 from C:\Users\user\asdf.pcap
>>>>>> > - 345
>>>>>> > - 166.825599
>>>>>> > - 10.20.128.111
>>>>>> > - 10.20.4.30
>>>>>> > - TFTP
>>>>>> > - 121
>>>>>> > - Read Request, File: pxelinux.cfg/36383031-3839-3255-5831-353130303732,
>>>>>> Transfer type: octet, tsize=0, blksize=1408
>>>>>> > - # Packet 345 from C:\Users\user\asdf.pcap
>>>>>> > - 346
>>>>>> > - 166.836864
>>>>>> > - 10.20.4.30
>>>>>> > - 10.20.128.111
>>>>>> > - TFTP
>>>>>> > - 61
>>>>>> > - Error Code, Code: File not found, Message: File not found
>>>>>>
>>>>>> Then requests and receives a file called pxelinux.cfg/01-<mac address>
>>>>>> > # Packet 346 from C:\Users\user\asdf.pcap
>>>>>> > - 347
>>>>>> > - 166.837008
>>>>>> > - 10.20.128.111
>>>>>> > - 10.20.4.30
>>>>>> > - TFTP
>>>>>> > - 105
>>>>>> > - Read Request, File: pxelinux.cfg/01-e8-39-35-2b-c9-5c, Transfer
>>>>>> type: octet, tsize=0, blksize=1408
>>>>>>
>>>>>> which contains the following printable text:
>>>>>> > 95+\W1ExL@@T
>>>>>> > oad*DEFAULT local
>>>>>> > LABEL local
>>>>>> >   SAY Booting local disk ...
>>>>>> >   KERNEL chain.c32
>>>>>> >   APPEND hd0
>>>>>>
>>>>>> which I can correlate to log entries:
>>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>>> ldlinux.c32 requested by e8:39:35:2b:c9:5c
>>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>>> pxelinux.cfg/36383031-3839-3255-5831-353130303732 requested by
>>>>>> e8:39:35:2b:c9:5c
>>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>>> pxelinux.cfg/01-e8-39-35-2b-c9-5c requested by e8:39:35:2b:c9:5c
>>>>>>
>>>>>> I assume the "APPEND hd0" is what is telling the pxelinux loader
>>>>>> which disk to boot.
>>>>>> I searched but I cannot find a directory called pxelinux.cfg anywhere
>>>>>> on the maas servers, nor a file with any part of the mac address in it's
>>>>>> name. I'll assume then that some piece of maas is responding to that
>>>>>> request after fetching the config from some sort of database for that MAC
>>>>>> address/node.
>>>>>>
>>>>>> So then there must be a knob somewhere in MAAS that I can tweak to
>>>>>> cause a different disk to be sent in the APPEND hd0 command.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Dec 5, 2017 at 5:24 PM, Lloyd Parkes <
>>>>>> lloyd+lp at must-have-coffee.gen.nz> wrote:
>>>>>>
>>>>>>> I originally sent this from the wrong email address and so it got
>>>>>>> hung
>>>>>>> up on list moderation.
>>>>>>>
>>>>>>>
>>>>>>> On 5 December 2017 at 10:45, Daniel K <sathackr at gmail.com> wrote:
>>>>>>> >
>>>>>>> > There must be something that tells the PXE loader which physical
>>>>>>> disk to try
>>>>>>> > to boot
>>>>>>>
>>>>>>> This is almost certainly hard coded in the PXELinux boot script to
>>>>>>> default to BIOS disk 0x80. Have a look at
>>>>>>> http://www.syslinux.org/wiki/index.php?title=SYSLINUX#LOCALBOOT_type
>>>>>>> and see if it helps.
>>>>>>>
>>>>>>> I would dig into this myself because I want to make my HPE servers
>>>>>>> boot as well, but I'm 3265km and two months away from my MAAS
>>>>>>> servers.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Lloyd
>>>>>>>
>>>>>>> --
>>>>>>> Maas-devel mailing list
>>>>>>> Maas-devel at lists.ubuntu.com
>>>>>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>>>>>> an/listinfo/maas-devel
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Maas-devel mailing list
>>>> Maas-devel at lists.ubuntu.com
>>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>>> an/listinfo/maas-devel
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/maas-devel/attachments/20171207/d7ded854/attachment-0001.html>


More information about the Maas-devel mailing list