How does MAAS pick which volume to boot from?

Dmitrii Shcherbakov dmitrii.shcherbakov at canonical.com
Thu Dec 7 08:37:13 UTC 2017


Daniel,

> Thanks for the response -- I'm only working with a 32GB micro-sd card,
nowhere near the 2TB for auto-gpt.

Could you please create a bug here https://bugs.launchpad.net/maas/+filebug
so that you and the dev team can track it?

>From the data model perspective a node in MAAS has a boot method stored in
the database ("pxe" and "uefi" are relevant in this case). This is later
used for validating the partition table type for a boot device (See below).

# Holds the known `bios_boot_methods`. If `bios_boot_method` is not in this
# list then it will fallback to `DEFAULT_BIOS_BOOT_METHOD`.
KNOWN_BIOS_BOOT_METHODS = frozenset(["pxe", "uefi", "powernv", "powerkvm"])

Non-UEFI QEMU/KVM virtual machines with MAAS 2.3.0 get "pxe" boot method:

maasdb=# select id,bios_boot_method,boot_disk_id from maasserver_node;
 id  | bios_boot_method | boot_disk_id
-----+------------------+--------------
  37 | *pxe*              |


Which results in MBR usage:

sudo gdisk /dev/sda
GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: MBR only
  BSD: not present
  APM: not present
  GPT: not present



> gives me a "success" message but the type remains "MBR"

I think this is why you are getting this:

https://github.com/maas/maas/blob/2.3.0/src/maasserver/models/partitiontable.py#L164-L194
class PartitionTable(CleanSave, TimestampedModel):
...    def save(self, *args, **kwargs):
        self._set_and_validate_table_type_for_boot_disk()
        return super(PartitionTable, self).save(*args, **kwargs)

    def _set_and_validate_table_type_for_boot_disk(self):
...
            if boot_disk is not None and self.block_device.id ==
boot_disk.id:
                bios_boot_method = node.get_bios_boot_method()
                if bios_boot_method in ["uefi", "powernv", "powerkvm"]: #
<---- UEFI GPT code path

                else:  #
<------------------------------------------------------------------
non-UEFI code path
                    # Don't even check if its 'pxe', because we always
fallback
                    # to MBR unless the disk is larger than 2TiB in that
case
                    # it is GPT.
                    disk_size = self.block_device.size
                    if not self.table_type:
                        if disk_size >= GPT_REQUIRED_SIZE:
                            self.table_type = PARTITION_TABLE_TYPE.GPT
                        else:
                            self.table_type = PARTITION_TABLE_TYPE.MBR  #
<---- auto-selection of MBR
                    elif (disk_size < GPT_REQUIRED_SIZE and
                            self.table_type != PARTITION_TABLE_TYPE.MBR):
                        raise ValidationError({
                            "table_type": [
                                "Partition table on this node's boot disk "
                                "must be using '%s'." % (
                                    PARTITION_TABLE_TYPE.MBR)]
                            })

So, there's definitely code there that blocks you from explicitly setting a
partition table type to GPT on small devices.

In my view, it should be optional to use MBR and GPT should be used even
for non-UEFI systems, unless a very old Linux distribution is used or
x86_64 Windows has to be deployed.

Best Regards,
Dmitrii Shcherbakov

Field Software Engineer
IRC (freenode): Dmitrii-Sh

On Wed, Dec 6, 2017 at 10:25 PM, Daniel K <sathackr at gmail.com> wrote:

> Dmitrii,
>
> Thanks for the response -- I'm only working with a 32GB micro-sd card,
> nowhere near the 2TB for auto-gpt.
>
> I tried setting the partition table type to GPT for a node in the "Ready"
> state and the request was ignored -- not sure how to force GPT without just
> changing it after install(which causes problems with automation)
>
> What I would expect to work:
>
> > maas maasadmin block-device update 753 partition_table_type=GPT
>
> gives me a "success" message but the type remains "MBR"
>
>
>
>
> On Wed, Dec 6, 2017 at 2:56 AM, Dmitrii Shcherbakov <dmitrii.shcherbakov@
> canonical.com> wrote:
>
>> For non-UEFI systems it appears to be that MAAS 2.3.0 only generates a
>> curtin operation with "gpt" when storage devices are larger than 2 TiB:
>>
>> https://github.com/maas/maas/blob/2.3.0/src/maasserver/prese
>> ed_storage.py#L224-L237
>>     def _generate_disk_operation(self, block_device):
>> ...
>>             if bios_boot_method in [
>>                     "uefi", "powernv", "powerkvm"]:
>>                 disk_operation["ptable"] = "gpt"
>>                 if node_arch == "ppc64el":
>>                     add_prep_partition = True
>>             elif (block_device.size >= GPT_REQUIRED_SIZE and # <---- this
>>                     node_arch == "amd64"):
>>                 disk_operation["ptable"] = "gpt"
>>                 add_bios_grub_partition = True
>>             else: # <---- and this
>>                 disk_operation["ptable"] = "msdos"
>>
>> class CurtinStorageGenerator:
>> ...   def generate(self):
>> ...
>>         # Generate each YAML operation in the storage_config.
>>         self._generate_disk_operations()
>> ...
>>     def _generate_disk_operations(self):
>>         """Generate all disk operations."""
>>         for block_device in self.operations["disk"]:
>>             self._generate_disk_operation(block_device)
>>
>> https://github.com/maas/maas/blob/2.3.0/src/maasserver/model
>> s/partitiontable.py#L51-L53
>> GPT_REQUIRED_SIZE = 2 * 1024 * 1024 * 1024 * 1024
>>
>>
>>
>> Best Regards,
>> Dmitrii Shcherbakov
>>
>> Field Software Engineer
>> IRC (freenode): Dmitrii-Sh
>>
>> On Wed, Dec 6, 2017 at 7:40 AM, Daniel K <sathackr at gmail.com> wrote:
>>
>>> Digging into chain.c32, it looks like several options could be passed
>>> for the boot drive.
>>>
>>> > Usage:
>>> > chain.c32 hd<disk#> [<partition>] [options]
>>> > chain.c32 fd<disk#> [options]
>>> > chain.c32 mbr:<id> [<partition>] [options]
>>> > chain.c32 boot [<partition>] [options]
>>> > chain.c32 fs [options]
>>> > chain.c32 label=<label> [options]
>>> > chain.c32 guid=<label> [options]
>>>
>>> It would seem that label= or guid= would be the most sure-fire way to
>>> boot the drive you want to boot, but that requires a GPT partition instead
>>> of MBR. Fallback method for mbr could be use the mbr:<id> option:
>>>
>>> > The mbr: syntax means search all the hard disks until one with a
>>> specific MBR serial number (bytes 440-443) is found.
>>> > You can get the MBR serial number, by running the following command
>>> (change /dev/sda to the correct device):
>>> > $ hexdump -s 440 -n 4 -e '"0x%08x\n"' /dev/sda
>>> > 0x0ec8694c
>>> > Or by running:
>>> > $ fdisk -l /dev/sda
>>> > ...
>>> > Disk identifier: 0x0ec8694c
>>> > Example:
>>> > LABEL mbr_serial
>>> > COM32 chain.c32
>>> > APPEND mbr:0x0ec8694c
>>>
>>> So for a MBR boot it would seem to make sense that if during
>>> commissioning, grub is installed on /dev/sdc then it should pass hd2
>>> instead of hd0 to chain.c32, or pass mbr:<serial>. That way regardless of
>>> the bios configuration, the correct drive would always boot. Looks like
>>> there are some options to use variables in the pxe template files, but I
>>> doubt a guid or mbr serial number would be availble.
>>>
>>> Looks like I may be able to sidestep this by hardcoding something like
>>> "label=boot" instead of hd0 in the template file, then forcing curtin to
>>> use a gpt table instead of mbr, and ensuring the disk/partition with grub
>>> is labeled "boot" and no others are labeled as such. Still not quite
>>> familiar enough with MAAS to know where to make that adjustment.
>>>
>>> This of course would also not be a problem if HPE would put the drives
>>> in the right order. Or UEFI, which is not supported by these servers that I
>>> can tell.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Dec 5, 2017 at 11:06 PM, Daniel K <sathackr at gmail.com> wrote:
>>>
>>>> Looks like the hd0 may be hardcoded :-/
>>>>
>>>> > root at maas1:~# cat /usr/lib/python3/dist-packages
>>>> /provisioningserver/templates/pxe/config.local.amd64.template
>>>> > DEFAULT local
>>>> >
>>>> > LABEL local
>>>> >   SAY Booting local disk ...
>>>> >   KERNEL chain.c32
>>>> >   APPEND hd0
>>>>
>>>>
>>>>
>>>> On Tue, Dec 5, 2017 at 10:38 PM, Daniel K <sathackr at gmail.com> wrote:
>>>>
>>>>> So attacking this from the angle I'm most familiar with I've captured
>>>>> the traffic between the maas server and a booting node to see if I can
>>>>> catch the data in flight.
>>>>>
>>>>> PXE first downloads a file called pxelinux.0 - I see this file in
>>>>> /var/lib/maas/boot-resources.
>>>>> Then requests(and receives) a file called ldlinux.c32.
>>>>>
>>>>> Then requests a non existent file: pxelinux.cfg/<some sort of
>>>>> uuid/guid?>
>>>>> > # Packet 344 from C:\Users\user\asdf.pcap
>>>>> > - 345
>>>>> > - 166.825599
>>>>> > - 10.20.128.111
>>>>> > - 10.20.4.30
>>>>> > - TFTP
>>>>> > - 121
>>>>> > - Read Request, File: pxelinux.cfg/36383031-3839-3255-5831-353130303732,
>>>>> Transfer type: octet, tsize=0, blksize=1408
>>>>> > - # Packet 345 from C:\Users\user\asdf.pcap
>>>>> > - 346
>>>>> > - 166.836864
>>>>> > - 10.20.4.30
>>>>> > - 10.20.128.111
>>>>> > - TFTP
>>>>> > - 61
>>>>> > - Error Code, Code: File not found, Message: File not found
>>>>>
>>>>> Then requests and receives a file called pxelinux.cfg/01-<mac address>
>>>>> > # Packet 346 from C:\Users\user\asdf.pcap
>>>>> > - 347
>>>>> > - 166.837008
>>>>> > - 10.20.128.111
>>>>> > - 10.20.4.30
>>>>> > - TFTP
>>>>> > - 105
>>>>> > - Read Request, File: pxelinux.cfg/01-e8-39-35-2b-c9-5c, Transfer
>>>>> type: octet, tsize=0, blksize=1408
>>>>>
>>>>> which contains the following printable text:
>>>>> > 95+\W1ExL@@T
>>>>> > oad*DEFAULT local
>>>>> > LABEL local
>>>>> >   SAY Booting local disk ...
>>>>> >   KERNEL chain.c32
>>>>> >   APPEND hd0
>>>>>
>>>>> which I can correlate to log entries:
>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>> ldlinux.c32 requested by e8:39:35:2b:c9:5c
>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>> pxelinux.cfg/36383031-3839-3255-5831-353130303732 requested by
>>>>> e8:39:35:2b:c9:5c
>>>>> > 2017-12-05 22:17:49 provisioningserver.rackdservices.tftp: [info]
>>>>> pxelinux.cfg/01-e8-39-35-2b-c9-5c requested by e8:39:35:2b:c9:5c
>>>>>
>>>>> I assume the "APPEND hd0" is what is telling the pxelinux loader which
>>>>> disk to boot.
>>>>> I searched but I cannot find a directory called pxelinux.cfg anywhere
>>>>> on the maas servers, nor a file with any part of the mac address in it's
>>>>> name. I'll assume then that some piece of maas is responding to that
>>>>> request after fetching the config from some sort of database for that MAC
>>>>> address/node.
>>>>>
>>>>> So then there must be a knob somewhere in MAAS that I can tweak to
>>>>> cause a different disk to be sent in the APPEND hd0 command.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Dec 5, 2017 at 5:24 PM, Lloyd Parkes <
>>>>> lloyd+lp at must-have-coffee.gen.nz> wrote:
>>>>>
>>>>>> I originally sent this from the wrong email address and so it got hung
>>>>>> up on list moderation.
>>>>>>
>>>>>>
>>>>>> On 5 December 2017 at 10:45, Daniel K <sathackr at gmail.com> wrote:
>>>>>> >
>>>>>> > There must be something that tells the PXE loader which physical
>>>>>> disk to try
>>>>>> > to boot
>>>>>>
>>>>>> This is almost certainly hard coded in the PXELinux boot script to
>>>>>> default to BIOS disk 0x80. Have a look at
>>>>>> http://www.syslinux.org/wiki/index.php?title=SYSLINUX#LOCALBOOT_type
>>>>>> and see if it helps.
>>>>>>
>>>>>> I would dig into this myself because I want to make my HPE servers
>>>>>> boot as well, but I'm 3265km and two months away from my MAAS servers.
>>>>>>
>>>>>> Cheers,
>>>>>> Lloyd
>>>>>>
>>>>>> --
>>>>>> Maas-devel mailing list
>>>>>> Maas-devel at lists.ubuntu.com
>>>>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>>>>> an/listinfo/maas-devel
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> Maas-devel mailing list
>>> Maas-devel at lists.ubuntu.com
>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>> an/listinfo/maas-devel
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/maas-devel/attachments/20171207/646a014f/attachment-0001.html>


More information about the Maas-devel mailing list