mbox series

[0/5] s390x: Add Full Boot Order Support

Message ID 20240529154311.734548-1-jrossi@linux.ibm.com (mailing list archive)
Headers show
Series s390x: Add Full Boot Order Support | expand

Message

Jared Rossi May 29, 2024, 3:43 p.m. UTC
From: Jared Rossi <jrossi@linux.ibm.com>

This patch set primarily adds support for the specification of multiple boot
devices, allowing for the guest to automatically use an alternative device on
a failed boot without needing to be reconfigured. It additionally provides the
ability to define the loadparm attribute on a per-device bases, which allows
boot devices to use different loadparm values if needed.

In brief, an IPLB is generated for each designated boot device (up to a maximum
of 8) and stored in guest memory immediately before BIOS. If a device fails to
boot, the next IPLB is retrieved and we jump back to the start of BIOS.

Devices can be specified using the standard qemu device tag "bootindex" as with
other architectures. Lower number indices are tried first, with "bootindex=0"
indicating the first device to try.

A subsequent Libvirt patch will be necessary to allow assignment of per-device
loadparms in the guest XML

Jared Rossi (5):
  Create include files for s390x IPL definitions
  Add loadparm to CcwDevice
  Build IPLB chain for multiple boot devices
  Add boot device fallback infrastructure
  Enable and document boot device fallback on panic

 docs/system/bootindex.rst         |   7 +-
 docs/system/s390x/bootdevices.rst |   9 +-
 hw/s390x/ccw-device.h             |   2 +
 hw/s390x/ipl.h                    | 117 +-------------------
 include/hw/s390x/ipl/qipl.h       | 128 ++++++++++++++++++++++
 pc-bios/s390-ccw/bootmap.h        |   5 +
 pc-bios/s390-ccw/iplb.h           | 108 +++++--------------
 pc-bios/s390-ccw/s390-ccw.h       |   6 ++
 hw/s390x/ccw-device.c             |  49 +++++++++
 hw/s390x/ipl.c                    | 170 ++++++++++++++++++++++--------
 hw/s390x/s390-virtio-ccw.c        |  18 +---
 hw/s390x/sclp.c                   |   3 +-
 pc-bios/s390-ccw/bootmap.c        |  41 ++++---
 pc-bios/s390-ccw/main.c           |  25 +++--
 pc-bios/s390-ccw/netmain.c        |   4 +
 pc-bios/s390-ccw/Makefile         |   2 +-
 16 files changed, 413 insertions(+), 281 deletions(-)
 create mode 100644 include/hw/s390x/ipl/qipl.h

Comments

Thomas Huth June 4, 2024, 6:35 p.m. UTC | #1
On 29/05/2024 17.43, jrossi@linux.ibm.com wrote:
> From: Jared Rossi <jrossi@linux.ibm.com>
> 
> This patch set primarily adds support for the specification of multiple boot
> devices, allowing for the guest to automatically use an alternative device on
> a failed boot without needing to be reconfigured. It additionally provides the
> ability to define the loadparm attribute on a per-device bases, which allows
> boot devices to use different loadparm values if needed.
> 
> In brief, an IPLB is generated for each designated boot device (up to a maximum
> of 8) and stored in guest memory immediately before BIOS. If a device fails to
> boot, the next IPLB is retrieved and we jump back to the start of BIOS.
> 
> Devices can be specified using the standard qemu device tag "bootindex" as with
> other architectures. Lower number indices are tried first, with "bootindex=0"
> indicating the first device to try.
> 
> A subsequent Libvirt patch will be necessary to allow assignment of per-device
> loadparms in the guest XML
> 
> Jared Rossi (5):
>    Create include files for s390x IPL definitions
>    Add loadparm to CcwDevice
>    Build IPLB chain for multiple boot devices
>    Add boot device fallback infrastructure
>    Enable and document boot device fallback on panic
> 
>   docs/system/bootindex.rst         |   7 +-
>   docs/system/s390x/bootdevices.rst |   9 +-
>   hw/s390x/ccw-device.h             |   2 +
>   hw/s390x/ipl.h                    | 117 +-------------------
>   include/hw/s390x/ipl/qipl.h       | 128 ++++++++++++++++++++++
>   pc-bios/s390-ccw/bootmap.h        |   5 +
>   pc-bios/s390-ccw/iplb.h           | 108 +++++--------------
>   pc-bios/s390-ccw/s390-ccw.h       |   6 ++
>   hw/s390x/ccw-device.c             |  49 +++++++++
>   hw/s390x/ipl.c                    | 170 ++++++++++++++++++++++--------
>   hw/s390x/s390-virtio-ccw.c        |  18 +---
>   hw/s390x/sclp.c                   |   3 +-
>   pc-bios/s390-ccw/bootmap.c        |  41 ++++---
>   pc-bios/s390-ccw/main.c           |  25 +++--
>   pc-bios/s390-ccw/netmain.c        |   4 +
>   pc-bios/s390-ccw/Makefile         |   2 +-

  Hi Jared!

For v2, could you please add at least two tests: one that check booting from 
the second disk and one that checks booting from the last boot disk when the 
previous ones are invalid?

I could think of two easy ways for adding such tests, up to you what you prefer:

- Extend the tests/qtest/cdrom-test.c - see add_s390x_tests() there

- Add an avocado test - see "grep -l s390 tests/avocado/*.py" for examples.

  Thomas
Thomas Huth June 5, 2024, 8:02 a.m. UTC | #2
On 29/05/2024 17.43, jrossi@linux.ibm.com wrote:
> From: Jared Rossi <jrossi@linux.ibm.com>
> 
> This patch set primarily adds support for the specification of multiple boot
> devices, allowing for the guest to automatically use an alternative device on
> a failed boot without needing to be reconfigured. It additionally provides the
> ability to define the loadparm attribute on a per-device bases, which allows
> boot devices to use different loadparm values if needed.
> 
> In brief, an IPLB is generated for each designated boot device (up to a maximum
> of 8) and stored in guest memory immediately before BIOS. If a device fails to
> boot, the next IPLB is retrieved and we jump back to the start of BIOS.
> 
> Devices can be specified using the standard qemu device tag "bootindex" as with
> other architectures. Lower number indices are tried first, with "bootindex=0"
> indicating the first device to try.

Is this supposed with multiple scsi-hd devices, too? I tried to boot a guest 
with two scsi disks (attached to a single virtio-scsi-ccw adapter) where 
only the second disk had a bootable installation, but that failed...?

  Thomas
Jared Rossi June 6, 2024, 7:22 p.m. UTC | #3
On 6/5/24 4:02 AM, Thomas Huth wrote:
> On 29/05/2024 17.43, jrossi@linux.ibm.com wrote:
>> From: Jared Rossi <jrossi@linux.ibm.com>
>>
>> This patch set primarily adds support for the specification of 
>> multiple boot
>> devices, allowing for the guest to automatically use an alternative 
>> device on
>> a failed boot without needing to be reconfigured. It additionally 
>> provides the
>> ability to define the loadparm attribute on a per-device bases, which 
>> allows
>> boot devices to use different loadparm values if needed.
>>
>> In brief, an IPLB is generated for each designated boot device (up to 
>> a maximum
>> of 8) and stored in guest memory immediately before BIOS. If a device 
>> fails to
>> boot, the next IPLB is retrieved and we jump back to the start of BIOS.
>>
>> Devices can be specified using the standard qemu device tag 
>> "bootindex" as with
>> other architectures. Lower number indices are tried first, with 
>> "bootindex=0"
>> indicating the first device to try.
>
> Is this supposed with multiple scsi-hd devices, too? I tried to boot a 
> guest with two scsi disks (attached to a single virtio-scsi-ccw 
> adapter) where only the second disk had a bootable installation, but 
> that failed...?
>
>  Thomas
>
>

Hi Thomas,

Yes, I would expect that to work. I tried to reproduce this using a 
non-bootable scsi disk as the first boot device and then a known-good 
bootable scsi disk as the second boot device, with one controller.  In 
my instance the BIOS was not able to identify the first disk as bootable 
and so that device failed to IPL, but it did move on to the next disk 
after that, and the guest successfully IPL'd from the second device.

When you say it failed, do you mean the first disk failed to boot (as 
expected), but then the guest died without attempting to boot from the 
second disk?  Or did something else happen? I am either not 
understanding your configuration or I am not understanding your error.

Regards,

Jared Rossi
Thomas Huth June 7, 2024, 6:19 a.m. UTC | #4
On 06/06/2024 21.22, Jared Rossi wrote:
> 
> 
> On 6/5/24 4:02 AM, Thomas Huth wrote:
>> On 29/05/2024 17.43, jrossi@linux.ibm.com wrote:
>>> From: Jared Rossi <jrossi@linux.ibm.com>
>>>
>>> This patch set primarily adds support for the specification of multiple boot
>>> devices, allowing for the guest to automatically use an alternative 
>>> device on
>>> a failed boot without needing to be reconfigured. It additionally 
>>> provides the
>>> ability to define the loadparm attribute on a per-device bases, which allows
>>> boot devices to use different loadparm values if needed.
>>>
>>> In brief, an IPLB is generated for each designated boot device (up to a 
>>> maximum
>>> of 8) and stored in guest memory immediately before BIOS. If a device 
>>> fails to
>>> boot, the next IPLB is retrieved and we jump back to the start of BIOS.
>>>
>>> Devices can be specified using the standard qemu device tag "bootindex" 
>>> as with
>>> other architectures. Lower number indices are tried first, with 
>>> "bootindex=0"
>>> indicating the first device to try.
>>
>> Is this supposed with multiple scsi-hd devices, too? I tried to boot a 
>> guest with two scsi disks (attached to a single virtio-scsi-ccw adapter) 
>> where only the second disk had a bootable installation, but that failed...?
>>
>>  Thomas
>>
>>
> 
> Hi Thomas,
> 
> Yes, I would expect that to work. I tried to reproduce this using a 
> non-bootable scsi disk as the first boot device and then a known-good 
> bootable scsi disk as the second boot device, with one controller.  In my 
> instance the BIOS was not able to identify the first disk as bootable and so 
> that device failed to IPL, but it did move on to the next disk after that, 
> and the guest successfully IPL'd from the second device.
> 
> When you say it failed, do you mean the first disk failed to boot (as 
> expected), but then the guest died without attempting to boot from the 
> second disk?  Or did something else happen? I am either not understanding 
> your configuration or I am not understanding your error.

I did this:

  $ ./qemu-system-s390x -bios pc-bios/s390-ccw/s390-ccw.img -accel kvm \
    -device virtio-scsi-ccw  -drive if=none,id=d2,file=/tmp/bad.qcow2 \
    -device scsi-hd,drive=d2,bootindex=2 \
    -drive if=none,id=d8,file=/tmp/good.qcow2 \
    -device scsi-hd,drive=d8,bootindex=3 -m 4G -nographic
  LOADPARM=[        ]
  Using virtio-scsi.
  Using guessed DASD geometry.
  Using ECKD scheme (block size   512), CDL
  No zIPL section in IPL2 record.
  zIPL load failed.

  Trying next boot device...
  LOADPARM=[        ]
  Using virtio-scsi.
  Using guessed DASD geometry.
  Using ECKD scheme (block size   512), CDL
  No zIPL section in IPL2 record.
  zIPL load failed.

So it claims to try to load from the second disk, but it fails.
If I change the "bootindex=3" of the second disk to "bootindex=1", it boots 
perfectly fine, so I'm sure that the installation on good.qcow2 is working fine.

  Thomas
Jared Rossi June 10, 2024, 3:58 a.m. UTC | #5
On 6/7/24 2:19 AM, Thomas Huth wrote:
> On 06/06/2024 21.22, Jared Rossi wrote:
>>
>>
>> On 6/5/24 4:02 AM, Thomas Huth wrote:
>>> On 29/05/2024 17.43, jrossi@linux.ibm.com wrote:
>>>> From: Jared Rossi <jrossi@linux.ibm.com>
>>>>
>>>> This patch set primarily adds support for the specification of 
>>>> multiple boot
>>>> devices, allowing for the guest to automatically use an alternative 
>>>> device on
>>>> a failed boot without needing to be reconfigured. It additionally 
>>>> provides the
>>>> ability to define the loadparm attribute on a per-device bases, 
>>>> which allows
>>>> boot devices to use different loadparm values if needed.
>>>>
>>>> In brief, an IPLB is generated for each designated boot device (up 
>>>> to a maximum
>>>> of 8) and stored in guest memory immediately before BIOS. If a 
>>>> device fails to
>>>> boot, the next IPLB is retrieved and we jump back to the start of 
>>>> BIOS.
>>>>
>>>> Devices can be specified using the standard qemu device tag 
>>>> "bootindex" as with
>>>> other architectures. Lower number indices are tried first, with 
>>>> "bootindex=0"
>>>> indicating the first device to try.
>>>
>>> Is this supposed with multiple scsi-hd devices, too? I tried to boot 
>>> a guest with two scsi disks (attached to a single virtio-scsi-ccw 
>>> adapter) where only the second disk had a bootable installation, but 
>>> that failed...?
>>>
>>>  Thomas
>>>
>>>
>>
>> Hi Thomas,
>>
>> Yes, I would expect that to work. I tried to reproduce this using a 
>> non-bootable scsi disk as the first boot device and then a known-good 
>> bootable scsi disk as the second boot device, with one controller.  
>> In my instance the BIOS was not able to identify the first disk as 
>> bootable and so that device failed to IPL, but it did move on to the 
>> next disk after that, and the guest successfully IPL'd from the 
>> second device.
>>
>> When you say it failed, do you mean the first disk failed to boot (as 
>> expected), but then the guest died without attempting to boot from 
>> the second disk?  Or did something else happen? I am either not 
>> understanding your configuration or I am not understanding your error.
>
> I did this:
>
>  $ ./qemu-system-s390x -bios pc-bios/s390-ccw/s390-ccw.img -accel kvm \
>    -device virtio-scsi-ccw  -drive if=none,id=d2,file=/tmp/bad.qcow2 \
>    -device scsi-hd,drive=d2,bootindex=2 \
>    -drive if=none,id=d8,file=/tmp/good.qcow2 \
>    -device scsi-hd,drive=d8,bootindex=3 -m 4G -nographic
>  LOADPARM=[        ]
>  Using virtio-scsi.
>  Using guessed DASD geometry.
>  Using ECKD scheme (block size   512), CDL
>  No zIPL section in IPL2 record.
>  zIPL load failed.
>
>  Trying next boot device...
>  LOADPARM=[        ]
>  Using virtio-scsi.
>  Using guessed DASD geometry.
>  Using ECKD scheme (block size   512), CDL
>  No zIPL section in IPL2 record.
>  zIPL load failed.
>
> So it claims to try to load from the second disk, but it fails.
> If I change the "bootindex=3" of the second disk to "bootindex=1", it 
> boots perfectly fine, so I'm sure that the installation on good.qcow2 
> is working fine.
>
>  Thomas
>

I am able to reproduce this now; I'll investigate the problem.