diff mbox series

[v2,5/7] virtio-mem: s390 support

Message ID 20241014144622.876731-6-david@redhat.com (mailing list archive)
State New
Headers show
Series virtio-mem: s390 support | expand

Commit Message

David Hildenbrand Oct. 14, 2024, 2:46 p.m. UTC
Now that s390 code is prepared for memory devices that reside above the
maximum storage increment exposed through SCLP, everything is in place
to unlock virtio-mem support.

As virtio-mem in Linux currently supports logically onlining/offlining
memory in pageblock granularity, we have an effective hot(un)plug
granularity of 1 MiB on s390.

As virito-mem adds/removes individual Linux memory blocks (256MB), we
will currently never use gigantic pages in the identity mapping.

It is worth noting that neither storage keys nor storage attributes (e.g.,
data / nodat) are touched when onlining memory blocks, which is good
because we are not supposed to touch these parts for unplugged device
blocks that are logically offline in Linux.

We will currently never initialize storage keys for virtio-mem
memory -- IOW, storage_key_init_range() is never called. It could be added
in the future when plugging device blocks. But as that function
essentially does nothing without modifying the code (changing
PAGE_DEFAULT_ACC), that's just fine for now.

kexec should work as intended and just like on other architectures that
support virtio-mem: we will never place kexec binaries on virtio-mem
memory, and never indicate virtio-mem memory to the 2nd kernel. The
device driver in the 2nd kernel can simply reset the device --
turning all memory unplugged, to then start plugging memory and adding
them to Linux, without causing trouble because the memory is already
used elsewhere.

The special s390 kdump mode, whereby the 2nd kernel creates the ELF
core header, won't currently dump virtio-mem memory. The virtio-mem
driver has a special kdump mode, from where we can detect memory ranges
to dump. Based on this, support for dumping virtio-mem memory can be
added in the future fairly easily.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/Kconfig | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Heiko Carstens Oct. 14, 2024, 6:48 p.m. UTC | #1
On Mon, Oct 14, 2024 at 04:46:17PM +0200, David Hildenbrand wrote:
> The special s390 kdump mode, whereby the 2nd kernel creates the ELF
> core header, won't currently dump virtio-mem memory. The virtio-mem
> driver has a special kdump mode, from where we can detect memory ranges
> to dump. Based on this, support for dumping virtio-mem memory can be
> added in the future fairly easily.

Hm.. who will add this support? This looks like a showstopper to me.
Who is supposed to debug crash dumps where memory parts are missing?
David Hildenbrand Oct. 14, 2024, 7:16 p.m. UTC | #2
On 14.10.24 20:48, Heiko Carstens wrote:
> On Mon, Oct 14, 2024 at 04:46:17PM +0200, David Hildenbrand wrote:
>> The special s390 kdump mode, whereby the 2nd kernel creates the ELF
>> core header, won't currently dump virtio-mem memory. The virtio-mem
>> driver has a special kdump mode, from where we can detect memory ranges
>> to dump. Based on this, support for dumping virtio-mem memory can be
>> added in the future fairly easily.
> 

Thanks for the review.

> Hm.. who will add this support? This looks like a showstopper to me.

The cover letter is clearer on that: "One remaining work item is kdump 
support for virtio-mem memory. This will be sent out separately once 
initial support landed."

I had a prototype, but need to spend some time to clean it up -- or find 
someone to hand it over to clean it up.

I have to chose wisely what I work on nowadays, and cannot spend that 
time if the basic support won't get ACKed.

> Who is supposed to debug crash dumps where memory parts are missing?

For many production use cases it certainly needs to exist.

But note that virtio-mem can be used with ZONE_MOVABLE, in which case 
mostly only user data (e.g., pagecache,anon) ends up on hotplugged 
memory, that would get excluded from makedumpfile in the default configs 
either way.

It's not uncommon to let kdump support be added later (e.g., AMD SNP 
variants).
Heiko Carstens Oct. 15, 2024, 8:37 a.m. UTC | #3
On Mon, Oct 14, 2024 at 09:16:45PM +0200, David Hildenbrand wrote:
> On 14.10.24 20:48, Heiko Carstens wrote:
> > On Mon, Oct 14, 2024 at 04:46:17PM +0200, David Hildenbrand wrote:
> > > to dump. Based on this, support for dumping virtio-mem memory can be
> > Hm.. who will add this support? This looks like a showstopper to me.
> 
> The cover letter is clearer on that: "One remaining work item is kdump
> support for virtio-mem memory. This will be sent out separately once initial
> support landed."
> 
> I had a prototype, but need to spend some time to clean it up -- or find
> someone to hand it over to clean it up.
> 
> I have to chose wisely what I work on nowadays, and cannot spend that time
> if the basic support won't get ACKed.
> 
> > Who is supposed to debug crash dumps where memory parts are missing?
> 
> For many production use cases it certainly needs to exist.
> 
> But note that virtio-mem can be used with ZONE_MOVABLE, in which case mostly
> only user data (e.g., pagecache,anon) ends up on hotplugged memory, that
> would get excluded from makedumpfile in the default configs either way.
> 
> It's not uncommon to let kdump support be added later (e.g., AMD SNP
> variants).

I'll leave it up to kvm folks to decide if we need kdump support from
the beginning or if we are good with the current implementation.
Christian Borntraeger Oct. 21, 2024, 6:33 a.m. UTC | #4
Am 15.10.24 um 10:37 schrieb Heiko Carstens:
> On Mon, Oct 14, 2024 at 09:16:45PM +0200, David Hildenbrand wrote:
>> On 14.10.24 20:48, Heiko Carstens wrote:
>>> On Mon, Oct 14, 2024 at 04:46:17PM +0200, David Hildenbrand wrote:
>>>> to dump. Based on this, support for dumping virtio-mem memory can be
>>> Hm.. who will add this support? This looks like a showstopper to me.
>>
>> The cover letter is clearer on that: "One remaining work item is kdump
>> support for virtio-mem memory. This will be sent out separately once initial
>> support landed."
>>
>> I had a prototype, but need to spend some time to clean it up -- or find
>> someone to hand it over to clean it up.
>>
>> I have to chose wisely what I work on nowadays, and cannot spend that time
>> if the basic support won't get ACKed.
>>
>>> Who is supposed to debug crash dumps where memory parts are missing?
>>
>> For many production use cases it certainly needs to exist.
>>
>> But note that virtio-mem can be used with ZONE_MOVABLE, in which case mostly
>> only user data (e.g., pagecache,anon) ends up on hotplugged memory, that
>> would get excluded from makedumpfile in the default configs either way.
>>
>> It's not uncommon to let kdump support be added later (e.g., AMD SNP
>> variants).
> 
> I'll leave it up to kvm folks to decide if we need kdump support from
> the beginning or if we are good with the current implementation.

If David confirms that he has a plan for this, I am fine with a staged approach
for upstream.
David Hildenbrand Oct. 21, 2024, 12:19 p.m. UTC | #5
Am 21.10.24 um 08:33 schrieb Christian Borntraeger:
> 
> 
> Am 15.10.24 um 10:37 schrieb Heiko Carstens:
>> On Mon, Oct 14, 2024 at 09:16:45PM +0200, David Hildenbrand wrote:
>>> On 14.10.24 20:48, Heiko Carstens wrote:
>>>
>>> The cover letter is clearer on that: "One remaining work item is kdump
>>> support for virtio-mem memory. This will be sent out separately once initial
>>> support landed."
>>>
>>> I had a prototype, but need to spend some time to clean it up -- or find
>>> someone to hand it over to clean it up.
>>>
>>> I have to chose wisely what I work on nowadays, and cannot spend that time
>>> if the basic support won't get ACKed.
>>>
>>>
>>> For many production use cases it certainly needs to exist.
>>>
>>> But note that virtio-mem can be used with ZONE_MOVABLE, in which case mostly
>>> only user data (e.g., pagecache,anon) ends up on hotplugged memory, that
>>> would get excluded from makedumpfile in the default configs either way.
>>>
>>> It's not uncommon to let kdump support be added later (e.g., AMD SNP
>>> variants).
>>
>> I'll leave it up to kvm folks to decide if we need kdump support from
>> the beginning or if we are good with the current implementation.
> 
> If David confirms that he has a plan for this, I am fine with a staged approach
> for upstream.

I do have a plan and a even a semi-working prototype that I am currently 
improving. In summary, the virtio-mem driver in kdump mode can report ranges 
with plugged memory to the core so we can include them in the elfcore hdr. That 
is the easy part.

The "challenge" is when the virtio-mem driver is built as a module and gets 
loaded after building/allocating the elfcore hdr (which happens when creating 
/proc/vmcore). We have to defer detecting+adding the ranges to the time 
/proc/vmcore gets opened. Not super complicated, but needs some thought to get 
it done in a clean way / with minimal churn.
diff mbox series

Patch

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 42a48ac763ee..fb320eea70fe 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -122,7 +122,7 @@  config VIRTIO_BALLOON
 
 config VIRTIO_MEM
 	tristate "Virtio mem driver"
-	depends on X86_64 || ARM64 || RISCV
+	depends on X86_64 || ARM64 || RISCV || S390
 	depends on VIRTIO
 	depends on MEMORY_HOTPLUG
 	depends on MEMORY_HOTREMOVE
@@ -132,11 +132,11 @@  config VIRTIO_MEM
 	 This driver provides access to virtio-mem paravirtualized memory
 	 devices, allowing to hotplug and hotunplug memory.
 
-	 This driver currently only supports x86-64 and arm64. Although it
-	 should compile on other architectures that implement memory
-	 hot(un)plug, architecture-specific and/or common
-	 code changes may be required for virtio-mem, kdump and kexec to work as
-	 expected.
+	 This driver currently supports x86-64, arm64, riscv and s390x.
+	 Although it should compile on other architectures that implement
+	 memory hot(un)plug, architecture-specific and/or common
+	 code changes may be required for virtio-mem, kdump and kexec to
+	 work as expected.
 
 	 If unsure, say M.