diff mbox series

[v2] KVM: x86: Complain about an attempt to change the APIC base address

Message ID 20240625235554.2576349-1-jmattson@google.com (mailing list archive)
State New, archived
Headers show
Series [v2] KVM: x86: Complain about an attempt to change the APIC base address | expand

Commit Message

Jim Mattson June 25, 2024, 11:53 p.m. UTC
KVM does not support changing the APIC's base address. Prior to commit
3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or
APIC base"), it emitted a rate-limited warning about this. Now, it's
just silently broken.

Use vcpu_unimpl() to complain about this unsupported operation. Even a
rate-limited error message is better than complete silence.

Fixes: 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 Changes in v2:
  * Changed format specifiers from "%#llx" to "%#x"
  * Cast apic->base_address to unsigned int for printing
 
 arch/x86/kvm/lapic.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jim Mattson July 24, 2024, 6:05 p.m. UTC | #1
On Tue, Jun 25, 2024 at 4:56 PM Jim Mattson <jmattson@google.com> wrote:
>
> KVM does not support changing the APIC's base address. Prior to commit
> 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or
> APIC base"), it emitted a rate-limited warning about this. Now, it's
> just silently broken.
>
> Use vcpu_unimpl() to complain about this unsupported operation. Even a
> rate-limited error message is better than complete silence.
>
> Fixes: 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
> Signed-off-by: Jim Mattson <jmattson@google.com>
> ---
>  Changes in v2:
>   * Changed format specifiers from "%#llx" to "%#x"
>   * Cast apic->base_address to unsigned int for printing
>
>  arch/x86/kvm/lapic.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index acd7d48100a1..43ac05d10b2e 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2583,6 +2583,9 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
>
>         if ((value & MSR_IA32_APICBASE_ENABLE) &&
>              apic->base_address != APIC_DEFAULT_PHYS_BASE) {
> +               vcpu_unimpl(vcpu, "APIC base %#x is not %#x",
> +                           (unsigned int)apic->base_address,
> +                           APIC_DEFAULT_PHYS_BASE);
>                 kvm_set_apicv_inhibit(apic->vcpu->kvm,
>                                       APICV_INHIBIT_REASON_APIC_BASE_MODIFIED);
>         }
> --
> 2.45.2.741.gdbec12cfda-goog

Ping.
Maxim Levitsky July 24, 2024, 6:13 p.m. UTC | #2
On Wed, 2024-07-24 at 11:05 -0700, Jim Mattson wrote:
> On Tue, Jun 25, 2024 at 4:56 PM Jim Mattson <jmattson@google.com> wrote:
> > KVM does not support changing the APIC's base address. Prior to commit
> > 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or
> > APIC base"), it emitted a rate-limited warning about this. Now, it's
> > just silently broken.
> > 
> > Use vcpu_unimpl() to complain about this unsupported operation. Even a
> > rate-limited error message is better than complete silence.
> > 
> > Fixes: 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
> > Signed-off-by: Jim Mattson <jmattson@google.com>
> > ---
> >  Changes in v2:
> >   * Changed format specifiers from "%#llx" to "%#x"
> >   * Cast apic->base_address to unsigned int for printing
> > 
> >  arch/x86/kvm/lapic.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index acd7d48100a1..43ac05d10b2e 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -2583,6 +2583,9 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
> > 
> >         if ((value & MSR_IA32_APICBASE_ENABLE) &&
> >              apic->base_address != APIC_DEFAULT_PHYS_BASE) {
> > +               vcpu_unimpl(vcpu, "APIC base %#x is not %#x",
> > +                           (unsigned int)apic->base_address,
> > +                           APIC_DEFAULT_PHYS_BASE);
> >                 kvm_set_apicv_inhibit(apic->vcpu->kvm,
> >                                       APICV_INHIBIT_REASON_APIC_BASE_MODIFIED);
> >         }
> > --
> > 2.45.2.741.gdbec12cfda-goog
> 
> Ping.
> 
I think that we talked about this once, that nobody looks at these dmesg warnings,
its just a way for a malicious guest to fill up the host log (yes rate limit helps,
but slowly you can still fill it up),
but if you think that this is valuable, I am not against putting it back.

I wonder....

What if we introduce a new KVM capability, say CAP_DISABLE_UNSUPPORTED_FEATURES,
and when enabled, outright crash the guest when it attempts things like changing APIC base,
APIC IDs, and other unsupported things like that?

Then we can make qemu set it by default, and if users have to use an unsupported feature,
they could always add a qemu flag that will disable this capability.

Best regards,
	Maxim Levitsky
Jim Mattson July 24, 2024, 6:34 p.m. UTC | #3
On Wed, Jul 24, 2024 at 11:13 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> On Wed, 2024-07-24 at 11:05 -0700, Jim Mattson wrote:
> > On Tue, Jun 25, 2024 at 4:56 PM Jim Mattson <jmattson@google.com> wrote:
> > > KVM does not support changing the APIC's base address. Prior to commit
> > > 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or
> > > APIC base"), it emitted a rate-limited warning about this. Now, it's
> > > just silently broken.
> > >
> > > Use vcpu_unimpl() to complain about this unsupported operation. Even a
> > > rate-limited error message is better than complete silence.
> > >
> > > Fixes: 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
> > > Signed-off-by: Jim Mattson <jmattson@google.com>
> > > ---
> > >  Changes in v2:
> > >   * Changed format specifiers from "%#llx" to "%#x"
> > >   * Cast apic->base_address to unsigned int for printing
> > >
> > >  arch/x86/kvm/lapic.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > index acd7d48100a1..43ac05d10b2e 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -2583,6 +2583,9 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
> > >
> > >         if ((value & MSR_IA32_APICBASE_ENABLE) &&
> > >              apic->base_address != APIC_DEFAULT_PHYS_BASE) {
> > > +               vcpu_unimpl(vcpu, "APIC base %#x is not %#x",
> > > +                           (unsigned int)apic->base_address,
> > > +                           APIC_DEFAULT_PHYS_BASE);
> > >                 kvm_set_apicv_inhibit(apic->vcpu->kvm,
> > >                                       APICV_INHIBIT_REASON_APIC_BASE_MODIFIED);
> > >         }
> > > --
> > > 2.45.2.741.gdbec12cfda-goog
> >
> > Ping.
> >
> I think that we talked about this once, that nobody looks at these dmesg warnings,
> its just a way for a malicious guest to fill up the host log (yes rate limit helps,
> but slowly you can still fill it up),
> but if you think that this is valuable, I am not against putting it back.

Funny that you mention this. I'm about to send out another change to
curtail some ratelimited spam that *quickly* fills up the host log.

> I wonder....
>
> What if we introduce a new KVM capability, say CAP_DISABLE_UNSUPPORTED_FEATURES,
> and when enabled, outright crash the guest when it attempts things like changing APIC base,
> APIC IDs, and other unsupported things like that?
>
> Then we can make qemu set it by default, and if users have to use an unsupported feature,
> they could always add a qemu flag that will disable this capability.

Alternatively, why not devise a way to inform userspace that the guest
has exercised an unsupported feature? Unless you're a hobbyist working
on your desktop, kernel messages are a *terrible* mechanism for
communicating with the end user.

> Best regards,
>         Maxim Levitsky
>
Sean Christopherson July 24, 2024, 7:38 p.m. UTC | #4
On Wed, Jul 24, 2024, Jim Mattson wrote:
> On Wed, Jul 24, 2024 at 11:13 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > What if we introduce a new KVM capability, say CAP_DISABLE_UNSUPPORTED_FEATURES,
> > and when enabled, outright crash the guest when it attempts things like
> > changing APIC base, APIC IDs, and other unsupported things like that?
> >
> > Then we can make qemu set it by default, and if users have to use an
> > unsupported feature, they could always add a qemu flag that will disable
> > this capability.
> 
> Alternatively, why not devise a way to inform userspace that the guest
> has exercised an unsupported feature? Unless you're a hobbyist working on
> your desktop, kernel messages are a *terrible* mechanism for communicating
> with the end user.

A per-vCPU/VM stat would suffice in most cases, e.g. similar to the proposed
auto-EOI stat[*].  But I honestly don't see the point for APIC base relocation
and changing x2APIC IDs.  We _know_ these things don't work.  Having a flag might
save a bit of triage when debugging a guest issue, but I fail to see what userspace
would do with the information outside of a debug scenario.

And for APIC base relocation, userspace already has the ability to detect this
unuspported behavior.  Trap writes to MSR_IA32_APICBASE via MSR filtering, then
reflect the value back into KVM.  Performance would suck, but writes to
MSR_IA32_APICBASE should be very rare, especially if the host forces x2APIC via
guest firmware.

As for changing x2APIC IDs, that is the architectural behavior on Intel.  If a
kernel is trying to change x2APIC IDs, it's going to have a bad day regardless.

So I guess the question is, what motivated this patch?
Jim Mattson July 24, 2024, 9:22 p.m. UTC | #5
On Wed, Jul 24, 2024 at 12:38 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Jul 24, 2024, Jim Mattson wrote:
> > On Wed, Jul 24, 2024 at 11:13 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > > What if we introduce a new KVM capability, say CAP_DISABLE_UNSUPPORTED_FEATURES,
> > > and when enabled, outright crash the guest when it attempts things like
> > > changing APIC base, APIC IDs, and other unsupported things like that?
> > >
> > > Then we can make qemu set it by default, and if users have to use an
> > > unsupported feature, they could always add a qemu flag that will disable
> > > this capability.
> >
> > Alternatively, why not devise a way to inform userspace that the guest
> > has exercised an unsupported feature? Unless you're a hobbyist working on
> > your desktop, kernel messages are a *terrible* mechanism for communicating
> > with the end user.
>
> A per-vCPU/VM stat would suffice in most cases, e.g. similar to the proposed
> auto-EOI stat[*].  But I honestly don't see the point for APIC base relocation
> and changing x2APIC IDs.  We _know_ these things don't work.  Having a flag might
> save a bit of triage when debugging a guest issue, but I fail to see what userspace
> would do with the information outside of a debug scenario.

I would argue that insider knowledge about what does and doesn't work
isn't particularly helpful to the end user.

A non-standard flag isn't particularly helpful either. Nor is a kernel
log message. Perhaps these solutions are fine for hobbyists, but they
are not useful in an enterprise environment

If a guest OS tries to change the APIC base address, and KVM silently
ignores it, the guest is unlikely to get very far. Imagine what would
happen if the guest tried to change GS_BASE, and KVM silently ignored
it.

Maybe KVM should return KVM_INTERNAL_ERROR_EMULATION if the guest
attempts to relocate the APIC base--even without a new "pedantic"
flag. What is the point in trying to continue without relocation?

> And for APIC base relocation, userspace already has the ability to detect this
> unuspported behavior.  Trap writes to MSR_IA32_APICBASE via MSR filtering, then
> reflect the value back into KVM.  Performance would suck, but writes to
> MSR_IA32_APICBASE should be very rare, especially if the host forces x2APIC via
> guest firmware.

This "unsupported behavior" should at least be documented somewhere.

> As for changing x2APIC IDs, that is the architectural behavior on Intel.  If a
> kernel is trying to change x2APIC IDs, it's going to have a bad day regardless.
>
> So I guess the question is, what motivated this patch?

I don't recall, but I now withdraw the patch. We really should do better.

BTW, there is a KUT that supposedly verifies that APIC base relocation
works. See KUT commit b6bdb3f6ab6 ("apic: Add test case for relocation
and writing reserved bits"). That's the test that Nadav Amit refers to
in the commit message for commit db324fe6f20b ("KVM: x86: Warn on APIC
base relocation").
Sean Christopherson July 24, 2024, 10:22 p.m. UTC | #6
On Wed, Jul 24, 2024, Jim Mattson wrote:
> On Wed, Jul 24, 2024 at 12:38 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Jul 24, 2024, Jim Mattson wrote:
> > > On Wed, Jul 24, 2024 at 11:13 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > > > What if we introduce a new KVM capability, say CAP_DISABLE_UNSUPPORTED_FEATURES,
> > > > and when enabled, outright crash the guest when it attempts things like
> > > > changing APIC base, APIC IDs, and other unsupported things like that?
> > > >
> > > > Then we can make qemu set it by default, and if users have to use an
> > > > unsupported feature, they could always add a qemu flag that will disable
> > > > this capability.
> > >
> > > Alternatively, why not devise a way to inform userspace that the guest
> > > has exercised an unsupported feature? Unless you're a hobbyist working on
> > > your desktop, kernel messages are a *terrible* mechanism for communicating
> > > with the end user.
> >
> > A per-vCPU/VM stat would suffice in most cases, e.g. similar to the proposed
> > auto-EOI stat[*].  But I honestly don't see the point for APIC base relocation
> > and changing x2APIC IDs.  We _know_ these things don't work.  Having a flag might
> > save a bit of triage when debugging a guest issue, but I fail to see what userspace
> > would do with the information outside of a debug scenario.
> 
> I would argue that insider knowledge about what does and doesn't work
> isn't particularly helpful to the end user.
> 
> A non-standard flag isn't particularly helpful either. Nor is a kernel
> log message. Perhaps these solutions are fine for hobbyists, but they
> are not useful in an enterprise environment

I don't disagree, but at the same time, I don't think it's unreasonable to expect
an enterprise environment to be aware of KVM's _documented_ errata (see below).
Of course, this one ain't documented...

> If a guest OS tries to change the APIC base address, and KVM silently
> ignores it, the guest is unlikely to get very far. Imagine what would
> happen if the guest tried to change GS_BASE, and KVM silently ignored
> it.
> 
> Maybe KVM should return KVM_INTERNAL_ERROR_EMULATION if the guest
> attempts to relocate the APIC base--even without a new "pedantic"
> flag. What is the point in trying to continue without relocation?

I'm definitely not opposed to this, though there's a non-zero risk would be killing
a weird-but-functional guest, e.g. if it "relocates" the APIC base on its way to
enabling x2APIC.   Maybe a quirk is warranted for this one in particular?  (where
disabling the quirk triggers KVM_INTERNAL_ERROR_EMULATION).

> > And for APIC base relocation, userspace already has the ability to detect this
> > unuspported behavior.  Trap writes to MSR_IA32_APICBASE via MSR filtering, then
> > reflect the value back into KVM.  Performance would suck, but writes to
> > MSR_IA32_APICBASE should be very rare, especially if the host forces x2APIC via
> > guest firmware.
> 
> This "unsupported behavior" should at least be documented somewhere.

Ya, Documentation/virt/kvm/x86/errata.rst has a few things, but we need to keep
building it out.
diff mbox series

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index acd7d48100a1..43ac05d10b2e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2583,6 +2583,9 @@  void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
 
 	if ((value & MSR_IA32_APICBASE_ENABLE) &&
 	     apic->base_address != APIC_DEFAULT_PHYS_BASE) {
+		vcpu_unimpl(vcpu, "APIC base %#x is not %#x",
+			    (unsigned int)apic->base_address,
+			    APIC_DEFAULT_PHYS_BASE);
 		kvm_set_apicv_inhibit(apic->vcpu->kvm,
 				      APICV_INHIBIT_REASON_APIC_BASE_MODIFIED);
 	}