Message ID | 20230908041115.987682-1-tao1.su@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] KVM: x86: Clear bit12 of ICR after APIC-write VM-exit | expand |
On Fri, Sep 08, 2023 at 12:11:15PM +0800, Tao Su wrote: >When IPI virtualization is enabled, a WARN is triggered if bit12 of ICR >MSR is set after APIC-write VM-exit. The reason is kvm_apic_send_ipi() >thinks the APIC_ICR_BUSY bit should be cleared because KVM has no delay, >but kvm_apic_write_nodecode() doesn't clear the APIC_ICR_BUSY bit. > >Bit12 of ICR is different from other reserved bits(31:20, 17:16 and 13). >When bit12 is set, it will cause APIC-wirte VM-exit but not #GP. For s/wirte/write >reading bit12 back as '0' which is a safer approach, clearing bit12 in >x2APIC mode is needed. how about quoting what Sean said: (w/ a slight change to the last sentence) Under the x2APIC section, regarding ICR, the SDM says: It remains readable only to aid in debugging; however, software should not assume the value returned by reading the ICR is the last written value. I.e. KVM basically has free reign to do whatever it wants, so long as it doesn't confuse userspace or break KVM's ABI. Clear bit12 so that it reads back as '0'. This approach is safer than "do nothing" and is consistent with the case where IPI virtualization is disabled or not supported, i.e., handle_fastpath_set_x2apic_icr_irqoff() -> kvm_x2apic_icr_write() > >Although bit12 of ICR is no longer APIC_ICR_BUSY in x2APIC, keeping it >is far easier to understand what's going on, especially given that it >may be repurposed for something new. Probably you can remove this paragraph. it is not clear w/o the context that there was an attempt to rename APIC_ICR_BUSY for x2apic while fixing the issue. > >Link: https://lore.kernel.org/all/ZPj6iF0Q7iynn62p@google.com/ >Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") >Signed-off-by: Tao Su <tao1.su@linux.intel.com> >Tested-by: Yi Lai <yi1.lai@intel.com> Apart from above nits on the changelog, this patch looks good to me. Reviewed-by: Chao Gao <chao.gao@intel.com> >--- >Changelog: > >v2: > - Drop the unnecessary alias for bit12 of ICR. > - Add back kvm_lapic_get_reg64() that was removed by mistake. > - Modify the commit message to make it clearer. > >v1: https://lore.kernel.org/all/20230904013555.725413-1-tao1.su@linux.intel.com/ >--- > arch/x86/kvm/lapic.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > >diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >index dcd60b39e794..664d5a78b46a 100644 >--- a/arch/x86/kvm/lapic.c >+++ b/arch/x86/kvm/lapic.c >@@ -2450,13 +2450,13 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset) > * ICR is a single 64-bit register when x2APIC is enabled. For legacy > * xAPIC, ICR writes need to go down the common (slightly slower) path > * to get the upper half from ICR2. >+ * >+ * TODO: optimize to just emulate side effect w/o one more write > */ > if (apic_x2apic_mode(apic) && offset == APIC_ICR) { > val = kvm_lapic_get_reg64(apic, APIC_ICR); >- kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32)); >- trace_kvm_apic_write(APIC_ICR, val); >+ kvm_x2apic_icr_write(apic, val); > } else { >- /* TODO: optimize to just emulate side effect w/o one more write */ > val = kvm_lapic_get_reg(apic, offset); > kvm_lapic_reg_write(apic, offset, (u32)val); > } > >base-commit: a48fa7efaf1161c1c898931fe4c7f0070964233a >-- >2.34.1 >
On Fri, Sep 08, 2023 at 06:12:12PM +0800, Chao Gao wrote: > On Fri, Sep 08, 2023 at 12:11:15PM +0800, Tao Su wrote: > >When IPI virtualization is enabled, a WARN is triggered if bit12 of ICR > >MSR is set after APIC-write VM-exit. The reason is kvm_apic_send_ipi() > >thinks the APIC_ICR_BUSY bit should be cleared because KVM has no delay, > >but kvm_apic_write_nodecode() doesn't clear the APIC_ICR_BUSY bit. > > > >Bit12 of ICR is different from other reserved bits(31:20, 17:16 and 13). > >When bit12 is set, it will cause APIC-wirte VM-exit but not #GP. For > > s/wirte/write > > >reading bit12 back as '0' which is a safer approach, clearing bit12 in > >x2APIC mode is needed. > > how about quoting what Sean said: > (w/ a slight change to the last sentence) > > Under the x2APIC section, regarding ICR, the SDM says: > > It remains readable only to aid in debugging; however, software should not > assume the value returned by reading the ICR is the last written value. > > I.e. KVM basically has free reign to do whatever it wants, so long as it doesn't > confuse userspace or break KVM's ABI. > > Clear bit12 so that it reads back as '0'. This approach is safer than "do > nothing" and is consistent with the case where IPI virtualization is > disabled or not supported, i.e., > > handle_fastpath_set_x2apic_icr_irqoff() -> kvm_x2apic_icr_write() > > > > >Although bit12 of ICR is no longer APIC_ICR_BUSY in x2APIC, keeping it > >is far easier to understand what's going on, especially given that it > >may be repurposed for something new. > > Probably you can remove this paragraph. it is not clear w/o the context > that there was an attempt to rename APIC_ICR_BUSY for x2apic while fixing > the issue. Yes, agree with all the above, this is more correct and clear description. Thanks, Tao > > > > >Link: https://lore.kernel.org/all/ZPj6iF0Q7iynn62p@google.com/ > >Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode") > >Signed-off-by: Tao Su <tao1.su@linux.intel.com> > >Tested-by: Yi Lai <yi1.lai@intel.com> > > Apart from above nits on the changelog, this patch looks good to me. > > Reviewed-by: Chao Gao <chao.gao@intel.com> > > >--- > >Changelog: > > > >v2: > > - Drop the unnecessary alias for bit12 of ICR. > > - Add back kvm_lapic_get_reg64() that was removed by mistake. > > - Modify the commit message to make it clearer. > > > >v1: https://lore.kernel.org/all/20230904013555.725413-1-tao1.su@linux.intel.com/ > >--- > > arch/x86/kvm/lapic.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > >diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > >index dcd60b39e794..664d5a78b46a 100644 > >--- a/arch/x86/kvm/lapic.c > >+++ b/arch/x86/kvm/lapic.c > >@@ -2450,13 +2450,13 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset) > > * ICR is a single 64-bit register when x2APIC is enabled. For legacy > > * xAPIC, ICR writes need to go down the common (slightly slower) path > > * to get the upper half from ICR2. > >+ * > >+ * TODO: optimize to just emulate side effect w/o one more write > > */ > > if (apic_x2apic_mode(apic) && offset == APIC_ICR) { > > val = kvm_lapic_get_reg64(apic, APIC_ICR); > >- kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32)); > >- trace_kvm_apic_write(APIC_ICR, val); > >+ kvm_x2apic_icr_write(apic, val); > > } else { > >- /* TODO: optimize to just emulate side effect w/o one more write */ > > val = kvm_lapic_get_reg(apic, offset); > > kvm_lapic_reg_write(apic, offset, (u32)val); > > } > > > >base-commit: a48fa7efaf1161c1c898931fe4c7f0070964233a > >-- > >2.34.1 > >
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index dcd60b39e794..664d5a78b46a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2450,13 +2450,13 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset) * ICR is a single 64-bit register when x2APIC is enabled. For legacy * xAPIC, ICR writes need to go down the common (slightly slower) path * to get the upper half from ICR2. + * + * TODO: optimize to just emulate side effect w/o one more write */ if (apic_x2apic_mode(apic) && offset == APIC_ICR) { val = kvm_lapic_get_reg64(apic, APIC_ICR); - kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32)); - trace_kvm_apic_write(APIC_ICR, val); + kvm_x2apic_icr_write(apic, val); } else { - /* TODO: optimize to just emulate side effect w/o one more write */ val = kvm_lapic_get_reg(apic, offset); kvm_lapic_reg_write(apic, offset, (u32)val); }