diff mbox series

[RFC,05/14] x86/apic: Initialize APIC ID for Secure AVIC

Message ID 20240913113705.419146-6-Neeraj.Upadhyay@amd.com (mailing list archive)
State New, archived
Headers show
Series AMD: Add Secure AVIC Guest Support | expand

Commit Message

Neeraj Upadhyay Sept. 13, 2024, 11:36 a.m. UTC
Initialize the APIC ID in the APIC backing page with the
CPUID function 0000_000bh_EDX (Extended Topology Enumeration),
and ensure that APIC ID msr read from hypervisor is consistent
with the value read from CPUID.

Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
 arch/x86/kernel/apic/x2apic_savic.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Melody Wang Nov. 9, 2024, 8:13 p.m. UTC | #1
Hi Neeraj,

On 9/13/2024 4:36 AM, Neeraj Upadhyay wrote:
> Initialize the APIC ID in the APIC backing page with the
> CPUID function 0000_000bh_EDX (Extended Topology Enumeration),
> and ensure that APIC ID msr read from hypervisor is consistent
> with the value read from CPUID.
> 
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
>  arch/x86/kernel/apic/x2apic_savic.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
> index 99151be4e173..09fbc1857bf3 100644
> --- a/arch/x86/kernel/apic/x2apic_savic.c
> +++ b/arch/x86/kernel/apic/x2apic_savic.c
> @@ -14,6 +14,7 @@
>  #include <linux/sizes.h>
>  
>  #include <asm/apic.h>
> +#include <asm/cpuid.h>
>  #include <asm/sev.h>
>  
>  #include "local.h"
> @@ -200,6 +201,8 @@ static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
>  
>  static void init_backing_page(void *backing_page)
>  {
> +	u32 hv_apic_id;
> +	u32 apic_id;
>  	u32 val;
>  	int i;
>  
> @@ -220,6 +223,13 @@ static void init_backing_page(void *backing_page)
>  
>  	val = read_msr_from_hv(APIC_LDR);
>  	set_reg(backing_page, APIC_LDR, val);
> +
> +	/* Read APIC ID from Extended Topology Enumeration CPUID */
> +	apic_id = cpuid_edx(0x0000000b);
> +	hv_apic_id = read_msr_from_hv(APIC_ID);
> +	WARN_ONCE(hv_apic_id != apic_id, "Inconsistent APIC_ID values: %d (cpuid), %d (msr)",
> +			apic_id, hv_apic_id);
> +	set_reg(backing_page, APIC_ID, apic_id);
>  }
>  
With this warning that hv_apic_id and apic_id  is different, do you still want to set_reg after that? If so, wonder why we have this warning?

Thanks,
Melody
>  static void x2apic_savic_setup(void)
Neeraj Upadhyay Nov. 10, 2024, 3:55 a.m. UTC | #2
>>  static void init_backing_page(void *backing_page)
>>  {
>> +	u32 hv_apic_id;
>> +	u32 apic_id;
>>  	u32 val;
>>  	int i;
>>  
>> @@ -220,6 +223,13 @@ static void init_backing_page(void *backing_page)
>>  
>>  	val = read_msr_from_hv(APIC_LDR);
>>  	set_reg(backing_page, APIC_LDR, val);
>> +
>> +	/* Read APIC ID from Extended Topology Enumeration CPUID */
>> +	apic_id = cpuid_edx(0x0000000b);
>> +	hv_apic_id = read_msr_from_hv(APIC_ID);
>> +	WARN_ONCE(hv_apic_id != apic_id, "Inconsistent APIC_ID values: %d (cpuid), %d (msr)",
>> +			apic_id, hv_apic_id);
>> +	set_reg(backing_page, APIC_ID, apic_id);
>>  }
>>  
> With this warning that hv_apic_id and apic_id  is different, do you still want to set_reg after that? If so, wonder why we have this warning?
> 

"apic_id" as read from cpuid is the source of truth for guest and is the one
guest would be  using for its interrupt/IPI flow.

Guest IPI flow does below:

1. Source vCPU updates the IRR bit in the destination vCPU's backing page.
2. Source vCPU takes an Automatic Exit to hv by doing ICR wrmsr operation.
   The destination APIC ID in ICR write data contains "apic_id".
3. Hv uses "apic_id" to either kick the corresponding destination vCPU (
   if not running) or write to AVIC doorbell to notify the running
   destination vCPU about the new interrupt.

Given that in step 3, hv uses "apic_id" (provided by guest) to find the
corresponding vCPU information, "apic_id" and "hv_apic_id" need to match.
Mismatch is not considered as a fatal event for guest (snp_abort() is not
triggered) and a warning is raise, as even if hv fails to kick or notify
the target vCPU, the IPI (though delayed) will get handled the next time
target vCPU vmrun happens.

I will include this information in commit message and change WARN_ONCE() to
pr_warn() (while at it, will change the format specifiers from %d to %u).


- Neeraj
Borislav Petkov Nov. 10, 2024, 12:12 p.m. UTC | #3
On Sun, Nov 10, 2024 at 09:25:34AM +0530, Neeraj Upadhyay wrote:
> Given that in step 3, hv uses "apic_id" (provided by guest) to find the
> corresponding vCPU information, "apic_id" and "hv_apic_id" need to match.
> Mismatch is not considered as a fatal event for guest (snp_abort() is not
> triggered) and a warning is raise,

What is it considered then and why does the warning even exist?

What can anyone do about it?

If you don't kill the guest, what should the guest owner do if she sees that
warning?
Neeraj Upadhyay Nov. 10, 2024, 3:22 p.m. UTC | #4
On 11/10/2024 5:42 PM, Borislav Petkov wrote:
> On Sun, Nov 10, 2024 at 09:25:34AM +0530, Neeraj Upadhyay wrote:
>> Given that in step 3, hv uses "apic_id" (provided by guest) to find the
>> corresponding vCPU information, "apic_id" and "hv_apic_id" need to match.
>> Mismatch is not considered as a fatal event for guest (snp_abort() is not
>> triggered) and a warning is raise,
> 
> What is it considered then and why does the warning even exist?
> 

APIC ID mismatch can delay IPI handling, which can result in slow guest by
delaying activities like scheduling of tasks within guest.

> What can anyone do about it?
> 

The misconfiguration would require fixing the vCPUs' APIC ID in the host.


> If you don't kill the guest, what should the guest owner do if she sees that
> warning?
> 

If I get your point, unless a corrective action is possible without
hard reboot of the guest, doing a snp_abort() on detecting mismatch works better
here. That way, the issue can be caught early, and it does not disrupt the running
applications on a limping guest (which happens for the case where we only emit
a warning). So, thinking more, snp_abort() looks more apt here.


- Neeraj
Borislav Petkov Nov. 10, 2024, 4:34 p.m. UTC | #5
On Sun, Nov 10, 2024 at 08:52:03PM +0530, Neeraj Upadhyay wrote:
> If I get your point, unless a corrective action is possible without
> hard reboot of the guest, doing a snp_abort() on detecting mismatch works better
> here. That way, the issue can be caught early, and it does not disrupt the running
> applications on a limping guest (which happens for the case where we only emit
> a warning). So, thinking more, snp_abort() looks more apt here.

Well, sometimes you have no influence on the HV (public cloud, for example).

So WARN_ONCE was on the right track but the error message should be more
user-friendly:

	WARN_ONCE(hv_apic_id != apic_id, 
		  "APIC IDs mismatch: %d (HV: %d). IPI handling will suffer!",
		  apic_id, hv_apic_id);
Neeraj Upadhyay Nov. 11, 2024, 3:45 a.m. UTC | #6
On 11/10/2024 10:04 PM, Borislav Petkov wrote:
> On Sun, Nov 10, 2024 at 08:52:03PM +0530, Neeraj Upadhyay wrote:
>> If I get your point, unless a corrective action is possible without
>> hard reboot of the guest, doing a snp_abort() on detecting mismatch works better
>> here. That way, the issue can be caught early, and it does not disrupt the running
>> applications on a limping guest (which happens for the case where we only emit
>> a warning). So, thinking more, snp_abort() looks more apt here.
> 
> Well, sometimes you have no influence on the HV (public cloud, for example).
> 

I see. Ok.

> So WARN_ONCE was on the right track but the error message should be more
> user-friendly:
> 
> 	WARN_ONCE(hv_apic_id != apic_id, 
> 		  "APIC IDs mismatch: %d (HV: %d). IPI handling will suffer!",
> 		  apic_id, hv_apic_id);
> 

Ok sure, I will update it.


- Neeraj
diff mbox series

Patch

diff --git a/arch/x86/kernel/apic/x2apic_savic.c b/arch/x86/kernel/apic/x2apic_savic.c
index 99151be4e173..09fbc1857bf3 100644
--- a/arch/x86/kernel/apic/x2apic_savic.c
+++ b/arch/x86/kernel/apic/x2apic_savic.c
@@ -14,6 +14,7 @@ 
 #include <linux/sizes.h>
 
 #include <asm/apic.h>
+#include <asm/cpuid.h>
 #include <asm/sev.h>
 
 #include "local.h"
@@ -200,6 +201,8 @@  static void x2apic_savic_send_IPI_mask_allbutself(const struct cpumask *mask, in
 
 static void init_backing_page(void *backing_page)
 {
+	u32 hv_apic_id;
+	u32 apic_id;
 	u32 val;
 	int i;
 
@@ -220,6 +223,13 @@  static void init_backing_page(void *backing_page)
 
 	val = read_msr_from_hv(APIC_LDR);
 	set_reg(backing_page, APIC_LDR, val);
+
+	/* Read APIC ID from Extended Topology Enumeration CPUID */
+	apic_id = cpuid_edx(0x0000000b);
+	hv_apic_id = read_msr_from_hv(APIC_ID);
+	WARN_ONCE(hv_apic_id != apic_id, "Inconsistent APIC_ID values: %d (cpuid), %d (msr)",
+			apic_id, hv_apic_id);
+	set_reg(backing_page, APIC_ID, apic_id);
 }
 
 static void x2apic_savic_setup(void)