Message ID | 1575021698-10589-1-git-send-email-yi.y.sun@linux.intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v3] psr: fix bug which may cause crash | expand |
On 29.11.2019 11:01, Yi Sun wrote: > --- a/xen/arch/x86/psr.c > +++ b/xen/arch/x86/psr.c > @@ -1271,7 +1271,13 @@ static void do_write_psr_msrs(void *data) > > for ( j = 0; j < cos_num; j++, index++ ) > { > - if ( feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) > + /* > + * Multiple RDT features may co-exist and their COS_MAX may be > + * different. So we should prevent one feature to write COS > + * register which exceeds its COS_MAX. Otherwise, panic may happen. I don't think the last sentence adds much value. Early on I said "brief" for a reason. > + */ > + if ( cos <= feat->cos_max && > + feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) As indicated in reply to v2, the added condition is loop invariant and hence should lead to the loop not getting entered in the first place. Jan
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c index 5866a26..943a1e0 100644 --- a/xen/arch/x86/psr.c +++ b/xen/arch/x86/psr.c @@ -1271,7 +1271,13 @@ static void do_write_psr_msrs(void *data) for ( j = 0; j < cos_num; j++, index++ ) { - if ( feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) + /* + * Multiple RDT features may co-exist and their COS_MAX may be + * different. So we should prevent one feature to write COS + * register which exceeds its COS_MAX. Otherwise, panic may happen. + */ + if ( cos <= feat->cos_max && + feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) { feat->cos_reg_val[cos * cos_num + j] = info->val[index]; props->write_msr(cos, info->val[index], props->type[j]);
During test, we found a crash on Xen with below trace. (XEN) Xen call trace: (XEN) [<ffff82d0802a065a>] R psr.c#l3_cdp_write_msr+0x1e/0x22 (XEN) [<ffff82d0802a0858>] F psr.c#do_write_psr_msrs+0x6d/0x109 (XEN) [<ffff82d08023e000>] F smp_call_function_interrupt+0x5a/0xac (XEN) [<ffff82d0802a2b89>] F call_function_interrupt+0x20/0x34 (XEN) [<ffff82d080282c64>] F do_IRQ+0x175/0x6ae (XEN) [<ffff82d08038b8ba>] F common_interrupt+0x10a/0x120 (XEN) [<ffff82d0802ec616>] F cpu_idle.c#acpi_idle_do_entry+0x9d/0xb1 (XEN) [<ffff82d0802ecc01>] F cpu_idle.c#acpi_processor_idle+0x41d/0x626 (XEN) [<ffff82d08027353b>] F domain.c#idle_loop+0xa5/0xa7 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 20: (XEN) GENERAL PROTECTION FAULT (XEN) [error_code=0000] (XEN) **************************************** The bug happens when CDP and MBA co-exist and MBA COS_MAX is bigger than CDP COS_MAX. E.g. MBA has 8 COS registers but CDP only have 6. When setting MBA throttling value for the 7th guest, the value array would be: +------------------+------------------+--------------+ | Data default val | Code default val | MBA throttle | +------------------+------------------+--------------+ Then, COS id 7 will be selected for writting the values. We should avoid writting CDP data/code valules to COS id 7 MSR because it exceeds the CDP COS_MAX. Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> --- xen/arch/x86/psr.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)