Message ID | 1574936284-5139-1-git-send-email-yi.y.sun@linux.intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] psr: fix bug which may cause crash | expand |
On 28.11.2019 11:18, Yi Sun wrote: > --- a/xen/arch/x86/psr.c > +++ b/xen/arch/x86/psr.c > @@ -1271,7 +1271,8 @@ static void do_write_psr_msrs(void *data) > > for ( j = 0; j < cos_num; j++, index++ ) > { > - if ( feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) > + if ( cos <= feat->cos_max && > + feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) The description is indeed much better now, thanks. However, as indicated in reply to v1, this extra (and at the first glance unmotivated) bounds check wants to be accompanied by a brief but precise comment. Furthermore with the loop bounded by a local variable, why not cos_num = min(props->cos_num, feat->cos_max + 1); a few lines up from here (again suitable commented)? Jan
On 19-11-28 12:25:44, Jan Beulich wrote: > On 28.11.2019 11:18, Yi Sun wrote: > > --- a/xen/arch/x86/psr.c > > +++ b/xen/arch/x86/psr.c > > @@ -1271,7 +1271,8 @@ static void do_write_psr_msrs(void *data) > > > > for ( j = 0; j < cos_num; j++, index++ ) > > { > > - if ( feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) > > + if ( cos <= feat->cos_max && > > + feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) > > The description is indeed much better now, thanks. However, > as indicated in reply to v1, this extra (and at the first glance > unmotivated) bounds check wants to be accompanied by a brief but I will add the comment. > precise comment. Furthermore with the loop bounded by a local > variable, why not > > cos_num = min(props->cos_num, feat->cos_max + 1); > > a few lines up from here (again suitable commented)? > cos_num is a different thing with the number of COS registers. The meaning of it is "COS registers number that feature uses for one COS ID". E.g. MBA/CAT cos_num is always 1. But CDP cos_num is 2 because it uses 2 COS registers for one COS ID. > Jan
On 29.11.2019 02:41, Yi Sun wrote: > On 19-11-28 12:25:44, Jan Beulich wrote: >> On 28.11.2019 11:18, Yi Sun wrote: >>> --- a/xen/arch/x86/psr.c >>> +++ b/xen/arch/x86/psr.c >>> @@ -1271,7 +1271,8 @@ static void do_write_psr_msrs(void *data) >>> >>> for ( j = 0; j < cos_num; j++, index++ ) >>> { >>> - if ( feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) >>> + if ( cos <= feat->cos_max && >>> + feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) >> >> The description is indeed much better now, thanks. However, >> as indicated in reply to v1, this extra (and at the first glance >> unmotivated) bounds check wants to be accompanied by a brief but > > I will add the comment. > >> precise comment. Furthermore with the loop bounded by a local >> variable, why not >> >> cos_num = min(props->cos_num, feat->cos_max + 1); >> >> a few lines up from here (again suitable commented)? >> > cos_num is a different thing with the number of COS registers. The > meaning of it is "COS registers number that feature uses for one COS > ID". E.g. MBA/CAT cos_num is always 1. But CDP cos_num is 2 because it > uses 2 COS registers for one COS ID. Oh, right, I got confused. The cos <= feat->cos_max is entirely loop invariant. Hence the loop shouldn't even be entered, i.e. you want to add a conditional ahead of it. Jan
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c index 5866a26..ecca5b4 100644 --- a/xen/arch/x86/psr.c +++ b/xen/arch/x86/psr.c @@ -1271,7 +1271,8 @@ static void do_write_psr_msrs(void *data) for ( j = 0; j < cos_num; j++, index++ ) { - if ( feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) + if ( cos <= feat->cos_max && + feat->cos_reg_val[cos * cos_num + j] != info->val[index] ) { feat->cos_reg_val[cos * cos_num + j] = info->val[index]; props->write_msr(cos, info->val[index], props->type[j]);
During test, we found a crash on Xen with below trace. (XEN) Xen call trace: (XEN) [<ffff82d0802a065a>] R psr.c#l3_cdp_write_msr+0x1e/0x22 (XEN) [<ffff82d0802a0858>] F psr.c#do_write_psr_msrs+0x6d/0x109 (XEN) [<ffff82d08023e000>] F smp_call_function_interrupt+0x5a/0xac (XEN) [<ffff82d0802a2b89>] F call_function_interrupt+0x20/0x34 (XEN) [<ffff82d080282c64>] F do_IRQ+0x175/0x6ae (XEN) [<ffff82d08038b8ba>] F common_interrupt+0x10a/0x120 (XEN) [<ffff82d0802ec616>] F cpu_idle.c#acpi_idle_do_entry+0x9d/0xb1 (XEN) [<ffff82d0802ecc01>] F cpu_idle.c#acpi_processor_idle+0x41d/0x626 (XEN) [<ffff82d08027353b>] F domain.c#idle_loop+0xa5/0xa7 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 20: (XEN) GENERAL PROTECTION FAULT (XEN) [error_code=0000] (XEN) **************************************** The bug happens when CDP and MBA co-exist and MBA COS_MAX is bigger than CDP COS_MAX. E.g. MBA has 8 COS registers but CDP only have 6. When setting MBA throttling value for the 7th guest, the value array would be: +------------------+------------------+--------------+ | Data default val | Code default val | MBA throttle | +------------------+------------------+--------------+ Then, COS id 7 will be selected for writting the values. We should avoid writting CDP data/code valules to COS id 7 MSR because it exceeds the CDP COS_MAX. Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> --- xen/arch/x86/psr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)