Message ID | 20210915232739.6367-2-Smita.KoralahalliChannabasappa@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/mce: Handle error simulation failures in mce-inject module | expand |
On Wed, Sep 15, 2021 at 06:27:35PM -0500, Smita Koralahalli wrote: > The MCA_IPID register uniquely identifies a bank's type on Scalable MCA > (SMCA) systems. When an MCA bank is not populated, the MCA_IPID register > will read as zero and writes to it will be ignored. Check the value of > this register before trying to simulate the error. > > Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> > --- > arch/x86/kernel/cpu/mce/inject.c | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c > index 0bfc14041bbb..51ac575c4605 100644 > --- a/arch/x86/kernel/cpu/mce/inject.c > +++ b/arch/x86/kernel/cpu/mce/inject.c > @@ -577,6 +577,24 @@ static int inj_bank_set(void *data, u64 val) > } > > m->bank = val; > + > + /* Read IPID value to determine if a bank is unpopulated on the target > + * CPU. > + */ Kernel comments style format is: /* * A sentence ending with a full-stop. * Another sentence. ... * More sentences. ... */ > + if (boot_cpu_has(X86_FEATURE_SMCA)) { This whole thing belongs into inj_ipid_set() where you should verify whether the bank is set when you try to set the IPID for that bank. > + > + /* Check for user provided IPID value. */ > + if (!m->ipid) { > + rdmsrl_on_cpu(m->extcpu, MSR_AMD64_SMCA_MCx_IPID(val), > + &m->ipid); Oh well, one IPI per ipid write. We're doing injection so we can't be on a production machine so who cares about IPIs there. > + if (!m->ipid) { > + pr_err("Error simulation not possible: Bank %llu unpopulated\n", "Cannot set IPID for bank... - bank %d unpopulated\n" Also, in all your text, use "injection" instead of "simulation" so that there's no confusion. Thx.
Hi Boris, On 9/24/21 3:26 AM, Borislav Petkov wrote: > On Wed, Sep 15, 2021 at 06:27:35PM -0500, Smita Koralahalli wrote: >> The MCA_IPID register uniquely identifies a bank's type on Scalable MCA >> (SMCA) systems. When an MCA bank is not populated, the MCA_IPID register >> will read as zero and writes to it will be ignored. Check the value of >> this register before trying to simulate the error. >> >> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> >> --- >> arch/x86/kernel/cpu/mce/inject.c | 18 ++++++++++++++++++ >> 1 file changed, 18 insertions(+) >> >> diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c >> index 0bfc14041bbb..51ac575c4605 100644 >> --- a/arch/x86/kernel/cpu/mce/inject.c >> +++ b/arch/x86/kernel/cpu/mce/inject.c >> @@ -577,6 +577,24 @@ static int inj_bank_set(void *data, u64 val) >> } >> >> + if (boot_cpu_has(X86_FEATURE_SMCA)) { > This whole thing belongs into inj_ipid_set() where you should verify > whether the bank is set when you try to set the IPID for that bank. Can you please elaborate on this? I'm not sure if I understood this right. Should I read the ipid file to verify that the user has input proper ipid? If ipid file reads zero then do rdmsrl_on_cpu? Thanks, Smita > > + if (!m->ipid) { > + pr_err("Error simulation not possible: Bank %llu unpopulated\n", >> + >> + /* Check for user provided IPID value. */ >> + if (!m->ipid) { >> + rdmsrl_on_cpu(m->extcpu, MSR_AMD64_SMCA_MCx_IPID(val), >> + &m->ipid); > Oh well, one IPI per ipid write. We're doing injection so we can't be on > a production machine so who cares about IPIs there. >
On Mon, Sep 27, 2021 at 02:51:56PM -0500, Smita Koralahalli Channabasappa wrote: > Can you please elaborate on this? I'm not sure if I understood this > right. Should I read the ipid file to verify that the user has input > proper ipid? If ipid file reads zero then do rdmsrl_on_cpu? No, on a write to the ipid file you should do that checking and write if the bank is populated or fail the write otherwise. And you should put all that code in inj_bank_set() - that's why I say "on a write to the ipid file". And instead of boot_cpu_has() you should use cpu_feature_enabled(). Makes sense?
On 9/27/21 3:15 PM, Borislav Petkov wrote: > On Mon, Sep 27, 2021 at 02:51:56PM -0500, Smita Koralahalli Channabasappa wrote: >> Can you please elaborate on this? I'm not sure if I understood this >> right. Should I read the ipid file to verify that the user has input >> proper ipid? If ipid file reads zero then do rdmsrl_on_cpu? > No, on a write to the ipid file you should do that checking and write if > the bank is populated or fail the write otherwise. And you should put > all that code in inj_bank_set() - that's why I say "on a write to the > ipid file". > > And instead of boot_cpu_has() you should use cpu_feature_enabled(). > > Makes sense? Yes, this makes sense to me now. But you meant to say inj_ipid_set() instead of inj_bank_set()..? Something like this: -MCE_INJECT_SET(ipid) +static int inj_ipid_set(void *data, u64 val) +{ + struct mce *m = (struct mce*)data; + if cpu_feature_enabled(X86_FEATURE_SMCA)) { + rdmsrl_on_cpu(.. .. .. + m->ipid = val; + .. +} Thanks,
On Mon, Sep 27, 2021 at 04:56:17PM -0500, Smita Koralahalli Channabasappa wrote: > Yes, this makes sense to me now. But you meant to say inj_ipid_set() > instead of inj_bank_set()..? Yeah, I had it correct before: "This whole thing belongs into inj_ipid_set() where you should verify... " > > Something like this: > > -MCE_INJECT_SET(ipid) > > +static int inj_ipid_set(void *data, u64 val) > +{ > + struct mce *m = (struct mce*)data; > > + if cpu_feature_enabled(X86_FEATURE_SMCA)) { > > + rdmsrl_on_cpu(.. > .. > .. > + m->ipid = val; > + .. > +} Yes, and return proper error codes. Thx.
Hi Boris, Sorry for the delayed response. When I was coding this up, I came across few issues. Mentioning below. On 9/24/21 3:26 AM, Borislav Petkov wrote: > On Wed, Sep 15, 2021 at 06:27:35PM -0500, Smita Koralahalli wrote: >> The MCA_IPID register uniquely identifies a bank's type on Scalable MCA >> (SMCA) systems. When an MCA bank is not populated, the MCA_IPID register >> will read as zero and writes to it will be ignored. Check the value of >> this register before trying to simulate the error. >> >> Signed-off-by: Smita Koralahalli<Smita.KoralahalliChannabasappa@amd.com> >> --- >> arch/x86/kernel/cpu/mce/inject.c | 18 ++++++++++++++++++ >> 1 file changed, 18 insertions(+) >> >> diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c >> index 0bfc14041bbb..51ac575c4605 100644 >> --- a/arch/x86/kernel/cpu/mce/inject.c >> +++ b/arch/x86/kernel/cpu/mce/inject.c >> @@ -577,6 +577,24 @@ static int inj_bank_set(void *data, u64 val) >> } >> >> m->bank = val; >> + >> + /* Read IPID value to determine if a bank is unpopulated on the target >> + * CPU. >> + */ > Kernel comments style format is: > > /* > * A sentence ending with a full-stop. > * Another sentence. ... > * More sentences. ... > */ > >> + if (boot_cpu_has(X86_FEATURE_SMCA)) { > This whole thing belongs into inj_ipid_set() where you should verify > whether the bank is set when you try to set the IPID for that bank. I do not have the bank number in order to look up the IPID for that bank. I couldn't know the bank number because mce-inject files are synchronized in a way that once the bank number is written the injection starts. Can you please suggest what needs to be done here? Also, the IPID register is read only from the OS, hence the user provided IPID values could be useful for "sw" error injection types. For "hw" error injection types we need to read from the registers to determine the IPID value. Should there be two cases where on a "sw" injection use the user provided IPID value whereas on "hw" injection read from registers? I'm pasting the code snippet after rework on the comments. static int inj_ipid_set(void *data, u64 val) { struct mce *m = (struct mce *)data; if (cpu_feature_enabled(X86_FEATURE_SMCA)) { if (val && inj_type == SW_INJ) m->ipid = val; else { rdmsrl_on_cpu(m->extcpu, MSR_AMD64_SMCA_MCx_IPID(?), &m->ipid); // Requires bank number here. if (!m->ipid) { pr_err("Cannot set IPID - unpopulated bank\n"); return -ENODEV; } } } return 0; } Please let me know what do you think? Thanks, >> + >> + /* Check for user provided IPID value. */ >> + if (!m->ipid) { >> + rdmsrl_on_cpu(m->extcpu, MSR_AMD64_SMCA_MCx_IPID(val), >> + &m->ipid); > Oh well, one IPI per ipid write. We're doing injection so we can't be on > a production machine so who cares about IPIs there. > >> + if (!m->ipid) { >> + pr_err("Error simulation not possible: Bank %llu unpopulated\n", > "Cannot set IPID for bank... - bank %d unpopulated\n" > > Also, in all your text, use "injection" instead of "simulation" so that > there's no confusion. > > Thx. >
On Mon, Oct 11, 2021 at 04:12:14PM -0500, Koralahalli Channabasappa, Smita wrote: > I do not have the bank number in order to look up the IPID for that bank. > I couldn't know the bank number because mce-inject files are synchronized > in a way that once the bank number is written the injection starts. > Can you please suggest what needs to be done here? > > Also, the IPID register is read only from the OS, hence the user provided > IPID values could be useful for "sw" error injection types. For "hw" error > injection types we need to read from the registers to determine the IPID > value. > > Should there be two cases where on a "sw" injection use the user provided > IPID value whereas on "hw" injection read from registers? Right, that's a good point. So the way I see it is, we need to decide what is allowed for sw injection and what for hw injection, wrt to IPID value. I think for sw injection, we probably should say that since this is sw only and its purpose is to test the code only, there should not be any limitations imposed by the underlying machine. Like using the bank number, for example. So what you do now for sw injection: if (val && inj_type == SW_INJ) m->ipid = val; should be good enough. User simply sets some IPID value and that value will be used for the bank which is written when injecting. Now, for hw injection, you have two cases: 1. The bank is unpopulated so setting the IPID there doesn't make any sense. 2. The bank *is* populated and the respective IPID MSR has a value describing what that bank is. And in that case, does it even make sense to set the IPID? I don't think so because that IP block's type - aka IPID - has been set already by hardware/firmware. So the way I see it, it makes no sense whatsoever to set the IPID of a bank during hw injection. Right?
On 10/14/21 1:22 PM, Borislav Petkov wrote: > On Mon, Oct 11, 2021 at 04:12:14PM -0500, Koralahalli Channabasappa, Smita wrote: >> I do not have the bank number in order to look up the IPID for that bank. >> I couldn't know the bank number because mce-inject files are synchronized >> in a way that once the bank number is written the injection starts. >> Can you please suggest what needs to be done here? >> >> Also, the IPID register is read only from the OS, hence the user provided >> IPID values could be useful for "sw" error injection types. For "hw" error >> injection types we need to read from the registers to determine the IPID >> value. >> >> Should there be two cases where on a "sw" injection use the user provided >> IPID value whereas on "hw" injection read from registers? > Right, that's a good point. So the way I see it is, we need to decide > what is allowed for sw injection and what for hw injection, wrt to IPID > value. > > I think for sw injection, we probably should say that since this is > sw only and its purpose is to test the code only, there should not be > any limitations imposed by the underlying machine. Like using the bank > number, for example. > > So what you do now for sw injection: > > if (val && inj_type == SW_INJ) > m->ipid = val; > > should be good enough. User simply sets some IPID value and that value > will be used for the bank which is written when injecting. > > Now, for hw injection, you have two cases: > > 1. The bank is unpopulated so setting the IPID there doesn't make any sense. > > 2. The bank *is* populated and the respective IPID MSR has a value > describing what that bank is. > > And in that case, does it even make sense to set the IPID? I don't think > so because that IP block's type - aka IPID - has been set already by > hardware/firmware. > > So the way I see it, it makes no sense whatsoever to set the IPID of a > bank during hw injection. > > Right? Yes, I agree. inj_ipid_set() can be used to serve the purpose of setting user provided IPID on a sw injection only. My concern was, we need to determine whether the bank is unpopulated or populated before trying to inject the errors on a hw injection, for which we need to read the IPID MSR of that bank. We cannot do that inside inj_ipid_set() as we do not know the bank number until inj_bank_set() executes which is called after inj_ipid_set(). mce-inject files are synchronized in a way that once the bank number is written in inj_bank_set(), injection starts. So this snippet of code: if (inj_type != SW_INJ) { rdmsrl_on_cpu(m->extcpu, MSR_AMD64_SMCA_MCx_IPID(val),&m->ipid); if (!m->ipid) { pr_err("Error injection not possible - bank %d unpopulated\n"); return -ENODEV; } } should be retained inside inj_bank_set() ? And inj_ipid_set() should just set m->ipid = val on a SW_INJ as you mentioned above? Thanks,
On Thu, Oct 14, 2021 at 03:26:13PM -0500, Koralahalli Channabasappa, Smita wrote: > My concern was, we need to determine whether the bank is unpopulated or > populated before trying to inject the errors on a hw injection, for which > we need to read the IPID MSR of that bank. Ah, that. Look at the smca_banks[] array in .../mce/amd.c and how smca_configure() prepares all banks in there. You could use that array to query which SMCA bank on which CPU is initialized, before injecting into it. > should be retained inside inj_bank_set() ? And yes, I guess you'll have to do it there because then you know which bank and which CPU the hw injection is supposed to happen on. > And inj_ipid_set() should just set m->ipid = val on a SW_INJ as you mentioned > above? Yap. Thx.
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c index 0bfc14041bbb..51ac575c4605 100644 --- a/arch/x86/kernel/cpu/mce/inject.c +++ b/arch/x86/kernel/cpu/mce/inject.c @@ -577,6 +577,24 @@ static int inj_bank_set(void *data, u64 val) } m->bank = val; + + /* Read IPID value to determine if a bank is unpopulated on the target + * CPU. + */ + if (boot_cpu_has(X86_FEATURE_SMCA)) { + + /* Check for user provided IPID value. */ + if (!m->ipid) { + rdmsrl_on_cpu(m->extcpu, MSR_AMD64_SMCA_MCx_IPID(val), + &m->ipid); + if (!m->ipid) { + pr_err("Error simulation not possible: Bank %llu unpopulated\n", + val); + return -ENODEV; + } + } + } + do_inject(); /* Reset injection struct */
The MCA_IPID register uniquely identifies a bank's type on Scalable MCA (SMCA) systems. When an MCA bank is not populated, the MCA_IPID register will read as zero and writes to it will be ignored. Check the value of this register before trying to simulate the error. Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> --- arch/x86/kernel/cpu/mce/inject.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)