Message ID | 20241028081142.66028-2-xueshuai@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [v15,1/3] ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered | expand |
On Mon, Oct 28, 2024 at 04:11:40PM +0800, Shuai Xue wrote: > Synchronous error was detected as a result of user-space process accessing > a 2-bit uncorrected error. The CPU will take a synchronous error exception > such as Synchronous External Abort (SEA) on Arm64. The kernel will queue a > memory_failure() work which poisons the related page, unmaps the page, and > then sends a SIGBUS to the process, so that a system wide panic can be > avoided. > > However, no memory_failure() work will be queued when abnormal synchronous > errors occur. These errors can include situations such as invalid PA, > unexpected severity, no memory failure config support, invalid GUID > section, etc. In such case, the user-space process will trigger SEA again. > This loop can potentially exceed the platform firmware threshold or even > trigger a kernel hard lockup, leading to a system reboot. > > Fix it by performing a force kill if no memory_failure() work is queued > for synchronous errors. > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > drivers/acpi/apei/ghes.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index ada93cfde9ba..f2ee28c44d7a 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -801,6 +801,16 @@ static bool ghes_do_proc(struct ghes *ghes, > } > } > > + /* > + * If no memory failure work is queued for abnormal synchronous > + * errors, do a force kill. > + */ > + if (sync && !queued) { > + pr_err("%s:%d: hardware memory corruption (SIGBUS)\n", > + current->comm, task_pid_nr(current)); I think it would help to include the GHES_PFX to indicate where this message is coming from. The pr_fmt() macro could also be introduced instead. Also, you may want to include the HW_ERR prefix. Not all kernel messages related to hardware errors have this prefix today. But maybe that should be changed so there is more consistent messaging. Thanks, Yazen
在 2024/10/30 04:48, Yazen Ghannam 写道: > On Mon, Oct 28, 2024 at 04:11:40PM +0800, Shuai Xue wrote: >> Synchronous error was detected as a result of user-space process accessing >> a 2-bit uncorrected error. The CPU will take a synchronous error exception >> such as Synchronous External Abort (SEA) on Arm64. The kernel will queue a >> memory_failure() work which poisons the related page, unmaps the page, and >> then sends a SIGBUS to the process, so that a system wide panic can be >> avoided. >> >> However, no memory_failure() work will be queued when abnormal synchronous >> errors occur. These errors can include situations such as invalid PA, >> unexpected severity, no memory failure config support, invalid GUID >> section, etc. In such case, the user-space process will trigger SEA again. >> This loop can potentially exceed the platform firmware threshold or even >> trigger a kernel hard lockup, leading to a system reboot. >> >> Fix it by performing a force kill if no memory_failure() work is queued >> for synchronous errors. >> >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> --- >> drivers/acpi/apei/ghes.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c >> index ada93cfde9ba..f2ee28c44d7a 100644 >> --- a/drivers/acpi/apei/ghes.c >> +++ b/drivers/acpi/apei/ghes.c >> @@ -801,6 +801,16 @@ static bool ghes_do_proc(struct ghes *ghes, >> } >> } >> >> + /* >> + * If no memory failure work is queued for abnormal synchronous >> + * errors, do a force kill. >> + */ >> + if (sync && !queued) { >> + pr_err("%s:%d: hardware memory corruption (SIGBUS)\n", >> + current->comm, task_pid_nr(current)); > > I think it would help to include the GHES_PFX to indicate where this > message is coming from. The pr_fmt() macro could also be introduced > instead. Yes, GHES_PFX is a effective prefix and will be consistent to other message in GHES driver. Will add it in next version. What do you mean about pr_fmt()? > > Also, you may want to include the HW_ERR prefix. Not all kernel messages > related to hardware errors have this prefix today. But maybe that should > be changed so there is more consistent messaging. > Do we really need a HW_ERR prefix? The other case which use HW_ERR prefix are for hardware registers. The messages which send SIGBUS does not include HW_ERR, e.g. in kill_proc(), kill_procs(). pr_err("%#lx: Sending SIGBUS to %s:%d due to hardware memory corruption\n",... pr_err("%#lx: forcibly killing %s:%d because of failure to unmap corrupted page\n",... > Thanks, > Yazen Thanks for valuable comments. Best Regards, Shuai
On Wed, Oct 30, 2024 at 09:54:00AM +0800, Shuai Xue wrote: > > > 在 2024/10/30 04:48, Yazen Ghannam 写道: > > On Mon, Oct 28, 2024 at 04:11:40PM +0800, Shuai Xue wrote: > > > Synchronous error was detected as a result of user-space process accessing > > > a 2-bit uncorrected error. The CPU will take a synchronous error exception > > > such as Synchronous External Abort (SEA) on Arm64. The kernel will queue a > > > memory_failure() work which poisons the related page, unmaps the page, and > > > then sends a SIGBUS to the process, so that a system wide panic can be > > > avoided. > > > > > > However, no memory_failure() work will be queued when abnormal synchronous > > > errors occur. These errors can include situations such as invalid PA, > > > unexpected severity, no memory failure config support, invalid GUID > > > section, etc. In such case, the user-space process will trigger SEA again. > > > This loop can potentially exceed the platform firmware threshold or even > > > trigger a kernel hard lockup, leading to a system reboot. > > > > > > Fix it by performing a force kill if no memory_failure() work is queued > > > for synchronous errors. > > > > > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > > > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > --- > > > drivers/acpi/apei/ghes.c | 10 ++++++++++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > > > index ada93cfde9ba..f2ee28c44d7a 100644 > > > --- a/drivers/acpi/apei/ghes.c > > > +++ b/drivers/acpi/apei/ghes.c > > > @@ -801,6 +801,16 @@ static bool ghes_do_proc(struct ghes *ghes, > > > } > > > } > > > + /* > > > + * If no memory failure work is queued for abnormal synchronous > > > + * errors, do a force kill. > > > + */ > > > + if (sync && !queued) { > > > + pr_err("%s:%d: hardware memory corruption (SIGBUS)\n", > > > + current->comm, task_pid_nr(current)); > > > > I think it would help to include the GHES_PFX to indicate where this > > message is coming from. The pr_fmt() macro could also be introduced > > instead. > > Yes, GHES_PFX is a effective prefix and will be consistent to other message > in GHES driver. Will add it in next version. > > What do you mean about pr_fmt()? This can be used to set a prefix for an entire section of code. The pr_*() macros will pick it up without needing to include a prefix for each call. This is described in "Documentation/core-api/printk-basics.rst". > > > > > Also, you may want to include the HW_ERR prefix. Not all kernel messages > > related to hardware errors have this prefix today. But maybe that should > > be changed so there is more consistent messaging. > > > > Do we really need a HW_ERR prefix? The other case which use HW_ERR prefix > are for hardware registers. The messages which send SIGBUS does > not include HW_ERR, e.g. in kill_proc(), kill_procs(). > > pr_err("%#lx: Sending SIGBUS to %s:%d due to hardware memory > corruption\n",... > pr_err("%#lx: forcibly killing %s:%d because of failure to unmap > corrupted page\n",... > > Correct, HW_ERR isn't used there. My interpretation is that it can be used whenever an event is due to a hardware error (real or simulated). This is a very clear message to a user. It may be redundant in some cases (like here where the message already says "hardware memory corruption"). But I think it would be go to use it anyway for consistency. I think other relevant places in the kernel should also be updated. But that is beyond this patch, and I don't expect it to be done here and now. Thanks, Yazen
在 2024/10/30 22:08, Yazen Ghannam 写道: > On Wed, Oct 30, 2024 at 09:54:00AM +0800, Shuai Xue wrote: >> >> >> 在 2024/10/30 04:48, Yazen Ghannam 写道: >>> On Mon, Oct 28, 2024 at 04:11:40PM +0800, Shuai Xue wrote: >>>> Synchronous error was detected as a result of user-space process accessing >>>> a 2-bit uncorrected error. The CPU will take a synchronous error exception >>>> such as Synchronous External Abort (SEA) on Arm64. The kernel will queue a >>>> memory_failure() work which poisons the related page, unmaps the page, and >>>> then sends a SIGBUS to the process, so that a system wide panic can be >>>> avoided. >>>> >>>> However, no memory_failure() work will be queued when abnormal synchronous >>>> errors occur. These errors can include situations such as invalid PA, >>>> unexpected severity, no memory failure config support, invalid GUID >>>> section, etc. In such case, the user-space process will trigger SEA again. >>>> This loop can potentially exceed the platform firmware threshold or even >>>> trigger a kernel hard lockup, leading to a system reboot. >>>> >>>> Fix it by performing a force kill if no memory_failure() work is queued >>>> for synchronous errors. >>>> >>>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >>>> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> >>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >>>> --- >>>> drivers/acpi/apei/ghes.c | 10 ++++++++++ >>>> 1 file changed, 10 insertions(+) >>>> >>>> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c >>>> index ada93cfde9ba..f2ee28c44d7a 100644 >>>> --- a/drivers/acpi/apei/ghes.c >>>> +++ b/drivers/acpi/apei/ghes.c >>>> @@ -801,6 +801,16 @@ static bool ghes_do_proc(struct ghes *ghes, >>>> } >>>> } >>>> + /* >>>> + * If no memory failure work is queued for abnormal synchronous >>>> + * errors, do a force kill. >>>> + */ >>>> + if (sync && !queued) { >>>> + pr_err("%s:%d: hardware memory corruption (SIGBUS)\n", >>>> + current->comm, task_pid_nr(current)); >>> >>> I think it would help to include the GHES_PFX to indicate where this >>> message is coming from. The pr_fmt() macro could also be introduced >>> instead. >> >> Yes, GHES_PFX is a effective prefix and will be consistent to other message >> in GHES driver. Will add it in next version. >> >> What do you mean about pr_fmt()? > > This can be used to set a prefix for an entire section of code. The > pr_*() macros will pick it up without needing to include a prefix for > each call. > > This is described in "Documentation/core-api/printk-basics.rst". Got you point, a pr_fmt is much more convenient, but it is beyond this patch. I would like to send a patch to add pr_fmt and then replace each call after this patch merged. > >> >>> >>> Also, you may want to include the HW_ERR prefix. Not all kernel messages >>> related to hardware errors have this prefix today. But maybe that should >>> be changed so there is more consistent messaging. >>> >> >> Do we really need a HW_ERR prefix? The other case which use HW_ERR prefix >> are for hardware registers. The messages which send SIGBUS does >> not include HW_ERR, e.g. in kill_proc(), kill_procs(). >> >> pr_err("%#lx: Sending SIGBUS to %s:%d due to hardware memory >> corruption\n",... >> pr_err("%#lx: forcibly killing %s:%d because of failure to unmap >> corrupted page\n",... >> >> > > Correct, HW_ERR isn't used there. My interpretation is that it can be > used whenever an event is due to a hardware error (real or simulated). > This is a very clear message to a user. > > It may be redundant in some cases (like here where the message already > says "hardware memory corruption"). But I think it would be go to use it > anyway for consistency. > > I think other relevant places in the kernel should also be updated. But > that is beyond this patch, and I don't expect it to be done here and > now. Ok, I will add the HW_ERR in next version, if no one else objects. > > Thanks, > Yazen Thank you. Best Regards, Shuai
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index ada93cfde9ba..f2ee28c44d7a 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -801,6 +801,16 @@ static bool ghes_do_proc(struct ghes *ghes, } } + /* + * If no memory failure work is queued for abnormal synchronous + * errors, do a force kill. + */ + if (sync && !queued) { + pr_err("%s:%d: hardware memory corruption (SIGBUS)\n", + current->comm, task_pid_nr(current)); + force_sig(SIGBUS); + } + return queued; }