Message ID | 20200228174817.74278-1-james.morse@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | ACPI / APEI: Kick the memory_failure() queue for synchronous errors | expand |
Hello James, I think my one comment on patch 2 is valid, right? But for this series: Tested-by: Tyler Baicar <baicar@os.amperecomputing.com> Thanks, Tyler On Fri, Feb 28, 2020 at 12:48 PM James Morse <james.morse@arm.com> wrote: > > Hello! > > These are the remaining patches from the SDEI series[0] that fix > a race between memory_failure() and user-space re-triggering the error > in ghes.c. > > > ghes_handle_memory_failure() calls memory_failure_queue() from > IRQ context to schedule memory_failure()s work as it needs to sleep. > Once the GHES machinery returns from the IRQ, it may return to user-space > before memory_failure() runs. > > If the error that kicked all this off is specific to user-space, e.g. a > load from corrupted memory, we may find ourselves taking the error > again. If the user-space task is scheduled out, and memory_failure() runs, > the same user-space task may be scheduled in on another CPU, which could > also take the same error. > > These lead to exaggerated error counters, which may cause some threshold > to be reached early. > > This can happen with any error that causes a Synchronous External Abort > on arm64. I can't see why the same wouldn't happen with a machine-check > handled firmware first on x86. > > > This series adds a memory_failure_queue_kick() helper to > memory-failure.c, and calls it as task-work before returning to > user-space. > > > Currently arm64 papers over this problem by ignoring ghes_notify_sea()'s > return code as it knows there is still work to do. arm64 generates its > own signal to user-space, which means the first task to discover an > error will always be killed, even if the error was later handled. > (which is no improvement on the no-RAS behaviour) > > As a final piece, arm64 can try to process the irq work queued by > ghes_notify_sea() while its still in the external abort handler. A succesfull > return value here now means the memory_failure() work will be done before we > return to user-space, we no longer need to generate our own signal. > This lets the original task survive the error if memory_failure() can > recover the corrupted memory. > > Based on v5.6-rc2. I'm afraid it touches three different trees. > $subject says ACPI as that is where the bulk of the diffstat is. > > This series may conflict in arm64 with a series from Mark Rutland to > cleanup the daif/PMR toggling. > > > This would be v9 of these patches, but after a year I figure I should > start the numbering again. I've dropped any collected tags. > > Known issues: > * arm64's apei_claim_sea() may unwittingly re-enable debug if it takes > an external-abort from debug context. Patch 3 makes this worse > instead of fixing it. The fix would make use of helpers from Mark R's > series. > > > Thanks, > > James > > > [0] https://lore.kernel.org/linux-arm-kernel/20190129184902.102850-1-james.morse@arm.com/ > [1] https://lore.kernel.org/linux-acpi/1506516620-20033-3-git-send-email-xiexiuqi@huawei.com/ > > James Morse (3): > mm/memory-failure: Add memory_failure_queue_kick() > ACPI / APEI: Kick the memory_failure() queue for synchronous errors > arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work > > arch/arm64/kernel/acpi.c | 25 +++++++++++++++ > arch/arm64/mm/fault.c | 12 ++++--- > drivers/acpi/apei/ghes.c | 68 +++++++++++++++++++++++++++++++++------- > include/acpi/ghes.h | 3 ++ > include/linux/mm.h | 1 + > mm/memory-failure.c | 15 ++++++++- > 6 files changed, 107 insertions(+), 17 deletions(-) > > -- > 2.24.1