Message ID | 20231123003513.24292-4-ankita@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: Implement ECC handling for pfn with no struct page | expand |
On Thu, Nov 23, 2023 at 06:05:11AM +0530, ankita@nvidia.com wrote: > - pfn = PHYS_PFN(physical_addr); > - if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { > - pr_warn_ratelimited(FW_WARN GHES_PFX > - "Invalid address in generic error data: %#llx\n", > - physical_addr); > - return false; > - } You don't just remove a pfn valid test just because your weird device can't stomach it - you extend it, like 3ad6fd77a2d6 ("x86/sgx: Add check for SGX pages to ghes_do_memory_failure()") did, for example.
On Sun, Dec 03, 2023 at 12:23:19AM +0100, Borislav Petkov wrote: > On Thu, Nov 23, 2023 at 06:05:11AM +0530, ankita@nvidia.com wrote: > > - pfn = PHYS_PFN(physical_addr); > > - if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { > > - pr_warn_ratelimited(FW_WARN GHES_PFX > > - "Invalid address in generic error data: %#llx\n", > > - physical_addr); > > - return false; > > - } > > You don't just remove a pfn valid test just because your weird device > can't stomach it - you extend it, like It wasn't removed. patch 1 moved it to memory_failure() where it makes a lot more sense. Jason
On Mon, Dec 04, 2023 at 10:36:50AM -0400, Jason Gunthorpe wrote: > It wasn't removed. patch 1 moved it to memory_failure() where it makes > a lot more sense. Why is this a separate patch then?
>> It wasn't removed. patch 1 moved it to memory_failure() where it makes >> a lot more sense. > > Why is this a separate patch then? This was done to keep ghes code separate from the memory failure code. I can merge them if that is preferable.
On Mon, Dec 04, 2023 at 03:54:52PM +0000, Ankit Agrawal wrote: > >> It wasn't removed. patch 1 moved it to memory_failure() where it makes > >> a lot more sense. > > > > Why is this a separate patch then? > > This was done to keep ghes code separate from the memory failure code. > I can merge them if that is preferable. A single patch to move just this code could be a good idea Jason
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 63ad0541db38..0ca6ab9fccbe 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -471,20 +471,10 @@ static void ghes_kick_task_work(struct callback_head *head) static bool ghes_do_memory_failure(u64 physical_addr, int flags) { - unsigned long pfn; - if (!IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE)) return false; - pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { - pr_warn_ratelimited(FW_WARN GHES_PFX - "Invalid address in generic error data: %#llx\n", - physical_addr); - return false; - } - - memory_failure_queue(pfn, flags); + memory_failure_queue(PHYS_PFN(physical_addr), flags); return true; }