Message ID | 20250128001749.3132656-1-oliver.upton@linux.dev (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | ACPI: GTDT: Relax sanity checking on Platform Timers array count | expand |
On Tue, 28 Jan 2025 00:17:49 +0000, Oliver Upton <oliver.upton@linux.dev> wrote: > > Perhaps unsurprisingly there are some platforms where the GTDT isn't > quite right and the Platforms Timer array overflows the length of the > overall table. Colour me shocked! Broken ACPI tables? [insert REM song here]. > > While the recently-added sanity checking isn't wrong, it makes it > impossible to boot the kernel on offending platforms. Try to hobble > along and limit the Platform Timer count to the bounds of the table. > > Cc: Marc Zyngier <maz@kernel.org> > Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> > Cc: Zheng Zengkai <zhengzengkai@huawei.com> > Cc: stable@vger.kernel.org > Fixes: 263e22d6bd1f ("ACPI: GTDT: Tighten the check for the array of platform timer structures") > Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Marc Zyngier <maz@kernel.org> M.
On Tue, Jan 28, 2025 at 12:17:49AM +0000, Oliver Upton wrote: > Perhaps unsurprisingly there are some platforms where the GTDT isn't https://lore.kernel.org/lkml/Zw6b3V5Mk2tIGmy5@lpieralisi An accident waiting to happen :) > quite right and the Platforms Timer array overflows the length of the > overall table. > > While the recently-added sanity checking isn't wrong, it makes it > impossible to boot the kernel on offending platforms. Try to hobble > along and limit the Platform Timer count to the bounds of the table. > > Cc: Marc Zyngier <maz@kernel.org> > Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> > Cc: Zheng Zengkai <zhengzengkai@huawei.com> > Cc: stable@vger.kernel.org > Fixes: 263e22d6bd1f ("ACPI: GTDT: Tighten the check for the array of platform timer structures") > Signed-off-by: Oliver Upton <oliver.upton@linux.dev> > --- > drivers/acpi/arm64/gtdt.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c > index 3561553eff8b..70f8290b659d 100644 > --- a/drivers/acpi/arm64/gtdt.c > +++ b/drivers/acpi/arm64/gtdt.c > @@ -163,7 +163,7 @@ int __init acpi_gtdt_init(struct acpi_table_header *table, > { > void *platform_timer; > struct acpi_table_gtdt *gtdt; > - int cnt = 0; > + u32 cnt = 0; > > gtdt = container_of(table, struct acpi_table_gtdt, header); > acpi_gtdt_desc.gtdt = gtdt; > @@ -188,13 +188,17 @@ int __init acpi_gtdt_init(struct acpi_table_header *table, > cnt++; > > if (cnt != gtdt->platform_timer_count) { > + cnt = min(cnt, gtdt->platform_timer_count); Thank you for reporting this. There is something I need to understand. What's wrong cnt (because platform_timer_valid() fails for some reason on some entries whereas before the commit we are fixing was applied we *were* parsing those entries) or gtdt->platform_timer_count ? I *guess* the issue is the following: gtdt->platform_timer_count reports the number of GT blocks in the GTDT not including Arm generic watchdogs, whereas cnt counts both structure types (and that's what gtdt->platform_timer_count should report too if it was correct). I am asking just to understand if platform_timer_valid() forced skipping some entries that we were parsing before the commit we are fixing was applied I doubt it but it is worth checking. > + pr_err(FW_BUG "limiting Platform Timer count to %d\n", cnt); > + }` > + > + if (!cnt) { > acpi_gtdt_desc.platform_timer = NULL; > - pr_err(FW_BUG "invalid timer data.\n"); > - return -EINVAL; > + return 0; > } > > if (platform_timer_count) > - *platform_timer_count = gtdt->platform_timer_count; > + *platform_timer_count = cnt; I think this should be fine as things stand (but see above). It is used in: gtdt_sbsa_gwdt_init() - just to check if there are platform timers entries arch_timer_mem_acpi_init() - to create a temporary array to init arch mem timer entries (the array is oversized because it includes watchdog entries in the count) In both cases taking the min(cnt, gtdt->platform_timer_count); should work AFAICS (hard to grok though, we - as in ACPI maintainers - need to clean this up). Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> > > return 0; > } > > base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 > -- > 2.48.1.262.g85cc9f2d1e-goog >
Hi Lorenzo, On Tue, Jan 28, 2025 at 11:50:55AM +0100, Lorenzo Pieralisi wrote: > > @@ -188,13 +188,17 @@ int __init acpi_gtdt_init(struct acpi_table_header *table, > > cnt++; > > > > if (cnt != gtdt->platform_timer_count) { > > + cnt = min(cnt, gtdt->platform_timer_count); > > Thank you for reporting this. > > There is something I need to understand. > > What's wrong cnt (because platform_timer_valid() fails for some > reason on some entries whereas before the commit we > are fixing was applied we *were* parsing those entries) or > gtdt->platform_timer_count ? > > I *guess* the issue is the following: > > gtdt->platform_timer_count reports the number of GT blocks in the > GTDT not including Arm generic watchdogs, whereas cnt counts both > structure types (and that's what gtdt->platform_timer_count should > report too if it was correct). I've seen two different issues so far: - In one case, the offset of the platform timer array is entirely beyond the GTDT - In another, the GTDT has a timer array of length 2, but only the first structure falls within the length of the overall GTDT Since cnt is the result of doing a bounds-checked walk of the platform timer array, both of these issues cause the sanity check to fail. > > if (platform_timer_count) > > - *platform_timer_count = gtdt->platform_timer_count; > > + *platform_timer_count = cnt; > > I think this should be fine as things stand (but see above). > > It is used in: > > gtdt_sbsa_gwdt_init() - just to check if there are platform timers entries > > arch_timer_mem_acpi_init() - to create a temporary array to init arch mem timer > entries (the array is oversized because it > includes watchdog entries in the count) > > In both cases taking the > > min(cnt, gtdt->platform_timer_count); > > should work AFAICS It was probably worth noting in the changelog that I did this to gracefully handle the reverse of this issue where we could dereference platform timer entries that are within the bounds of the GTDT but exceed gtdt->platform_timer_count. > (hard to grok though, we - as in ACPI maintainers - > need to clean this up). Heh, thought this smelled a little ripe ;-) Went for the minimal fix first.
On Tue, Jan 28, 2025 at 12:42:24PM -0800, Oliver Upton wrote: > Hi Lorenzo, > > On Tue, Jan 28, 2025 at 11:50:55AM +0100, Lorenzo Pieralisi wrote: > > > @@ -188,13 +188,17 @@ int __init acpi_gtdt_init(struct acpi_table_header *table, > > > cnt++; > > > > > > if (cnt != gtdt->platform_timer_count) { > > > + cnt = min(cnt, gtdt->platform_timer_count); > > > > Thank you for reporting this. > > > > There is something I need to understand. > > > > What's wrong cnt (because platform_timer_valid() fails for some > > reason on some entries whereas before the commit we > > are fixing was applied we *were* parsing those entries) or > > gtdt->platform_timer_count ? > > > > I *guess* the issue is the following: > > > > gtdt->platform_timer_count reports the number of GT blocks in the > > GTDT not including Arm generic watchdogs, whereas cnt counts both > > structure types (and that's what gtdt->platform_timer_count should > > report too if it was correct). > > I've seen two different issues so far: > > - In one case, the offset of the platform timer array is entirely > beyond the GTDT And we were parsing it before the commit you are fixing was applied because we weren't validating the first entry ? That's how the loop was working before: #define for_each_platform_timer(_g) for (_g = acpi_gtdt_desc.platform_timer; _g; _g = next_platform_timer(_g)) We were parsing the first entry and now we don't anymore (rightly so), even with this fix applied, correct ? > - In another, the GTDT has a timer array of length 2, but only the > first structure falls within the length of the overall GTDT Right. > Since cnt is the result of doing a bounds-checked walk of the platform > timer array, both of these issues cause the sanity check to fail. Yep. > > > if (platform_timer_count) > > > - *platform_timer_count = gtdt->platform_timer_count; > > > + *platform_timer_count = cnt; > > > > I think this should be fine as things stand (but see above). > > > > It is used in: > > > > gtdt_sbsa_gwdt_init() - just to check if there are platform timers entries > > > > arch_timer_mem_acpi_init() - to create a temporary array to init arch mem timer > > entries (the array is oversized because it > > includes watchdog entries in the count) > > > > In both cases taking the > > > > min(cnt, gtdt->platform_timer_count); > > > > should work AFAICS > > It was probably worth noting in the changelog that I did this to > gracefully handle the reverse of this issue where we could dereference > platform timer entries that are within the bounds of the GTDT but exceed > gtdt->platform_timer_count. Yes, that's strike three, where platform_timer_count is borked, I understand you have not hit this one though but it seems right to fix it the way you did. I am just trying to understand given that we are sending the fix to stable whether this fix can affect other broken GTDTs (or better, whether other borked FW can affect this change) but I hope not, it should not. > > (hard to grok though, we - as in ACPI maintainers - > > need to clean this up). > > Heh, thought this smelled a little ripe ;-) Went for the minimal fix > first. I am sorry you have to deal with this, the kernel is not there to validate the firmware but on the other hand when it doesn't that's the outcome. Apologies. Lorenzo
diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c index 3561553eff8b..70f8290b659d 100644 --- a/drivers/acpi/arm64/gtdt.c +++ b/drivers/acpi/arm64/gtdt.c @@ -163,7 +163,7 @@ int __init acpi_gtdt_init(struct acpi_table_header *table, { void *platform_timer; struct acpi_table_gtdt *gtdt; - int cnt = 0; + u32 cnt = 0; gtdt = container_of(table, struct acpi_table_gtdt, header); acpi_gtdt_desc.gtdt = gtdt; @@ -188,13 +188,17 @@ int __init acpi_gtdt_init(struct acpi_table_header *table, cnt++; if (cnt != gtdt->platform_timer_count) { + cnt = min(cnt, gtdt->platform_timer_count); + pr_err(FW_BUG "limiting Platform Timer count to %d\n", cnt); + } + + if (!cnt) { acpi_gtdt_desc.platform_timer = NULL; - pr_err(FW_BUG "invalid timer data.\n"); - return -EINVAL; + return 0; } if (platform_timer_count) - *platform_timer_count = gtdt->platform_timer_count; + *platform_timer_count = cnt; return 0; }
Perhaps unsurprisingly there are some platforms where the GTDT isn't quite right and the Platforms Timer array overflows the length of the overall table. While the recently-added sanity checking isn't wrong, it makes it impossible to boot the kernel on offending platforms. Try to hobble along and limit the Platform Timer count to the bounds of the table. Cc: Marc Zyngier <maz@kernel.org> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Zheng Zengkai <zhengzengkai@huawei.com> Cc: stable@vger.kernel.org Fixes: 263e22d6bd1f ("ACPI: GTDT: Tighten the check for the array of platform timer structures") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> --- drivers/acpi/arm64/gtdt.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04