Message ID | 20240913080053.36636-1-mikisabate@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 9510c5b0db36f1727ffd1204146ee8f68bb88035 |
Headers | show |
Series | riscv: Prevent a bad reference count on CPU nodes | expand |
On Fri, Sep 13, 2024 at 10:00:52AM +0200, Miquel Sabaté Solà wrote: > When populating cache leaves we previously fetched the CPU device node > at the very beginning. But when ACPI is enabled we go through a > specific branch which returns early and does not call 'of_node_put' for > the node that was acquired. > > Since we are not using a CPU device node for the ACPI code anyways, we > can simply move the initialization of it just passed the ACPI block, and > we are guaranteed to have an 'of_node_put' call for the acquired node. > This prevents a bad reference count of the CPU device node. > > Moreover, the previous function did not check for errors when acquiring > the device node, so a return -ENOENT has been added for that case. > LGTM, Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> > Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> > --- > I was wondering if this should also be sent to stable, but I have not seen > a report on it, and this is not responsible for an oops or anything like that. > So in the end I decided not to, but maybe you consider otherwise. > Right, it is not a fix per say and hence not a stable material as ACPI is not accessing the node pointer.
Hi Miquel, On Fri, Sep 13, 2024 at 4:02 PM Miquel Sabaté Solà <mikisabate@gmail.com> wrote: > > When populating cache leaves we previously fetched the CPU device node > at the very beginning. But when ACPI is enabled we go through a > specific branch which returns early and does not call 'of_node_put' for > the node that was acquired. > > Since we are not using a CPU device node for the ACPI code anyways, we > can simply move the initialization of it just passed the ACPI block, and > we are guaranteed to have an 'of_node_put' call for the acquired node. > This prevents a bad reference count of the CPU device node. > > Moreover, the previous function did not check for errors when acquiring > the device node, so a return -ENOENT has been added for that case. > > Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> > --- > I was wondering if this should also be sent to stable, but I have not seen > a report on it, and this is not responsible for an oops or anything like that. > So in the end I decided not to, but maybe you consider otherwise. > > arch/riscv/kernel/cacheinfo.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c > index d6c108c50cba..d32dfdba083e 100644 > --- a/arch/riscv/kernel/cacheinfo.c > +++ b/arch/riscv/kernel/cacheinfo.c > @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu) > { > struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); > struct cacheinfo *this_leaf = this_cpu_ci->info_list; > - struct device_node *np = of_cpu_device_node_get(cpu); > - struct device_node *prev = NULL; > + struct device_node *np, *prev; > int levels = 1, level = 1; > > if (!acpi_disabled) { > @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu) > return 0; > } > > + np = of_cpu_device_node_get(cpu); > + if (!np) > + return -ENOENT; > + It is necessary because the caller of populate_cache_leaves has a return value judgment. So, Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> > if (of_property_read_bool(np, "cache-size")) > ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level); > if (of_property_read_bool(np, "i-cache-size")) > -- > 2.46.0 > Thanks, Yunhui
On dv., de set. 13 2024, Miquel Sabaté Solà wrote: > When populating cache leaves we previously fetched the CPU device node > at the very beginning. But when ACPI is enabled we go through a > specific branch which returns early and does not call 'of_node_put' for > the node that was acquired. > > Since we are not using a CPU device node for the ACPI code anyways, we > can simply move the initialization of it just passed the ACPI block, and > we are guaranteed to have an 'of_node_put' call for the acquired node. > This prevents a bad reference count of the CPU device node. > > Moreover, the previous function did not check for errors when acquiring > the device node, so a return -ENOENT has been added for that case. > > Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> > --- > I was wondering if this should also be sent to stable, but I have not seen > a report on it, and this is not responsible for an oops or anything like that. > So in the end I decided not to, but maybe you consider otherwise. > > arch/riscv/kernel/cacheinfo.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c > index d6c108c50cba..d32dfdba083e 100644 > --- a/arch/riscv/kernel/cacheinfo.c > +++ b/arch/riscv/kernel/cacheinfo.c > @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu) > { > struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); > struct cacheinfo *this_leaf = this_cpu_ci->info_list; > - struct device_node *np = of_cpu_device_node_get(cpu); > - struct device_node *prev = NULL; > + struct device_node *np, *prev; > int levels = 1, level = 1; > > if (!acpi_disabled) { > @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu) > return 0; > } > > + np = of_cpu_device_node_get(cpu); > + if (!np) > + return -ENOENT; > + > if (of_property_read_bool(np, "cache-size")) > ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level); > if (of_property_read_bool(np, "i-cache-size")) Gently ping :) Could you take a look at this fix? Thanks, Miquel
On Fri, Sep 13, 2024 at 10:00:52AM +0200, Miquel Sabaté Solà wrote: > When populating cache leaves we previously fetched the CPU device node > at the very beginning. But when ACPI is enabled we go through a > specific branch which returns early and does not call 'of_node_put' for > the node that was acquired. > > Since we are not using a CPU device node for the ACPI code anyways, we > can simply move the initialization of it just passed the ACPI block, and > we are guaranteed to have an 'of_node_put' call for the acquired node. > This prevents a bad reference count of the CPU device node. > > Moreover, the previous function did not check for errors when acquiring > the device node, so a return -ENOENT has been added for that case. > > Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> > --- > I was wondering if this should also be sent to stable, but I have not seen > a report on it, and this is not responsible for an oops or anything like that. > So in the end I decided not to, but maybe you consider otherwise. > > arch/riscv/kernel/cacheinfo.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c > index d6c108c50cba..d32dfdba083e 100644 > --- a/arch/riscv/kernel/cacheinfo.c > +++ b/arch/riscv/kernel/cacheinfo.c > @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu) > { > struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); > struct cacheinfo *this_leaf = this_cpu_ci->info_list; > - struct device_node *np = of_cpu_device_node_get(cpu); > - struct device_node *prev = NULL; > + struct device_node *np, *prev; > int levels = 1, level = 1; > > if (!acpi_disabled) { > @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu) > return 0; > } > > + np = of_cpu_device_node_get(cpu); > + if (!np) > + return -ENOENT; > + LGTM. Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Thanks, Sunil
On dl., de set. 30 2024, Miquel Sabaté Solà wrote: > On dv., de set. 13 2024, Miquel Sabaté Solà wrote: > >> When populating cache leaves we previously fetched the CPU device node >> at the very beginning. But when ACPI is enabled we go through a >> specific branch which returns early and does not call 'of_node_put' for >> the node that was acquired. >> >> Since we are not using a CPU device node for the ACPI code anyways, we >> can simply move the initialization of it just passed the ACPI block, and >> we are guaranteed to have an 'of_node_put' call for the acquired node. >> This prevents a bad reference count of the CPU device node. >> >> Moreover, the previous function did not check for errors when acquiring >> the device node, so a return -ENOENT has been added for that case. >> >> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> >> --- >> I was wondering if this should also be sent to stable, but I have not seen >> a report on it, and this is not responsible for an oops or anything like that. >> So in the end I decided not to, but maybe you consider otherwise. >> >> arch/riscv/kernel/cacheinfo.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c >> index d6c108c50cba..d32dfdba083e 100644 >> --- a/arch/riscv/kernel/cacheinfo.c >> +++ b/arch/riscv/kernel/cacheinfo.c >> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu) >> { >> struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); >> struct cacheinfo *this_leaf = this_cpu_ci->info_list; >> - struct device_node *np = of_cpu_device_node_get(cpu); >> - struct device_node *prev = NULL; >> + struct device_node *np, *prev; >> int levels = 1, level = 1; >> >> if (!acpi_disabled) { >> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu) >> return 0; >> } >> >> + np = of_cpu_device_node_get(cpu); >> + if (!np) >> + return -ENOENT; >> + >> if (of_property_read_bool(np, "cache-size")) >> ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level); >> if (of_property_read_bool(np, "i-cache-size")) > > Gently ping :) > > Could you take a look at this fix? > > Thanks, > Miquel Hello, Would it make sense to have this fix for rc3? Thanks, Miquel
Hi Miquel, On 08/10/2024 15:38, Miquel Sabaté Solà wrote: > On dl., de set. 30 2024, Miquel Sabaté Solà wrote: > >> On dv., de set. 13 2024, Miquel Sabaté Solà wrote: >> >>> When populating cache leaves we previously fetched the CPU device node >>> at the very beginning. But when ACPI is enabled we go through a >>> specific branch which returns early and does not call 'of_node_put' for >>> the node that was acquired. >>> >>> Since we are not using a CPU device node for the ACPI code anyways, we >>> can simply move the initialization of it just passed the ACPI block, and >>> we are guaranteed to have an 'of_node_put' call for the acquired node. >>> This prevents a bad reference count of the CPU device node. >>> >>> Moreover, the previous function did not check for errors when acquiring >>> the device node, so a return -ENOENT has been added for that case. >>> >>> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> >>> --- >>> I was wondering if this should also be sent to stable, but I have not seen >>> a report on it, and this is not responsible for an oops or anything like that. >>> So in the end I decided not to, but maybe you consider otherwise. >>> >>> arch/riscv/kernel/cacheinfo.c | 7 +++++-- >>> 1 file changed, 5 insertions(+), 2 deletions(-) >>> >>> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c >>> index d6c108c50cba..d32dfdba083e 100644 >>> --- a/arch/riscv/kernel/cacheinfo.c >>> +++ b/arch/riscv/kernel/cacheinfo.c >>> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu) >>> { >>> struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); >>> struct cacheinfo *this_leaf = this_cpu_ci->info_list; >>> - struct device_node *np = of_cpu_device_node_get(cpu); >>> - struct device_node *prev = NULL; >>> + struct device_node *np, *prev; >>> int levels = 1, level = 1; >>> >>> if (!acpi_disabled) { >>> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu) >>> return 0; >>> } >>> >>> + np = of_cpu_device_node_get(cpu); >>> + if (!np) >>> + return -ENOENT; >>> + >>> if (of_property_read_bool(np, "cache-size")) >>> ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level); >>> if (of_property_read_bool(np, "i-cache-size")) >> Gently ping :) >> >> Could you take a look at this fix? >> >> Thanks, >> Miquel > Hello, > > Would it make sense to have this fix for rc3? Sorry for the late response. It probably won't make it to rc3 but I'll make sure it will in rc4 :) First: Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> And it needs the following Fixes tag (but no need to send a new version, b4 will pick it up): Fixes: 604f32ea6909 ("riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT") And about ccing stable, I'm not sure what could be the impact of this bad reference count (some warnings could appear, etc...) so as it is a small patch, I think it's worth backporting to stable. Thanks, Alex > > Thanks, > Miquel > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On dj., d’oct. 10 2024, Alexandre Ghiti wrote: > Hi Miquel, > > On 08/10/2024 15:38, Miquel Sabaté Solà wrote: >> On dl., de set. 30 2024, Miquel Sabaté Solà wrote: >> >>> On dv., de set. 13 2024, Miquel Sabaté Solà wrote: >>> >>>> When populating cache leaves we previously fetched the CPU device node >>>> at the very beginning. But when ACPI is enabled we go through a >>>> specific branch which returns early and does not call 'of_node_put' for >>>> the node that was acquired. >>>> >>>> Since we are not using a CPU device node for the ACPI code anyways, we >>>> can simply move the initialization of it just passed the ACPI block, and >>>> we are guaranteed to have an 'of_node_put' call for the acquired node. >>>> This prevents a bad reference count of the CPU device node. >>>> >>>> Moreover, the previous function did not check for errors when acquiring >>>> the device node, so a return -ENOENT has been added for that case. >>>> >>>> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> >>>> --- >>>> I was wondering if this should also be sent to stable, but I have not seen >>>> a report on it, and this is not responsible for an oops or anything like that. >>>> So in the end I decided not to, but maybe you consider otherwise. >>>> >>>> arch/riscv/kernel/cacheinfo.c | 7 +++++-- >>>> 1 file changed, 5 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c >>>> index d6c108c50cba..d32dfdba083e 100644 >>>> --- a/arch/riscv/kernel/cacheinfo.c >>>> +++ b/arch/riscv/kernel/cacheinfo.c >>>> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu) >>>> { >>>> struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); >>>> struct cacheinfo *this_leaf = this_cpu_ci->info_list; >>>> - struct device_node *np = of_cpu_device_node_get(cpu); >>>> - struct device_node *prev = NULL; >>>> + struct device_node *np, *prev; >>>> int levels = 1, level = 1; >>>> >>>> if (!acpi_disabled) { >>>> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu) >>>> return 0; >>>> } >>>> >>>> + np = of_cpu_device_node_get(cpu); >>>> + if (!np) >>>> + return -ENOENT; >>>> + >>>> if (of_property_read_bool(np, "cache-size")) >>>> ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level); >>>> if (of_property_read_bool(np, "i-cache-size")) >>> Gently ping :) >>> >>> Could you take a look at this fix? >>> >>> Thanks, >>> Miquel >> Hello, >> >> Would it make sense to have this fix for rc3? > > > Sorry for the late response. It probably won't make it to rc3 but I'll make sure > it will in rc4 :) > > First: > > Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> > > And it needs the following Fixes tag (but no need to send a new version, b4 will > pick it up): > > Fixes: 604f32ea6909 ("riscv: cacheinfo: initialize cacheinfo's level and type > from ACPI PPTT") > > And about ccing stable, I'm not sure what could be the impact of this bad > reference count (some warnings could appear, etc...) so as it is a small patch, > I think it's worth backporting to stable. > > Thanks, > > Alex > > >> >> Thanks, >> Miquel >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv Nice, thank you!
Hello: This patch was applied to riscv/linux.git (fixes) by Palmer Dabbelt <palmer@rivosinc.com>: On Fri, 13 Sep 2024 10:00:52 +0200 you wrote: > When populating cache leaves we previously fetched the CPU device node > at the very beginning. But when ACPI is enabled we go through a > specific branch which returns early and does not call 'of_node_put' for > the node that was acquired. > > Since we are not using a CPU device node for the ACPI code anyways, we > can simply move the initialization of it just passed the ACPI block, and > we are guaranteed to have an 'of_node_put' call for the acquired node. > This prevents a bad reference count of the CPU device node. > > [...] Here is the summary with links: - riscv: Prevent a bad reference count on CPU nodes https://git.kernel.org/riscv/c/9510c5b0db36 You are awesome, thank you!
diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c index d6c108c50cba..d32dfdba083e 100644 --- a/arch/riscv/kernel/cacheinfo.c +++ b/arch/riscv/kernel/cacheinfo.c @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu) { struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); struct cacheinfo *this_leaf = this_cpu_ci->info_list; - struct device_node *np = of_cpu_device_node_get(cpu); - struct device_node *prev = NULL; + struct device_node *np, *prev; int levels = 1, level = 1; if (!acpi_disabled) { @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu) return 0; } + np = of_cpu_device_node_get(cpu); + if (!np) + return -ENOENT; + if (of_property_read_bool(np, "cache-size")) ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level); if (of_property_read_bool(np, "i-cache-size"))
When populating cache leaves we previously fetched the CPU device node at the very beginning. But when ACPI is enabled we go through a specific branch which returns early and does not call 'of_node_put' for the node that was acquired. Since we are not using a CPU device node for the ACPI code anyways, we can simply move the initialization of it just passed the ACPI block, and we are guaranteed to have an 'of_node_put' call for the acquired node. This prevents a bad reference count of the CPU device node. Moreover, the previous function did not check for errors when acquiring the device node, so a return -ENOENT has been added for that case. Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> --- I was wondering if this should also be sent to stable, but I have not seen a report on it, and this is not responsible for an oops or anything like that. So in the end I decided not to, but maybe you consider otherwise. arch/riscv/kernel/cacheinfo.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 2.46.0