diff mbox series

riscv: Prevent a bad reference count on CPU nodes

Message ID 20240913080053.36636-1-mikisabate@gmail.com (mailing list archive)
State Accepted
Commit 9510c5b0db36f1727ffd1204146ee8f68bb88035
Headers show
Series riscv: Prevent a bad reference count on CPU nodes | expand

Checks

Context Check Description
conchuod/vmtest-for-next-PR success PR summary
conchuod/patch-1-test-1 success .github/scripts/patches/tests/build_rv32_defconfig.sh took 135.52s
conchuod/patch-1-test-2 success .github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh took 1355.38s
conchuod/patch-1-test-3 success .github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh took 1602.16s
conchuod/patch-1-test-4 success .github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh took 20.31s
conchuod/patch-1-test-5 success .github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh took 22.85s
conchuod/patch-1-test-6 success .github/scripts/patches/tests/checkpatch.sh took 0.43s
conchuod/patch-1-test-7 success .github/scripts/patches/tests/dtb_warn_rv64.sh took 43.43s
conchuod/patch-1-test-8 success .github/scripts/patches/tests/header_inline.sh took 0.00s
conchuod/patch-1-test-9 success .github/scripts/patches/tests/kdoc.sh took 0.54s
conchuod/patch-1-test-10 success .github/scripts/patches/tests/module_param.sh took 0.01s
conchuod/patch-1-test-11 success .github/scripts/patches/tests/verify_fixes.sh took 0.00s
conchuod/patch-1-test-12 success .github/scripts/patches/tests/verify_signedoff.sh took 0.03s

Commit Message

Miquel Sabaté Solà Sept. 13, 2024, 8 a.m. UTC
When populating cache leaves we previously fetched the CPU device node
at the very beginning. But when ACPI is enabled we go through a
specific branch which returns early and does not call 'of_node_put' for
the node that was acquired.

Since we are not using a CPU device node for the ACPI code anyways, we
can simply move the initialization of it just passed the ACPI block, and
we are guaranteed to have an 'of_node_put' call for the acquired node.
This prevents a bad reference count of the CPU device node.

Moreover, the previous function did not check for errors when acquiring
the device node, so a return -ENOENT has been added for that case.

Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
---
I was wondering if this should also be sent to stable, but  I have not seen
a report on it, and this is not responsible for an oops or anything like that.
So in the end I decided not to, but maybe you consider otherwise.

 arch/riscv/kernel/cacheinfo.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--
2.46.0

Comments

Sudeep Holla Sept. 13, 2024, 9:07 a.m. UTC | #1
On Fri, Sep 13, 2024 at 10:00:52AM +0200, Miquel Sabaté Solà wrote:
> When populating cache leaves we previously fetched the CPU device node
> at the very beginning. But when ACPI is enabled we go through a
> specific branch which returns early and does not call 'of_node_put' for
> the node that was acquired.
> 
> Since we are not using a CPU device node for the ACPI code anyways, we
> can simply move the initialization of it just passed the ACPI block, and
> we are guaranteed to have an 'of_node_put' call for the acquired node.
> This prevents a bad reference count of the CPU device node.
> 
> Moreover, the previous function did not check for errors when acquiring
> the device node, so a return -ENOENT has been added for that case.
>

LGTM,

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
> ---
> I was wondering if this should also be sent to stable, but  I have not seen
> a report on it, and this is not responsible for an oops or anything like that.
> So in the end I decided not to, but maybe you consider otherwise.
> 

Right, it is not a fix per say and hence not a stable material as ACPI
is not accessing the node pointer.
Yunhui Cui Sept. 18, 2024, 2:19 a.m. UTC | #2
Hi Miquel,

On Fri, Sep 13, 2024 at 4:02 PM Miquel Sabaté Solà <mikisabate@gmail.com> wrote:
>
> When populating cache leaves we previously fetched the CPU device node
> at the very beginning. But when ACPI is enabled we go through a
> specific branch which returns early and does not call 'of_node_put' for
> the node that was acquired.
>
> Since we are not using a CPU device node for the ACPI code anyways, we
> can simply move the initialization of it just passed the ACPI block, and
> we are guaranteed to have an 'of_node_put' call for the acquired node.
> This prevents a bad reference count of the CPU device node.
>
> Moreover, the previous function did not check for errors when acquiring
> the device node, so a return -ENOENT has been added for that case.
>
> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
> ---
> I was wondering if this should also be sent to stable, but  I have not seen
> a report on it, and this is not responsible for an oops or anything like that.
> So in the end I decided not to, but maybe you consider otherwise.
>
>  arch/riscv/kernel/cacheinfo.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index d6c108c50cba..d32dfdba083e 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu)
>  {
>         struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>         struct cacheinfo *this_leaf = this_cpu_ci->info_list;
> -       struct device_node *np = of_cpu_device_node_get(cpu);
> -       struct device_node *prev = NULL;
> +       struct device_node *np, *prev;
>         int levels = 1, level = 1;
>
>         if (!acpi_disabled) {
> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu)
>                 return 0;
>         }
>
> +       np = of_cpu_device_node_get(cpu);
> +       if (!np)
> +               return -ENOENT;
> +

It is necessary because the caller of populate_cache_leaves has a
return value judgment.
So,  Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>

>         if (of_property_read_bool(np, "cache-size"))
>                 ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);
>         if (of_property_read_bool(np, "i-cache-size"))
> --
> 2.46.0
>

Thanks,
Yunhui
Miquel Sabaté Solà Sept. 30, 2024, 12:35 p.m. UTC | #3
On dv., de set. 13 2024, Miquel Sabaté Solà wrote:

> When populating cache leaves we previously fetched the CPU device node
> at the very beginning. But when ACPI is enabled we go through a
> specific branch which returns early and does not call 'of_node_put' for
> the node that was acquired.
>
> Since we are not using a CPU device node for the ACPI code anyways, we
> can simply move the initialization of it just passed the ACPI block, and
> we are guaranteed to have an 'of_node_put' call for the acquired node.
> This prevents a bad reference count of the CPU device node.
>
> Moreover, the previous function did not check for errors when acquiring
> the device node, so a return -ENOENT has been added for that case.
>
> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
> ---
> I was wondering if this should also be sent to stable, but  I have not seen
> a report on it, and this is not responsible for an oops or anything like that.
> So in the end I decided not to, but maybe you consider otherwise.
>
>  arch/riscv/kernel/cacheinfo.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index d6c108c50cba..d32dfdba083e 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu)
>  {
>  	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>  	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
> -	struct device_node *np = of_cpu_device_node_get(cpu);
> -	struct device_node *prev = NULL;
> +	struct device_node *np, *prev;
>  	int levels = 1, level = 1;
>
>  	if (!acpi_disabled) {
> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu)
>  		return 0;
>  	}
>
> +	np = of_cpu_device_node_get(cpu);
> +	if (!np)
> +		return -ENOENT;
> +
>  	if (of_property_read_bool(np, "cache-size"))
>  		ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);
>  	if (of_property_read_bool(np, "i-cache-size"))

Gently ping :)

Could you take a look at this fix?

Thanks,
Miquel
Sunil V L Sept. 30, 2024, 4:28 p.m. UTC | #4
On Fri, Sep 13, 2024 at 10:00:52AM +0200, Miquel Sabaté Solà wrote:
> When populating cache leaves we previously fetched the CPU device node
> at the very beginning. But when ACPI is enabled we go through a
> specific branch which returns early and does not call 'of_node_put' for
> the node that was acquired.
> 
> Since we are not using a CPU device node for the ACPI code anyways, we
> can simply move the initialization of it just passed the ACPI block, and
> we are guaranteed to have an 'of_node_put' call for the acquired node.
> This prevents a bad reference count of the CPU device node.
> 
> Moreover, the previous function did not check for errors when acquiring
> the device node, so a return -ENOENT has been added for that case.
> 
> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
> ---
> I was wondering if this should also be sent to stable, but  I have not seen
> a report on it, and this is not responsible for an oops or anything like that.
> So in the end I decided not to, but maybe you consider otherwise.
> 
>  arch/riscv/kernel/cacheinfo.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index d6c108c50cba..d32dfdba083e 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu)
>  {
>  	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>  	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
> -	struct device_node *np = of_cpu_device_node_get(cpu);
> -	struct device_node *prev = NULL;
> +	struct device_node *np, *prev;
>  	int levels = 1, level = 1;
> 
>  	if (!acpi_disabled) {
> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu)
>  		return 0;
>  	}
> 
> +	np = of_cpu_device_node_get(cpu);
> +	if (!np)
> +		return -ENOENT;
> +
LGTM.

Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>

Thanks,
Sunil
Miquel Sabaté Solà Oct. 8, 2024, 1:38 p.m. UTC | #5
On dl., de set. 30 2024, Miquel Sabaté Solà wrote:

> On dv., de set. 13 2024, Miquel Sabaté Solà wrote:
>
>> When populating cache leaves we previously fetched the CPU device node
>> at the very beginning. But when ACPI is enabled we go through a
>> specific branch which returns early and does not call 'of_node_put' for
>> the node that was acquired.
>>
>> Since we are not using a CPU device node for the ACPI code anyways, we
>> can simply move the initialization of it just passed the ACPI block, and
>> we are guaranteed to have an 'of_node_put' call for the acquired node.
>> This prevents a bad reference count of the CPU device node.
>>
>> Moreover, the previous function did not check for errors when acquiring
>> the device node, so a return -ENOENT has been added for that case.
>>
>> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
>> ---
>> I was wondering if this should also be sent to stable, but  I have not seen
>> a report on it, and this is not responsible for an oops or anything like that.
>> So in the end I decided not to, but maybe you consider otherwise.
>>
>>  arch/riscv/kernel/cacheinfo.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
>> index d6c108c50cba..d32dfdba083e 100644
>> --- a/arch/riscv/kernel/cacheinfo.c
>> +++ b/arch/riscv/kernel/cacheinfo.c
>> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu)
>>  {
>>  	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>>  	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
>> -	struct device_node *np = of_cpu_device_node_get(cpu);
>> -	struct device_node *prev = NULL;
>> +	struct device_node *np, *prev;
>>  	int levels = 1, level = 1;
>>
>>  	if (!acpi_disabled) {
>> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu)
>>  		return 0;
>>  	}
>>
>> +	np = of_cpu_device_node_get(cpu);
>> +	if (!np)
>> +		return -ENOENT;
>> +
>>  	if (of_property_read_bool(np, "cache-size"))
>>  		ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);
>>  	if (of_property_read_bool(np, "i-cache-size"))
>
> Gently ping :)
>
> Could you take a look at this fix?
>
> Thanks,
> Miquel

Hello,

Would it make sense to have this fix for rc3?

Thanks,
Miquel
Alexandre Ghiti Oct. 10, 2024, 12:29 p.m. UTC | #6
Hi Miquel,

On 08/10/2024 15:38, Miquel Sabaté Solà wrote:
> On dl., de set. 30 2024, Miquel Sabaté Solà wrote:
>
>> On dv., de set. 13 2024, Miquel Sabaté Solà wrote:
>>
>>> When populating cache leaves we previously fetched the CPU device node
>>> at the very beginning. But when ACPI is enabled we go through a
>>> specific branch which returns early and does not call 'of_node_put' for
>>> the node that was acquired.
>>>
>>> Since we are not using a CPU device node for the ACPI code anyways, we
>>> can simply move the initialization of it just passed the ACPI block, and
>>> we are guaranteed to have an 'of_node_put' call for the acquired node.
>>> This prevents a bad reference count of the CPU device node.
>>>
>>> Moreover, the previous function did not check for errors when acquiring
>>> the device node, so a return -ENOENT has been added for that case.
>>>
>>> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
>>> ---
>>> I was wondering if this should also be sent to stable, but  I have not seen
>>> a report on it, and this is not responsible for an oops or anything like that.
>>> So in the end I decided not to, but maybe you consider otherwise.
>>>
>>>   arch/riscv/kernel/cacheinfo.c | 7 +++++--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
>>> index d6c108c50cba..d32dfdba083e 100644
>>> --- a/arch/riscv/kernel/cacheinfo.c
>>> +++ b/arch/riscv/kernel/cacheinfo.c
>>> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu)
>>>   {
>>>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>>>   	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
>>> -	struct device_node *np = of_cpu_device_node_get(cpu);
>>> -	struct device_node *prev = NULL;
>>> +	struct device_node *np, *prev;
>>>   	int levels = 1, level = 1;
>>>
>>>   	if (!acpi_disabled) {
>>> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu)
>>>   		return 0;
>>>   	}
>>>
>>> +	np = of_cpu_device_node_get(cpu);
>>> +	if (!np)
>>> +		return -ENOENT;
>>> +
>>>   	if (of_property_read_bool(np, "cache-size"))
>>>   		ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);
>>>   	if (of_property_read_bool(np, "i-cache-size"))
>> Gently ping :)
>>
>> Could you take a look at this fix?
>>
>> Thanks,
>> Miquel
> Hello,
>
> Would it make sense to have this fix for rc3?


Sorry for the late response. It probably won't make it to rc3 but I'll 
make sure it will in rc4 :)

First:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

And it needs the following Fixes tag (but no need to send a new version, 
b4 will pick it up):

Fixes: 604f32ea6909 ("riscv: cacheinfo: initialize cacheinfo's level and 
type from ACPI PPTT")

And about ccing stable, I'm not sure what could be the impact of this 
bad reference count (some warnings could appear, etc...) so as it is a 
small patch, I think it's worth backporting to stable.

Thanks,

Alex


>
> Thanks,
> Miquel
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Miquel Sabaté Solà Oct. 10, 2024, 2:32 p.m. UTC | #7
On dj., d’oct. 10 2024, Alexandre Ghiti wrote:

> Hi Miquel,
>
> On 08/10/2024 15:38, Miquel Sabaté Solà wrote:
>> On dl., de set. 30 2024, Miquel Sabaté Solà wrote:
>>
>>> On dv., de set. 13 2024, Miquel Sabaté Solà wrote:
>>>
>>>> When populating cache leaves we previously fetched the CPU device node
>>>> at the very beginning. But when ACPI is enabled we go through a
>>>> specific branch which returns early and does not call 'of_node_put' for
>>>> the node that was acquired.
>>>>
>>>> Since we are not using a CPU device node for the ACPI code anyways, we
>>>> can simply move the initialization of it just passed the ACPI block, and
>>>> we are guaranteed to have an 'of_node_put' call for the acquired node.
>>>> This prevents a bad reference count of the CPU device node.
>>>>
>>>> Moreover, the previous function did not check for errors when acquiring
>>>> the device node, so a return -ENOENT has been added for that case.
>>>>
>>>> Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
>>>> ---
>>>> I was wondering if this should also be sent to stable, but  I have not seen
>>>> a report on it, and this is not responsible for an oops or anything like that.
>>>> So in the end I decided not to, but maybe you consider otherwise.
>>>>
>>>>   arch/riscv/kernel/cacheinfo.c | 7 +++++--
>>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
>>>> index d6c108c50cba..d32dfdba083e 100644
>>>> --- a/arch/riscv/kernel/cacheinfo.c
>>>> +++ b/arch/riscv/kernel/cacheinfo.c
>>>> @@ -75,8 +75,7 @@ int populate_cache_leaves(unsigned int cpu)
>>>>   {
>>>>   	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>>>>   	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
>>>> -	struct device_node *np = of_cpu_device_node_get(cpu);
>>>> -	struct device_node *prev = NULL;
>>>> +	struct device_node *np, *prev;
>>>>   	int levels = 1, level = 1;
>>>>
>>>>   	if (!acpi_disabled) {
>>>> @@ -100,6 +99,10 @@ int populate_cache_leaves(unsigned int cpu)
>>>>   		return 0;
>>>>   	}
>>>>
>>>> +	np = of_cpu_device_node_get(cpu);
>>>> +	if (!np)
>>>> +		return -ENOENT;
>>>> +
>>>>   	if (of_property_read_bool(np, "cache-size"))
>>>>   		ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);
>>>>   	if (of_property_read_bool(np, "i-cache-size"))
>>> Gently ping :)
>>>
>>> Could you take a look at this fix?
>>>
>>> Thanks,
>>> Miquel
>> Hello,
>>
>> Would it make sense to have this fix for rc3?
>
>
> Sorry for the late response. It probably won't make it to rc3 but I'll make sure
> it will in rc4 :)
>
> First:
>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
>
> And it needs the following Fixes tag (but no need to send a new version, b4 will
> pick it up):
>
> Fixes: 604f32ea6909 ("riscv: cacheinfo: initialize cacheinfo's level and type
> from ACPI PPTT")
>
> And about ccing stable, I'm not sure what could be the impact of this bad
> reference count (some warnings could appear, etc...) so as it is a small patch,
> I think it's worth backporting to stable.
>
> Thanks,
>
> Alex
>
>
>>
>> Thanks,
>> Miquel
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv

Nice, thank you!
patchwork-bot+linux-riscv@kernel.org Oct. 17, 2024, 4:30 p.m. UTC | #8
Hello:

This patch was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Fri, 13 Sep 2024 10:00:52 +0200 you wrote:
> When populating cache leaves we previously fetched the CPU device node
> at the very beginning. But when ACPI is enabled we go through a
> specific branch which returns early and does not call 'of_node_put' for
> the node that was acquired.
> 
> Since we are not using a CPU device node for the ACPI code anyways, we
> can simply move the initialization of it just passed the ACPI block, and
> we are guaranteed to have an 'of_node_put' call for the acquired node.
> This prevents a bad reference count of the CPU device node.
> 
> [...]

Here is the summary with links:
  - riscv: Prevent a bad reference count on CPU nodes
    https://git.kernel.org/riscv/c/9510c5b0db36

You are awesome, thank you!
diff mbox series

Patch

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index d6c108c50cba..d32dfdba083e 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -75,8 +75,7 @@  int populate_cache_leaves(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	struct cacheinfo *this_leaf = this_cpu_ci->info_list;
-	struct device_node *np = of_cpu_device_node_get(cpu);
-	struct device_node *prev = NULL;
+	struct device_node *np, *prev;
 	int levels = 1, level = 1;

 	if (!acpi_disabled) {
@@ -100,6 +99,10 @@  int populate_cache_leaves(unsigned int cpu)
 		return 0;
 	}

+	np = of_cpu_device_node_get(cpu);
+	if (!np)
+		return -ENOENT;
+
 	if (of_property_read_bool(np, "cache-size"))
 		ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);
 	if (of_property_read_bool(np, "i-cache-size"))