diff mbox

[2/2] of/fdt: skip unflattening of disabled nodes

Message ID 20171003161815.25999-2-robh@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Rob Herring (Arm) Oct. 3, 2017, 4:18 p.m. UTC
For static DT usecases, we don't need the disabled nodes and can skip
unflattening. This saves a significant amount of RAM in memory constrained
cases. In one example on STM32F469, the RAM usage goes from 118K to 26K.

There are a few cases in the kernel that modify the status property
dynamically. These all are changes from enabled to disabled, depend on
OF_DYNAMIC or are not FDT based (PDT based).

Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
For more background, see this presentation from Nico:

https://connect.linaro.org/resource/sfo17/sfo17-100/

 drivers/of/fdt.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Nicolas Pitre Oct. 3, 2017, 6:46 p.m. UTC | #1
On Tue, 3 Oct 2017, Rob Herring wrote:

> For static DT usecases, we don't need the disabled nodes and can skip
> unflattening. This saves a significant amount of RAM in memory constrained
> cases. In one example on STM32F469, the RAM usage goes from 118K to 26K.
> 
> There are a few cases in the kernel that modify the status property
> dynamically. These all are changes from enabled to disabled, depend on
> OF_DYNAMIC or are not FDT based (PDT based).
> 
> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
> Cc: Frank Rowand <frowand.list@gmail.com>
> Signed-off-by: Rob Herring <robh@kernel.org>

Tested-by: Nicolas Pitre <nico@linaro.org>

> ---
> For more background, see this presentation from Nico:
> 
> https://connect.linaro.org/resource/sfo17/sfo17-100/

Oh my ... this is horrible ...

/me hides in shame

> 
>  drivers/of/fdt.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index f8c39705418b..efe91c6856a0 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -396,6 +396,10 @@ static int unflatten_dt_nodes(const void *blob,
>  		if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
>  			continue;
>  
> +		if (!IS_ENABLED(CONFIG_OF_DYNAMIC) &&
> +		    !of_fdt_device_is_available(blob, offset))
> +			continue;
> +
>  		if (!populate_node(blob, offset, &mem, nps[depth],
>  				   &nps[depth+1], dryrun))
>  			return mem - base;
> -- 
> 2.11.0
> 
>
Frank Rowand Oct. 3, 2017, 9:58 p.m. UTC | #2
On 10/03/17 11:46, Nicolas Pitre wrote:
> On Tue, 3 Oct 2017, Rob Herring wrote:
> 
>> For static DT usecases, we don't need the disabled nodes and can skip
>> unflattening. This saves a significant amount of RAM in memory constrained
>> cases. In one example on STM32F469, the RAM usage goes from 118K to 26K.
>>
>> There are a few cases in the kernel that modify the status property
>> dynamically. These all are changes from enabled to disabled, depend on
>> OF_DYNAMIC or are not FDT based (PDT based).
>>
>> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
>> Cc: Frank Rowand <frowand.list@gmail.com>
>> Signed-off-by: Rob Herring <robh@kernel.org>
> 
> Tested-by: Nicolas Pitre <nico@linaro.org>
> 
>> ---
>> For more background, see this presentation from Nico:
>>
>> https://connect.linaro.org/resource/sfo17/sfo17-100/
> 
> Oh my ... this is horrible ...

I found the presentation to be quite good and useful.
The Linaro site does not have the slides, could you please send
me a copy?

Thanks,

Frank

> 
> /me hides in shame
> 
>>
>>  drivers/of/fdt.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>> index f8c39705418b..efe91c6856a0 100644
>> --- a/drivers/of/fdt.c
>> +++ b/drivers/of/fdt.c
>> @@ -396,6 +396,10 @@ static int unflatten_dt_nodes(const void *blob,
>>  		if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
>>  			continue;
>>  
>> +		if (!IS_ENABLED(CONFIG_OF_DYNAMIC) &&
>> +		    !of_fdt_device_is_available(blob, offset))
>> +			continue;
>> +
>>  		if (!populate_node(blob, offset, &mem, nps[depth],
>>  				   &nps[depth+1], dryrun))
>>  			return mem - base;
>> -- 
>> 2.11.0
>>
>>
>
Nicolas Pitre Oct. 5, 2017, 6:29 p.m. UTC | #3
On Tue, 3 Oct 2017, Frank Rowand wrote:

> On 10/03/17 11:46, Nicolas Pitre wrote:
> > On Tue, 3 Oct 2017, Rob Herring wrote:
> > 
> >> For static DT usecases, we don't need the disabled nodes and can skip
> >> unflattening. This saves a significant amount of RAM in memory constrained
> >> cases. In one example on STM32F469, the RAM usage goes from 118K to 26K.
> >>
> >> There are a few cases in the kernel that modify the status property
> >> dynamically. These all are changes from enabled to disabled, depend on
> >> OF_DYNAMIC or are not FDT based (PDT based).
> >>
> >> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
> >> Cc: Frank Rowand <frowand.list@gmail.com>
> >> Signed-off-by: Rob Herring <robh@kernel.org>
> > 
> > Tested-by: Nicolas Pitre <nico@linaro.org>
> > 
> >> ---
> >> For more background, see this presentation from Nico:
> >>
> >> https://connect.linaro.org/resource/sfo17/sfo17-100/
> > 
> > Oh my ... this is horrible ...
> 
> I found the presentation to be quite good and useful.

Thanks. I think the content is good. I'm just a terrible speaker.

> The Linaro site does not have the slides, could you please send
> me a copy?

I've asked for the slides to be uploaded to the site. I've sent you a 
copy privately in the mean time.


Nicolas
Frank Rowand Oct. 8, 2017, 10:57 p.m. UTC | #4
On 10/03/17 09:18, Rob Herring wrote:
> For static DT usecases, we don't need the disabled nodes and can skip
> unflattening. This saves a significant amount of RAM in memory constrained
> cases. In one example on STM32F469, the RAM usage goes from 118K to 26K.
> 
> There are a few cases in the kernel that modify the status property
> dynamically. These all are changes from enabled to disabled, depend on
> OF_DYNAMIC or are not FDT based (PDT based).
> 
> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
> Cc: Frank Rowand <frowand.list@gmail.com>
> Signed-off-by: Rob Herring <robh@kernel.org>
> ---
> For more background, see this presentation from Nico:
> 
> https://connect.linaro.org/resource/sfo17/sfo17-100/
> 
>  drivers/of/fdt.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index f8c39705418b..efe91c6856a0 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -396,6 +396,10 @@ static int unflatten_dt_nodes(const void *blob,
>  		if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
>  			continue;
>  
> +		if (!IS_ENABLED(CONFIG_OF_DYNAMIC) &&
> +		    !of_fdt_device_is_available(blob, offset))
> +			continue;
> +
>  		if (!populate_node(blob, offset, &mem, nps[depth],
>  				   &nps[depth+1], dryrun))
>  			return mem - base;
> 

Hi Rob,

I strongly support the idea of this patch, but there may be an
issue we have to resolve.  I'm pretty sure we had talked about
the issue a long time ago, and it has been sitting on my todo
list.

We have two sets of node traversal macros and functions.  One
set honors the status property, and the other ignores it.  If
I recall our previous discussion properly, we want the normal
usage to honor the status property and only a tiny (or maybe
non-existent) set of locations to be allowed to ignore the
status property.

A rough sense of how often the status property is honored or
not is:

   $ git grep for_each_child_of_node | wc -l
   293
   $ git grep of_get_next_child | wc -l
   103

   $ git grep for_each_available_child_of_node | wc -l
   106
   $ git grep of_get_next_available_child | wc -l
   20

Many of the cases where the status flag is ignored will not
actually encounter a node that is not available, so many of
the cases where status is not checked could currently be
checking status.

And just for completeness, there are a number of standalone
checks for whether a node is available:

   $ git grep of_device_is_available | wc -l
   128

It will be a pain to manually check all of the sites that
ignore the status property, but that task should be done.

In the mean time, maybe we could flush out the few cases
that currently depend on ignoring the status property by

   - making for_each_child_of_node() and of_get_next_child()
     actually check for valid status

   - provide a temporary (one or two kernel release)
     CONFIG option to allow the old behavior for
     for_each_child_of_node() and of_get_next_child()
     just in case we miss any locations that need to
     be fixed

   - fix up the few places in core device tree code that
     actually need to ignore status (if such places exist)

In the end, the *_available_*() interfaces should be
removed, because the normal behavior of node traversal
should be to only traverse nodes that are available.

-Frank
Rob Herring (Arm) Oct. 9, 2017, 6:59 p.m. UTC | #5
On Sun, Oct 8, 2017 at 5:57 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> On 10/03/17 09:18, Rob Herring wrote:
>> For static DT usecases, we don't need the disabled nodes and can skip
>> unflattening. This saves a significant amount of RAM in memory constrained
>> cases. In one example on STM32F469, the RAM usage goes from 118K to 26K.
>>
>> There are a few cases in the kernel that modify the status property
>> dynamically. These all are changes from enabled to disabled, depend on
>> OF_DYNAMIC or are not FDT based (PDT based).
>>
>> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
>> Cc: Frank Rowand <frowand.list@gmail.com>
>> Signed-off-by: Rob Herring <robh@kernel.org>
>> ---
>> For more background, see this presentation from Nico:
>>
>> https://connect.linaro.org/resource/sfo17/sfo17-100/
>>
>>  drivers/of/fdt.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>> index f8c39705418b..efe91c6856a0 100644
>> --- a/drivers/of/fdt.c
>> +++ b/drivers/of/fdt.c
>> @@ -396,6 +396,10 @@ static int unflatten_dt_nodes(const void *blob,
>>               if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
>>                       continue;
>>
>> +             if (!IS_ENABLED(CONFIG_OF_DYNAMIC) &&
>> +                 !of_fdt_device_is_available(blob, offset))
>> +                     continue;
>> +
>>               if (!populate_node(blob, offset, &mem, nps[depth],
>>                                  &nps[depth+1], dryrun))
>>                       return mem - base;
>>
>
> Hi Rob,
>
> I strongly support the idea of this patch, but there may be an
> issue we have to resolve.  I'm pretty sure we had talked about
> the issue a long time ago, and it has been sitting on my todo
> list.
>
> We have two sets of node traversal macros and functions.  One
> set honors the status property, and the other ignores it.  If
> I recall our previous discussion properly, we want the normal
> usage to honor the status property and only a tiny (or maybe
> non-existent) set of locations to be allowed to ignore the
> status property.

Ignoring status is a bug for a static DT. There could be places that
expect the node to be present, but disabled. Those may be bugs too.

> A rough sense of how often the status property is honored or
> not is:
>
>    $ git grep for_each_child_of_node | wc -l
>    293
>    $ git grep of_get_next_child | wc -l
>    103
>
>    $ git grep for_each_available_child_of_node | wc -l
>    106
>    $ git grep of_get_next_available_child | wc -l
>    20
>
> Many of the cases where the status flag is ignored will not
> actually encounter a node that is not available, so many of
> the cases where status is not checked could currently be
> checking status.

For many nodes, status simply makes no sense or at least is undefined.

> And just for completeness, there are a number of standalone
> checks for whether a node is available:
>
>    $ git grep of_device_is_available | wc -l
>    128

I'm surprised it's that many. It's a low-level detail that the core
should handle. We'd also need to make things like of_find_node_by_name
honor status.

> It will be a pain to manually check all of the sites that
> ignore the status property, but that task should be done.
>
> In the mean time, maybe we could flush out the few cases
> that currently depend on ignoring the status property by
>
>    - making for_each_child_of_node() and of_get_next_child()
>      actually check for valid status
>
>    - provide a temporary (one or two kernel release)
>      CONFIG option to allow the old behavior for
>      for_each_child_of_node() and of_get_next_child()
>      just in case we miss any locations that need to
>      be fixed
>
>    - fix up the few places in core device tree code that
>      actually need to ignore status (if such places exist)
>
> In the end, the *_available_*() interfaces should be
> removed, because the normal behavior of node traversal
> should be to only traverse nodes that are available.

I'm not sure this is really something we want or need to fix.

I could just make this depend on OF_KOBJ instead. Then practically no
one would see any change as almost everyone enables sysfs (and in turn
/proc/device-tree).

Rob
Frank Rowand Oct. 9, 2017, 8:20 p.m. UTC | #6
On 10/09/17 11:59, Rob Herring wrote:
> On Sun, Oct 8, 2017 at 5:57 PM, Frank Rowand <frowand.list@gmail.com> wrote:
>> On 10/03/17 09:18, Rob Herring wrote:
>>> For static DT usecases, we don't need the disabled nodes and can skip
>>> unflattening. This saves a significant amount of RAM in memory constrained
>>> cases. In one example on STM32F469, the RAM usage goes from 118K to 26K.
>>>
>>> There are a few cases in the kernel that modify the status property
>>> dynamically. These all are changes from enabled to disabled, depend on
>>> OF_DYNAMIC or are not FDT based (PDT based).
>>>
>>> Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
>>> Cc: Frank Rowand <frowand.list@gmail.com>
>>> Signed-off-by: Rob Herring <robh@kernel.org>
>>> ---
>>> For more background, see this presentation from Nico:
>>>
>>> https://connect.linaro.org/resource/sfo17/sfo17-100/
>>>
>>>  drivers/of/fdt.c | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>>> index f8c39705418b..efe91c6856a0 100644
>>> --- a/drivers/of/fdt.c
>>> +++ b/drivers/of/fdt.c
>>> @@ -396,6 +396,10 @@ static int unflatten_dt_nodes(const void *blob,
>>>               if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
>>>                       continue;
>>>
>>> +             if (!IS_ENABLED(CONFIG_OF_DYNAMIC) &&
>>> +                 !of_fdt_device_is_available(blob, offset))
>>> +                     continue;
>>> +
>>>               if (!populate_node(blob, offset, &mem, nps[depth],
>>>                                  &nps[depth+1], dryrun))
>>>                       return mem - base;
>>>
>>
>> Hi Rob,
>>
>> I strongly support the idea of this patch, but there may be an
>> issue we have to resolve.  I'm pretty sure we had talked about
>> the issue a long time ago, and it has been sitting on my todo
>> list.
>>
>> We have two sets of node traversal macros and functions.  One
>> set honors the status property, and the other ignores it.  If
>> I recall our previous discussion properly, we want the normal
>> usage to honor the status property and only a tiny (or maybe
>> non-existent) set of locations to be allowed to ignore the
>> status property.
> 
> Ignoring status is a bug for a static DT. There could be places that
> expect the node to be present, but disabled. Those may be bugs too.
> 
>> A rough sense of how often the status property is honored or
>> not is:
>>
>>    $ git grep for_each_child_of_node | wc -l
>>    293
>>    $ git grep of_get_next_child | wc -l
>>    103
>>
>>    $ git grep for_each_available_child_of_node | wc -l
>>    106
>>    $ git grep of_get_next_available_child | wc -l
>>    20
>>
>> Many of the cases where the status flag is ignored will not
>> actually encounter a node that is not available, so many of
>> the cases where status is not checked could currently be
>> checking status.
> 
> For many nodes, status simply makes no sense or at least is undefined.
> 
>> And just for completeness, there are a number of standalone
>> checks for whether a node is available:
>>
>>    $ git grep of_device_is_available | wc -l
>>    128
> 
> I'm surprised it's that many. It's a low-level detail that the core
> should handle. We'd also need to make things like of_find_node_by_name
> honor status.
> 
>> It will be a pain to manually check all of the sites that
>> ignore the status property, but that task should be done.
>>
>> In the mean time, maybe we could flush out the few cases
>> that currently depend on ignoring the status property by
>>
>>    - making for_each_child_of_node() and of_get_next_child()
>>      actually check for valid status
>>
>>    - provide a temporary (one or two kernel release)
>>      CONFIG option to allow the old behavior for
>>      for_each_child_of_node() and of_get_next_child()
>>      just in case we miss any locations that need to
>>      be fixed
>>
>>    - fix up the few places in core device tree code that
>>      actually need to ignore status (if such places exist)
>>
>> In the end, the *_available_*() interfaces should be
>> removed, because the normal behavior of node traversal
>> should be to only traverse nodes that are available.
> 
> I'm not sure this is really something we want or need to fix.
> 
> I could just make this depend on OF_KOBJ instead. Then practically no
> one would see any change as almost everyone enables sysfs (and in turn
> /proc/device-tree).

Yes, depending on OF_KOBJ should reduce the risk significantly.  Good
enough for me (you can take my reviewed-by for that).

I'm still leaving a clean up of checking or ignoring status on my
long term todo list.

-Frank

> 
> Rob
>
diff mbox

Patch

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index f8c39705418b..efe91c6856a0 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -396,6 +396,10 @@  static int unflatten_dt_nodes(const void *blob,
 		if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
 			continue;
 
+		if (!IS_ENABLED(CONFIG_OF_DYNAMIC) &&
+		    !of_fdt_device_is_available(blob, offset))
+			continue;
+
 		if (!populate_node(blob, offset, &mem, nps[depth],
 				   &nps[depth+1], dryrun))
 			return mem - base;