diff mbox series

[for-4.19(?)] xen/arm: bootfdt: Fix device tree memory node probing

Message ID 20240626080428.18480-1-michal.orzel@amd.com (mailing list archive)
State Superseded
Headers show
Series [for-4.19(?)] xen/arm: bootfdt: Fix device tree memory node probing | expand

Commit Message

Michal Orzel June 26, 2024, 8:04 a.m. UTC
Memory node probing is done as part of early_scan_node() that is called
for each node with depth >= 1 (root node is at depth 0). According to
Devicetree Specification v0.4, chapter 3.4, /memory node can only exists
as a top level node. However, Xen incorrectly considers all the nodes with
unit node name "memory" as RAM. This buggy behavior can result in a
failure if there are other nodes in the device tree (at depth >= 2) with
"memory" as unit node name. An example can be a "memory@xxx" node under
/reserved-memory. Fix it by introducing device_tree_is_memory_node() to
perform all the required checks to assess if a node is a proper /memory
node.

Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM location and size")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
---
4.19: This patch is fixing a possible early boot Xen failure (before main
console is initialized). In my case it results in a warning "Shattering
superpage is not supported" and panic "Unable to setup the directmap mappings".

If this is too late for this patch to go in, we can backport it after the tree
re-opens.
---
 xen/arch/arm/bootfdt.c | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

Comments

Luca Fancellu June 26, 2024, 2:42 p.m. UTC | #1
Hi Michal,

> On 26 Jun 2024, at 09:04, Michal Orzel <michal.orzel@amd.com> wrote:
> 
> Memory node probing is done as part of early_scan_node() that is called
> for each node with depth >= 1 (root node is at depth 0). According to
> Devicetree Specification v0.4, chapter 3.4, /memory node can only exists
> as a top level node. However, Xen incorrectly considers all the nodes with
> unit node name "memory" as RAM. This buggy behavior can result in a
> failure if there are other nodes in the device tree (at depth >= 2) with
> "memory" as unit node name. An example can be a "memory@xxx" node under
> /reserved-memory. Fix it by introducing device_tree_is_memory_node() to
> perform all the required checks to assess if a node is a proper /memory
> node.
> 
> Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM location and size")
> Signed-off-by: Michal Orzel <michal.orzel@amd.com>
> ---
> 4.19: This patch is fixing a possible early boot Xen failure (before main
> console is initialized). In my case it results in a warning "Shattering
> superpage is not supported" and panic "Unable to setup the directmap mappings".
> 
> If this is too late for this patch to go in, we can backport it after the tree
> re-opens.
> ---

It looks ok to me, 

Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>

I’ve also tested on FVP, I’ll put it through our CI and I’ll let you know.

Tested-by: Luca Fancellu <luca.fancellu@arm.com>
Julien Grall June 26, 2024, 8:13 p.m. UTC | #2
Hi Michal,

On 26/06/2024 09:04, Michal Orzel wrote:
> Memory node probing is done as part of early_scan_node() that is called
> for each node with depth >= 1 (root node is at depth 0). According to
> Devicetree Specification v0.4, chapter 3.4, /memory node can only exists
> as a top level node. However, Xen incorrectly considers all the nodes with
> unit node name "memory" as RAM. This buggy behavior can result in a
> failure if there are other nodes in the device tree (at depth >= 2) with
> "memory" as unit node name. An example can be a "memory@xxx" node under
> /reserved-memory. Fix it by introducing device_tree_is_memory_node() to
> perform all the required checks to assess if a node is a proper /memory
> node.
> 
> Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM location and size")
> Signed-off-by: Michal Orzel <michal.orzel@amd.com>
> ---
> 4.19: This patch is fixing a possible early boot Xen failure (before main
> console is initialized). In my case it results in a warning "Shattering
> superpage is not supported" and panic "Unable to setup the directmap mappings".
> 
> If this is too late for this patch to go in, we can backport it after the tree
> re-opens.

The code looks correct to me, but I am not sure about merging it to 4.19.

This is not a new bug (in fact has been there since pretty much Xen on 
Arm was created) and we haven't seen any report until today. So in some 
way it would be best to merge it after 4.19 so it can get more testing.

In the other hand, I guess this will block you. Is this a new platform? 
Is it available?

> ---
>   xen/arch/arm/bootfdt.c | 28 +++++++++++++++++++++++++++-
>   1 file changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> index 6e060111d95b..23c968e6955d 100644
> --- a/xen/arch/arm/bootfdt.c
> +++ b/xen/arch/arm/bootfdt.c
> @@ -83,6 +83,32 @@ static bool __init device_tree_node_compatible(const void *fdt, int node,
>       return false;
>   }
>   
> +/*
> + * Check if a node is a proper /memory node according to Devicetree
> + * Specification v0.4, chapter 3.4.
> + */
> +static bool __init device_tree_is_memory_node(const void *fdt, int node,
> +                                              int depth)
> +{
> +    const char *type;
> +    int len;
> +
> +    if ( depth != 1 )
> +        return false;
> +
> +    if ( !device_tree_node_matches(fdt, node, "memory") )
> +        return false;
> +
> +    type = fdt_getprop(fdt, node, "device_type", &len);
> +    if ( !type )
> +        return false;
> +
> +    if ( (len <= 0) || strcmp(type, "memory") )

I would consider to use strncmp() to avoid relying on the property to be 
well-formed (i.e. nul-terminated).

> +        return false;
> +
> +    return true;
> +}
> +
>   void __init device_tree_get_reg(const __be32 **cell, uint32_t address_cells,
>                                   uint32_t size_cells, paddr_t *start,
>                                   paddr_t *size)
> @@ -448,7 +474,7 @@ static int __init early_scan_node(const void *fdt,
>        * populated. So we should skip the parsing.
>        */
>       if ( !efi_enabled(EFI_BOOT) &&
> -         device_tree_node_matches(fdt, node, "memory") )
> +         device_tree_is_memory_node(fdt, node, depth) )
>           rc = process_memory_node(fdt, node, name, depth,
>                                    address_cells, size_cells, bootinfo_get_mem());
>       else if ( depth == 1 && !dt_node_cmp(name, "reserved-memory") )

Cheers,
Stefano Stabellini June 26, 2024, 11:37 p.m. UTC | #3
On Wed, 26 Jun 2024, Julien Grall wrote:
> Hi Michal,
> 
> On 26/06/2024 09:04, Michal Orzel wrote:
> > Memory node probing is done as part of early_scan_node() that is called
> > for each node with depth >= 1 (root node is at depth 0). According to
> > Devicetree Specification v0.4, chapter 3.4, /memory node can only exists
> > as a top level node. However, Xen incorrectly considers all the nodes with
> > unit node name "memory" as RAM. This buggy behavior can result in a
> > failure if there are other nodes in the device tree (at depth >= 2) with
> > "memory" as unit node name. An example can be a "memory@xxx" node under
> > /reserved-memory. Fix it by introducing device_tree_is_memory_node() to
> > perform all the required checks to assess if a node is a proper /memory
> > node.
> > 
> > Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM location and
> > size")
> > Signed-off-by: Michal Orzel <michal.orzel@amd.com>
> > ---
> > 4.19: This patch is fixing a possible early boot Xen failure (before main
> > console is initialized). In my case it results in a warning "Shattering
> > superpage is not supported" and panic "Unable to setup the directmap
> > mappings".
> > 
> > If this is too late for this patch to go in, we can backport it after the
> > tree
> > re-opens.
> 
> The code looks correct to me, but I am not sure about merging it to 4.19.
> 
> This is not a new bug (in fact has been there since pretty much Xen on Arm was
> created) and we haven't seen any report until today. So in some way it would
> be best to merge it after 4.19 so it can get more testing.

First it was found on a new board, but then the issue appeared also on
an old board (the Ultrascale+). I think the reason is that a
reserved-memory node was added triggering the bug.


> In the other hand, I guess this will block you. Is this a new platform? Is it
> available?

Yes the platform is available so I would be more concerned about Xen
4.19 not booting on newer Ultrascale+ device trees. That said, we can
also backport the fix later to staging-4.19. I'll leave the decision to
you.
Michal Orzel June 27, 2024, 7:40 a.m. UTC | #4
Hi Julien,

On 26/06/2024 22:13, Julien Grall wrote:
> 
> 
> Hi Michal,
> 
> On 26/06/2024 09:04, Michal Orzel wrote:
>> Memory node probing is done as part of early_scan_node() that is called
>> for each node with depth >= 1 (root node is at depth 0). According to
>> Devicetree Specification v0.4, chapter 3.4, /memory node can only exists
>> as a top level node. However, Xen incorrectly considers all the nodes with
>> unit node name "memory" as RAM. This buggy behavior can result in a
>> failure if there are other nodes in the device tree (at depth >= 2) with
>> "memory" as unit node name. An example can be a "memory@xxx" node under
>> /reserved-memory. Fix it by introducing device_tree_is_memory_node() to
>> perform all the required checks to assess if a node is a proper /memory
>> node.
>>
>> Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM location and size")
>> Signed-off-by: Michal Orzel <michal.orzel@amd.com>
>> ---
>> 4.19: This patch is fixing a possible early boot Xen failure (before main
>> console is initialized). In my case it results in a warning "Shattering
>> superpage is not supported" and panic "Unable to setup the directmap mappings".
>>
>> If this is too late for this patch to go in, we can backport it after the tree
>> re-opens.
> 
> The code looks correct to me, but I am not sure about merging it to 4.19.
> 
> This is not a new bug (in fact has been there since pretty much Xen on
> Arm was created) and we haven't seen any report until today. So in some
> way it would be best to merge it after 4.19 so it can get more testing.
> 
> In the other hand, I guess this will block you. Is this a new platform?
> Is it available?
Stefano answered this question. Also, IMO this change is quite straightforward
and does not introduce any engineering doubt, so I'm not really sure if it needs
more testing.

> 
>> ---
>>   xen/arch/arm/bootfdt.c | 28 +++++++++++++++++++++++++++-
>>   1 file changed, 27 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
>> index 6e060111d95b..23c968e6955d 100644
>> --- a/xen/arch/arm/bootfdt.c
>> +++ b/xen/arch/arm/bootfdt.c
>> @@ -83,6 +83,32 @@ static bool __init device_tree_node_compatible(const void *fdt, int node,
>>       return false;
>>   }
>>
>> +/*
>> + * Check if a node is a proper /memory node according to Devicetree
>> + * Specification v0.4, chapter 3.4.
>> + */
>> +static bool __init device_tree_is_memory_node(const void *fdt, int node,
>> +                                              int depth)
>> +{
>> +    const char *type;
>> +    int len;
>> +
>> +    if ( depth != 1 )
>> +        return false;
>> +
>> +    if ( !device_tree_node_matches(fdt, node, "memory") )
>> +        return false;
>> +
>> +    type = fdt_getprop(fdt, node, "device_type", &len);
>> +    if ( !type )
>> +        return false;
>> +
>> +    if ( (len <= 0) || strcmp(type, "memory") )
> 
> I would consider to use strncmp() to avoid relying on the property to be
> well-formed (i.e. nul-terminated).
Are you sure? AFAIR, libfdt returns NULL and -FDT_ERR_TRUNCATED as len if a string is non null terminated.
Also, let's suppose it is somehow not terminated. In that case how could libfdt set len to be > 0?
It needs to know where is the end of the string to calculate the length.

> 
>> +        return false;
>> +
>> +    return true;
>> +}
>> +
>>   void __init device_tree_get_reg(const __be32 **cell, uint32_t address_cells,
>>                                   uint32_t size_cells, paddr_t *start,
>>                                   paddr_t *size)
>> @@ -448,7 +474,7 @@ static int __init early_scan_node(const void *fdt,
>>        * populated. So we should skip the parsing.
>>        */
>>       if ( !efi_enabled(EFI_BOOT) &&
>> -         device_tree_node_matches(fdt, node, "memory") )
>> +         device_tree_is_memory_node(fdt, node, depth) )
>>           rc = process_memory_node(fdt, node, name, depth,
>>                                    address_cells, size_cells, bootinfo_get_mem());
>>       else if ( depth == 1 && !dt_node_cmp(name, "reserved-memory") )
> 
> Cheers,
> 
> --
> Julien Grall

~Michal
Oleksii Kurochko June 27, 2024, 9:53 a.m. UTC | #5
On Wed, 2024-06-26 at 10:04 +0200, Michal Orzel wrote:
> Memory node probing is done as part of early_scan_node() that is
> called
> for each node with depth >= 1 (root node is at depth 0). According to
> Devicetree Specification v0.4, chapter 3.4, /memory node can only
> exists
> as a top level node. However, Xen incorrectly considers all the nodes
> with
> unit node name "memory" as RAM. This buggy behavior can result in a
> failure if there are other nodes in the device tree (at depth >= 2)
> with
> "memory" as unit node name. An example can be a "memory@xxx" node
> under
> /reserved-memory. Fix it by introducing device_tree_is_memory_node()
> to
> perform all the required checks to assess if a node is a proper
> /memory
> node.
> 
> Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM
> location and size")
> Signed-off-by: Michal Orzel <michal.orzel@amd.com>
> ---
> 4.19: This patch is fixing a possible early boot Xen failure (before
> main
> console is initialized). In my case it results in a warning
> "Shattering
> superpage is not supported" and panic "Unable to setup the directmap
> mappings".
> 
> If this is too late for this patch to go in, we can backport it after
> the tree
> re-opens.
So if we have a warning/panic and there is no random behaviour, I think
that it would be better to postpone until 4.20 branching.

~ Oleksii

> ---
>  xen/arch/arm/bootfdt.c | 28 +++++++++++++++++++++++++++-
>  1 file changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
> index 6e060111d95b..23c968e6955d 100644
> --- a/xen/arch/arm/bootfdt.c
> +++ b/xen/arch/arm/bootfdt.c
> @@ -83,6 +83,32 @@ static bool __init
> device_tree_node_compatible(const void *fdt, int node,
>      return false;
>  }
>  
> +/*
> + * Check if a node is a proper /memory node according to Devicetree
> + * Specification v0.4, chapter 3.4.
> + */
> +static bool __init device_tree_is_memory_node(const void *fdt, int
> node,
> +                                              int depth)
> +{
> +    const char *type;
> +    int len;
> +
> +    if ( depth != 1 )
> +        return false;
> +
> +    if ( !device_tree_node_matches(fdt, node, "memory") )
> +        return false;
> +
> +    type = fdt_getprop(fdt, node, "device_type", &len);
> +    if ( !type )
> +        return false;
> +
> +    if ( (len <= 0) || strcmp(type, "memory") )
> +        return false;
> +
> +    return true;
> +}
> +
>  void __init device_tree_get_reg(const __be32 **cell, uint32_t
> address_cells,
>                                  uint32_t size_cells, paddr_t *start,
>                                  paddr_t *size)
> @@ -448,7 +474,7 @@ static int __init early_scan_node(const void
> *fdt,
>       * populated. So we should skip the parsing.
>       */
>      if ( !efi_enabled(EFI_BOOT) &&
> -         device_tree_node_matches(fdt, node, "memory") )
> +         device_tree_is_memory_node(fdt, node, depth) )
>          rc = process_memory_node(fdt, node, name, depth,
>                                   address_cells, size_cells,
> bootinfo_get_mem());
>      else if ( depth == 1 && !dt_node_cmp(name, "reserved-memory") )
Julien Grall June 27, 2024, 10:32 a.m. UTC | #6
On 27/06/2024 08:40, Michal Orzel wrote:
> Hi Julien,
> 
> On 26/06/2024 22:13, Julien Grall wrote:
>>
>>
>> Hi Michal,
>>
>> On 26/06/2024 09:04, Michal Orzel wrote:
>>> Memory node probing is done as part of early_scan_node() that is called
>>> for each node with depth >= 1 (root node is at depth 0). According to
>>> Devicetree Specification v0.4, chapter 3.4, /memory node can only exists
>>> as a top level node. However, Xen incorrectly considers all the nodes with
>>> unit node name "memory" as RAM. This buggy behavior can result in a
>>> failure if there are other nodes in the device tree (at depth >= 2) with
>>> "memory" as unit node name. An example can be a "memory@xxx" node under
>>> /reserved-memory. Fix it by introducing device_tree_is_memory_node() to
>>> perform all the required checks to assess if a node is a proper /memory
>>> node.
>>>
>>> Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM location and size")
>>> Signed-off-by: Michal Orzel <michal.orzel@amd.com>
>>> ---
>>> 4.19: This patch is fixing a possible early boot Xen failure (before main
>>> console is initialized). In my case it results in a warning "Shattering
>>> superpage is not supported" and panic "Unable to setup the directmap mappings".
>>>
>>> If this is too late for this patch to go in, we can backport it after the tree
>>> re-opens.
>>
>> The code looks correct to me, but I am not sure about merging it to 4.19.
>>
>> This is not a new bug (in fact has been there since pretty much Xen on
>> Arm was created) and we haven't seen any report until today. So in some
>> way it would be best to merge it after 4.19 so it can get more testing.
>>
>> In the other hand, I guess this will block you. Is this a new platform?
>> Is it available?
> Stefano answered this question. Also, IMO this change is quite straightforward
> and does not introduce any engineering doubt, so I'm not really sure if it needs
> more testing.

At this stage of the release we should think whether the bug is critical 
enough (rather than the risk is low enough) to be merged. IMHO, it is 
not because this has been there for the past 12 years...

But if we focus on the "riskness". We are rightly changing an interface 
which possibly someone may have (bogusly) relied on. So there is a 
lowish risk that you may end up to break use-case late in the release 
(we are a couple of weeks away) for use-case that never worked in the 
past 12 years.

We should also think what the worse that can happen if there is a bug in 
your series. The worse is we would not be able to boot on already 
supported platform. This would be quite a bad regression this late in 
the release. For Device-Tree parsing, I don't think it is enough to just 
test on just a handful of platforms this late in the release.

Which is why to me the answer to "It should be in 4.19" is not 
straightforward. If we merge post 4.19, then we give the chance to 
people to test, update & adjust their setup if needed.

Anyway, ultimately this is Oleksii's decision as the release manager.

[...]

>>> +/*
>>> + * Check if a node is a proper /memory node according to Devicetree
>>> + * Specification v0.4, chapter 3.4.
>>> + */
>>> +static bool __init device_tree_is_memory_node(const void *fdt, int node,
>>> +                                              int depth)
>>> +{
>>> +    const char *type;
>>> +    int len;
>>> +
>>> +    if ( depth != 1 )
>>> +        return false;
>>> +
>>> +    if ( !device_tree_node_matches(fdt, node, "memory") )
>>> +        return false;
>>> +
>>> +    type = fdt_getprop(fdt, node, "device_type", &len);
>>> +    if ( !type )
>>> +        return false;
>>> +
>>> +    if ( (len <= 0) || strcmp(type, "memory") )
>>
>> I would consider to use strncmp() to avoid relying on the property to be
>> well-formed (i.e. nul-terminated).
> Are you sure? AFAIR, libfdt returns NULL and -FDT_ERR_TRUNCATED as len if a string is non null terminated.

I can't find such code in path. Do you have any pointer?

> Also, let's suppose it is somehow not terminated. In that case how could libfdt set len to be > 0?

The FDT will store the length of the property otherwise you would not be 
able to encode property that are just a list of cells (i.e. numbers).

> It needs to know where is the end of the string to calculate the length.

For the name and the description, it is unclear to why would 
fdt_getprop() would only work for string property.

Cheers,
Oleksii Kurochko June 27, 2024, 11:34 a.m. UTC | #7
On Thu, 2024-06-27 at 11:32 +0100, Julien Grall wrote:
> 
> 
> On 27/06/2024 08:40, Michal Orzel wrote:
> > Hi Julien,
> > 
> > On 26/06/2024 22:13, Julien Grall wrote:
> > > 
> > > 
> > > Hi Michal,
> > > 
> > > On 26/06/2024 09:04, Michal Orzel wrote:
> > > > Memory node probing is done as part of early_scan_node() that
> > > > is called
> > > > for each node with depth >= 1 (root node is at depth 0).
> > > > According to
> > > > Devicetree Specification v0.4, chapter 3.4, /memory node can
> > > > only exists
> > > > as a top level node. However, Xen incorrectly considers all the
> > > > nodes with
> > > > unit node name "memory" as RAM. This buggy behavior can result
> > > > in a
> > > > failure if there are other nodes in the device tree (at depth
> > > > >= 2) with
> > > > "memory" as unit node name. An example can be a "memory@xxx"
> > > > node under
> > > > /reserved-memory. Fix it by introducing
> > > > device_tree_is_memory_node() to
> > > > perform all the required checks to assess if a node is a proper
> > > > /memory
> > > > node.
> > > > 
> > > > Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM
> > > > location and size")
> > > > Signed-off-by: Michal Orzel <michal.orzel@amd.com>
> > > > ---
> > > > 4.19: This patch is fixing a possible early boot Xen failure
> > > > (before main
> > > > console is initialized). In my case it results in a warning
> > > > "Shattering
> > > > superpage is not supported" and panic "Unable to setup the
> > > > directmap mappings".
> > > > 
> > > > If this is too late for this patch to go in, we can backport it
> > > > after the tree
> > > > re-opens.
> > > 
> > > The code looks correct to me, but I am not sure about merging it
> > > to 4.19.
> > > 
> > > This is not a new bug (in fact has been there since pretty much
> > > Xen on
> > > Arm was created) and we haven't seen any report until today. So
> > > in some
> > > way it would be best to merge it after 4.19 so it can get more
> > > testing.
> > > 
> > > In the other hand, I guess this will block you. Is this a new
> > > platform?
> > > Is it available?
> > Stefano answered this question. Also, IMO this change is quite
> > straightforward
> > and does not introduce any engineering doubt, so I'm not really
> > sure if it needs
> > more testing.
> 
> At this stage of the release we should think whether the bug is
> critical 
> enough (rather than the risk is low enough) to be merged. IMHO, it is
> not because this has been there for the past 12 years...
> 
> But if we focus on the "riskness". We are rightly changing an
> interface 
> which possibly someone may have (bogusly) relied on. So there is a 
> lowish risk that you may end up to break use-case late in the release
> (we are a couple of weeks away) for use-case that never worked in the
> past 12 years.
> 
> We should also think what the worse that can happen if there is a bug
> in 
> your series. The worse is we would not be able to boot on already 
> supported platform. This would be quite a bad regression this late in
> the release. For Device-Tree parsing, I don't think it is enough to
> just 
> test on just a handful of platforms this late in the release.
> 
> Which is why to me the answer to "It should be in 4.19" is not 
> straightforward. If we merge post 4.19, then we give the chance to 
> people to test, update & adjust their setup if needed.
> 
> Anyway, ultimately this is Oleksii's decision as the release manager.
I agree with Julien, it would be better to postpone this patch series
until 4.20 branching.

~ Oleksii

> 
> [...]
> 
> > > > +/*
> > > > + * Check if a node is a proper /memory node according to
> > > > Devicetree
> > > > + * Specification v0.4, chapter 3.4.
> > > > + */
> > > > +static bool __init device_tree_is_memory_node(const void *fdt,
> > > > int node,
> > > > +                                              int depth)
> > > > +{
> > > > +    const char *type;
> > > > +    int len;
> > > > +
> > > > +    if ( depth != 1 )
> > > > +        return false;
> > > > +
> > > > +    if ( !device_tree_node_matches(fdt, node, "memory") )
> > > > +        return false;
> > > > +
> > > > +    type = fdt_getprop(fdt, node, "device_type", &len);
> > > > +    if ( !type )
> > > > +        return false;
> > > > +
> > > > +    if ( (len <= 0) || strcmp(type, "memory") )
> > > 
> > > I would consider to use strncmp() to avoid relying on the
> > > property to be
> > > well-formed (i.e. nul-terminated).
> > Are you sure? AFAIR, libfdt returns NULL and -FDT_ERR_TRUNCATED as
> > len if a string is non null terminated.
> 
> I can't find such code in path. Do you have any pointer?
> 
> > Also, let's suppose it is somehow not terminated. In that case how
> > could libfdt set len to be > 0?
> 
> The FDT will store the length of the property otherwise you would not
> be 
> able to encode property that are just a list of cells (i.e. numbers).
> 
> > It needs to know where is the end of the string to calculate the
> > length.
> 
> For the name and the description, it is unclear to why would 
> fdt_getprop() would only work for string property.
> 
> Cheers,
>
Michal Orzel June 27, 2024, 12:01 p.m. UTC | #8
On 27/06/2024 12:32, Julien Grall wrote:
> 
> 
> On 27/06/2024 08:40, Michal Orzel wrote:
>> Hi Julien,
>>
>> On 26/06/2024 22:13, Julien Grall wrote:
>>>
>>>
>>> Hi Michal,
>>>
>>> On 26/06/2024 09:04, Michal Orzel wrote:
>>>> Memory node probing is done as part of early_scan_node() that is called
>>>> for each node with depth >= 1 (root node is at depth 0). According to
>>>> Devicetree Specification v0.4, chapter 3.4, /memory node can only exists
>>>> as a top level node. However, Xen incorrectly considers all the nodes with
>>>> unit node name "memory" as RAM. This buggy behavior can result in a
>>>> failure if there are other nodes in the device tree (at depth >= 2) with
>>>> "memory" as unit node name. An example can be a "memory@xxx" node under
>>>> /reserved-memory. Fix it by introducing device_tree_is_memory_node() to
>>>> perform all the required checks to assess if a node is a proper /memory
>>>> node.
>>>>
>>>> Fixes: 3e99c95ba1c8 ("arm, device tree: parse the DTB for RAM location and size")
>>>> Signed-off-by: Michal Orzel <michal.orzel@amd.com>
>>>> ---
>>>> 4.19: This patch is fixing a possible early boot Xen failure (before main
>>>> console is initialized). In my case it results in a warning "Shattering
>>>> superpage is not supported" and panic "Unable to setup the directmap mappings".
>>>>
>>>> If this is too late for this patch to go in, we can backport it after the tree
>>>> re-opens.
>>>
>>> The code looks correct to me, but I am not sure about merging it to 4.19.
>>>
>>> This is not a new bug (in fact has been there since pretty much Xen on
>>> Arm was created) and we haven't seen any report until today. So in some
>>> way it would be best to merge it after 4.19 so it can get more testing.
>>>
>>> In the other hand, I guess this will block you. Is this a new platform?
>>> Is it available?
>> Stefano answered this question. Also, IMO this change is quite straightforward
>> and does not introduce any engineering doubt, so I'm not really sure if it needs
>> more testing.
> 
> At this stage of the release we should think whether the bug is critical
> enough (rather than the risk is low enough) to be merged. IMHO, it is
> not because this has been there for the past 12 years...
> 
> But if we focus on the "riskness". We are rightly changing an interface
> which possibly someone may have (bogusly) relied on. So there is a
> lowish risk that you may end up to break use-case late in the release
> (we are a couple of weeks away) for use-case that never worked in the
> past 12 years.
> 
> We should also think what the worse that can happen if there is a bug in
> your series. The worse is we would not be able to boot on already
> supported platform. This would be quite a bad regression this late in
> the release. For Device-Tree parsing, I don't think it is enough to just
> test on just a handful of platforms this late in the release.
> 
> Which is why to me the answer to "It should be in 4.19" is not
> straightforward. If we merge post 4.19, then we give the chance to
> people to test, update & adjust their setup if needed.
> 
> Anyway, ultimately this is Oleksii's decision as the release manager.
Ok, I agree with your reasoning.

> 
> [...]
> 
>>>> +/*
>>>> + * Check if a node is a proper /memory node according to Devicetree
>>>> + * Specification v0.4, chapter 3.4.
>>>> + */
>>>> +static bool __init device_tree_is_memory_node(const void *fdt, int node,
>>>> +                                              int depth)
>>>> +{
>>>> +    const char *type;
>>>> +    int len;
>>>> +
>>>> +    if ( depth != 1 )
>>>> +        return false;
>>>> +
>>>> +    if ( !device_tree_node_matches(fdt, node, "memory") )
>>>> +        return false;
>>>> +
>>>> +    type = fdt_getprop(fdt, node, "device_type", &len);
>>>> +    if ( !type )
>>>> +        return false;
>>>> +
>>>> +    if ( (len <= 0) || strcmp(type, "memory") )
>>>
>>> I would consider to use strncmp() to avoid relying on the property to be
>>> well-formed (i.e. nul-terminated).
>> Are you sure? AFAIR, libfdt returns NULL and -FDT_ERR_TRUNCATED as len if a string is non null terminated.
> 
> I can't find such code in path. Do you have any pointer?
I checked and I cannot find such code either. I made the wrong assumption thinking that fdt_getprop can only work with strings.
Therefore, I'm ok with changing s/strcmp/strncmp/ for hardening. Shall I respin the patch marking it as for-4.20?


~Michal
Julien Grall July 10, 2024, 9:22 a.m. UTC | #9
Hi Michal,

On 27/06/2024 13:01, Michal Orzel wrote:
>>>>> +/*
>>>>> + * Check if a node is a proper /memory node according to Devicetree
>>>>> + * Specification v0.4, chapter 3.4.
>>>>> + */
>>>>> +static bool __init device_tree_is_memory_node(const void *fdt, int node,
>>>>> +                                              int depth)
>>>>> +{
>>>>> +    const char *type;
>>>>> +    int len;
>>>>> +
>>>>> +    if ( depth != 1 )
>>>>> +        return false;
>>>>> +
>>>>> +    if ( !device_tree_node_matches(fdt, node, "memory") )
>>>>> +        return false;
>>>>> +
>>>>> +    type = fdt_getprop(fdt, node, "device_type", &len);
>>>>> +    if ( !type )
>>>>> +        return false;
>>>>> +
>>>>> +    if ( (len <= 0) || strcmp(type, "memory") )
>>>>
>>>> I would consider to use strncmp() to avoid relying on the property to be
>>>> well-formed (i.e. nul-terminated).
>>> Are you sure? AFAIR, libfdt returns NULL and -FDT_ERR_TRUNCATED as len if a string is non null terminated.
>>
>> I can't find such code in path. Do you have any pointer?
> I checked and I cannot find such code either. I made the wrong assumption thinking that fdt_getprop can only work with strings.
> Therefore, I'm ok with changing s/strcmp/strncmp/ for hardening. Shall I respin the patch marking it as for-4.20?

Sorry this e-mail fell through the cracks. Yes please. I need to create 
a branch to collect all simple patches for 4.20.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 6e060111d95b..23c968e6955d 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -83,6 +83,32 @@  static bool __init device_tree_node_compatible(const void *fdt, int node,
     return false;
 }
 
+/*
+ * Check if a node is a proper /memory node according to Devicetree
+ * Specification v0.4, chapter 3.4.
+ */
+static bool __init device_tree_is_memory_node(const void *fdt, int node,
+                                              int depth)
+{
+    const char *type;
+    int len;
+
+    if ( depth != 1 )
+        return false;
+
+    if ( !device_tree_node_matches(fdt, node, "memory") )
+        return false;
+
+    type = fdt_getprop(fdt, node, "device_type", &len);
+    if ( !type )
+        return false;
+
+    if ( (len <= 0) || strcmp(type, "memory") )
+        return false;
+
+    return true;
+}
+
 void __init device_tree_get_reg(const __be32 **cell, uint32_t address_cells,
                                 uint32_t size_cells, paddr_t *start,
                                 paddr_t *size)
@@ -448,7 +474,7 @@  static int __init early_scan_node(const void *fdt,
      * populated. So we should skip the parsing.
      */
     if ( !efi_enabled(EFI_BOOT) &&
-         device_tree_node_matches(fdt, node, "memory") )
+         device_tree_is_memory_node(fdt, node, depth) )
         rc = process_memory_node(fdt, node, name, depth,
                                  address_cells, size_cells, bootinfo_get_mem());
     else if ( depth == 1 && !dt_node_cmp(name, "reserved-memory") )