Message ID | 6b2db92de764b6031647926d27cb14dd455eff7d.1704809355.git.javi.merino@cloud.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [for-4.18,v1] xen/common: Don't dereference overlay_node after checking that it is NULL | expand |
Hi Javi, Title: Any reason this is titled for-4.18? Shouldn't this patch also be merged in staging? On 09/01/2024 14:19, Javi Merino wrote: > In remove_nodes(), overlay_node is dereferenced when printing the > error message even though it is known to be NULLL. Fix the error Typo: s/NULLL/NULL/ This can be fixed on commit if there is nothing else. > message to avoid dereferencing a NULL pointer. > > The semantic patch that spots this code is available in > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/coccinelle/null/deref_null.cocci?id=1f874787ed9a2d78ed59cb21d0d90ac0178eceb0 Good catch and glad to see that coccinelle can work on Xen. I am looking forward for more work in that area :). > > Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities") > Signed-off-by: Javi Merino <javi.merino@cloud.com> c> --- > CC: Vikram Garhwal <vikram.garhwal@amd.com> > > Vikram, I didn't know what to put in the error message. Feel free to > suggest something more appropriate than "Device not present in the > tree". More questions for Vikram, looking at the code, it is not 100% clear in which condition overlay_node could be NULL. Is this a programming error? if so, maybe this should be an ASSERT_UNREACHABLE() (could be added separately) and it would be fine to print nothing. Cheers,
On Tue, Jan 09, 2024 at 03:31:55PM +0000, Julien Grall wrote: > Hi Javi, > > Title: Any reason this is titled for-4.18? Shouldn't this patch also be > merged in staging? This is for staging and 4.18. If the tag "for-4.18" meant "this is only for 4.18", then that's not what I meant. Sorry for that. > On 09/01/2024 14:19, Javi Merino wrote: > > In remove_nodes(), overlay_node is dereferenced when printing the > > error message even though it is known to be NULLL. Fix the error > > Typo: s/NULLL/NULL/ > > This can be fixed on commit if there is nothing else. > > > message to avoid dereferencing a NULL pointer. > > > > The semantic patch that spots this code is available in > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/coccinelle/null/deref_null.cocci?id=1f874787ed9a2d78ed59cb21d0d90ac0178eceb0 > > Good catch and glad to see that coccinelle can work on Xen. I am looking > forward for more work in that area :). I'm working on a series to add a "make coccicheck" to Xen, similar to the coccicheck in the kernel. All the other semantic patches that I think are relevant to us only give false positives, but I think it would still be beneficial to have them. Cheers, Javi
Hi Javi, On 09/01/2024 15:42, Javi Merino wrote: > On Tue, Jan 09, 2024 at 03:31:55PM +0000, Julien Grall wrote: >> Hi Javi, >> >> Title: Any reason this is titled for-4.18? Shouldn't this patch also be >> merged in staging? > > This is for staging and 4.18. If the tag "for-4.18" meant "this is > only for 4.18", then that's not what I meant. Sorry for that. We usually use "for-4.XX" during code freeze to indicate whether a patch should be part of the upcoming relase of the next week. Hence my confusion. Outside of the code freeze, we sometimes add the tag "Backport: 4.XX+" just above the Signed-off-by in addition to the Fixes tag to indicate how far the backport should go. The Fixes tag is also sufficient. As a side node, this is fixing experimental code. So in general we would not backport such patch (we only do backport for supported features). This is because there are no guarantee that an experimental would not crash Xen. Although, the tag is still useful for downstream that may have decided to take the patch in (I think AMD/Xilinx is using them) and accept some of the risks. Stefano is the person doing the backport for Arm. So I will let him decide whether he wants to backport it. Cheers,
On Tue, Jan 09, 2024 at 03:50:38PM +0000, Julien Grall wrote: > Hi Javi, > > On 09/01/2024 15:42, Javi Merino wrote: > > On Tue, Jan 09, 2024 at 03:31:55PM +0000, Julien Grall wrote: > > > Hi Javi, > > > > > > Title: Any reason this is titled for-4.18? Shouldn't this patch also be > > > merged in staging? > > > > This is for staging and 4.18. If the tag "for-4.18" meant "this is > > only for 4.18", then that's not what I meant. Sorry for that. > > We usually use "for-4.XX" during code freeze to indicate whether a patch > should be part of the upcoming relase of the next week. Hence my confusion. Ok, I know for next time. Thanks for the clarification. > Outside of the code freeze, we sometimes add the tag "Backport: 4.XX+" just > above the Signed-off-by in addition to the Fixes tag to indicate how far the > backport should go. The Fixes tag is also sufficient. > > As a side node, this is fixing experimental code. So in general we would not > backport such patch (we only do backport for supported features). This is > because there are no guarantee that an experimental would not crash Xen. > > Although, the tag is still useful for downstream that may have decided to > take the patch in (I think AMD/Xilinx is using them) and accept some of the > risks. > > Stefano is the person doing the backport for Arm. So I will let him decide > whether he wants to backport it. Fair enough. Cheers, Javi
Hi Javi, Thank you for spotting and fixing this. On Tue, Jan 09, 2024 at 03:31:55PM +0000, Julien Grall wrote: > Hi Javi, > > Title: Any reason this is titled for-4.18? Shouldn't this patch also be > merged in staging? > > On 09/01/2024 14:19, Javi Merino wrote: > > In remove_nodes(), overlay_node is dereferenced when printing the > > error message even though it is known to be NULLL. Fix the error > > Typo: s/NULLL/NULL/ > > This can be fixed on commit if there is nothing else. > > > message to avoid dereferencing a NULL pointer. > > > > The semantic patch that spots this code is available in > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/coccinelle/null/deref_null.cocci?id=1f874787ed9a2d78ed59cb21d0d90ac0178eceb0 > > Good catch and glad to see that coccinelle can work on Xen. I am looking > forward for more work in that area :). > > > > > Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities") > > Signed-off-by: Javi Merino <javi.merino@cloud.com> > c> --- > > CC: Vikram Garhwal <vikram.garhwal@amd.com> > > > > Vikram, I didn't know what to put in the error message. Feel free to > > suggest something more appropriate than "Device not present in the > > tree". > > More questions for Vikram, looking at the code, it is not 100% clear in > which condition overlay_node could be NULL. Is this a programming error? if > so, maybe this should be an ASSERT_UNREACHABLE() (could be added separately) > and it would be fine to print nothing. > This can happen with failures in add_nodes() function. add_nodes() failure will try to call remove_nodes function. Depending on where add_nodes() is failed, nodes_address may or may not be NULL. We also added a detailed comment on this: https://github.com/xen-project/xen/blob/5a3ace21f3d779b291a2d305824b2820d88de7f1/xen/common/dt-overlay.c#L816 For now, we can return from here without printing anything as error message will be printed by the caller of remove_nodes() anyway. > Cheers, > > -- > Julien Grall
On Wed, Jan 10, 2024 at 12:25:57PM -0800, Vikram Garhwal wrote: > Hi Javi, > Thank you for spotting and fixing this. Hi Vikram, > On Tue, Jan 09, 2024 at 03:31:55PM +0000, Julien Grall wrote: > > On 09/01/2024 14:19, Javi Merino wrote: > > > In remove_nodes(), overlay_node is dereferenced when printing the > > > error message even though it is known to be NULLL. Fix the error > > > > Typo: s/NULLL/NULL/ > > > > This can be fixed on commit if there is nothing else. > > > > > message to avoid dereferencing a NULL pointer. > > > > > > The semantic patch that spots this code is available in > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/coccinelle/null/deref_null.cocci?id=1f874787ed9a2d78ed59cb21d0d90ac0178eceb0 > > > > Good catch and glad to see that coccinelle can work on Xen. I am looking > > forward for more work in that area :). > > > > > > > > Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities") > > > Signed-off-by: Javi Merino <javi.merino@cloud.com> > > c> --- > > > CC: Vikram Garhwal <vikram.garhwal@amd.com> > > > > > > Vikram, I didn't know what to put in the error message. Feel free to > > > suggest something more appropriate than "Device not present in the > > > tree". > > > > More questions for Vikram, looking at the code, it is not 100% clear in > > which condition overlay_node could be NULL. Is this a programming error? if > > so, maybe this should be an ASSERT_UNREACHABLE() (could be added separately) > > and it would be fine to print nothing. > > > This can happen with failures in add_nodes() function. add_nodes() failure will > try to call remove_nodes function. Depending on where add_nodes() is failed, > nodes_address may or may not be NULL. > > We also added a detailed comment on this: > https://github.com/xen-project/xen/blob/5a3ace21f3d779b291a2d305824b2820d88de7f1/xen/common/dt-overlay.c#L816 > > For now, we can return from here without printing anything as error message will > be printed by the caller of remove_nodes() anyway. Ok, I'll send a v2 without the printk and add this explanation to the commit message. Thanks! Javi
diff --git a/xen/common/dt-overlay.c b/xen/common/dt-overlay.c index 5663a049e90a..f04b0c276eea 100644 --- a/xen/common/dt-overlay.c +++ b/xen/common/dt-overlay.c @@ -428,8 +428,7 @@ static int remove_nodes(const struct overlay_track *tracker) overlay_node = (struct dt_device_node *)tracker->nodes_address[j]; if ( overlay_node == NULL ) { - printk(XENLOG_ERR "Device %s is not present in the tree. Removing nodes failed\n", - overlay_node->full_name); + printk(XENLOG_ERR "Device not present in the tree. Removing nodes failed\n"); return -EINVAL; }
In remove_nodes(), overlay_node is dereferenced when printing the error message even though it is known to be NULLL. Fix the error message to avoid dereferencing a NULL pointer. The semantic patch that spots this code is available in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/coccinelle/null/deref_null.cocci?id=1f874787ed9a2d78ed59cb21d0d90ac0178eceb0 Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities") Signed-off-by: Javi Merino <javi.merino@cloud.com> --- CC: Vikram Garhwal <vikram.garhwal@amd.com> Vikram, I didn't know what to put in the error message. Feel free to suggest something more appropriate than "Device not present in the tree". --- xen/common/dt-overlay.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)