Message ID | 20220523171559.472112-1-cristian.marussi@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list | expand |
Hello Cristian, On Mon, 23 May 2022 at 19:17, Cristian Marussi <cristian.marussi@arm.com> wrote: > > Even though malformed replies from firmware must be treated carefully to > avoid memory corruption Kernel side, some out-of-spec SCMI replies can > be tolerated to avoid breaking existing deployed system, as long as they > won't cause memory issues. > > Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> > Cc: Etienne Carriere <etienne.carriere@linaro.org> > Cc: Sudeep Holla <sudeep.holla@arm.com> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Acked-by: Etienne Carriere <etienne.carriere@linaro.org> Best regards, etienne > --- > drivers/firmware/arm_scmi/base.c | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/drivers/firmware/arm_scmi/base.c b/drivers/firmware/arm_scmi/base.c > index 20fba7370f4e..d0ac96da1ddf 100644 > --- a/drivers/firmware/arm_scmi/base.c > +++ b/drivers/firmware/arm_scmi/base.c > @@ -221,11 +221,17 @@ scmi_base_implementation_list_get(const struct scmi_protocol_handle *ph, > calc_list_sz = (1 + (loop_num_ret - 1) / sizeof(u32)) * > sizeof(u32); > if (calc_list_sz != real_list_sz) { > - dev_err(dev, > - "Malformed reply - real_sz:%zd calc_sz:%u\n", > - real_list_sz, calc_list_sz); > - ret = -EPROTO; > - break; > + dev_warn(dev, > + "Malformed reply - real_sz:%zd calc_sz:%u (loop_num_ret:%d)\n", > + real_list_sz, calc_list_sz, loop_num_ret); > + /* > + * Bail out if the expected list size is bigger than the > + * total payload size of the received reply. > + */ > + if (calc_list_sz > real_list_sz) { > + ret = -EPROTO; > + break; > + } > } > > for (loop = 0; loop < loop_num_ret; loop++) > -- > 2.36.1 >
Hi Cristian, On 5/23/22 19:15, Cristian Marussi wrote: > Even though malformed replies from firmware must be treated carefully to > avoid memory corruption Kernel side, some out-of-spec SCMI replies can > be tolerated to avoid breaking existing deployed system, as long as they > won't cause memory issues. > > Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> > Cc: Etienne Carriere <etienne.carriere@linaro.org> > Cc: Sudeep Holla <sudeep.holla@arm.com> > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568 EVB1 in vanilla v5.19-rc1. I guess this patch should have a Fixes: tag, right? Would be great to have this in v5.19. AFAIC: Acked-by: Michael Riesch <michael.riesch@wolfvision.net> Best regards, Michael > --- > drivers/firmware/arm_scmi/base.c | 16 +++++++++++----- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/drivers/firmware/arm_scmi/base.c b/drivers/firmware/arm_scmi/base.c > index 20fba7370f4e..d0ac96da1ddf 100644 > --- a/drivers/firmware/arm_scmi/base.c > +++ b/drivers/firmware/arm_scmi/base.c > @@ -221,11 +221,17 @@ scmi_base_implementation_list_get(const struct scmi_protocol_handle *ph, > calc_list_sz = (1 + (loop_num_ret - 1) / sizeof(u32)) * > sizeof(u32); > if (calc_list_sz != real_list_sz) { > - dev_err(dev, > - "Malformed reply - real_sz:%zd calc_sz:%u\n", > - real_list_sz, calc_list_sz); > - ret = -EPROTO; > - break; > + dev_warn(dev, > + "Malformed reply - real_sz:%zd calc_sz:%u (loop_num_ret:%d)\n", > + real_list_sz, calc_list_sz, loop_num_ret); > + /* > + * Bail out if the expected list size is bigger than the > + * total payload size of the received reply. > + */ > + if (calc_list_sz > real_list_sz) { > + ret = -EPROTO; > + break; > + } > } > > for (loop = 0; loop < loop_num_ret; loop++)
On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote: > Hi Cristian, > Hi Michael, > On 5/23/22 19:15, Cristian Marussi wrote: > > Even though malformed replies from firmware must be treated carefully to > > avoid memory corruption Kernel side, some out-of-spec SCMI replies can > > be tolerated to avoid breaking existing deployed system, as long as they > > won't cause memory issues. > > > > Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> > > Cc: Etienne Carriere <etienne.carriere@linaro.org> > > Cc: Sudeep Holla <sudeep.holla@arm.com> > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> > > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568 > EVB1 in vanilla v5.19-rc1. > Yes, the break was reported on -next and today it appeared in 5.19-rc1. A proper FW fix is also up for review by Etienne but in the meantime this tries to limit damages relaxing a bit the checks. > I guess this patch should have a Fixes: tag, right? > It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it will be solved within the v5.19 cycle and I thought the Fixes tag was not needed in this case (I could be wrong...) > Would be great to have this in v5.19. AFAIC: > > Acked-by: Michael Riesch <michael.riesch@wolfvision.net> Thanks for testing it. Cristian
Hi > Gesendet: Montag, 06. Juni 2022 um 14:59 Uhr > Von: "Michael Riesch" <michael.riesch@wolfvision.net> > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568 > EVB1 in vanilla v5.19-rc1. > > I guess this patch should have a Fixes: tag, right? > > Would be great to have this in v5.19. AFAIC: > > Acked-by: Michael Riesch <michael.riesch@wolfvision.net> Thanks for the Patch, had also this on my rk3568 bpi-r2 pro board: panfrost fde60000.gpu: get clock failed -517 fixed by this patch Tested-By: Frank Wunderlich <frank-w@public-files.de> regards Frank
On Mon, Jun 06, 2022 at 02:31:48PM +0100, Cristian Marussi wrote: > On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote: > > Hi Cristian, > > > Hi Michael, > > > On 5/23/22 19:15, Cristian Marussi wrote: > > > Even though malformed replies from firmware must be treated carefully to > > > avoid memory corruption Kernel side, some out-of-spec SCMI replies can > > > be tolerated to avoid breaking existing deployed system, as long as they > > > won't cause memory issues. > > > > > > Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> > > > Cc: Etienne Carriere <etienne.carriere@linaro.org> > > > Cc: Sudeep Holla <sudeep.holla@arm.com> > > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> > > > > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568 > > EVB1 in vanilla v5.19-rc1. > > > > Yes, the break was reported on -next and today it appeared in 5.19-rc1. > A proper FW fix is also up for review by Etienne but in the meantime > this tries to limit damages relaxing a bit the checks. > > > I guess this patch should have a Fixes: tag, right? > > > > It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the > fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it > will be solved within the v5.19 cycle and I thought the Fixes tag was > not needed in this case (I could be wrong...) Correct, if for some reason, we can't push this before v5.19, then fixes tag needs to added. I will add that then, but for now let us target getting it in before v5.19 -- Regards, Sudeep
On Mon, 23 May 2022 18:15:59 +0100, Cristian Marussi wrote: > Even though malformed replies from firmware must be treated carefully to > avoid memory corruption Kernel side, some out-of-spec SCMI replies can > be tolerated to avoid breaking existing deployed system, as long as they > won't cause memory issues. > Applied to sudeep.holla/linux (for-next/scmi), thanks! [1/1] firmware: arm_scmi: Relax BASE protocol sanity checks on protocol list https://git.kernel.org/sudeep.holla/c/122839b58a -- Regards, Sudeep
Am Montag, 6. Juni 2022, 16:43:05 CEST schrieb Sudeep Holla: > On Mon, Jun 06, 2022 at 02:31:48PM +0100, Cristian Marussi wrote: > > On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote: > > > Hi Cristian, > > > > > Hi Michael, > > > > > On 5/23/22 19:15, Cristian Marussi wrote: > > > > Even though malformed replies from firmware must be treated carefully to > > > > avoid memory corruption Kernel side, some out-of-spec SCMI replies can > > > > be tolerated to avoid breaking existing deployed system, as long as they > > > > won't cause memory issues. > > > > > > > > Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> > > > > Cc: Etienne Carriere <etienne.carriere@linaro.org> > > > > Cc: Sudeep Holla <sudeep.holla@arm.com> > > > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> > > > > > > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568 > > > EVB1 in vanilla v5.19-rc1. > > > > > > > Yes, the break was reported on -next and today it appeared in 5.19-rc1. > > A proper FW fix is also up for review by Etienne but in the meantime > > this tries to limit damages relaxing a bit the checks. > > > > > I guess this patch should have a Fixes: tag, right? > > > > > > > It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the > > fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it > > will be solved within the v5.19 cycle and I thought the Fixes tag was > > not needed in this case (I could be wrong...) > > Correct, if for some reason, we can't push this before v5.19, then fixes > tag needs to added. I will add that then, but for now let us target getting > it in before v5.19 hmm, I'd disagree for the generalization. While true that is not 100% necessary to be present in all cases, so definitly no reason for a new version when applied to the same -rc series, having the Fixes tag not only clearly marks the patch as such, but also allows people reading either mailing lists or the later the git history to actually see where the issue started. So I really think it is a nice-to-have in most cases. Heiko
On Mon, Jun 06, 2022 at 04:55:10PM +0200, Heiko Stübner wrote: > Am Montag, 6. Juni 2022, 16:43:05 CEST schrieb Sudeep Holla: > > On Mon, Jun 06, 2022 at 02:31:48PM +0100, Cristian Marussi wrote: > > > On Mon, Jun 06, 2022 at 02:59:10PM +0200, Michael Riesch wrote: > > > > Hi Cristian, > > > > > > > Hi Michael, > > > > > > > On 5/23/22 19:15, Cristian Marussi wrote: > > > > > Even though malformed replies from firmware must be treated carefully to > > > > > avoid memory corruption Kernel side, some out-of-spec SCMI replies can > > > > > be tolerated to avoid breaking existing deployed system, as long as they > > > > > won't cause memory issues. > > > > > > > > > > Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> > > > > > Cc: Etienne Carriere <etienne.carriere@linaro.org> > > > > > Cc: Sudeep Holla <sudeep.holla@arm.com> > > > > > Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> > > > > > > > > Thanks a lot, without this fix the Mali G52 GPU won't probe on my RK3568 > > > > EVB1 in vanilla v5.19-rc1. > > > > > > > > > > Yes, the break was reported on -next and today it appeared in 5.19-rc1. > > > A proper FW fix is also up for review by Etienne but in the meantime > > > this tries to limit damages relaxing a bit the checks. > > > > > > > I guess this patch should have a Fixes: tag, right? > > > > > > > > > > It has not a Fixes tag because the issue was introduced in 5.19-rc1 and the > > > fix will go in with the next round of v5.19 fixes by Sudeep (AFAIU) so it > > > will be solved within the v5.19 cycle and I thought the Fixes tag was > > > not needed in this case (I could be wrong...) > > > > Correct, if for some reason, we can't push this before v5.19, then fixes > > tag needs to added. I will add that then, but for now let us target getting > > it in before v5.19 > > hmm, I'd disagree for the generalization. > > While true that is not 100% necessary to be present in all cases, so > definitly no reason for a new version when applied to the same -rc series, > having the Fixes tag not only clearly marks the patch as such, but also > allows people reading either mailing lists or the later the git history > to actually see where the issue started. So I really think it is a > nice-to-have in most cases. Absolutely agreed in general. But I asked Cristian to not add as we are work around the firmware bug by relaxing the checks in the kernel. This is not fixing anything in the original commit IMO, it is just that the original commit highlighted the firmware issue on that system.
diff --git a/drivers/firmware/arm_scmi/base.c b/drivers/firmware/arm_scmi/base.c index 20fba7370f4e..d0ac96da1ddf 100644 --- a/drivers/firmware/arm_scmi/base.c +++ b/drivers/firmware/arm_scmi/base.c @@ -221,11 +221,17 @@ scmi_base_implementation_list_get(const struct scmi_protocol_handle *ph, calc_list_sz = (1 + (loop_num_ret - 1) / sizeof(u32)) * sizeof(u32); if (calc_list_sz != real_list_sz) { - dev_err(dev, - "Malformed reply - real_sz:%zd calc_sz:%u\n", - real_list_sz, calc_list_sz); - ret = -EPROTO; - break; + dev_warn(dev, + "Malformed reply - real_sz:%zd calc_sz:%u (loop_num_ret:%d)\n", + real_list_sz, calc_list_sz, loop_num_ret); + /* + * Bail out if the expected list size is bigger than the + * total payload size of the received reply. + */ + if (calc_list_sz > real_list_sz) { + ret = -EPROTO; + break; + } } for (loop = 0; loop < loop_num_ret; loop++)
Even though malformed replies from firmware must be treated carefully to avoid memory corruption Kernel side, some out-of-spec SCMI replies can be tolerated to avoid breaking existing deployed system, as long as they won't cause memory issues. Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> Cc: Etienne Carriere <etienne.carriere@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> --- drivers/firmware/arm_scmi/base.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)