Message ID | 20240103-topic-battmgr2-v2-1-c07b9206a2a5@linaro.org (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | [v2] power: supply: qcom_battmgr: Ignore notifications before initialization | expand |
On 03/01/2024 13:36, Konrad Dybcio wrote: > Commit b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power > supplies after PDR is up") moved the devm_power_supply_register() calls > so that the power supply devices are not registered before we go through > the entire initialization sequence (power up the ADSP remote processor, > wait for it to come online, coordinate with userspace..). > > Some firmware versions (e.g. on SM8550) seem to leave battmgr at least > partly initialized when exiting the bootloader and loading Linux. Check > if the power supply devices are registered before consuming the battmgr > notifications. > > Fixes: b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power supplies after PDR is up") > Reported-by: Xilin Wu <wuxilin123@gmail.com> > Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> > --- > Changes in v2: > - Fix the commit title > - Link to v1: https://lore.kernel.org/linux-arm-msm/d9cf7d9d-60d9-4637-97bf-c9840452899e@linaro.org/T/#t > --- > drivers/power/supply/qcom_battmgr.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/power/supply/qcom_battmgr.c b/drivers/power/supply/qcom_battmgr.c > index a12e2a66d516..7d85292eb839 100644 > --- a/drivers/power/supply/qcom_battmgr.c > +++ b/drivers/power/supply/qcom_battmgr.c > @@ -1271,6 +1271,10 @@ static void qcom_battmgr_callback(const void *data, size_t len, void *priv) > struct qcom_battmgr *battmgr = priv; > unsigned int opcode = le32_to_cpu(hdr->opcode); > > + /* Ignore the pings that come before Linux cleanly initializes the battmgr stack */ > + if (!battmgr->bat_psy) > + return; > + > if (opcode == BATTMGR_NOTIFICATION) > qcom_battmgr_notification(battmgr, data, len); > else if (battmgr->variant == QCOM_BATTMGR_SC8280XP) > > --- > base-commit: 0fef202ac2f8e6d9ad21aead648278f1226b9053 > change-id: 20240103-topic-battmgr2-15c17fac6d35 > > Best regards, Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD
On Wed, Jan 03, 2024 at 01:36:08PM +0100, Konrad Dybcio wrote: > Commit b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power > supplies after PDR is up") moved the devm_power_supply_register() calls > so that the power supply devices are not registered before we go through > the entire initialization sequence (power up the ADSP remote processor, > wait for it to come online, coordinate with userspace..). > > Some firmware versions (e.g. on SM8550) seem to leave battmgr at least > partly initialized when exiting the bootloader and loading Linux. Check > if the power supply devices are registered before consuming the battmgr > notifications. So this clearly was not tested properly as the offending commit breaks both the Lenovo ThinkPad X13s and the SC8280XP CRD. I spent some time this afternoon tracking down and considering the best way to address this before I checked lore and found this proposed fix (why was I not CCed?). > Fixes: b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power supplies after PDR is up") > Reported-by: Xilin Wu <wuxilin123@gmail.com> > Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> > --- > Changes in v2: > - Fix the commit title > - Link to v1: https://lore.kernel.org/linux-arm-msm/d9cf7d9d-60d9-4637-97bf-c9840452899e@linaro.org/T/#t > --- > drivers/power/supply/qcom_battmgr.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/power/supply/qcom_battmgr.c b/drivers/power/supply/qcom_battmgr.c > index a12e2a66d516..7d85292eb839 100644 > --- a/drivers/power/supply/qcom_battmgr.c > +++ b/drivers/power/supply/qcom_battmgr.c > @@ -1271,6 +1271,10 @@ static void qcom_battmgr_callback(const void *data, size_t len, void *priv) > struct qcom_battmgr *battmgr = priv; > unsigned int opcode = le32_to_cpu(hdr->opcode); > > + /* Ignore the pings that come before Linux cleanly initializes the battmgr stack */ Nit: I know you have a wide-screen monitor but please follow the coding style and break your lines at 80 columns for readability. ;) > + if (!battmgr->bat_psy) > + return; This is not a proper fix. You register 3-4 class devices and only check one. Even if your checked the last one, there's no locking or barriers in place to prevent this from breaking. Deferred registration of the class devices also risks missing notifications as you'll be spending time on registration after the service has gone live. I'm sure all of this can be handled but as it is non-trivial and the motivation for the offending commit is questionable to begin with, I suggest reverting for now. I'll send a revert for Sebastian to consider. > + > if (opcode == BATTMGR_NOTIFICATION) > qcom_battmgr_notification(battmgr, data, len); > else if (battmgr->variant == QCOM_BATTMGR_SC8280XP) Johan
On 1/23/24 16:59, Johan Hovold wrote: > On Wed, Jan 03, 2024 at 01:36:08PM +0100, Konrad Dybcio wrote: >> Commit b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power >> supplies after PDR is up") moved the devm_power_supply_register() calls >> so that the power supply devices are not registered before we go through >> the entire initialization sequence (power up the ADSP remote processor, >> wait for it to come online, coordinate with userspace..). >> >> Some firmware versions (e.g. on SM8550) seem to leave battmgr at least >> partly initialized when exiting the bootloader and loading Linux. Check >> if the power supply devices are registered before consuming the battmgr >> notifications. > > So this clearly was not tested properly as the offending commit breaks > both the Lenovo ThinkPad X13s and the SC8280XP CRD. > > I spent some time this afternoon tracking down and considering the best > way to address this before I checked lore and found this proposed fix > (why was I not CCed?). I didn't give the offending commit a spin on the laptops, as I simply assumed the interface is generic enough to behave similarly across the platforms. With this, I didn't imagine the DSP firmwares aren't unloaded on these.. [...] > >> + if (!battmgr->bat_psy) >> + return; > > This is not a proper fix. You register 3-4 class devices and only check > one. Even if your checked the last one, there's no locking or barriers > in place to prevent this from breaking. > > Deferred registration of the class devices also risks missing > notifications as you'll be spending time on registration after the > service has gone live. > > I'm sure all of this can be handled but as it is non-trivial and the > motivation for the offending commit is questionable to begin with, I > suggest reverting for now. > > I'll send a revert for Sebastian to consider. What you're saying is valid, but a "battery" device is always expected to be present. If devm_power_supply_register fails, things would go very south very fast anyway. I personally don't see this being a terribly bad fix, but I'm open to different propositions. Konrad
On Tue, Jan 23, 2024 at 06:53:46PM +0100, Konrad Dybcio wrote: > On 1/23/24 16:59, Johan Hovold wrote: > > On Wed, Jan 03, 2024 at 01:36:08PM +0100, Konrad Dybcio wrote: > >> Commit b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power > >> supplies after PDR is up") moved the devm_power_supply_register() calls > >> so that the power supply devices are not registered before we go through > >> the entire initialization sequence (power up the ADSP remote processor, > >> wait for it to come online, coordinate with userspace..). > >> > >> Some firmware versions (e.g. on SM8550) seem to leave battmgr at least > >> partly initialized when exiting the bootloader and loading Linux. Check > >> if the power supply devices are registered before consuming the battmgr > >> notifications. > >> + if (!battmgr->bat_psy) > >> + return; > > > > This is not a proper fix. You register 3-4 class devices and only check > > one. Even if your checked the last one, there's no locking or barriers > > in place to prevent this from breaking. > > > > Deferred registration of the class devices also risks missing > > notifications as you'll be spending time on registration after the > > service has gone live. > > > > I'm sure all of this can be handled but as it is non-trivial and the > > motivation for the offending commit is questionable to begin with, I > > suggest reverting for now. > > > > I'll send a revert for Sebastian to consider. > > What you're saying is valid, but a "battery" device is always expected > to be present. Yes, but that's not the point. battmgr->bat_psy is the first class device pointer to be initialised, but that being set does not mean that the other pointers are not still NULL when you hit this callback. > If devm_power_supply_register fails, things would go very > south very fast anyway. Eh, no. Before the offending commit, if registration fails, we bail out from probe() before registering the PMIC GLINK client (and callbacks) so all is good. That is no longer the case since b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power supplies after PDR is up") which happily ignores errors and could theoretically result in all but the first class device being registered leading to further NULL derefs on notifications. I could have pointed this out in the commit message for the revert. > I personally don't see this being a terribly bad fix, but I'm open to > different propositions. It's not a correct fix, only a band-aid that papers over the immediate issue, I'm afraid. Let's revert and if you care deeply about this you can possibly propose a complete patch that addresses the above issues, even if I'm more inclined to leave things as they were and not spend more time on this. Johan
diff --git a/drivers/power/supply/qcom_battmgr.c b/drivers/power/supply/qcom_battmgr.c index a12e2a66d516..7d85292eb839 100644 --- a/drivers/power/supply/qcom_battmgr.c +++ b/drivers/power/supply/qcom_battmgr.c @@ -1271,6 +1271,10 @@ static void qcom_battmgr_callback(const void *data, size_t len, void *priv) struct qcom_battmgr *battmgr = priv; unsigned int opcode = le32_to_cpu(hdr->opcode); + /* Ignore the pings that come before Linux cleanly initializes the battmgr stack */ + if (!battmgr->bat_psy) + return; + if (opcode == BATTMGR_NOTIFICATION) qcom_battmgr_notification(battmgr, data, len); else if (battmgr->variant == QCOM_BATTMGR_SC8280XP)
Commit b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power supplies after PDR is up") moved the devm_power_supply_register() calls so that the power supply devices are not registered before we go through the entire initialization sequence (power up the ADSP remote processor, wait for it to come online, coordinate with userspace..). Some firmware versions (e.g. on SM8550) seem to leave battmgr at least partly initialized when exiting the bootloader and loading Linux. Check if the power supply devices are registered before consuming the battmgr notifications. Fixes: b43f7ddc2b7a ("power: supply: qcom_battmgr: Register the power supplies after PDR is up") Reported-by: Xilin Wu <wuxilin123@gmail.com> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> --- Changes in v2: - Fix the commit title - Link to v1: https://lore.kernel.org/linux-arm-msm/d9cf7d9d-60d9-4637-97bf-c9840452899e@linaro.org/T/#t --- drivers/power/supply/qcom_battmgr.c | 4 ++++ 1 file changed, 4 insertions(+) --- base-commit: 0fef202ac2f8e6d9ad21aead648278f1226b9053 change-id: 20240103-topic-battmgr2-15c17fac6d35 Best regards,