Message ID | e0459058-5ca5-1c1a-c06a-47100c176ba2@omp.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] usb: host: xhci-plat: fix possible kernel oops while resuming | expand |
On 7/11/23 10:18 PM, Sergey Shtylyov wrote: > If this driver enables the xHC clocks while resuming from sleep, it calls > clk_prepare_enable() without checking for errors and blithely goes on to > read/write the xHC's registers -- which, with the xHC not being clocked, > at least on ARM32 usually causes an imprecise external abort exceptions > which cause kernel oops. Currently, the chips for which the driver does > the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based > on ARM32 CPUs, as it seems... > > In order to fix this issue, add the result checks for clk_prepare_enable() > calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls > on the error path of xhci_plat_resume(); then factor out the common clock > disabling code from the suspend() and resume() driver PM methods into a > separate function to avoid code duplication. > > Found by Linux Verification Center (linuxtesting.org) with the Svace static > analysis tool. > > Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks") > Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> > > --- > This patch is against the 'usb-linus' branch of Greg KH's 'usb.git' repo... The v4 of the patch was posted almost 2 months ago... What's going on, is Mathias still opn vacations? [...] MBR, Sergey
Hi Sorry about the delay On 11.7.2023 22.18, Sergey Shtylyov wrote: > If this driver enables the xHC clocks while resuming from sleep, it calls > clk_prepare_enable() without checking for errors and blithely goes on to > read/write the xHC's registers -- which, with the xHC not being clocked, > at least on ARM32 usually causes an imprecise external abort exceptions > which cause kernel oops. Currently, the chips for which the driver does > the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based > on ARM32 CPUs, as it seems... > > In order to fix this issue, add the result checks for clk_prepare_enable() > calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls > on the error path of xhci_plat_resume(); then factor out the common clock > disabling code from the suspend() and resume() driver PM methods into a > separate function to avoid code duplication. Minor nitpick, but not sure a separate function is helpful here. It's two lines of code called twice. > > Found by Linux Verification Center (linuxtesting.org) with the Svace static > analysis tool. > > Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks") > Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> If I understood correctly this issue hasn't been seen in real life, and this patch only changes how we fail? So I guess this would be more suitable for usb-next than usb-linus. > > --- > This patch is against the 'usb-linus' branch of Greg KH's 'usb.git' repo... > > Changes in version 4: > - resolved reject in xhci_plat_resume() due to the changed xhci_resume() call; > - added the __maybe_unused attribute to xhci_plat_disable_clocks(). > > Changes in version 3: > - sanitized the clock enabling error paths in xhci_plat_resume() WRT the > applicability checks; > - factored out the common clock disabling code from the suspend() and resume() > driver PM methods; > - added to the patch sescriptiun a passage describing the change being done. > > Changes in version 2: > - fixed up the error path for clk_prepare_enable() calls in xhci_plat_resume(). > > drivers/usb/host/xhci-plat.c | 35 +++++++++++++++++++++++++++-------- > 1 file changed, 27 insertions(+), 8 deletions(-) > > Index: usb/drivers/usb/host/xhci-plat.c > =================================================================== > --- usb.orig/drivers/usb/host/xhci-plat.c > +++ usb/drivers/usb/host/xhci-plat.c > @@ -424,6 +424,14 @@ void xhci_plat_remove(struct platform_de > } > EXPORT_SYMBOL_GPL(xhci_plat_remove); > > +static void __maybe_unused xhci_plat_disable_clocks(struct xhci_hcd *xhci) > +{ > + if (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS) { > + clk_disable_unprepare(xhci->clk); > + clk_disable_unprepare(xhci->reg_clk); > + } > +} > + xhci_plat_disable_clocks() name is a bit misleading, it only disables the clocks if clocks are set to be disabled/enabled during suspend/resume. > static int __maybe_unused xhci_plat_suspend(struct device *dev) > { > struct usb_hcd *hcd = dev_get_drvdata(dev); > @@ -444,10 +452,8 @@ static int __maybe_unused xhci_plat_susp > if (ret) > return ret; > > - if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) { > - clk_disable_unprepare(xhci->clk); > - clk_disable_unprepare(xhci->reg_clk); > - } > + if (!device_may_wakeup(dev)) > + xhci_plat_disable_clocks(xhci); Not sure this change improves things > > return 0; > } > @@ -459,23 +465,36 @@ static int __maybe_unused xhci_plat_resu > int ret; > > if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) { > - clk_prepare_enable(xhci->clk); > - clk_prepare_enable(xhci->reg_clk); > + ret = clk_prepare_enable(xhci->clk); > + if (ret) > + return ret; > + > + ret = clk_prepare_enable(xhci->reg_clk); > + if (ret) { > + clk_disable_unprepare(xhci->clk); > + return ret; > + } > } > > ret = xhci_priv_resume_quirk(hcd); > if (ret) > - return ret; > + goto disable_clks; > > ret = xhci_resume(xhci, PMSG_RESUME); > if (ret) > - return ret; > + goto disable_clks; > > pm_runtime_disable(dev); > pm_runtime_set_active(dev); > pm_runtime_enable(dev); > > return 0; > + > +disable_clks: > + if (!device_may_wakeup(dev)) > + xhci_plat_disable_clocks(xhci); I'd skip the helper and just do: if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) { clk_disable_unprepare(xhci->clk); clk_disable_unprepare(xhci->reg_clk); } It better matches the if condition when enabling the clocks: if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) { ret = clk_prepare_enable(xhci->clk); ... We also don't save any lines of code by adding the helper. Thanks Mathias
On 9/11/23 3:53 PM, Mathias Nyman wrote: [...] >> If this driver enables the xHC clocks while resuming from sleep, it calls >> clk_prepare_enable() without checking for errors and blithely goes on to >> read/write the xHC's registers -- which, with the xHC not being clocked, >> at least on ARM32 usually causes an imprecise external abort exceptions >> which cause kernel oops. Currently, the chips for which the driver does >> the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based >> on ARM32 CPUs, as it seems... >> >> In order to fix this issue, add the result checks for clk_prepare_enable() >> calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls >> on the error path of xhci_plat_resume(); then factor out the common clock >> disabling code from the suspend() and resume() driver PM methods into a >> separate function to avoid code duplication. > Minor nitpick, but not sure a separate function is helpful here. > It's two lines of code called twice. Tried to save on the object code size... >> Found by Linux Verification Center (linuxtesting.org) with the Svace static >> analysis tool. >> >> Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks") >> Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> > > If I understood correctly this issue hasn't been seen in real life, Not yet? > and this patch only changes how we fail? Yes, failing gracefully instead of a kernel oops... > So I guess this would be more suitable for usb-next than usb-linus. Maybe. I'll recast and rebase... [...] > Thanks > Mathias MBR, Sergey
Index: usb/drivers/usb/host/xhci-plat.c =================================================================== --- usb.orig/drivers/usb/host/xhci-plat.c +++ usb/drivers/usb/host/xhci-plat.c @@ -424,6 +424,14 @@ void xhci_plat_remove(struct platform_de } EXPORT_SYMBOL_GPL(xhci_plat_remove); +static void __maybe_unused xhci_plat_disable_clocks(struct xhci_hcd *xhci) +{ + if (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS) { + clk_disable_unprepare(xhci->clk); + clk_disable_unprepare(xhci->reg_clk); + } +} + static int __maybe_unused xhci_plat_suspend(struct device *dev) { struct usb_hcd *hcd = dev_get_drvdata(dev); @@ -444,10 +452,8 @@ static int __maybe_unused xhci_plat_susp if (ret) return ret; - if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) { - clk_disable_unprepare(xhci->clk); - clk_disable_unprepare(xhci->reg_clk); - } + if (!device_may_wakeup(dev)) + xhci_plat_disable_clocks(xhci); return 0; } @@ -459,23 +465,36 @@ static int __maybe_unused xhci_plat_resu int ret; if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) { - clk_prepare_enable(xhci->clk); - clk_prepare_enable(xhci->reg_clk); + ret = clk_prepare_enable(xhci->clk); + if (ret) + return ret; + + ret = clk_prepare_enable(xhci->reg_clk); + if (ret) { + clk_disable_unprepare(xhci->clk); + return ret; + } } ret = xhci_priv_resume_quirk(hcd); if (ret) - return ret; + goto disable_clks; ret = xhci_resume(xhci, PMSG_RESUME); if (ret) - return ret; + goto disable_clks; pm_runtime_disable(dev); pm_runtime_set_active(dev); pm_runtime_enable(dev); return 0; + +disable_clks: + if (!device_may_wakeup(dev)) + xhci_plat_disable_clocks(xhci); + + return ret; } static int __maybe_unused xhci_plat_runtime_suspend(struct device *dev)
If this driver enables the xHC clocks while resuming from sleep, it calls clk_prepare_enable() without checking for errors and blithely goes on to read/write the xHC's registers -- which, with the xHC not being clocked, at least on ARM32 usually causes an imprecise external abort exceptions which cause kernel oops. Currently, the chips for which the driver does the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based on ARM32 CPUs, as it seems... In order to fix this issue, add the result checks for clk_prepare_enable() calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls on the error path of xhci_plat_resume(); then factor out the common clock disabling code from the suspend() and resume() driver PM methods into a separate function to avoid code duplication. Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> --- This patch is against the 'usb-linus' branch of Greg KH's 'usb.git' repo... Changes in version 4: - resolved reject in xhci_plat_resume() due to the changed xhci_resume() call; - added the __maybe_unused attribute to xhci_plat_disable_clocks(). Changes in version 3: - sanitized the clock enabling error paths in xhci_plat_resume() WRT the applicability checks; - factored out the common clock disabling code from the suspend() and resume() driver PM methods; - added to the patch sescriptiun a passage describing the change being done. Changes in version 2: - fixed up the error path for clk_prepare_enable() calls in xhci_plat_resume(). drivers/usb/host/xhci-plat.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-)