diff mbox series

[v4] usb: host: xhci-plat: fix possible kernel oops while resuming

Message ID e0459058-5ca5-1c1a-c06a-47100c176ba2@omp.ru (mailing list archive)
State New, archived
Headers show
Series [v4] usb: host: xhci-plat: fix possible kernel oops while resuming | expand

Commit Message

Sergey Shtylyov July 11, 2023, 7:18 p.m. UTC
If this driver enables the xHC clocks while resuming from sleep, it calls
clk_prepare_enable() without checking for errors and blithely goes on to
read/write the xHC's registers -- which, with the xHC not being clocked,
at least on ARM32 usually causes an imprecise external abort exceptions
which cause kernel oops.  Currently, the chips for which the driver does
the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based
on ARM32 CPUs, as it seems...

In order to fix this issue, add the result checks for clk_prepare_enable()
calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls
on the error path of xhci_plat_resume(); then factor out the common clock
disabling code from the suspend() and resume() driver PM methods into a
separate function to avoid code duplication.

Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.

Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>

---
This patch is against the 'usb-linus' branch of Greg KH's 'usb.git' repo...

Changes in version 4:
- resolved reject in xhci_plat_resume() due to the changed xhci_resume() call;
- added the __maybe_unused attribute to xhci_plat_disable_clocks().

Changes in version 3:
- sanitized the clock enabling error paths in xhci_plat_resume() WRT the
  applicability checks;
- factored out the common clock disabling code from the suspend() and resume()
  driver PM methods;
- added to the patch sescriptiun a passage describing the change being done.

Changes in version 2:
- fixed up the error path for clk_prepare_enable() calls in xhci_plat_resume().

 drivers/usb/host/xhci-plat.c |   35 +++++++++++++++++++++++++++--------
 1 file changed, 27 insertions(+), 8 deletions(-)

Comments

Sergey Shtylyov Sept. 8, 2023, 6:58 p.m. UTC | #1
On 7/11/23 10:18 PM, Sergey Shtylyov wrote:

> If this driver enables the xHC clocks while resuming from sleep, it calls
> clk_prepare_enable() without checking for errors and blithely goes on to
> read/write the xHC's registers -- which, with the xHC not being clocked,
> at least on ARM32 usually causes an imprecise external abort exceptions
> which cause kernel oops.  Currently, the chips for which the driver does
> the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based
> on ARM32 CPUs, as it seems...
> 
> In order to fix this issue, add the result checks for clk_prepare_enable()
> calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls
> on the error path of xhci_plat_resume(); then factor out the common clock
> disabling code from the suspend() and resume() driver PM methods into a
> separate function to avoid code duplication.
> 
> Found by Linux Verification Center (linuxtesting.org) with the Svace static
> analysis tool.
> 
> Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks")
> Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
> 
> ---
> This patch is against the 'usb-linus' branch of Greg KH's 'usb.git' repo...

   The v4 of the patch was posted almost 2 months ago... What's going on,
is Mathias still opn vacations?

[...]

MBR, Sergey
Mathias Nyman Sept. 11, 2023, 12:53 p.m. UTC | #2
Hi

Sorry about the delay

On 11.7.2023 22.18, Sergey Shtylyov wrote:
> If this driver enables the xHC clocks while resuming from sleep, it calls
> clk_prepare_enable() without checking for errors and blithely goes on to
> read/write the xHC's registers -- which, with the xHC not being clocked,
> at least on ARM32 usually causes an imprecise external abort exceptions
> which cause kernel oops.  Currently, the chips for which the driver does
> the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based
> on ARM32 CPUs, as it seems...
> 
> In order to fix this issue, add the result checks for clk_prepare_enable()
> calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls
> on the error path of xhci_plat_resume(); then factor out the common clock
> disabling code from the suspend() and resume() driver PM methods into a
> separate function to avoid code duplication.

Minor nitpick, but not sure a separate function is helpful here.
It's two lines of code called twice.

> 
> Found by Linux Verification Center (linuxtesting.org) with the Svace static
> analysis tool.
> 
> Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks")
> Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>

If I understood correctly this issue hasn't been seen in real life,
and this patch only changes how we fail?

So I guess this would be more suitable for usb-next than usb-linus.

> 
> ---
> This patch is against the 'usb-linus' branch of Greg KH's 'usb.git' repo...
> 
> Changes in version 4:
> - resolved reject in xhci_plat_resume() due to the changed xhci_resume() call;
> - added the __maybe_unused attribute to xhci_plat_disable_clocks().
> 
> Changes in version 3:
> - sanitized the clock enabling error paths in xhci_plat_resume() WRT the
>    applicability checks;
> - factored out the common clock disabling code from the suspend() and resume()
>    driver PM methods;
> - added to the patch sescriptiun a passage describing the change being done.
> 
> Changes in version 2:
> - fixed up the error path for clk_prepare_enable() calls in xhci_plat_resume().
> 
>   drivers/usb/host/xhci-plat.c |   35 +++++++++++++++++++++++++++--------
>   1 file changed, 27 insertions(+), 8 deletions(-)
> 
> Index: usb/drivers/usb/host/xhci-plat.c
> ===================================================================
> --- usb.orig/drivers/usb/host/xhci-plat.c
> +++ usb/drivers/usb/host/xhci-plat.c
> @@ -424,6 +424,14 @@ void xhci_plat_remove(struct platform_de
>   }
>   EXPORT_SYMBOL_GPL(xhci_plat_remove);
>   
> +static void __maybe_unused xhci_plat_disable_clocks(struct xhci_hcd *xhci)
> +{
> +	if (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS) {
> +		clk_disable_unprepare(xhci->clk);
> +		clk_disable_unprepare(xhci->reg_clk);
> +	}
> +}
> +

xhci_plat_disable_clocks() name is a bit misleading, it only disables the clocks
if clocks are set to be disabled/enabled during suspend/resume.

>   static int __maybe_unused xhci_plat_suspend(struct device *dev)
>   {
>   	struct usb_hcd	*hcd = dev_get_drvdata(dev);
> @@ -444,10 +452,8 @@ static int __maybe_unused xhci_plat_susp
>   	if (ret)
>   		return ret;
>   
> -	if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) {
> -		clk_disable_unprepare(xhci->clk);
> -		clk_disable_unprepare(xhci->reg_clk);
> -	}
> +	if (!device_may_wakeup(dev))
> +		xhci_plat_disable_clocks(xhci);

Not sure this change improves things

>   
>   	return 0;
>   }
> @@ -459,23 +465,36 @@ static int __maybe_unused xhci_plat_resu
>   	int ret;
>   
>   	if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) {
> -		clk_prepare_enable(xhci->clk);
> -		clk_prepare_enable(xhci->reg_clk);
> +		ret = clk_prepare_enable(xhci->clk);
> +		if (ret)
> +			return ret;
> +
> +		ret = clk_prepare_enable(xhci->reg_clk);
> +		if (ret) {
> +			clk_disable_unprepare(xhci->clk);
> +			return ret;
> +		}
>   	}
>   
>   	ret = xhci_priv_resume_quirk(hcd);
>   	if (ret)
> -		return ret;
> +		goto disable_clks;
>   
>   	ret = xhci_resume(xhci, PMSG_RESUME);
>   	if (ret)
> -		return ret;
> +		goto disable_clks;
>   
>   	pm_runtime_disable(dev);
>   	pm_runtime_set_active(dev);
>   	pm_runtime_enable(dev);
>   
>   	return 0;
> +
> +disable_clks:
> +	if (!device_may_wakeup(dev))
> +		xhci_plat_disable_clocks(xhci);

I'd skip the helper and just do:

if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) {
	clk_disable_unprepare(xhci->clk);
	clk_disable_unprepare(xhci->reg_clk);
}

It better matches the if condition when enabling the clocks:
if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) {
	ret = clk_prepare_enable(xhci->clk);
	...

We also don't save any lines of code by adding the helper.

Thanks
Mathias
Sergey Shtylyov Sept. 19, 2023, 8:53 p.m. UTC | #3
On 9/11/23 3:53 PM, Mathias Nyman wrote:
[...]

>> If this driver enables the xHC clocks while resuming from sleep, it calls
>> clk_prepare_enable() without checking for errors and blithely goes on to
>> read/write the xHC's registers -- which, with the xHC not being clocked,
>> at least on ARM32 usually causes an imprecise external abort exceptions
>> which cause kernel oops.  Currently, the chips for which the driver does
>> the clock dance on suspend/resume seem to be the Broadcom STB SoCs, based
>> on ARM32 CPUs, as it seems...
>>
>> In order to fix this issue, add the result checks for clk_prepare_enable()
>> calls in xhci_plat_resume(), add conditional clk_disable_unprepare() calls
>> on the error path of xhci_plat_resume(); then factor out the common clock
>> disabling code from the suspend() and resume() driver PM methods into a
>> separate function to avoid code duplication.

> Minor nitpick, but not sure a separate function is helpful here.
> It's two lines of code called twice.

   Tried to save on the object code size...

>> Found by Linux Verification Center (linuxtesting.org) with the Svace static
>> analysis tool.
>>
>> Fixes: 8bd954c56197 ("usb: host: xhci-plat: suspend and resume clocks")
>> Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
> 
> If I understood correctly this issue hasn't been seen in real life,

   Not yet?

> and this patch only changes how we fail?

   Yes, failing gracefully instead of a kernel oops...

> So I guess this would be more suitable for usb-next than usb-linus.

   Maybe. I'll recast and rebase...

[...]

> Thanks
> Mathias

MBR, Sergey
diff mbox series

Patch

Index: usb/drivers/usb/host/xhci-plat.c
===================================================================
--- usb.orig/drivers/usb/host/xhci-plat.c
+++ usb/drivers/usb/host/xhci-plat.c
@@ -424,6 +424,14 @@  void xhci_plat_remove(struct platform_de
 }
 EXPORT_SYMBOL_GPL(xhci_plat_remove);
 
+static void __maybe_unused xhci_plat_disable_clocks(struct xhci_hcd *xhci)
+{
+	if (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS) {
+		clk_disable_unprepare(xhci->clk);
+		clk_disable_unprepare(xhci->reg_clk);
+	}
+}
+
 static int __maybe_unused xhci_plat_suspend(struct device *dev)
 {
 	struct usb_hcd	*hcd = dev_get_drvdata(dev);
@@ -444,10 +452,8 @@  static int __maybe_unused xhci_plat_susp
 	if (ret)
 		return ret;
 
-	if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) {
-		clk_disable_unprepare(xhci->clk);
-		clk_disable_unprepare(xhci->reg_clk);
-	}
+	if (!device_may_wakeup(dev))
+		xhci_plat_disable_clocks(xhci);
 
 	return 0;
 }
@@ -459,23 +465,36 @@  static int __maybe_unused xhci_plat_resu
 	int ret;
 
 	if (!device_may_wakeup(dev) && (xhci->quirks & XHCI_SUSPEND_RESUME_CLKS)) {
-		clk_prepare_enable(xhci->clk);
-		clk_prepare_enable(xhci->reg_clk);
+		ret = clk_prepare_enable(xhci->clk);
+		if (ret)
+			return ret;
+
+		ret = clk_prepare_enable(xhci->reg_clk);
+		if (ret) {
+			clk_disable_unprepare(xhci->clk);
+			return ret;
+		}
 	}
 
 	ret = xhci_priv_resume_quirk(hcd);
 	if (ret)
-		return ret;
+		goto disable_clks;
 
 	ret = xhci_resume(xhci, PMSG_RESUME);
 	if (ret)
-		return ret;
+		goto disable_clks;
 
 	pm_runtime_disable(dev);
 	pm_runtime_set_active(dev);
 	pm_runtime_enable(dev);
 
 	return 0;
+
+disable_clks:
+	if (!device_may_wakeup(dev))
+		xhci_plat_disable_clocks(xhci);
+
+	return ret;
 }
 
 static int __maybe_unused xhci_plat_runtime_suspend(struct device *dev)