Message ID | 20200521031355.7022-1-dinghao.liu@zju.edu.cn (mailing list archive) |
---|---|
State | Mainlined, archived |
Commit | 1c1dbb2c02623db18a50c61b175f19aead800b4e |
Delegated to: | Lorenzo Pieralisi |
Headers | show |
Series | [v2] PCI: tegra194: Fix runtime PM imbalance on error | expand |
[+cc Rafael, linux-pm] On Thu, May 21, 2020 at 11:13:49AM +0800, Dinghao Liu wrote: > pm_runtime_get_sync() increments the runtime PM usage counter even > when it returns an error code. Thus a pairing decrement is needed on > the error handling path to keep the counter balanced. I didn't realize there were so many drivers with the exact same issue. Can we just squash these all into a single patch so we can see them all together? Hmm. There are over 1300 callers of pm_runtime_get_sync(), and it looks like many of them have similar issues, i.e., they have a pattern like this ret = pm_runtime_get_sync(dev); if (ret < 0) return; pm_runtime_put(dev); where there is not a pm_runtime_put() to match every pm_runtime_get_sync(). Random sample: nds32_pmu_reserve_hardware sata_rcar_probe exynos_trng_probe ks_sa_rng_probe omap_aes_probe sun8i_ss_probe omap_aes_probe zynq_gpio_probe amdgpu_hwmon_show_power_avg mtk_crtc_ddp_hw_init ... Surely I'm missing something and these aren't all broken, right? Maybe we could put together a coccinelle script to scan the tree for this issue? > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> > --- > drivers/pci/controller/dwc/pcie-tegra194.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c > index ae30a2fd3716..2c0d2ce16b47 100644 > --- a/drivers/pci/controller/dwc/pcie-tegra194.c > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c > @@ -1623,7 +1623,7 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > ret = pinctrl_pm_select_default_state(dev); > if (ret < 0) { > dev_err(dev, "Failed to configure sideband pins: %d\n", ret); > - goto fail_pinctrl; > + goto fail_pm_get_sync; > } > > tegra_pcie_init_controller(pcie); > @@ -1650,9 +1650,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > fail_host_init: > tegra_pcie_deinit_controller(pcie); > -fail_pinctrl: > - pm_runtime_put_sync(dev); > fail_pm_get_sync: > + pm_runtime_put_sync(dev); > pm_runtime_disable(dev); > return ret; > } > -- > 2.17.1 >
On Thu, May 21, 2020 at 5:16 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > [+cc Rafael, linux-pm] > > On Thu, May 21, 2020 at 11:13:49AM +0800, Dinghao Liu wrote: > > pm_runtime_get_sync() increments the runtime PM usage counter even > > when it returns an error code. Thus a pairing decrement is needed on > > the error handling path to keep the counter balanced. > > I didn't realize there were so many drivers with the exact same issue. > Can we just squash these all into a single patch so we can see them > all together? > > Hmm. There are over 1300 callers of pm_runtime_get_sync(), and it > looks like many of them have similar issues, i.e., they have a pattern > like this > > ret = pm_runtime_get_sync(dev); > if (ret < 0) > return; > > pm_runtime_put(dev); > > where there is not a pm_runtime_put() to match every > pm_runtime_get_sync(). Random sample: > > nds32_pmu_reserve_hardware > sata_rcar_probe > exynos_trng_probe > ks_sa_rng_probe > omap_aes_probe > sun8i_ss_probe > omap_aes_probe > zynq_gpio_probe > amdgpu_hwmon_show_power_avg > mtk_crtc_ddp_hw_init > ... > > Surely I'm missing something and these aren't all broken, right? If they do what you've said, they are all broken I'm afraid. They should all be doing something like ret = pm_runtime_get_sync(dev); if (ret < 0) goto out; ... out: pm_runtime_put(dev); > Maybe we could put together a coccinelle script to scan the tree for > this issue? > > > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> > > --- > > drivers/pci/controller/dwc/pcie-tegra194.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c > > index ae30a2fd3716..2c0d2ce16b47 100644 > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c > > @@ -1623,7 +1623,7 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > ret = pinctrl_pm_select_default_state(dev); > > if (ret < 0) { > > dev_err(dev, "Failed to configure sideband pins: %d\n", ret); > > - goto fail_pinctrl; > > + goto fail_pm_get_sync; > > } > > > > tegra_pcie_init_controller(pcie); > > @@ -1650,9 +1650,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > > > fail_host_init: > > tegra_pcie_deinit_controller(pcie); > > -fail_pinctrl: > > - pm_runtime_put_sync(dev); > > fail_pm_get_sync: > > + pm_runtime_put_sync(dev); Why not pm_runtime_put()? > > pm_runtime_disable(dev); > > return ret; > > } > > -- > > 2.17.1 > >
Hi Bjorn, In fact, most usage of pm_runtime_get_sync() is correct. I made a static analysis tool to check this imbalance in kernel and found about 80 bugs in dirvers. Some of my patches have been accepted and I'm trying to patch the rest as soon as possible. Regards, Dinghao > [+cc Rafael, linux-pm] > > On Thu, May 21, 2020 at 11:13:49AM +0800, Dinghao Liu wrote: > > pm_runtime_get_sync() increments the runtime PM usage counter even > > when it returns an error code. Thus a pairing decrement is needed on > > the error handling path to keep the counter balanced. > > I didn't realize there were so many drivers with the exact same issue. > Can we just squash these all into a single patch so we can see them > all together? > > Hmm. There are over 1300 callers of pm_runtime_get_sync(), and it > looks like many of them have similar issues, i.e., they have a pattern > like this > > ret = pm_runtime_get_sync(dev); > if (ret < 0) > return; > > pm_runtime_put(dev); > > where there is not a pm_runtime_put() to match every > pm_runtime_get_sync(). Random sample: > > nds32_pmu_reserve_hardware > sata_rcar_probe > exynos_trng_probe > ks_sa_rng_probe > omap_aes_probe > sun8i_ss_probe > omap_aes_probe > zynq_gpio_probe > amdgpu_hwmon_show_power_avg > mtk_crtc_ddp_hw_init > ... > > Surely I'm missing something and these aren't all broken, right? > > Maybe we could put together a coccinelle script to scan the tree for > this issue? > > > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> > > --- > > drivers/pci/controller/dwc/pcie-tegra194.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c > > index ae30a2fd3716..2c0d2ce16b47 100644 > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c > > @@ -1623,7 +1623,7 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > ret = pinctrl_pm_select_default_state(dev); > > if (ret < 0) { > > dev_err(dev, "Failed to configure sideband pins: %d\n", ret); > > - goto fail_pinctrl; > > + goto fail_pm_get_sync; > > } > > > > tegra_pcie_init_controller(pcie); > > @@ -1650,9 +1650,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > > > fail_host_init: > > tegra_pcie_deinit_controller(pcie); > > -fail_pinctrl: > > - pm_runtime_put_sync(dev); > > fail_pm_get_sync: > > + pm_runtime_put_sync(dev); > > pm_runtime_disable(dev); > > return ret; > > } > > -- > > 2.17.1 > >
> On Thu, May 21, 2020 at 5:16 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > > > [+cc Rafael, linux-pm] > > > > On Thu, May 21, 2020 at 11:13:49AM +0800, Dinghao Liu wrote: > > > pm_runtime_get_sync() increments the runtime PM usage counter even > > > when it returns an error code. Thus a pairing decrement is needed on > > > the error handling path to keep the counter balanced. > > > > I didn't realize there were so many drivers with the exact same issue. > > Can we just squash these all into a single patch so we can see them > > all together? > > > > Hmm. There are over 1300 callers of pm_runtime_get_sync(), and it > > looks like many of them have similar issues, i.e., they have a pattern > > like this > > > > ret = pm_runtime_get_sync(dev); > > if (ret < 0) > > return; > > > > pm_runtime_put(dev); > > > > where there is not a pm_runtime_put() to match every > > pm_runtime_get_sync(). Random sample: > > > > nds32_pmu_reserve_hardware > > sata_rcar_probe > > exynos_trng_probe > > ks_sa_rng_probe > > omap_aes_probe > > sun8i_ss_probe > > omap_aes_probe > > zynq_gpio_probe > > amdgpu_hwmon_show_power_avg > > mtk_crtc_ddp_hw_init > > ... > > > > Surely I'm missing something and these aren't all broken, right? > > If they do what you've said, they are all broken I'm afraid. > > They should all be doing something like > > ret = pm_runtime_get_sync(dev); > if (ret < 0) > goto out; > > ... > > out: > pm_runtime_put(dev); > > > Maybe we could put together a coccinelle script to scan the tree for > > this issue? > > > > > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> > > > --- > > > drivers/pci/controller/dwc/pcie-tegra194.c | 5 ++--- > > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c > > > index ae30a2fd3716..2c0d2ce16b47 100644 > > > --- a/drivers/pci/controller/dwc/pcie-tegra194.c > > > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c > > > @@ -1623,7 +1623,7 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > > ret = pinctrl_pm_select_default_state(dev); > > > if (ret < 0) { > > > dev_err(dev, "Failed to configure sideband pins: %d\n", ret); > > > - goto fail_pinctrl; > > > + goto fail_pm_get_sync; > > > } > > > > > > tegra_pcie_init_controller(pcie); > > > @@ -1650,9 +1650,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > > > > > fail_host_init: > > > tegra_pcie_deinit_controller(pcie); > > > -fail_pinctrl: > > > - pm_runtime_put_sync(dev); > > > fail_pm_get_sync: > > > + pm_runtime_put_sync(dev); > > Why not pm_runtime_put()? Good question. For functions with PM decrement API somewhere, I will adopt it. If this API is not suitable here, please tell me. > > > > pm_runtime_disable(dev); > > > return ret; > > > } > > > -- > > > 2.17.1 > > >
On Thu, May 21, 2020 at 11:13:49AM +0800, Dinghao Liu wrote: > pm_runtime_get_sync() increments the runtime PM usage counter even > when it returns an error code. Thus a pairing decrement is needed on > the error handling path to keep the counter balanced. > > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> > --- > drivers/pci/controller/dwc/pcie-tegra194.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) Applied to pci/tegra, thanks. Lorenzo > diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c > index ae30a2fd3716..2c0d2ce16b47 100644 > --- a/drivers/pci/controller/dwc/pcie-tegra194.c > +++ b/drivers/pci/controller/dwc/pcie-tegra194.c > @@ -1623,7 +1623,7 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > ret = pinctrl_pm_select_default_state(dev); > if (ret < 0) { > dev_err(dev, "Failed to configure sideband pins: %d\n", ret); > - goto fail_pinctrl; > + goto fail_pm_get_sync; > } > > tegra_pcie_init_controller(pcie); > @@ -1650,9 +1650,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) > > fail_host_init: > tegra_pcie_deinit_controller(pcie); > -fail_pinctrl: > - pm_runtime_put_sync(dev); > fail_pm_get_sync: > + pm_runtime_put_sync(dev); > pm_runtime_disable(dev); > return ret; > } > -- > 2.17.1 >
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c index ae30a2fd3716..2c0d2ce16b47 100644 --- a/drivers/pci/controller/dwc/pcie-tegra194.c +++ b/drivers/pci/controller/dwc/pcie-tegra194.c @@ -1623,7 +1623,7 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) ret = pinctrl_pm_select_default_state(dev); if (ret < 0) { dev_err(dev, "Failed to configure sideband pins: %d\n", ret); - goto fail_pinctrl; + goto fail_pm_get_sync; } tegra_pcie_init_controller(pcie); @@ -1650,9 +1650,8 @@ static int tegra_pcie_config_rp(struct tegra_pcie_dw *pcie) fail_host_init: tegra_pcie_deinit_controller(pcie); -fail_pinctrl: - pm_runtime_put_sync(dev); fail_pm_get_sync: + pm_runtime_put_sync(dev); pm_runtime_disable(dev); return ret; }
pm_runtime_get_sync() increments the runtime PM usage counter even when it returns an error code. Thus a pairing decrement is needed on the error handling path to keep the counter balanced. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> --- drivers/pci/controller/dwc/pcie-tegra194.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)