Message ID | 20220323084725.65864-1-jan.sondhauss@wago.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drivers: ethernet: cpsw: fix panic when intrrupt coaleceing is set via ethtool | expand |
On 23/03/22 2:17 pm, Sondhauß, Jan wrote: > cpsw_ethtool_begin directly returns the result of pm_runtime_get_sync > when successful. > pm_runtime_get_sync returns -error code on failure and 0 on successful > resume but also 1 when the device is already active. So the common case > for cpsw_ethtool_begin is to return 1. That leads to inconsistent calls > to pm_runtime_put in the call-chain so that pm_runtime_put is called > one too many times and as result leaving the cpsw dev behind suspended. > > The suspended cpsw dev leads to an access violation later on by > different parts of the cpsw driver. > > Fix this by calling the return-friendly pm_runtime_resume_and_get > function. > > Signed-off-by: Jan Sondhauss <jan.sondhauss@wago.com> > --- > drivers/net/ethernet/ti/cpsw_ethtool.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ethernet/ti/cpsw_ethtool.c b/drivers/net/ethernet/ti/cpsw_ethtool.c > index 158c8d3793f4..b5bae6324970 100644 > --- a/drivers/net/ethernet/ti/cpsw_ethtool.c > +++ b/drivers/net/ethernet/ti/cpsw_ethtool.c > @@ -364,11 +364,9 @@ int cpsw_ethtool_op_begin(struct net_device *ndev) > struct cpsw_common *cpsw = priv->cpsw; > int ret; > > - ret = pm_runtime_get_sync(cpsw->dev); > - if (ret < 0) { > + ret = pm_runtime_resume_and_get(cpsw->dev); > + if (ret < 0) > cpsw_err(priv, drv, "ethtool begin failed %d\n", ret); > - pm_runtime_put_noidle(cpsw->dev); > - } > > return ret; > } Thanks for the fix! Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com> I am guessing this issue started with d43c65b05b84 (ethtool: runtime-resume netdev parent in ethnl_ops_begin) ? If so, could you add a Fixes tag so that patch gets backported to all affected stable kernels? Regards Vignesh
On 23/03/2022 12:10, Vignesh Raghavendra wrote: > On 23/03/22 2:17 pm, Sondhauß, Jan wrote: > cpsw_ethtool_begin directly > returns the result of pm_runtime_get_sync > when successful. > > pm_runtime_get_sync returns -error code on failure and 0 on successful > > resume but also 1 when ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > Please use caution when clicking on links or opening attachments! > ZjQcmQRYFpfptBannerEnd > > On 23/03/22 2:17 pm, Sondhauß, Jan wrote: >> cpsw_ethtool_begin directly returns the result of pm_runtime_get_sync >> when successful. >> pm_runtime_get_sync returns -error code on failure and 0 on successful >> resume but also 1 when the device is already active. So the common case >> for cpsw_ethtool_begin is to return 1. That leads to inconsistent calls >> to pm_runtime_put in the call-chain so that pm_runtime_put is called >> one too many times and as result leaving the cpsw dev behind suspended. >> >> The suspended cpsw dev leads to an access violation later on by >> different parts of the cpsw driver. >> >> Fix this by calling the return-friendly pm_runtime_resume_and_get >> function. >> >> Signed-off-by: Jan Sondhauss <jan.sondhauss@wago.com> >> --- >> drivers/net/ethernet/ti/cpsw_ethtool.c | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/net/ethernet/ti/cpsw_ethtool.c b/drivers/net/ethernet/ti/cpsw_ethtool.c >> index 158c8d3793f4..b5bae6324970 100644 >> --- a/drivers/net/ethernet/ti/cpsw_ethtool.c >> +++ b/drivers/net/ethernet/ti/cpsw_ethtool.c >> @@ -364,11 +364,9 @@ int cpsw_ethtool_op_begin(struct net_device *ndev) >> struct cpsw_common *cpsw = priv->cpsw; >> int ret; >> >> - ret = pm_runtime_get_sync(cpsw->dev); >> - if (ret < 0) { >> + ret = pm_runtime_resume_and_get(cpsw->dev); >> + if (ret < 0) >> cpsw_err(priv, drv, "ethtool begin failed %d\n", ret); >> - pm_runtime_put_noidle(cpsw->dev); >> - } >> >> return ret; >> } > > Thanks for the fix! > > Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com> > > I am guessing this issue started with d43c65b05b84 (ethtool: > runtime-resume netdev parent in ethnl_ops_begin) ? If so, could you add > a Fixes tag so that patch gets backported to all affected stable kernels? Yes it started with that commit. > > Regards > Vignesh >
Hello: This patch was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Wed, 23 Mar 2022 08:47:33 +0000 you wrote: > cpsw_ethtool_begin directly returns the result of pm_runtime_get_sync > when successful. > pm_runtime_get_sync returns -error code on failure and 0 on successful > resume but also 1 when the device is already active. So the common case > for cpsw_ethtool_begin is to return 1. That leads to inconsistent calls > to pm_runtime_put in the call-chain so that pm_runtime_put is called > one too many times and as result leaving the cpsw dev behind suspended. > > [...] Here is the summary with links: - [v2] drivers: ethernet: cpsw: fix panic when intrrupt coaleceing is set via ethtool https://git.kernel.org/netdev/net-next/c/2844e2434385 You are awesome, thank you!
diff --git a/drivers/net/ethernet/ti/cpsw_ethtool.c b/drivers/net/ethernet/ti/cpsw_ethtool.c index 158c8d3793f4..b5bae6324970 100644 --- a/drivers/net/ethernet/ti/cpsw_ethtool.c +++ b/drivers/net/ethernet/ti/cpsw_ethtool.c @@ -364,11 +364,9 @@ int cpsw_ethtool_op_begin(struct net_device *ndev) struct cpsw_common *cpsw = priv->cpsw; int ret; - ret = pm_runtime_get_sync(cpsw->dev); - if (ret < 0) { + ret = pm_runtime_resume_and_get(cpsw->dev); + if (ret < 0) cpsw_err(priv, drv, "ethtool begin failed %d\n", ret); - pm_runtime_put_noidle(cpsw->dev); - } return ret; }
cpsw_ethtool_begin directly returns the result of pm_runtime_get_sync when successful. pm_runtime_get_sync returns -error code on failure and 0 on successful resume but also 1 when the device is already active. So the common case for cpsw_ethtool_begin is to return 1. That leads to inconsistent calls to pm_runtime_put in the call-chain so that pm_runtime_put is called one too many times and as result leaving the cpsw dev behind suspended. The suspended cpsw dev leads to an access violation later on by different parts of the cpsw driver. Fix this by calling the return-friendly pm_runtime_resume_and_get function. Signed-off-by: Jan Sondhauss <jan.sondhauss@wago.com> --- drivers/net/ethernet/ti/cpsw_ethtool.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)