Message ID | a87f05e60ea1a94b571c9c87b69cc5b0e94943f2.1669999089.git.lorenzo@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [net-next] net: ethernet: mtk_wed: fix possible deadlock if mtk_wed_wo_init fails | expand |
On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote: > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in > mtk_wed_attach routine if mtk_wed_wo_init fails. > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support") > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > --- > drivers/net/ethernet/mediatek/mtk_wed.c | 24 ++++++++++++++------- > drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++--- > drivers/net/ethernet/mediatek/mtk_wed_wo.c | 3 +++ > 3 files changed, 26 insertions(+), 11 deletions(-) <...> > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > index f9539e6233c9..b084009a32f9 100644 > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd, > u16 seq; > int ret; > > + if (!wo) > + return -ENODEV; <...> > static void > mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo) > { > + if (!wo) > + return; How are these changes related to the written in deadlock? How is it possible to get internal mtk functions without valid wo? Thanks > + > /* disable interrupts */ > mtk_wed_wo_set_isr(wo, 0); > > -- > 2.38.1 >
> On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote: > > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in > > mtk_wed_attach routine if mtk_wed_wo_init fails. > > > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support") > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > --- > > drivers/net/ethernet/mediatek/mtk_wed.c | 24 ++++++++++++++------- > > drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++--- > > drivers/net/ethernet/mediatek/mtk_wed_wo.c | 3 +++ > > 3 files changed, 26 insertions(+), 11 deletions(-) > > <...> > > > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > index f9539e6233c9..b084009a32f9 100644 > > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd, > > u16 seq; > > int ret; > > > > + if (!wo) > > + return -ENODEV; > > <...> > > > static void > > mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo) > > { > > + if (!wo) > > + return; > > How are these changes related to the written in deadlock? > How is it possible to get internal mtk functions without valid wo? Hi Leon, if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running __mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in mtk_wed_wo_init()). Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit() so we will need to check if wo pointer is properly set. We will face the same issue if wo allocation fails in mtk_wed_wo_init routine. If we remove the deadlock we need to take into account even these conditions. Regards, Lorenzo > > Thanks > > > + > > /* disable interrupts */ > > mtk_wed_wo_set_isr(wo, 0); > > > > -- > > 2.38.1 > >
On Sun, Dec 04, 2022 at 04:09:21PM +0100, Lorenzo Bianconi wrote: > > On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote: > > > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in > > > mtk_wed_attach routine if mtk_wed_wo_init fails. > > > > > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support") > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > > --- > > > drivers/net/ethernet/mediatek/mtk_wed.c | 24 ++++++++++++++------- > > > drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++--- > > > drivers/net/ethernet/mediatek/mtk_wed_wo.c | 3 +++ > > > 3 files changed, 26 insertions(+), 11 deletions(-) > > > > <...> > > > > > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > index f9539e6233c9..b084009a32f9 100644 > > > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd, > > > u16 seq; > > > int ret; > > > > > > + if (!wo) > > > + return -ENODEV; > > > > <...> > > > > > static void > > > mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo) > > > { > > > + if (!wo) > > > + return; > > > > How are these changes related to the written in deadlock? > > How is it possible to get internal mtk functions without valid wo? > > Hi Leon, > > if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running > __mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in > mtk_wed_wo_init()). IMHO, it is a culprit, proper error unwind means that you won't call to uninit functions for something that is not initialized yet. It is better to fix it instead of adding "if (!wo) ..." checks. > Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit() This is another side of same coin. If you can run them in parallel, you need locking protection and ability to cancel work, so nothing is going to be executed once cleanup succeeded. These were my 2 cents, totally IMHO. Thanks
On Dec 05, Leon Romanovsky wrote: > On Sun, Dec 04, 2022 at 04:09:21PM +0100, Lorenzo Bianconi wrote: > > > On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote: > > > > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in > > > > mtk_wed_attach routine if mtk_wed_wo_init fails. > > > > > > > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support") > > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > > > --- > > > > drivers/net/ethernet/mediatek/mtk_wed.c | 24 ++++++++++++++------- > > > > drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++--- > > > > drivers/net/ethernet/mediatek/mtk_wed_wo.c | 3 +++ > > > > 3 files changed, 26 insertions(+), 11 deletions(-) > > > > > > <...> > > > > > > > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > > index f9539e6233c9..b084009a32f9 100644 > > > > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd, > > > > u16 seq; > > > > int ret; > > > > > > > > + if (!wo) > > > > + return -ENODEV; > > > > > > <...> > > > > > > > static void > > > > mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo) > > > > { > > > > + if (!wo) > > > > + return; > > > > > > How are these changes related to the written in deadlock? > > > How is it possible to get internal mtk functions without valid wo? > > > > Hi Leon, > > > > if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running > > __mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in > > mtk_wed_wo_init()). > > IMHO, it is a culprit, proper error unwind means that you won't call to > uninit functions for something that is not initialized yet. It is better > to fix it instead of adding "if (!wo) ..." checks. So, iiuc, you would prefer to do something like: __mtk_wed_detach() { ... if (mtk_wed_get_rx_capa(dev) && wo) { mtk_wed_wo_reset(dev); mtk_wed_free_rx_rings(dev); mtk_wed_wo_deinit(hw); } ... Right? I am fine both ways :) > > > Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit() > > This is another side of same coin. If you can run them in parallel, you > need locking protection and ability to cancel work, so nothing is going > to be executed once cleanup succeeded. Sorry, I did not get what you mean here with 'in parallel'. __mtk_wed_detach() always run with hw_lock mutex help in both mtk_wed_attach() or mtk_wed_detach(). Regards, Lorenzo > > These were my 2 cents, totally IMHO. > > Thanks
On Mon, Dec 05, 2022 at 10:04:07AM +0100, Lorenzo Bianconi wrote: > On Dec 05, Leon Romanovsky wrote: > > On Sun, Dec 04, 2022 at 04:09:21PM +0100, Lorenzo Bianconi wrote: > > > > On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote: > > > > > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in > > > > > mtk_wed_attach routine if mtk_wed_wo_init fails. > > > > > > > > > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support") > > > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > > > > --- > > > > > drivers/net/ethernet/mediatek/mtk_wed.c | 24 ++++++++++++++------- > > > > > drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++--- > > > > > drivers/net/ethernet/mediatek/mtk_wed_wo.c | 3 +++ > > > > > 3 files changed, 26 insertions(+), 11 deletions(-) > > > > > > > > <...> > > > > > > > > > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > > > index f9539e6233c9..b084009a32f9 100644 > > > > > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > > > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c > > > > > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd, > > > > > u16 seq; > > > > > int ret; > > > > > > > > > > + if (!wo) > > > > > + return -ENODEV; > > > > > > > > <...> > > > > > > > > > static void > > > > > mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo) > > > > > { > > > > > + if (!wo) > > > > > + return; > > > > > > > > How are these changes related to the written in deadlock? > > > > How is it possible to get internal mtk functions without valid wo? > > > > > > Hi Leon, > > > > > > if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running > > > __mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in > > > mtk_wed_wo_init()). > > > > IMHO, it is a culprit, proper error unwind means that you won't call to > > uninit functions for something that is not initialized yet. It is better > > to fix it instead of adding "if (!wo) ..." checks. > > So, iiuc, you would prefer to do something like: > > __mtk_wed_detach() > { > ... > if (mtk_wed_get_rx_capa(dev) && wo) { > mtk_wed_wo_reset(dev); > mtk_wed_free_rx_rings(dev); > mtk_wed_wo_deinit(hw); > } > ... > > Right? I am fine both ways :) Yes > > > > > > Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit() > > > > This is another side of same coin. If you can run them in parallel, you > > need locking protection and ability to cancel work, so nothing is going > > to be executed once cleanup succeeded. > > Sorry, I did not get what you mean here with 'in parallel'. __mtk_wed_detach() > always run with hw_lock mutex help in both mtk_wed_attach() or > mtk_wed_detach(). Lock is not enough, you need to make sure that no underlying code is called without wo. You suggestion above is fine. The less low level code will have "if (!wo) ...", the better will be. Thanks > > Regards, > Lorenzo > > > > > These were my 2 cents, totally IMHO. > > > > Thanks
On Mon, 5 Dec 2022 10:04:07 +0100 Lorenzo Bianconi wrote: > > IMHO, it is a culprit, proper error unwind means that you won't call to > > uninit functions for something that is not initialized yet. It is better > > to fix it instead of adding "if (!wo) ..." checks. > > So, iiuc, you would prefer to do something like: > > __mtk_wed_detach() > { > ... > if (mtk_wed_get_rx_capa(dev) && wo) { > mtk_wed_wo_reset(dev); > mtk_wed_free_rx_rings(dev); > mtk_wed_wo_deinit(hw); > } > ... > > Right? I am fine both ways :) FWIW, that does seem slightly better to me as well. Also - aren't you really fixing multiple issues here (even if on the same error path)? The locking, the null-checking and the change in mtk_wed_wo_reset()?
> > > IMHO, it is a culprit, proper error unwind means that you won't call to > > > uninit functions for something that is not initialized yet. It is better > > > to fix it instead of adding "if (!wo) ..." checks. > > > > So, iiuc, you would prefer to do something like: > > > > __mtk_wed_detach() > > { > > ... > > if (mtk_wed_get_rx_capa(dev) && wo) { > > mtk_wed_wo_reset(dev); > > mtk_wed_free_rx_rings(dev); > > mtk_wed_wo_deinit(hw); > > } > > ... > > > > Right? I am fine both ways :) > > FWIW, that does seem slightly better to me as well. > Also - aren't you really fixing multiple issues here > (even if on the same error path)? The locking, > the null-checking and the change in mtk_wed_wo_reset()? wo NULL pointer issue was not hit before for the deadlock one (so I fixed them in the same patch). Do you prefer to split them in two patches? (wo null pointer fix first). I have posted v2 addressing Leon's comments but I need to post a v3 to add missing WARN_ON. Regards, Lorenzo
On Wed, 7 Dec 2022 00:52:28 +0100 Lorenzo Bianconi wrote: > > FWIW, that does seem slightly better to me as well. > > Also - aren't you really fixing multiple issues here > > (even if on the same error path)? The locking, > > the null-checking and the change in mtk_wed_wo_reset()? > > wo NULL pointer issue was not hit before for the deadlock one (so I fixed them > in the same patch). > Do you prefer to split them in two patches? (wo null pointer fix first). Yes, I think they are different issues even if once "covers" the other. I think it'd make the review / judgment easier. > I have posted v2 addressing Leon's comments but I need to post a v3 to add > missing WARN_ON.
> On Wed, 7 Dec 2022 00:52:28 +0100 Lorenzo Bianconi wrote: > > > FWIW, that does seem slightly better to me as well. > > > Also - aren't you really fixing multiple issues here > > > (even if on the same error path)? The locking, > > > the null-checking and the change in mtk_wed_wo_reset()? > > > > wo NULL pointer issue was not hit before for the deadlock one (so I fixed them > > in the same patch). > > Do you prefer to split them in two patches? (wo null pointer fix first). > > Yes, I think they are different issues even if once "covers" the other. > I think it'd make the review / judgment easier. ok, I will post v3 splitting them. Regards, Lorenzo > > > I have posted v2 addressing Leon's comments but I need to post a v3 to add > > missing WARN_ON. >
diff --git a/drivers/net/ethernet/mediatek/mtk_wed.c b/drivers/net/ethernet/mediatek/mtk_wed.c index d041615b2bac..6352abd4157e 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed.c +++ b/drivers/net/ethernet/mediatek/mtk_wed.c @@ -174,9 +174,10 @@ mtk_wed_wo_reset(struct mtk_wed_device *dev) mtk_wdma_tx_reset(dev); mtk_wed_reset(dev, MTK_WED_RESET_WED); - mtk_wed_mcu_send_msg(wo, MTK_WED_MODULE_ID_WO, - MTK_WED_WO_CMD_CHANGE_STATE, &state, - sizeof(state), false); + if (mtk_wed_mcu_send_msg(wo, MTK_WED_MODULE_ID_WO, + MTK_WED_WO_CMD_CHANGE_STATE, &state, + sizeof(state), false)) + return; if (readx_poll_timeout(mtk_wed_wo_read_status, dev, val, val == MTK_WED_WOIF_DISABLE_DONE, @@ -576,12 +577,10 @@ mtk_wed_deinit(struct mtk_wed_device *dev) } static void -mtk_wed_detach(struct mtk_wed_device *dev) +__mtk_wed_detach(struct mtk_wed_device *dev) { struct mtk_wed_hw *hw = dev->hw; - mutex_lock(&hw_lock); - mtk_wed_deinit(dev); mtk_wdma_rx_reset(dev); @@ -612,6 +611,13 @@ mtk_wed_detach(struct mtk_wed_device *dev) module_put(THIS_MODULE); hw->wed_dev = NULL; +} + +static void +mtk_wed_detach(struct mtk_wed_device *dev) +{ + mutex_lock(&hw_lock); + __mtk_wed_detach(dev); mutex_unlock(&hw_lock); } @@ -1490,8 +1496,10 @@ mtk_wed_attach(struct mtk_wed_device *dev) ret = mtk_wed_wo_init(hw); } out: - if (ret) - mtk_wed_detach(dev); + if (ret) { + dev_err(dev->hw->dev, "failed to attach wed device\n"); + __mtk_wed_detach(dev); + } unlock: mutex_unlock(&hw_lock); diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c index f9539e6233c9..b084009a32f9 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd, u16 seq; int ret; + if (!wo) + return -ENODEV; + skb = mtk_wed_mcu_msg_alloc(data, len); if (!skb) return -ENOMEM; @@ -202,13 +205,14 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd, int mtk_wed_mcu_msg_update(struct mtk_wed_device *dev, int id, void *data, int len) { - struct mtk_wed_wo *wo = dev->hw->wed_wo; + if (!dev->hw || !dev->hw->wed_wo) + return 0; if (dev->hw->version == 1) return 0; - return mtk_wed_mcu_send_msg(wo, MTK_WED_MODULE_ID_WO, id, data, len, - true); + return mtk_wed_mcu_send_msg(dev->hw->wed_wo, MTK_WED_MODULE_ID_WO, id, + data, len, true); } static int diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.c b/drivers/net/ethernet/mediatek/mtk_wed_wo.c index a219da85f4db..92440d62e01c 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.c @@ -464,6 +464,9 @@ mtk_wed_wo_hardware_init(struct mtk_wed_wo *wo) static void mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo) { + if (!wo) + return; + /* disable interrupts */ mtk_wed_wo_set_isr(wo, 0);
Introduce __mtk_wed_detach() in order to avoid a possible deadlock in mtk_wed_attach routine if mtk_wed_wo_init fails. Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> --- drivers/net/ethernet/mediatek/mtk_wed.c | 24 ++++++++++++++------- drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++--- drivers/net/ethernet/mediatek/mtk_wed_wo.c | 3 +++ 3 files changed, 26 insertions(+), 11 deletions(-)