Message ID | cover.1632916329.git.leonro@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Devlink reload and missed notifications fix | expand |
On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote: > This series starts from the fixing the bug introduced by implementing > devlink delayed notifications logic, where I missed some of the > notifications functions. > > The rest series provides a way to dynamically set devlink ops that is > needed for mlx5 multiport device and starts cleanup by removing > not-needed logic. > > In the next series, we will delete various publish API, drop general > lock, annotate the code and rework logic around devlink->lock. > > All this is possible because driver initialization is separated from the > user input now. Swapping ops is a nasty hack in my book. And all that to avoid having two op structures in one driver. Or to avoid having counters which are always 0? Sorry, at the very least you need better explanation for this.
On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote: > On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote: > > This series starts from the fixing the bug introduced by implementing > > devlink delayed notifications logic, where I missed some of the > > notifications functions. > > > > The rest series provides a way to dynamically set devlink ops that is > > needed for mlx5 multiport device and starts cleanup by removing > > not-needed logic. > > > > In the next series, we will delete various publish API, drop general > > lock, annotate the code and rework logic around devlink->lock. > > > > All this is possible because driver initialization is separated from the > > user input now. > > Swapping ops is a nasty hack in my book. > > And all that to avoid having two op structures in one driver. > Or to avoid having counters which are always 0? > > Sorry, at the very least you need better explanation for this. Leon, while the discussion about this unfolds, can you please repost patch 1 separately? :) Thanks.
On Wed, 29 Sep 2021 13:46:38 +0000 Vladimir Oltean wrote: > On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote: > > Swapping ops is a nasty hack in my book. > > > > And all that to avoid having two op structures in one driver. > > Or to avoid having counters which are always 0? > > > > Sorry, at the very least you need better explanation for this. > > Leon, while the discussion about this unfolds, can you please repost > patch 1 separately? :) Yes, please and thanks! :)
On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote: > On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote: > > This series starts from the fixing the bug introduced by implementing > > devlink delayed notifications logic, where I missed some of the > > notifications functions. > > > > The rest series provides a way to dynamically set devlink ops that is > > needed for mlx5 multiport device and starts cleanup by removing > > not-needed logic. > > > > In the next series, we will delete various publish API, drop general > > lock, annotate the code and rework logic around devlink->lock. > > > > All this is possible because driver initialization is separated from the > > user input now. > > Swapping ops is a nasty hack in my book. > > And all that to avoid having two op structures in one driver. > Or to avoid having counters which are always 0? We don't need to advertise counters for feature that is not supported. In multiport mlx5 devices, the reload functionality is not supported, so this change at least make that device to behave like all other netdev devices that don't support devlink reload. The ops structure is set very early to make sure that internal devlink routines will be able access driver back during initialization (btw very questionable design choice), and at that stage the driver doesn't know yet which device type it is going to drive. So the answer is: 1. Can't have two structures. 2. Same behaviour across all netdev devices. > > Sorry, at the very least you need better explanation for this. Was it better explained now?
On Wed, Sep 29, 2021 at 06:56:21AM -0700, Jakub Kicinski wrote: > On Wed, 29 Sep 2021 13:46:38 +0000 Vladimir Oltean wrote: > > On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote: > > > Swapping ops is a nasty hack in my book. > > > > > > And all that to avoid having two op structures in one driver. > > > Or to avoid having counters which are always 0? > > > > > > Sorry, at the very least you need better explanation for this. > > > > Leon, while the discussion about this unfolds, can you please repost > > patch 1 separately? :) > > Yes, please and thanks! :) Done, thanks https://lore.kernel.org/netdev/2ed1159291f2a589b013914f2b60d8172fc525c1.1632925030.git.leonro@nvidia.com/T/#u
On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote: > On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote: > > On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote: > > > This series starts from the fixing the bug introduced by implementing > > > devlink delayed notifications logic, where I missed some of the > > > notifications functions. > > > > > > The rest series provides a way to dynamically set devlink ops that is > > > needed for mlx5 multiport device and starts cleanup by removing > > > not-needed logic. > > > > > > In the next series, we will delete various publish API, drop general > > > lock, annotate the code and rework logic around devlink->lock. > > > > > > All this is possible because driver initialization is separated from the > > > user input now. > > > > Swapping ops is a nasty hack in my book. > > > > And all that to avoid having two op structures in one driver. > > Or to avoid having counters which are always 0? > > We don't need to advertise counters for feature that is not supported. > In multiport mlx5 devices, the reload functionality is not supported, so > this change at least make that device to behave like all other netdev > devices that don't support devlink reload. > > The ops structure is set very early to make sure that internal devlink > routines will be able access driver back during initialization (btw very > questionable design choice) Indeed, is this fixable? Or now that devlink_register() was moved to the end of probe netdev can call ops before instance is registered? > and at that stage the driver doesn't know > yet which device type it is going to drive. > > So the answer is: > 1. Can't have two structures. I still don't understand why. To be clear - swapping full op structures is probably acceptable if it's a pure upgrade (existing pointers match). Poking new ops into a structure (in alphabetical order if I understand your reply to Greg, not destructor-before-contructor) is what I deem questionable. > 2. Same behaviour across all netdev devices. Unclear what this is referring to.
On Wed, Sep 29, 2021 at 07:39:40AM -0700, Jakub Kicinski wrote: > On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote: > > On Wed, Sep 29, 2021 at 06:40:04AM -0700, Jakub Kicinski wrote: > > > On Wed, 29 Sep 2021 15:00:41 +0300 Leon Romanovsky wrote: > > > > This series starts from the fixing the bug introduced by implementing > > > > devlink delayed notifications logic, where I missed some of the > > > > notifications functions. > > > > > > > > The rest series provides a way to dynamically set devlink ops that is > > > > needed for mlx5 multiport device and starts cleanup by removing > > > > not-needed logic. > > > > > > > > In the next series, we will delete various publish API, drop general > > > > lock, annotate the code and rework logic around devlink->lock. > > > > > > > > All this is possible because driver initialization is separated from the > > > > user input now. > > > > > > Swapping ops is a nasty hack in my book. > > > > > > And all that to avoid having two op structures in one driver. > > > Or to avoid having counters which are always 0? > > > > We don't need to advertise counters for feature that is not supported. > > In multiport mlx5 devices, the reload functionality is not supported, so > > this change at least make that device to behave like all other netdev > > devices that don't support devlink reload. > > > > The ops structure is set very early to make sure that internal devlink > > routines will be able access driver back during initialization (btw very > > questionable design choice) > > Indeed, is this fixable? Or now that devlink_register() was moved to > the end of probe netdev can call ops before instance is registered? > > > and at that stage the driver doesn't know > > yet which device type it is going to drive. > > > > So the answer is: > > 1. Can't have two structures. > > I still don't understand why. To be clear - swapping full op structures > is probably acceptable if it's a pure upgrade (existing pointers match). > Poking new ops into a structure (in alphabetical order if I understand > your reply to Greg, not destructor-before-contructor) is what I deem > questionable. It is sorted simply for readability and not for any other technical reason. Regarding new ops, this is how we are setting callbacks in RDMA based on actual device support. It works like a charm. > > > 2. Same behaviour across all netdev devices. > > Unclear what this is referring to. If your device doesn't support devlink reload, it won't print any reload counters at all. It is not the case for the multiport mlx5 device. It doesn't support, but still present these counters. Thanks
On Wed, 29 Sep 2021 18:31:51 +0300 Leon Romanovsky wrote: > On Wed, Sep 29, 2021 at 07:39:40AM -0700, Jakub Kicinski wrote: > > On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote: > > > We don't need to advertise counters for feature that is not supported. > > > In multiport mlx5 devices, the reload functionality is not supported, so > > > this change at least make that device to behave like all other netdev > > > devices that don't support devlink reload. > > > > > > The ops structure is set very early to make sure that internal devlink > > > routines will be able access driver back during initialization (btw very > > > questionable design choice) > > > > Indeed, is this fixable? Or now that devlink_register() was moved to > > the end of probe netdev can call ops before instance is registered? > > > > > and at that stage the driver doesn't know > > > yet which device type it is going to drive. > > > > > > So the answer is: > > > 1. Can't have two structures. > > > > I still don't understand why. To be clear - swapping full op structures > > is probably acceptable if it's a pure upgrade (existing pointers match). > > Poking new ops into a structure (in alphabetical order if I understand > > your reply to Greg, not destructor-before-contructor) is what I deem > > questionable. > > It is sorted simply for readability and not for any other technical > reason. > > Regarding new ops, this is how we are setting callbacks in RDMA based on > actual device support. It works like a charm. > > > > 2. Same behaviour across all netdev devices. > > > > Unclear what this is referring to. > > If your device doesn't support devlink reload, it won't print any > reload counters at all. It is not the case for the multiport mlx5 > device. It doesn't support, but still present these counters. There's myriad ways you can hide features. Swapping ops is heavy handed and prone to data races, I don't like it.
On Wed, Sep 29, 2021 at 10:55:37AM -0700, Jakub Kicinski wrote: > On Wed, 29 Sep 2021 18:31:51 +0300 Leon Romanovsky wrote: > > On Wed, Sep 29, 2021 at 07:39:40AM -0700, Jakub Kicinski wrote: > > > On Wed, 29 Sep 2021 17:13:28 +0300 Leon Romanovsky wrote: > > > > We don't need to advertise counters for feature that is not supported. > > > > In multiport mlx5 devices, the reload functionality is not supported, so > > > > this change at least make that device to behave like all other netdev > > > > devices that don't support devlink reload. > > > > > > > > The ops structure is set very early to make sure that internal devlink > > > > routines will be able access driver back during initialization (btw very > > > > questionable design choice) > > > > > > Indeed, is this fixable? Or now that devlink_register() was moved to > > > the end of probe netdev can call ops before instance is registered? > > > > > > > and at that stage the driver doesn't know > > > > yet which device type it is going to drive. > > > > > > > > So the answer is: > > > > 1. Can't have two structures. > > > > > > I still don't understand why. To be clear - swapping full op structures > > > is probably acceptable if it's a pure upgrade (existing pointers match). > > > Poking new ops into a structure (in alphabetical order if I understand > > > your reply to Greg, not destructor-before-contructor) is what I deem > > > questionable. > > > > It is sorted simply for readability and not for any other technical > > reason. > > > > Regarding new ops, this is how we are setting callbacks in RDMA based on > > actual device support. It works like a charm. > > > > > > 2. Same behaviour across all netdev devices. > > > > > > Unclear what this is referring to. > > > > If your device doesn't support devlink reload, it won't print any > > reload counters at all. It is not the case for the multiport mlx5 > > device. It doesn't support, but still present these counters. > > There's myriad ways you can hide features. > > Swapping ops is heavy handed and prone to data races, I don't like it. I'm not swapping, but setting only in supported devices. Anyway, please give me a chance to present improved version of this mechanism and we will continue from there. Thanks
From: Leon Romanovsky <leonro@nvidia.com> Changelog: v1: * Missed removal of extra WARN_ON * Added "ops parameter to macro as Dan suggested. v0: https://lore.kernel.org/all/cover.1632909221.git.leonro@nvidia.com ------------------------------------------------------------------- Hi, This series starts from the fixing the bug introduced by implementing devlink delayed notifications logic, where I missed some of the notifications functions. The rest series provides a way to dynamically set devlink ops that is needed for mlx5 multiport device and starts cleanup by removing not-needed logic. In the next series, we will delete various publish API, drop general lock, annotate the code and rework logic around devlink->lock. All this is possible because driver initialization is separated from the user input now. Thanks Leon Romanovsky (5): devlink: Add missed notifications iterators devlink: Allow modification of devlink ops devlink: Allow set specific ops callbacks dynamically net/mlx5: Register separate reload devlink ops for multiport device devlink: Delete reload enable/disable interface .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 6 +- .../net/ethernet/cavium/liquidio/lio_main.c | 2 +- .../freescale/dpaa2/dpaa2-eth-devlink.c | 2 +- .../hisilicon/hns3/hns3pf/hclge_devlink.c | 5 +- .../hisilicon/hns3/hns3vf/hclgevf_devlink.c | 5 +- .../net/ethernet/huawei/hinic/hinic_devlink.c | 2 +- drivers/net/ethernet/intel/ice/ice_devlink.c | 2 +- .../marvell/octeontx2/af/rvu_devlink.c | 2 +- .../marvell/prestera/prestera_devlink.c | 2 +- drivers/net/ethernet/mellanox/mlx4/main.c | 4 +- .../net/ethernet/mellanox/mlx5/core/devlink.c | 15 +- .../net/ethernet/mellanox/mlx5/core/main.c | 3 - .../mellanox/mlx5/core/sf/dev/driver.c | 5 +- drivers/net/ethernet/mellanox/mlxsw/core.c | 12 +- drivers/net/ethernet/mscc/ocelot.h | 2 +- drivers/net/ethernet/mscc/ocelot_net.c | 2 +- .../net/ethernet/netronome/nfp/nfp_devlink.c | 2 +- drivers/net/ethernet/netronome/nfp/nfp_main.h | 2 +- .../ethernet/pensando/ionic/ionic_devlink.c | 2 +- drivers/net/ethernet/qlogic/qed/qed_devlink.c | 2 +- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +- drivers/net/ethernet/ti/cpsw_new.c | 2 +- drivers/net/netdevsim/dev.c | 5 +- drivers/ptp/ptp_ocp.c | 2 +- drivers/staging/qlge/qlge_main.c | 2 +- include/net/devlink.h | 15 +- net/core/devlink.c | 156 ++++++++++-------- net/dsa/dsa2.c | 2 +- 28 files changed, 131 insertions(+), 134 deletions(-)