Message ID | 20231115115537.420374-1-shaozhengchao@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v3] bonding: use WARN_ON instead of BUG in alb_upper_dev_walk | expand |
Wed, Nov 15, 2023 at 12:55:37PM CET, shaozhengchao@huawei.com wrote: >If failed to allocate "tags" or could not find the final upper device from >start_dev's upper list in bond_verify_device_path(), only the loopback >detection of the current upper device should be affected, and the system is >no need to be panic. > >Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >--- >v3: return -ENOMEM instead of zero to stop walk >v2: use WARN_ON_ONCE instead of WARN_ON Yet the WARN_ON is back :O >--- > drivers/net/bonding/bond_alb.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > >diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c >index dc2c7b979656..21f1cb8e453b 100644 >--- a/drivers/net/bonding/bond_alb.c >+++ b/drivers/net/bonding/bond_alb.c >@@ -984,8 +984,10 @@ static int alb_upper_dev_walk(struct net_device *upper, > */ > if (netif_is_macvlan(upper) && !strict_match) { > tags = bond_verify_device_path(bond->dev, upper, 0); >- if (IS_ERR_OR_NULL(tags)) >- BUG(); >+ if (IS_ERR_OR_NULL(tags)) { >+ WARN_ON(1); >+ return -ENOMEM; >+ } > alb_send_lp_vid(slave, upper->dev_addr, > tags[0].vlan_proto, tags[0].vlan_id); > kfree(tags); >-- >2.34.1 > >
On Wed, Nov 15, 2023 at 03:22:39PM +0100, Jiri Pirko wrote: > Wed, Nov 15, 2023 at 12:55:37PM CET, shaozhengchao@huawei.com wrote: > >If failed to allocate "tags" or could not find the final upper device from > >start_dev's upper list in bond_verify_device_path(), only the loopback > >detection of the current upper device should be affected, and the system is > >no need to be panic. > > > >Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> > >--- > >v3: return -ENOMEM instead of zero to stop walk > >v2: use WARN_ON_ONCE instead of WARN_ON > > Yet the WARN_ON is back :O Hi Jiri, I think the suggestion was to either: 1. WARN_ON_ONCE(); return 0; <= this was v2 2. WARN_ON(); return -ESOMETHING; <= this is v3 (But not, WARN_ON(); return 0; <= this was v1) And after v2 it was determined that the approach taken here in v3 is preferred. So I think this patch is consistent with the feedback given by Jay in his reviews so far. > > > >--- > > drivers/net/bonding/bond_alb.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > >diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c > >index dc2c7b979656..21f1cb8e453b 100644 > >--- a/drivers/net/bonding/bond_alb.c > >+++ b/drivers/net/bonding/bond_alb.c > >@@ -984,8 +984,10 @@ static int alb_upper_dev_walk(struct net_device *upper, > > */ > > if (netif_is_macvlan(upper) && !strict_match) { > > tags = bond_verify_device_path(bond->dev, upper, 0); > >- if (IS_ERR_OR_NULL(tags)) > >- BUG(); > >+ if (IS_ERR_OR_NULL(tags)) { > >+ WARN_ON(1); > >+ return -ENOMEM; > >+ } > > alb_send_lp_vid(slave, upper->dev_addr, > > tags[0].vlan_proto, tags[0].vlan_id); > > kfree(tags); > >-- > >2.34.1 > > > > >
Simon Horman <horms@kernel.org> wrote: >On Wed, Nov 15, 2023 at 03:22:39PM +0100, Jiri Pirko wrote: >> Wed, Nov 15, 2023 at 12:55:37PM CET, shaozhengchao@huawei.com wrote: >> >If failed to allocate "tags" or could not find the final upper device from >> >start_dev's upper list in bond_verify_device_path(), only the loopback >> >detection of the current upper device should be affected, and the system is >> >no need to be panic. >> > >> >Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >> >--- >> >v3: return -ENOMEM instead of zero to stop walk >> >v2: use WARN_ON_ONCE instead of WARN_ON >> >> Yet the WARN_ON is back :O > >Hi Jiri, > >I think the suggestion was to either: > >1. WARN_ON_ONCE(); return 0; <= this was v2 >2. WARN_ON(); return -ESOMETHING; <= this is v3 >(But not, WARN_ON(); return 0; <= this was v1) > >And after v2 it was determined that the approach taken here in v3 is >preferred. > >So I think this patch is consistent with the feedback given by Jay >in his reviews so far. Sigh, the more I look the more complicated this gets. Anyway, I was previously thinking we're ok with WARN_ON if the return is non-zero to terminate the device walk. The bond itself will automatically call alb_upper_dev_walk at most once per second. However, user space could do something like continuously change the MAC address of the bond or initiate a failover in order to trigger a call to alb_upper_dev_walk. This won't be rate limited, and if the allocations there repeatedly fail, it would always trigger the WARN_ON. So, I'm thinking now that instead of WARN_anything, we should instead use something like net_err_ratelimited("%s: %s: allocation failure\n", start_dev->name, __func__); in bond_verify_device_path, and alb_upper_dev_walk doesn't do any WARN at all, and returns failure (non-zero). This is consistent with other similar allocation failures. -J >> >--- >> > drivers/net/bonding/bond_alb.c | 6 ++++-- >> > 1 file changed, 4 insertions(+), 2 deletions(-) >> > >> >diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c >> >index dc2c7b979656..21f1cb8e453b 100644 >> >--- a/drivers/net/bonding/bond_alb.c >> >+++ b/drivers/net/bonding/bond_alb.c >> >@@ -984,8 +984,10 @@ static int alb_upper_dev_walk(struct net_device *upper, >> > */ >> > if (netif_is_macvlan(upper) && !strict_match) { >> > tags = bond_verify_device_path(bond->dev, upper, 0); >> >- if (IS_ERR_OR_NULL(tags)) >> >- BUG(); >> >+ if (IS_ERR_OR_NULL(tags)) { >> >+ WARN_ON(1); >> >+ return -ENOMEM; >> >+ } >> > alb_send_lp_vid(slave, upper->dev_addr, >> > tags[0].vlan_proto, tags[0].vlan_id); >> > kfree(tags); >> >-- >> >2.34.1 >> > >> > >> > --- -Jay Vosburgh, jay.vosburgh@canonical.com
On 2023/11/16 4:34, Jay Vosburgh wrote: > Simon Horman <horms@kernel.org> wrote: > >> On Wed, Nov 15, 2023 at 03:22:39PM +0100, Jiri Pirko wrote: >>> Wed, Nov 15, 2023 at 12:55:37PM CET, shaozhengchao@huawei.com wrote: >>>> If failed to allocate "tags" or could not find the final upper device from >>>> start_dev's upper list in bond_verify_device_path(), only the loopback >>>> detection of the current upper device should be affected, and the system is >>>> no need to be panic. >>>> >>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> >>>> --- >>>> v3: return -ENOMEM instead of zero to stop walk >>>> v2: use WARN_ON_ONCE instead of WARN_ON >>> >>> Yet the WARN_ON is back :O >> >> Hi Jiri, >> >> I think the suggestion was to either: >> >> 1. WARN_ON_ONCE(); return 0; <= this was v2 >> 2. WARN_ON(); return -ESOMETHING; <= this is v3 >> (But not, WARN_ON(); return 0; <= this was v1) >> >> And after v2 it was determined that the approach taken here in v3 is >> preferred. >> >> So I think this patch is consistent with the feedback given by Jay >> in his reviews so far. > > Sigh, the more I look the more complicated this gets. > > Anyway, I was previously thinking we're ok with WARN_ON if the > return is non-zero to terminate the device walk. The bond itself will > automatically call alb_upper_dev_walk at most once per second. > > However, user space could do something like continuously change > the MAC address of the bond or initiate a failover in order to trigger a > call to alb_upper_dev_walk. This won't be rate limited, and if the > allocations there repeatedly fail, it would always trigger the WARN_ON. > Yes, it will be bad. > So, I'm thinking now that instead of WARN_anything, we should > instead use something like > > net_err_ratelimited("%s: %s: allocation failure\n", start_dev->name, __func__); > > in bond_verify_device_path, and alb_upper_dev_walk doesn't do > any WARN at all, and returns failure (non-zero). > > This is consistent with other similar allocation failures. > Maybe you are right here. Thanks Zhengchao Shao > -J > >>>> --- >>>> drivers/net/bonding/bond_alb.c | 6 ++++-- >>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c >>>> index dc2c7b979656..21f1cb8e453b 100644 >>>> --- a/drivers/net/bonding/bond_alb.c >>>> +++ b/drivers/net/bonding/bond_alb.c >>>> @@ -984,8 +984,10 @@ static int alb_upper_dev_walk(struct net_device *upper, >>>> */ >>>> if (netif_is_macvlan(upper) && !strict_match) { >>>> tags = bond_verify_device_path(bond->dev, upper, 0); >>>> - if (IS_ERR_OR_NULL(tags)) >>>> - BUG(); >>>> + if (IS_ERR_OR_NULL(tags)) { >>>> + WARN_ON(1); >>>> + return -ENOMEM; >>>> + } >>>> alb_send_lp_vid(slave, upper->dev_addr, >>>> tags[0].vlan_proto, tags[0].vlan_id); >>>> kfree(tags); >>>> -- >>>> 2.34.1 >>>> >>>> >>> >> > > --- > -Jay Vosburgh, jay.vosburgh@canonical.com
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c index dc2c7b979656..21f1cb8e453b 100644 --- a/drivers/net/bonding/bond_alb.c +++ b/drivers/net/bonding/bond_alb.c @@ -984,8 +984,10 @@ static int alb_upper_dev_walk(struct net_device *upper, */ if (netif_is_macvlan(upper) && !strict_match) { tags = bond_verify_device_path(bond->dev, upper, 0); - if (IS_ERR_OR_NULL(tags)) - BUG(); + if (IS_ERR_OR_NULL(tags)) { + WARN_ON(1); + return -ENOMEM; + } alb_send_lp_vid(slave, upper->dev_addr, tags[0].vlan_proto, tags[0].vlan_id); kfree(tags);
If failed to allocate "tags" or could not find the final upper device from start_dev's upper list in bond_verify_device_path(), only the loopback detection of the current upper device should be affected, and the system is no need to be panic. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> --- v3: return -ENOMEM instead of zero to stop walk v2: use WARN_ON_ONCE instead of WARN_ON --- drivers/net/bonding/bond_alb.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)