Message ID | 20230420082230.2968883-2-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bonding: fix send_peer_notif overflow | expand |
Hi Hangbin,
kernel test robot noticed the following build errors:
[auto build test ERROR on net/main]
url: https://github.com/intel-lab-lkp/linux/commits/Hangbin-Liu/bonding-fix-send_peer_notif-overflow/20230420-162411
patch link: https://lore.kernel.org/r/20230420082230.2968883-2-liuhangbin%40gmail.com
patch subject: [PATCH net 1/4] bonding: fix send_peer_notif overflow
config: parisc-defconfig (https://download.01.org/0day-ci/archive/20230420/202304202222.eUq4Xfv8-lkp@intel.com/config)
compiler: hppa-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/5bf7296696ea0aa3997bf310fae2aa5cf62a3af5
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Hangbin-Liu/bonding-fix-send_peer_notif-overflow/20230420-162411
git checkout 5bf7296696ea0aa3997bf310fae2aa5cf62a3af5
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=parisc olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=parisc SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304202222.eUq4Xfv8-lkp@intel.com/
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "__umoddi3" [drivers/net/bonding/bonding.ko] undefined!
kernel test robot <lkp@intel.com> wrote: >Hi Hangbin, > >kernel test robot noticed the following build errors: > >[auto build test ERROR on net/main] > >url: https://github.com/intel-lab-lkp/linux/commits/Hangbin-Liu/bonding-fix-send_peer_notif-overflow/20230420-162411 >patch link: https://lore.kernel.org/r/20230420082230.2968883-2-liuhangbin%40gmail.com >patch subject: [PATCH net 1/4] bonding: fix send_peer_notif overflow >config: parisc-defconfig (https://download.01.org/0day-ci/archive/20230420/202304202222.eUq4Xfv8-lkp@intel.com/config) >compiler: hppa-linux-gcc (GCC) 12.1.0 >reproduce (this is a W=1 build): > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # https://github.com/intel-lab-lkp/linux/commit/5bf7296696ea0aa3997bf310fae2aa5cf62a3af5 > git remote add linux-review https://github.com/intel-lab-lkp/linux > git fetch --no-tags linux-review Hangbin-Liu/bonding-fix-send_peer_notif-overflow/20230420-162411 > git checkout 5bf7296696ea0aa3997bf310fae2aa5cf62a3af5 > # save the config file > mkdir build_dir && cp config build_dir/.config > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=parisc olddefconfig > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=parisc SHELL=/bin/bash > >If you fix the issue, kindly add following tag where applicable >| Reported-by: kernel test robot <lkp@intel.com> >| Link: https://lore.kernel.org/oe-kbuild-all/202304202222.eUq4Xfv8-lkp@intel.com/ > >All errors (new ones prefixed by >>, old ones prefixed by <<): > >>> ERROR: modpost: "__umoddi3" [drivers/net/bonding/bonding.ko] undefined! I assume this is related to send_peer_notif now being u64 in the modulus at: static bool bond_should_notify_peers(struct bonding *bond) { [...] if (!slave || !bond->send_peer_notif || bond->send_peer_notif % max(1, bond->params.peer_notif_delay) != 0 || but I'm unsure if this is a real coding error, or some issue with the parisc arch specifically? -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Thu, 20 Apr 2023 08:59:40 -0700 Jay Vosburgh wrote: > >All errors (new ones prefixed by >>, old ones prefixed by <<): > > > >>> ERROR: modpost: "__umoddi3" [drivers/net/bonding/bonding.ko] undefined! > > I assume this is related to send_peer_notif now being u64 in the > modulus at: > > static bool bond_should_notify_peers(struct bonding *bond) > { > [...] > if (!slave || !bond->send_peer_notif || > bond->send_peer_notif % > max(1, bond->params.peer_notif_delay) != 0 || > > but I'm unsure if this is a real coding error, or some issue > with the parisc arch specifically? Coding error, I think. An appropriate helper from linux/math64.h should be used.
On Thu, Apr 20, 2023 at 04:21:39PM -0700, Jakub Kicinski wrote: > On Thu, 20 Apr 2023 08:59:40 -0700 Jay Vosburgh wrote: > > >All errors (new ones prefixed by >>, old ones prefixed by <<): > > > > > >>> ERROR: modpost: "__umoddi3" [drivers/net/bonding/bonding.ko] undefined! > > > > I assume this is related to send_peer_notif now being u64 in the > > modulus at: > > > > static bool bond_should_notify_peers(struct bonding *bond) > > { > > [...] > > if (!slave || !bond->send_peer_notif || > > bond->send_peer_notif % > > max(1, bond->params.peer_notif_delay) != 0 || > > > > but I'm unsure if this is a real coding error, or some issue > > with the parisc arch specifically? > > Coding error, I think. > An appropriate helper from linux/math64.h should be used. It looks define send_peer_notif to u64 is a bit too large, which introduce complex conversion for 32bit arch. For the remainder operation, bond->send_peer_notif % max(1, bond->params.peer_notif_delay). u32 % u32 look OK. But for multiplication operation, bond->send_peer_notif = bond->params.num_peer_notif * max(1, bond->params.peer_notif_delay); It's u8 * u32. How about let's limit the peer_notif_delay to less than max(u32 / u8), then we can just use u32 for send_peer_notif. Is there any realistic meaning to set peer_notif_delay to max(u32)? I don't think so. Jay, what do you think? Thanks Hangbin
Hangbin Liu <liuhangbin@gmail.com> wrote: >On Thu, Apr 20, 2023 at 04:21:39PM -0700, Jakub Kicinski wrote: >> On Thu, 20 Apr 2023 08:59:40 -0700 Jay Vosburgh wrote: >> > >All errors (new ones prefixed by >>, old ones prefixed by <<): >> > > >> > >>> ERROR: modpost: "__umoddi3" [drivers/net/bonding/bonding.ko] undefined! >> > >> > I assume this is related to send_peer_notif now being u64 in the >> > modulus at: >> > >> > static bool bond_should_notify_peers(struct bonding *bond) >> > { >> > [...] >> > if (!slave || !bond->send_peer_notif || >> > bond->send_peer_notif % >> > max(1, bond->params.peer_notif_delay) != 0 || >> > >> > but I'm unsure if this is a real coding error, or some issue >> > with the parisc arch specifically? >> >> Coding error, I think. >> An appropriate helper from linux/math64.h should be used. > >It looks define send_peer_notif to u64 is a bit too large, which introduce >complex conversion for 32bit arch. > >For the remainder operation, >bond->send_peer_notif % max(1, bond->params.peer_notif_delay). u32 % u32 look OK. > >But for multiplication operation, >bond->send_peer_notif = bond->params.num_peer_notif * max(1, bond->params.peer_notif_delay); >It's u8 * u32. How about let's limit the peer_notif_delay to less than max(u32 / u8), >then we can just use u32 for send_peer_notif. Is there any realistic meaning >to set peer_notif_delay to max(u32)? I don't think so. > >Jay, what do you think? I'm fine to limit the peerf_notif_delay range and then use a smaller type. num_peer_notif is already limited to 255; I'm going to suggest a limit to the delay of 300 seconds. That seems like an absurdly long time for this; I didn't do any kind of science to come up with that number. As peer_notif_delay is stored in units of miimon intervals, that gives a worst case peer_notif_delay value of 300000 if miimon is 1, and 255 * 300000 fits easily in a u32 for send_peer_notif. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Thu, Apr 20, 2023 at 10:13:17PM -0700, Jay Vosburgh wrote: > >It looks define send_peer_notif to u64 is a bit too large, which introduce > >complex conversion for 32bit arch. > > > >For the remainder operation, > >bond->send_peer_notif % max(1, bond->params.peer_notif_delay). u32 % u32 look OK. > > > >But for multiplication operation, > >bond->send_peer_notif = bond->params.num_peer_notif * max(1, bond->params.peer_notif_delay); > >It's u8 * u32. How about let's limit the peer_notif_delay to less than max(u32 / u8), > >then we can just use u32 for send_peer_notif. Is there any realistic meaning > >to set peer_notif_delay to max(u32)? I don't think so. > > > >Jay, what do you think? > > I'm fine to limit the peerf_notif_delay range and then use a > smaller type. > > num_peer_notif is already limited to 255; I'm going to suggest a > limit to the delay of 300 seconds. That seems like an absurdly long > time for this; I didn't do any kind of science to come up with that > number. > > As peer_notif_delay is stored in units of miimon intervals, that > gives a worst case peer_notif_delay value of 300000 if miimon is 1, and > 255 * 300000 fits easily in a u32 for send_peer_notif. OK, I just found another overflow. In bond_fill_info(), or bond_option_miimon_set(): if (nla_put_u32(skb, IFLA_BOND_PEER_NOTIF_DELAY, bond->params.peer_notif_delay * bond->params.miimon)) goto nla_put_failure; Since both peer_notif_delay and miimon are defined as int, there is a possibility that the fill in number got overflowed. The same with up/down delay. Even we limit the peer_notif_delay to 300s, which is 30000, there is still has possibility got overflowed if we set miimon large enough. This overflow should only has effect on use space shown since it's a multiplication result. The kernel part works fine. I'm not sure if we should also limit the miimon, up/down delay values.. Thanks Hangbin
On Fri, Apr 21, 2023 at 05:55:16PM +0800, Hangbin Liu wrote: > > I'm fine to limit the peerf_notif_delay range and then use a > > smaller type. > > > > num_peer_notif is already limited to 255; I'm going to suggest a > > limit to the delay of 300 seconds. That seems like an absurdly long > > time for this; I didn't do any kind of science to come up with that > > number. > > > > As peer_notif_delay is stored in units of miimon intervals, that > > gives a worst case peer_notif_delay value of 300000 if miimon is 1, and > > 255 * 300000 fits easily in a u32 for send_peer_notif. > > OK, I just found another overflow. In bond_fill_info(), > or bond_option_miimon_set(): > > if (nla_put_u32(skb, IFLA_BOND_PEER_NOTIF_DELAY, > bond->params.peer_notif_delay * bond->params.miimon)) > goto nla_put_failure; > > Since both peer_notif_delay and miimon are defined as int, there is a > possibility that the fill in number got overflowed. The same with up/down delay. > > Even we limit the peer_notif_delay to 300s, which is 30000, there is still has > possibility got overflowed if we set miimon large enough. > > This overflow should only has effect on use space shown since it's a > multiplication result. The kernel part works fine. I'm not sure if we should > also limit the miimon, up/down delay values.. Hi Jay, Any comments for this issue? Should I send the send_peer_notif fix first and discuss the miimon, up/down delay userspace overflow issue later? Thanks Hangbin
Hangbin Liu <liuhangbin@gmail.com> wrote: >On Fri, Apr 21, 2023 at 05:55:16PM +0800, Hangbin Liu wrote: >> > I'm fine to limit the peerf_notif_delay range and then use a >> > smaller type. >> > >> > num_peer_notif is already limited to 255; I'm going to suggest a >> > limit to the delay of 300 seconds. That seems like an absurdly long >> > time for this; I didn't do any kind of science to come up with that >> > number. >> > >> > As peer_notif_delay is stored in units of miimon intervals, that >> > gives a worst case peer_notif_delay value of 300000 if miimon is 1, and >> > 255 * 300000 fits easily in a u32 for send_peer_notif. >> >> OK, I just found another overflow. In bond_fill_info(), >> or bond_option_miimon_set(): >> >> if (nla_put_u32(skb, IFLA_BOND_PEER_NOTIF_DELAY, >> bond->params.peer_notif_delay * bond->params.miimon)) >> goto nla_put_failure; >> >> Since both peer_notif_delay and miimon are defined as int, there is a >> possibility that the fill in number got overflowed. The same with up/down delay. >> >> Even we limit the peer_notif_delay to 300s, which is 30000, there is still has >> possibility got overflowed if we set miimon large enough. >> >> This overflow should only has effect on use space shown since it's a >> multiplication result. The kernel part works fine. I'm not sure if we should >> also limit the miimon, up/down delay values.. > >Hi Jay, > >Any comments for this issue? Should I send the send_peer_notif fix first and >discuss the miimon, up/down delay userspace overflow issue later? Let's sort out the current send_peer_notif problems first. I don't see that the lack of upper bounds for miimon or up/down delay causes issues for any reasonable configuration, so it can wait a bit. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
diff --git a/include/net/bonding.h b/include/net/bonding.h index c3843239517d..f4d376420e4d 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -233,7 +233,7 @@ struct bonding { */ spinlock_t mode_lock; spinlock_t stats_lock; - u8 send_peer_notif; + u64 send_peer_notif; u8 igmp_retrans; #ifdef CONFIG_PROC_FS struct proc_dir_entry *proc_entry;
Bonding send_peer_notif was defined as u8. Since commit 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications"). the bond->send_peer_notif will be num_peer_notif multiplied by peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif overflow easily. e.g. ip link add bond0 type bond mode 1 miimon 100 num_grat_arp 30 peer_notify_delay 1000 To fix the overflow, we have to set the send_peer_notif large enough. Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") Reported-by: Liang Li <liali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- include/net/bonding.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)