Message ID | 20230420-bonding-be-vlan-proto-v1-1-754399f51d01@kernel.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bonding: Always assign be16 value to vlan_proto | expand |
Simon Horman <horms@kernel.org> wrote: >The type of the vlan_proto field is __be16. >And most users of the field use it as such. > >In the case of setting or testing the field for the >special VLAN_N_VID value, host byte order is used. >Which seems incorrect. > >Address this issue by converting VLAN_N_VID to __be16. > >I don't believe this is a bug because VLAN_N_VID in >both little-endian (and big-endian) byte order does >not conflict with any valid values (0 through VLAN_N_VID - 1) >in big-endian byte order. Is that true for all cases, or am I just confused? Doesn't VLAN ID 16 match VLAN_N_VID (which is 4096) if byte swapped? I.e., on a little endian host, VLAN_N_VID is 0x1000 natively, and network byte order (big endian) of VLAN ID 16 is also 0x1000. Either way, I think the change is fine; VLAN_N_VID is being used as a sentinel value here, so the only real requirement is that it not match an actual VLAN ID in network byte order. -J >Reported by sparse as: > > .../bond_main.c:2857:26: warning: restricted __be16 degrades to integer > .../bond_main.c:2863:20: warning: restricted __be16 degrades to integer > .../bond_main.c:2939:40: warning: incorrect type in assignment (different base types) > .../bond_main.c:2939:40: expected restricted __be16 [usertype] vlan_proto > .../bond_main.c:2939:40: got int > >No functional changes intended. >Compile tested only. > >Signed-off-by: Simon Horman <horms@kernel.org> >--- > drivers/net/bonding/bond_main.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index db7e650d9ebb..7f4c75fe58e1 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -2854,13 +2854,13 @@ static bool bond_handle_vlan(struct slave *slave, struct bond_vlan_tag *tags, > struct net_device *slave_dev = slave->dev; > struct bond_vlan_tag *outer_tag = tags; > >- if (!tags || tags->vlan_proto == VLAN_N_VID) >+ if (!tags || tags->vlan_proto == cpu_to_be16(VLAN_N_VID)) > return true; > > tags++; > > /* Go through all the tags backwards and add them to the packet */ >- while (tags->vlan_proto != VLAN_N_VID) { >+ while (tags->vlan_proto != cpu_to_be16(VLAN_N_VID)) { > if (!tags->vlan_id) { > tags++; > continue; >@@ -2936,7 +2936,7 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev, > tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC); > if (!tags) > return ERR_PTR(-ENOMEM); >- tags[level].vlan_proto = VLAN_N_VID; >+ tags[level].vlan_proto = cpu_to_be16(VLAN_N_VID); > return tags; > } --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Thu, Apr 20, 2023 at 12:47:33PM -0700, Jay Vosburgh wrote: > Simon Horman <horms@kernel.org> wrote: > > >The type of the vlan_proto field is __be16. > >And most users of the field use it as such. > > > >In the case of setting or testing the field for the > >special VLAN_N_VID value, host byte order is used. > >Which seems incorrect. > > > >Address this issue by converting VLAN_N_VID to __be16. > > > >I don't believe this is a bug because VLAN_N_VID in > >both little-endian (and big-endian) byte order does > >not conflict with any valid values (0 through VLAN_N_VID - 1) > >in big-endian byte order. > > Is that true for all cases, or am I just confused? Doesn't VLAN > ID 16 match VLAN_N_VID (which is 4096) if byte swapped? > > I.e., on a little endian host, VLAN_N_VID is 0x1000 natively, > and network byte order (big endian) of VLAN ID 16 is also 0x1000. > > Either way, I think the change is fine; VLAN_N_VID is being used > as a sentinel value here, so the only real requirement is that it not > match an actual VLAN ID in network byte order. > > -J In a strange twist of events, VLAN_N_VID is assigned as a sentinel value to a variable which usually holds the output of vlan_dev_vlan_proto(), or i.o.w. values like htons(ETH_P_8021Q), htons(ETH_P_8021AD). It is certainly a confusion of types to assign VLAN_N_VID to it, but at least it's not a valid VLAN protocol. To answer your question, tags->vlan_proto is never compared against a VLAN ID.
Vladimir Oltean <olteanv@gmail.com> wrote: >On Thu, Apr 20, 2023 at 12:47:33PM -0700, Jay Vosburgh wrote: >> Simon Horman <horms@kernel.org> wrote: >> >> >The type of the vlan_proto field is __be16. >> >And most users of the field use it as such. >> > >> >In the case of setting or testing the field for the >> >special VLAN_N_VID value, host byte order is used. >> >Which seems incorrect. >> > >> >Address this issue by converting VLAN_N_VID to __be16. >> > >> >I don't believe this is a bug because VLAN_N_VID in >> >both little-endian (and big-endian) byte order does >> >not conflict with any valid values (0 through VLAN_N_VID - 1) >> >in big-endian byte order. >> >> Is that true for all cases, or am I just confused? Doesn't VLAN >> ID 16 match VLAN_N_VID (which is 4096) if byte swapped? >> >> I.e., on a little endian host, VLAN_N_VID is 0x1000 natively, >> and network byte order (big endian) of VLAN ID 16 is also 0x1000. >> >> Either way, I think the change is fine; VLAN_N_VID is being used >> as a sentinel value here, so the only real requirement is that it not >> match an actual VLAN ID in network byte order. >> >> -J > >In a strange twist of events, VLAN_N_VID is assigned as a sentinel value >to a variable which usually holds the output of vlan_dev_vlan_proto(), >or i.o.w. values like htons(ETH_P_8021Q), htons(ETH_P_8021AD). It is >certainly a confusion of types to assign VLAN_N_VID to it, but at least >it's not a valid VLAN protocol. > >To answer your question, tags->vlan_proto is never compared against a >VLAN ID. Yah, looking again I see that now; I was checking the math on Simon's statement about "0 through VLAN_N_VID - 1". So, I think the patch is correct, but the commit message should really explain the reality. And, perhaps we should use 0 or 0xffff for the sentinel, since neither are valid Ethernet protocol IDs. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Thu, Apr 20, 2023 at 02:23:32PM -0700, Jay Vosburgh wrote: > Vladimir Oltean <olteanv@gmail.com> wrote: > > >On Thu, Apr 20, 2023 at 12:47:33PM -0700, Jay Vosburgh wrote: > >> Simon Horman <horms@kernel.org> wrote: > >> > >> >The type of the vlan_proto field is __be16. > >> >And most users of the field use it as such. > >> > > >> >In the case of setting or testing the field for the > >> >special VLAN_N_VID value, host byte order is used. > >> >Which seems incorrect. > >> > > >> >Address this issue by converting VLAN_N_VID to __be16. > >> > > >> >I don't believe this is a bug because VLAN_N_VID in > >> >both little-endian (and big-endian) byte order does > >> >not conflict with any valid values (0 through VLAN_N_VID - 1) > >> >in big-endian byte order. > >> > >> Is that true for all cases, or am I just confused? Doesn't VLAN > >> ID 16 match VLAN_N_VID (which is 4096) if byte swapped? > >> > >> I.e., on a little endian host, VLAN_N_VID is 0x1000 natively, > >> and network byte order (big endian) of VLAN ID 16 is also 0x1000. > >> > >> Either way, I think the change is fine; VLAN_N_VID is being used > >> as a sentinel value here, so the only real requirement is that it not > >> match an actual VLAN ID in network byte order. > >> > >> -J > > > >In a strange twist of events, VLAN_N_VID is assigned as a sentinel value > >to a variable which usually holds the output of vlan_dev_vlan_proto(), > >or i.o.w. values like htons(ETH_P_8021Q), htons(ETH_P_8021AD). It is > >certainly a confusion of types to assign VLAN_N_VID to it, but at least > >it's not a valid VLAN protocol. > > > >To answer your question, tags->vlan_proto is never compared against a > >VLAN ID. > > Yah, looking again I see that now; I was checking the math on > Simon's statement about "0 through VLAN_N_VID - 1". > > So, I think the patch is correct, but the commit message should > really explain the reality. And, perhaps we should use 0 or 0xffff for > the sentinel, since neither are valid Ethernet protocol IDs. Hi Jay and Vladimir, Thanks for your review. Firstly, sorry for the distraction about the VLAN_N_VID math. I agree it was incorrect. I had an out by one bug in my thought process which was about 0x0fff instead of 0x1000. Secondly, sorry for missing the central issue that it is a bit weird to use a VID related value as a sentinel for a protocol field. I agree it would be best to chose a different value. In reference to the list of EtherTypes [1]. I think 0 might be ok, but perhaps not ideal as technically it means a value of 0 for the IEEE802.3 Length Field (although perhaps it can never mean that in this context). OTOH, 0xffff, is 'reserved' ([1] references RFC1701 [2]), so perhaps it is a good choice. In any case, I'm open to suggestions. I'll probably hold off until the v6.5 cycle before reposting, unless -rc8 appears next week. I'd rather not rush this one given that I seem to have already got it wrong once. [1] https://www.iana.org/assignments/ieee-802-numbers/ieee-802-numbers.xhtml#ieee-802-numbers-1 [2] https://www.rfc-editor.org/rfc/rfc1701.html
On Fri, Apr 21, 2023 at 09:01:34AM +0200, Simon Horman wrote: > Hi Jay and Vladimir, > > Thanks for your review. > > Firstly, sorry for the distraction about the VLAN_N_VID math. I agree it > was incorrect. I had an out by one bug in my thought process which was > about 0x0fff instead of 0x1000. > > Secondly, sorry for missing the central issue that it is a bit weird > to use a VID related value as a sentinel for a protocol field. > I agree it would be best to chose a different value. > > In reference to the list of EtherTypes [1]. I think 0 might be ok, > but perhaps not ideal as technically it means a value of 0 for the > IEEE802.3 Length Field (although perhaps it can never mean that in this > context). > > OTOH, 0xffff, is 'reserved' ([1] references RFC1701 [2]), > so perhaps it is a good choice. > > In any case, I'm open to suggestions. > I'll probably hold off until the v6.5 cycle before reposting, > unless -rc8 appears next week. I'd rather not rush this one > given that I seem to have already got it wrong once. > > [1] https://www.iana.org/assignments/ieee-802-numbers/ieee-802-numbers.xhtml#ieee-802-numbers-1 > [2] https://www.rfc-editor.org/rfc/rfc1701.html Any value would work as long as it's not a valid VLAN protocol. I would #define BOND_VLAN_PROTO_NONE htons(0xffff) and use that.
Vladimir Oltean <olteanv@gmail.com> wrote: >On Fri, Apr 21, 2023 at 09:01:34AM +0200, Simon Horman wrote: >> Hi Jay and Vladimir, >> >> Thanks for your review. >> >> Firstly, sorry for the distraction about the VLAN_N_VID math. I agree it >> was incorrect. I had an out by one bug in my thought process which was >> about 0x0fff instead of 0x1000. >> >> Secondly, sorry for missing the central issue that it is a bit weird >> to use a VID related value as a sentinel for a protocol field. >> I agree it would be best to chose a different value. >> >> In reference to the list of EtherTypes [1]. I think 0 might be ok, >> but perhaps not ideal as technically it means a value of 0 for the >> IEEE802.3 Length Field (although perhaps it can never mean that in this >> context). >> >> OTOH, 0xffff, is 'reserved' ([1] references RFC1701 [2]), >> so perhaps it is a good choice. >> >> In any case, I'm open to suggestions. >> I'll probably hold off until the v6.5 cycle before reposting, >> unless -rc8 appears next week. I'd rather not rush this one >> given that I seem to have already got it wrong once. >> >> [1] https://www.iana.org/assignments/ieee-802-numbers/ieee-802-numbers.xhtml#ieee-802-numbers-1 >> [2] https://www.rfc-editor.org/rfc/rfc1701.html > >Any value would work as long as it's not a valid VLAN protocol. >I would #define BOND_VLAN_PROTO_NONE htons(0xffff) and use that. All of the above is fine with me; this isn't an urgent change. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index db7e650d9ebb..7f4c75fe58e1 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2854,13 +2854,13 @@ static bool bond_handle_vlan(struct slave *slave, struct bond_vlan_tag *tags, struct net_device *slave_dev = slave->dev; struct bond_vlan_tag *outer_tag = tags; - if (!tags || tags->vlan_proto == VLAN_N_VID) + if (!tags || tags->vlan_proto == cpu_to_be16(VLAN_N_VID)) return true; tags++; /* Go through all the tags backwards and add them to the packet */ - while (tags->vlan_proto != VLAN_N_VID) { + while (tags->vlan_proto != cpu_to_be16(VLAN_N_VID)) { if (!tags->vlan_id) { tags++; continue; @@ -2936,7 +2936,7 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev, tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC); if (!tags) return ERR_PTR(-ENOMEM); - tags[level].vlan_proto = VLAN_N_VID; + tags[level].vlan_proto = cpu_to_be16(VLAN_N_VID); return tags; }
The type of the vlan_proto field is __be16. And most users of the field use it as such. In the case of setting or testing the field for the special VLAN_N_VID value, host byte order is used. Which seems incorrect. Address this issue by converting VLAN_N_VID to __be16. I don't believe this is a bug because VLAN_N_VID in both little-endian (and big-endian) byte order does not conflict with any valid values (0 through VLAN_N_VID - 1) in big-endian byte order. Reported by sparse as: .../bond_main.c:2857:26: warning: restricted __be16 degrades to integer .../bond_main.c:2863:20: warning: restricted __be16 degrades to integer .../bond_main.c:2939:40: warning: incorrect type in assignment (different base types) .../bond_main.c:2939:40: expected restricted __be16 [usertype] vlan_proto .../bond_main.c:2939:40: got int No functional changes intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> --- drivers/net/bonding/bond_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)