Message ID | 20221027113248.420216-1-michael@walle.cc (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Revert "arm64: dts: ls1028a: sl28: use ocelot-8021q tagging by default" | expand |
On Thu, Oct 27, 2022 at 01:32:48PM +0200, Michael Walle wrote: > This reverts commit be0b178c50c37a666d54f435da71cf9f008362a0. > > This commit will break networking on the sl28 boards if the tagger is > not compiled into the kernel. If a non-default tagger is used, the > kernel doesn't do a request_module(). Fixing that is also not that > trivial because the tagger modules are loaded by ids, not by name. > Thus for now, just revert to the default tagger until that is fixed. > > Fixes: be0b178c50c3 ("arm64: dts: ls1028a: sl28: use ocelot-8021q tagging by default") > Reported-by: Heiko Thiery <heiko.thiery@gmail.com> > Signed-off-by: Michael Walle <michael@walle.cc> > --- > Vladimir, I'm not sure how to fix that one. Adding aliases to the tagger > modules? Something like "MODULE_ALIAS("dsa_tag-ocelot-8021q");" and then do > a request_module() in dsa_find_tagger_by_name(), too? > > .../arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts | 8 -------- > 1 file changed, 8 deletions(-) > > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts > index 72429b37a8b4..771c50c7f50a 100644 > --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts > +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts > @@ -324,14 +324,6 @@ &lpuart1 { > status = "okay"; > }; > > -&mscc_felix_port4 { > - dsa-tag-protocol = "ocelot-8021q"; > -}; > - > -&mscc_felix_port5 { > - dsa-tag-protocol = "ocelot-8021q"; > -}; > - > &usb0 { > status = "okay"; > }; > -- > 2.30.2 > Pretty nasty. Of all the switch drivers that support tagging protocol change, Ocelot/Felix is the only one with this bug, because in all other cases, the default and the alternative tagging protocol are part of the same .ko. Only here we have tag_ocelot.ko and tag_ocelot_8021q.ko. The problem preventing us from calling request_module() is that currently, the string identifying the tagging protocol (to which we match device tree information) is part of the tag_*.ko. I think we'd need the translation table between string and enum dsa_tag_protocol to be part of dsa_core.ko.
On Thu, Oct 27, 2022 at 03:05:19PM +0300, Vladimir Oltean wrote: > On Thu, Oct 27, 2022 at 01:32:48PM +0200, Michael Walle wrote: > > This reverts commit be0b178c50c37a666d54f435da71cf9f008362a0. > > > > This commit will break networking on the sl28 boards if the tagger is > > not compiled into the kernel. If a non-default tagger is used, the > > kernel doesn't do a request_module(). Fixing that is also not that > > trivial because the tagger modules are loaded by ids, not by name. > > Thus for now, just revert to the default tagger until that is fixed. > > > > Fixes: be0b178c50c3 ("arm64: dts: ls1028a: sl28: use ocelot-8021q tagging by default") > > Reported-by: Heiko Thiery <heiko.thiery@gmail.com> > > Signed-off-by: Michael Walle <michael@walle.cc> > > --- > > Vladimir, I'm not sure how to fix that one. Adding aliases to the tagger > > modules? Something like "MODULE_ALIAS("dsa_tag-ocelot-8021q");" and then do > > a request_module() in dsa_find_tagger_by_name(), too? > > > > .../arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts | 8 -------- > > 1 file changed, 8 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts > > index 72429b37a8b4..771c50c7f50a 100644 > > --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts > > +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts > > @@ -324,14 +324,6 @@ &lpuart1 { > > status = "okay"; > > }; > > > > -&mscc_felix_port4 { > > - dsa-tag-protocol = "ocelot-8021q"; > > -}; > > - > > -&mscc_felix_port5 { > > - dsa-tag-protocol = "ocelot-8021q"; > > -}; > > - > > &usb0 { > > status = "okay"; > > }; > > -- > > 2.30.2 > > > > Pretty nasty. Of all the switch drivers that support tagging protocol > change, Ocelot/Felix is the only one with this bug, because in all other > cases, the default and the alternative tagging protocol are part of the > same .ko. Only here we have tag_ocelot.ko and tag_ocelot_8021q.ko. > > The problem preventing us from calling request_module() is that currently, > the string identifying the tagging protocol (to which we match device > tree information) is part of the tag_*.ko. I think we'd need the > translation table between string and enum dsa_tag_protocol to be part of > dsa_core.ko. I think we should treat what we committed to in terms of dt-bindings with utmost respect, so I would consider your proposed revert as the absolute last option. Reverting a device tree change doesn't mean that the device trees without the revert will disappear from circulation. So far we have 3 options for fixing this within the kernel - make tag_ocelot.o and tag_ocelot_8021q.o link into the same tag_ocelot.ko - change the MODULE_ALIAS() of all tagging protocol driver modules from "dsa_tag-<number" to something containing their string name - what you proposed. I don't know why the current MODULE_ALIAS() is formatted the way it is. Maybe Andrew can comment on whether this is feasible. I think there isn't any backwards compatibility concern, since only modules compiled for a certain kernel version are expected to be loaded. - put a translation table between string and MODULE_ALIAS() inside dsa_core.ko, which potentially duplicates code. Maybe if we auto-generate it somehow?
On Thu, Oct 27, 2022 at 03:27:27PM +0300, Vladimir Oltean wrote: > I think we should treat what we committed to in terms of dt-bindings > with utmost respect, so I would consider your proposed revert as the > absolute last option. Reverting a device tree change doesn't mean that > the device trees without the revert will disappear from circulation. > > So far we have 3 options for fixing this within the kernel > > - make tag_ocelot.o and tag_ocelot_8021q.o link into the same > tag_ocelot.ko > > - change the MODULE_ALIAS() of all tagging protocol driver modules from > "dsa_tag-<number" to something containing their string name - what you > proposed. I don't know why the current MODULE_ALIAS() is formatted the > way it is. Maybe Andrew can comment on whether this is feasible. > I think there isn't any backwards compatibility concern, since only > modules compiled for a certain kernel version are expected to be > loaded. > > - put a translation table between string and MODULE_ALIAS() inside > dsa_core.ko, which potentially duplicates code. Maybe if we > auto-generate it somehow? Sorry for sending so many emails. I think the problem we should fix first and foremost is that, if there's a user protocol specified in the device tree but the kernel fails to load it, it should simply stick with the default tagging protocol, instead of failing to probe. Everything else can be dealt with as a future refinement.
Am 2022-10-27 14:40, schrieb Vladimir Oltean: > Sorry for sending so many emails. I think the problem we should fix > first and foremost is that, if there's a user protocol specified in the > device tree but the kernel fails to load it, it should simply stick > with > the default tagging protocol, instead of failing to probe. Everything > else can be dealt with as a future refinement. Sounds good to me. Should I come up with a patch or will you do it? -michael
On Thu, Oct 27, 2022 at 03:00:23PM +0200, Michael Walle wrote: > Am 2022-10-27 14:40, schrieb Vladimir Oltean: > > > Sorry for sending so many emails. I think the problem we should fix > > first and foremost is that, if there's a user protocol specified in the > > device tree but the kernel fails to load it, it should simply stick with > > the default tagging protocol, instead of failing to probe. Everything > > else can be dealt with as a future refinement. > > Sounds good to me. Should I come up with a patch or will you do it? I'll try to prepare a patch and copy you and Heiko. I hope to do that soon, but I've been running with hardirqs disabled for the past week or so, and as you can imagine, things are a bit crazy right now and there's a lot of pending work to do
Am 2022-10-27 14:27, schrieb Vladimir Oltean: >> Pretty nasty. Of all the switch drivers that support tagging protocol >> change, Ocelot/Felix is the only one with this bug, because in all >> other >> cases, the default and the alternative tagging protocol are part of >> the >> same .ko. Only here we have tag_ocelot.ko and tag_ocelot_8021q.ko. >> >> The problem preventing us from calling request_module() is that >> currently, >> the string identifying the tagging protocol (to which we match device >> tree information) is part of the tag_*.ko. I think we'd need the >> translation table between string and enum dsa_tag_protocol to be part >> of >> dsa_core.ko. > > I think we should treat what we committed to in terms of dt-bindings > with utmost respect, so I would consider your proposed revert as the > absolute last option. Reverting a device tree change doesn't mean that > the device trees without the revert will disappear from circulation. > > So far we have 3 options for fixing this within the kernel > > - make tag_ocelot.o and tag_ocelot_8021q.o link into the same > tag_ocelot.ko > > - change the MODULE_ALIAS() of all tagging protocol driver modules from > "dsa_tag-<number" to something containing their string name - what > you > proposed. I don't know why the current MODULE_ALIAS() is formatted > the > way it is. Maybe Andrew can comment on whether this is feasible. > I think there isn't any backwards compatibility concern, since only > modules compiled for a certain kernel version are expected to be > loaded. FWIW, you can have multiple aliases if we somehow need to keep the IDs, too. > - put a translation table between string and MODULE_ALIAS() inside > dsa_core.ko, which potentially duplicates code. Maybe if we > auto-generate it somehow? Yeah, I also thought of a table with of name to module alias mapping. But then you'd have two places to keep in sync (of not autogenerated). -michael
Am 2022-10-27 13:32, schrieb Michael Walle: > This reverts commit be0b178c50c37a666d54f435da71cf9f008362a0. > > This commit will break networking on the sl28 boards if the tagger is > not compiled into the kernel. If a non-default tagger is used, the > kernel doesn't do a request_module(). Fixing that is also not that > trivial because the tagger modules are loaded by ids, not by name. > Thus for now, just revert to the default tagger until that is fixed. > > Fixes: be0b178c50c3 ("arm64: dts: ls1028a: sl28: use ocelot-8021q > tagging by default") > Reported-by: Heiko Thiery <heiko.thiery@gmail.com> > Signed-off-by: Michael Walle <michael@walle.cc> Please disregard this patch. The proper fix is here: https://lore.kernel.org/netdev/20221027145439.3086017-1-vladimir.oltean@nxp.com/ -michael
On Thu, Oct 27, 2022 at 06:00:04PM +0200, Michael Walle wrote: > > - change the MODULE_ALIAS() of all tagging protocol driver modules from > > "dsa_tag-<number" to something containing their string name - what you > > proposed. I don't know why the current MODULE_ALIAS() is formatted the > > way it is. Maybe Andrew can comment on whether this is feasible. > > I think there isn't any backwards compatibility concern, since only > > modules compiled for a certain kernel version are expected to be > > loaded. > > FWIW, you can have multiple aliases if we somehow need to keep the IDs, > too. Yeah, that's worth exploring, but it means that we have 2 code paths for request_module() with different string formats. To me this is slightly undesirable, we should try to consolidate the mechanisms in the core. > > - put a translation table between string and MODULE_ALIAS() inside > > dsa_core.ko, which potentially duplicates code. Maybe if we > > auto-generate it somehow? > > Yeah, I also thought of a table with of name to module alias mapping. > But then you'd have two places to keep in sync (of not autogenerated). Well, to be fair, this is not exactly true. As far as I could find (grep for "ops->name" in net/dsa), there are only 3 instances of reading the "name" field of struct dsa_device_ops, and none of them are from a fast path. I can imagine a table along the lines of: static const char * const dsa_tag_proto_names[] = { [DSA_TAG_PROTO_NONE] = "none", [DSA_TAG_PROTO_BRCM] = "brcm", .... }; which is then used to directly replace ops->name (becomes dsa_tag_proto_names[ops->proto]). Then, we could add a new function "dsa_tag_protocol_name_to_id()" or something along those lines, and construct the modalias string based on that. No duplication necessary, since we would remove dsa_device_ops :: name.
Am 2022-10-27 20:04, schrieb Vladimir Oltean: > On Thu, Oct 27, 2022 at 06:00:04PM +0200, Michael Walle wrote: >> > - change the MODULE_ALIAS() of all tagging protocol driver modules from >> > "dsa_tag-<number" to something containing their string name - what you >> > proposed. I don't know why the current MODULE_ALIAS() is formatted the >> > way it is. Maybe Andrew can comment on whether this is feasible. >> > I think there isn't any backwards compatibility concern, since only >> > modules compiled for a certain kernel version are expected to be >> > loaded. >> >> FWIW, you can have multiple aliases if we somehow need to keep the >> IDs, >> too. > > Yeah, that's worth exploring, but it means that we have 2 code paths > for > request_module() with different string formats. To me this is slightly > undesirable, we should try to consolidate the mechanisms in the core. > >> > - put a translation table between string and MODULE_ALIAS() inside >> > dsa_core.ko, which potentially duplicates code. Maybe if we >> > auto-generate it somehow? >> >> Yeah, I also thought of a table with of name to module alias mapping. >> But then you'd have two places to keep in sync (of not autogenerated). > > Well, to be fair, this is not exactly true. As far as I could find > (grep for "ops->name" in net/dsa), there are only 3 instances of > reading > the "name" field of struct dsa_device_ops, and none of them are from a > fast path. > > I can imagine a table along the lines of: > > static const char * const dsa_tag_proto_names[] = { > [DSA_TAG_PROTO_NONE] = "none", > [DSA_TAG_PROTO_BRCM] = "brcm", > .... > }; > > which is then used to directly replace ops->name > (becomes dsa_tag_proto_names[ops->proto]). > > Then, we could add a new function "dsa_tag_protocol_name_to_id()" or > something along those lines, and construct the modalias string based on > that. > > No duplication necessary, since we would remove dsa_device_ops :: name. If one would a new tagger you'd need to add it to dsa_tag_proto_names[] as well as adding the tagger source file. Thus, two places to keep in sync. And you don't have all the information in one place, e.g. the tagger module. The name of the tagger as used in sysfs or device tree is then in the core. Just wanted to point that out. After all it's up to you as the maintainer to decide ;) -michael
On Thu, Oct 27, 2022 at 09:10:20PM +0200, Michael Walle wrote: > If one would a new tagger you'd need to add it to > dsa_tag_proto_names[] as well as adding the tagger source file. Thus, > two places to keep in sync. And you don't have all the information in > one place, e.g. the tagger module. The name of the tagger as used in > sysfs or device tree is then in the core. Just wanted to point that out. > After all it's up to you as the maintainer to decide ;) True, there are already 2 places you need to keep in sync, you already need to add the tagger id to include/net/dsa.h (twice). The header is shared between taggers, the DSA core and device drivers. A third place would indeed be a bit too much. Actually I came to like your idea with 2 modaliases. I prepared a patch set and I'm in the process of testing it (need to rebuild everything with modules, which I usually skip). If it works, I'll post it as an RFC soon.
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts index 72429b37a8b4..771c50c7f50a 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts +++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts @@ -324,14 +324,6 @@ &lpuart1 { status = "okay"; }; -&mscc_felix_port4 { - dsa-tag-protocol = "ocelot-8021q"; -}; - -&mscc_felix_port5 { - dsa-tag-protocol = "ocelot-8021q"; -}; - &usb0 { status = "okay"; };
This reverts commit be0b178c50c37a666d54f435da71cf9f008362a0. This commit will break networking on the sl28 boards if the tagger is not compiled into the kernel. If a non-default tagger is used, the kernel doesn't do a request_module(). Fixing that is also not that trivial because the tagger modules are loaded by ids, not by name. Thus for now, just revert to the default tagger until that is fixed. Fixes: be0b178c50c3 ("arm64: dts: ls1028a: sl28: use ocelot-8021q tagging by default") Reported-by: Heiko Thiery <heiko.thiery@gmail.com> Signed-off-by: Michael Walle <michael@walle.cc> --- Vladimir, I'm not sure how to fix that one. Adding aliases to the tagger modules? Something like "MODULE_ALIAS("dsa_tag-ocelot-8021q");" and then do a request_module() in dsa_find_tagger_by_name(), too? .../arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts | 8 -------- 1 file changed, 8 deletions(-)