diff mbox series

[net] net: dsa: fix panic when port leaves a bridge

Message ID 20220314153410.31744-1-kabel@kernel.org (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series [net] net: dsa: fix panic when port leaves a bridge | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 3 blamed authors not CCed: olteanv@gmail.com f.fainelli@gmail.com davem@davemloft.net; 5 maintainers not CCed: davem@davemloft.net f.fainelli@gmail.com olteanv@gmail.com kuba@kernel.org vivien.didelot@gmail.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 109 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Marek Behún March 14, 2022, 3:34 p.m. UTC
Fix a data structure breaking / NULL-pointer dereference in
dsa_switch_bridge_leave().

When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
notifier for every DSA switch that contains ports that are in the
bridge.

But the part of the code that unsets vlan_filtering expects that the ds
argument refers to the same switch that contains the leaving port.

This leads to various problems, including a NULL pointer dereference,
which was observed on Turris MOX with 2 switches (one with 8 user ports
and another with 4 user ports).

Thus we need to move the vlan_filtering change code to the non-crosschip
branch.

Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
Reported-by: Jan Bětík <hagrid@svine.us>
Signed-off-by: Marek Behún <kabel@kernel.org>
---
 net/dsa/switch.c | 97 +++++++++++++++++++++++++++---------------------
 1 file changed, 55 insertions(+), 42 deletions(-)

Comments

Vladimir Oltean March 14, 2022, 3:41 p.m. UTC | #1
Hi Marek,

On Mon, Mar 14, 2022 at 04:34:10PM +0100, Marek Behún wrote:
> Fix a data structure breaking / NULL-pointer dereference in
> dsa_switch_bridge_leave().
> 
> When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> notifier for every DSA switch that contains ports that are in the
> bridge.
> 
> But the part of the code that unsets vlan_filtering expects that the ds
> argument refers to the same switch that contains the leaving port.
> 
> This leads to various problems, including a NULL pointer dereference,
> which was observed on Turris MOX with 2 switches (one with 8 user ports
> and another with 4 user ports).
> 
> Thus we need to move the vlan_filtering change code to the non-crosschip
> branch.
> 
> Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> Reported-by: Jan Bětík <hagrid@svine.us>
> Signed-off-by: Marek Behún <kabel@kernel.org>
> ---
>  net/dsa/switch.c | 97 +++++++++++++++++++++++++++---------------------
>  1 file changed, 55 insertions(+), 42 deletions(-)
> 
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index e3c7d2627a61..38afb1e8fcb7 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -123,9 +123,61 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
>  	struct dsa_port *dp;
>  	int err;
>  
> -	if (dst->index == info->tree_index && ds->index == info->sw_index &&
> -	    ds->ops->port_bridge_leave)
> -		ds->ops->port_bridge_leave(ds, info->port, info->bridge);
> +	if (dst->index == info->tree_index && ds->index == info->sw_index) {
> +		if (ds->ops->port_bridge_leave)
> +			ds->ops->port_bridge_leave(ds, info->port,
> +						   info->bridge);
> +
> +		/* This function is called by the notifier for every DSA switch
> +		 * that has ports in the bridge we are leaving, but vlan
> +		 * filtering on the port should be changed only once. Since the
> +		 * code expects that ds is the switch containing the leaving
> +		 * port, the following code must not be called in the crosschip
> +		 * branch, only here.
> +		 */
> +
> +		if (ds->needs_standalone_vlan_filtering &&
> +		    !br_vlan_enabled(info->bridge.dev)) {
> +			change_vlan_filtering = true;
> +			vlan_filtering = true;
> +		} else if (!ds->needs_standalone_vlan_filtering &&
> +			   br_vlan_enabled(info->bridge.dev)) {
> +			change_vlan_filtering = true;
> +			vlan_filtering = false;
> +		}
> +
> +		/* If the bridge was vlan_filtering, the bridge core doesn't
> +		 * trigger an event for changing vlan_filtering setting upon
> +		 * slave ports leaving it. That is a good thing, because that
> +		 * lets us handle it and also handle the case where the switch's
> +		 * vlan_filtering setting is global (not per port). When that
> +		 * happens, the correct moment to trigger the vlan_filtering
> +		 * callback is only when the last port leaves the last
> +		 * VLAN-aware bridge.
> +		 */
> +		if (change_vlan_filtering && ds->vlan_filtering_is_global) {
> +			dsa_switch_for_each_port(dp, ds) {
> +				struct net_device *br =
> +					dsa_port_bridge_dev_get(dp);
> +
> +				if (br && br_vlan_enabled(br)) {
> +					change_vlan_filtering = false;
> +					break;
> +				}
> +			}
> +		}
> +
> +		if (change_vlan_filtering) {
> +			err = dsa_port_vlan_filtering(dsa_to_port(ds,
> +								  info->port),
> +						      vlan_filtering, &extack);
> +			if (extack._msg)
> +				dev_err(ds->dev, "port %d: %s\n", info->port,
> +					extack._msg);
> +			if (err && err != -EOPNOTSUPP)
> +				return err;
> +		}
> +	}
>  
>  	if ((dst->index != info->tree_index || ds->index != info->sw_index) &&
>  	    ds->ops->crosschip_bridge_leave)
> @@ -133,45 +185,6 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
>  						info->sw_index, info->port,
>  						info->bridge);
>  
> -	if (ds->needs_standalone_vlan_filtering &&
> -	    !br_vlan_enabled(info->bridge.dev)) {
> -		change_vlan_filtering = true;
> -		vlan_filtering = true;
> -	} else if (!ds->needs_standalone_vlan_filtering &&
> -		   br_vlan_enabled(info->bridge.dev)) {
> -		change_vlan_filtering = true;
> -		vlan_filtering = false;
> -	}
> -
> -	/* If the bridge was vlan_filtering, the bridge core doesn't trigger an
> -	 * event for changing vlan_filtering setting upon slave ports leaving
> -	 * it. That is a good thing, because that lets us handle it and also
> -	 * handle the case where the switch's vlan_filtering setting is global
> -	 * (not per port). When that happens, the correct moment to trigger the
> -	 * vlan_filtering callback is only when the last port leaves the last
> -	 * VLAN-aware bridge.
> -	 */
> -	if (change_vlan_filtering && ds->vlan_filtering_is_global) {
> -		dsa_switch_for_each_port(dp, ds) {
> -			struct net_device *br = dsa_port_bridge_dev_get(dp);
> -
> -			if (br && br_vlan_enabled(br)) {
> -				change_vlan_filtering = false;
> -				break;
> -			}
> -		}
> -	}
> -
> -	if (change_vlan_filtering) {
> -		err = dsa_port_vlan_filtering(dsa_to_port(ds, info->port),
> -					      vlan_filtering, &extack);
> -		if (extack._msg)
> -			dev_err(ds->dev, "port %d: %s\n", info->port,
> -				extack._msg);
> -		if (err && err != -EOPNOTSUPP)
> -			return err;
> -	}
> -
>  	return dsa_tag_8021q_bridge_leave(ds, info);
>  }
>  
> -- 
> 2.34.1
>

I had this patch in my tree for a while, moving the VLAN filtering reset
logic where it should really stay. Could you please check that this
placement fixes the NULL pointer dereferences too, and if it does,
combine my change with your commit message/authorship and resend?

From 5e0bbf38d35ceab0fba3f16f832bdbb7c127c167 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Mon, 21 Feb 2022 20:31:27 +0200
Subject: [PATCH] net: dsa: move reset of VLAN filtering to
 dsa_port_switchdev_unsync_attrs

In dsa_port_switchdev_unsync_attrs() there is a comment that resetting
the VLAN filtering isn't done where it is expected. And since commit
108dc8741c20 ("net: dsa: Avoid cross-chip syncing of VLAN filtering"),
there is no reason to handle this in switch.c either.

Therefore, move the logic to port.c, and adapt it slightly to the data
structures and naming conventions from there.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/dsa/port.c   | 60 +++++++++++++++++++++++++++++++++++++++++++++---
 net/dsa/switch.c | 58 ----------------------------------------------
 2 files changed, 57 insertions(+), 61 deletions(-)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index 58291df14cdb..d280c7a3af7b 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -172,6 +172,59 @@ void dsa_port_disable(struct dsa_port *dp)
 	rtnl_unlock();
 }
 
+static void dsa_port_reset_vlan_filtering(struct dsa_port *dp,
+					  struct dsa_bridge bridge)
+{
+	struct netlink_ext_ack extack = {0};
+	bool change_vlan_filtering = false;
+	struct dsa_switch *ds = dp->ds;
+	bool vlan_filtering;
+	int err;
+
+	if (ds->needs_standalone_vlan_filtering &&
+	    !br_vlan_enabled(bridge.dev)) {
+		change_vlan_filtering = true;
+		vlan_filtering = true;
+	} else if (!ds->needs_standalone_vlan_filtering &&
+		   br_vlan_enabled(bridge.dev)) {
+		change_vlan_filtering = true;
+		vlan_filtering = false;
+	}
+
+	/* If the bridge was vlan_filtering, the bridge core doesn't trigger an
+	 * event for changing vlan_filtering setting upon slave ports leaving
+	 * it. That is a good thing, because that lets us handle it and also
+	 * handle the case where the switch's vlan_filtering setting is global
+	 * (not per port). When that happens, the correct moment to trigger the
+	 * vlan_filtering callback is only when the last port leaves the last
+	 * VLAN-aware bridge.
+	 */
+	if (change_vlan_filtering && ds->vlan_filtering_is_global) {
+		dsa_switch_for_each_port(dp, ds) {
+			struct net_device *br = dsa_port_bridge_dev_get(dp);
+
+			if (br && br_vlan_enabled(br)) {
+				change_vlan_filtering = false;
+				break;
+			}
+		}
+	}
+
+	if (!change_vlan_filtering)
+		return;
+
+	err = dsa_port_vlan_filtering(dp, vlan_filtering, &extack);
+	if (extack._msg) {
+		dev_err(ds->dev, "port %d: %s\n", dp->index,
+			extack._msg);
+	}
+	if (err && err != -EOPNOTSUPP) {
+		dev_err(ds->dev,
+			"port %d failed to reset VLAN filtering to %d: %pe\n",
+		       dp->index, vlan_filtering, ERR_PTR(err));
+	}
+}
+
 static int dsa_port_inherit_brport_flags(struct dsa_port *dp,
 					 struct netlink_ext_ack *extack)
 {
@@ -243,7 +296,8 @@ static int dsa_port_switchdev_sync_attrs(struct dsa_port *dp,
 	return 0;
 }
 
-static void dsa_port_switchdev_unsync_attrs(struct dsa_port *dp)
+static void dsa_port_switchdev_unsync_attrs(struct dsa_port *dp,
+					    struct dsa_bridge bridge)
 {
 	/* Configure the port for standalone mode (no address learning,
 	 * flood everything).
@@ -263,7 +317,7 @@ static void dsa_port_switchdev_unsync_attrs(struct dsa_port *dp)
 	 */
 	dsa_port_set_state_now(dp, BR_STATE_FORWARDING, true);
 
-	/* VLAN filtering is handled by dsa_switch_bridge_leave */
+	dsa_port_reset_vlan_filtering(dp, bridge);
 
 	/* Ageing time may be global to the switch chip, so don't change it
 	 * here because we have no good reason (or value) to change it to.
@@ -418,7 +472,7 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
 			"port %d failed to notify DSA_NOTIFIER_BRIDGE_LEAVE: %pe\n",
 			dp->index, ERR_PTR(err));
 
-	dsa_port_switchdev_unsync_attrs(dp);
+	dsa_port_switchdev_unsync_attrs(dp, info.bridge);
 }
 
 int dsa_port_lag_change(struct dsa_port *dp,
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index d25cd1da3eb3..d8a80cf9742c 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -115,62 +115,10 @@ static int dsa_switch_bridge_join(struct dsa_switch *ds,
 	return 0;
 }
 
-static int dsa_switch_sync_vlan_filtering(struct dsa_switch *ds,
-					  struct dsa_notifier_bridge_info *info)
-{
-	struct netlink_ext_ack extack = {0};
-	bool change_vlan_filtering = false;
-	bool vlan_filtering;
-	struct dsa_port *dp;
-	int err;
-
-	if (ds->needs_standalone_vlan_filtering &&
-	    !br_vlan_enabled(info->bridge.dev)) {
-		change_vlan_filtering = true;
-		vlan_filtering = true;
-	} else if (!ds->needs_standalone_vlan_filtering &&
-		   br_vlan_enabled(info->bridge.dev)) {
-		change_vlan_filtering = true;
-		vlan_filtering = false;
-	}
-
-	/* If the bridge was vlan_filtering, the bridge core doesn't trigger an
-	 * event for changing vlan_filtering setting upon slave ports leaving
-	 * it. That is a good thing, because that lets us handle it and also
-	 * handle the case where the switch's vlan_filtering setting is global
-	 * (not per port). When that happens, the correct moment to trigger the
-	 * vlan_filtering callback is only when the last port leaves the last
-	 * VLAN-aware bridge.
-	 */
-	if (change_vlan_filtering && ds->vlan_filtering_is_global) {
-		dsa_switch_for_each_port(dp, ds) {
-			struct net_device *br = dsa_port_bridge_dev_get(dp);
-
-			if (br && br_vlan_enabled(br)) {
-				change_vlan_filtering = false;
-				break;
-			}
-		}
-	}
-
-	if (change_vlan_filtering) {
-		err = dsa_port_vlan_filtering(dsa_to_port(ds, info->port),
-					      vlan_filtering, &extack);
-		if (extack._msg)
-			dev_err(ds->dev, "port %d: %s\n", info->port,
-				extack._msg);
-		if (err && err != -EOPNOTSUPP)
-			return err;
-	}
-
-	return 0;
-}
-
 static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 				   struct dsa_notifier_bridge_info *info)
 {
 	struct dsa_switch_tree *dst = ds->dst;
-	int err;
 
 	if (dst->index == info->tree_index && ds->index == info->sw_index &&
 	    ds->ops->port_bridge_leave)
@@ -182,12 +130,6 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 						info->sw_index, info->port,
 						info->bridge);
 
-	if (ds->dst->index == info->tree_index && ds->index == info->sw_index) {
-		err = dsa_switch_sync_vlan_filtering(ds, info);
-		if (err)
-			return err;
-	}
-
 	return 0;
 }
Vladimir Oltean March 14, 2022, 3:45 p.m. UTC | #2
On Mon, Mar 14, 2022 at 04:34:10PM +0100, Marek Behún wrote:
> Fix a data structure breaking / NULL-pointer dereference in
> dsa_switch_bridge_leave().
> 
> When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> notifier for every DSA switch that contains ports that are in the
> bridge.
> 
> But the part of the code that unsets vlan_filtering expects that the ds
> argument refers to the same switch that contains the leaving port.
> 
> This leads to various problems, including a NULL pointer dereference,
> which was observed on Turris MOX with 2 switches (one with 8 user ports
> and another with 4 user ports).
> 
> Thus we need to move the vlan_filtering change code to the non-crosschip
> branch.
> 
> Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> Reported-by: Jan Bětík <hagrid@svine.us>
> Signed-off-by: Marek Behún <kabel@kernel.org>
> ---

Ah, wait a minute, you're missing Tobias' patch
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=108dc8741c203e9d6ce4e973367f1bac20c7192b

What happened is that it was applied to "net-next" instead of "net",
despite being correctly targeted.
https://patchwork.kernel.org/project/netdevbpf/patch/20220124210944.3749235-3-tobias@waldekranz.com/
Hmmm...
Tobias Waldekranz March 14, 2022, 3:48 p.m. UTC | #3
On Mon, Mar 14, 2022 at 16:34, Marek Behún <kabel@kernel.org> wrote:
> Fix a data structure breaking / NULL-pointer dereference in
> dsa_switch_bridge_leave().
>
> When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> notifier for every DSA switch that contains ports that are in the
> bridge.
>
> But the part of the code that unsets vlan_filtering expects that the ds
> argument refers to the same switch that contains the leaving port.
>
> This leads to various problems, including a NULL pointer dereference,
> which was observed on Turris MOX with 2 switches (one with 8 user ports
> and another with 4 user ports).
>
> Thus we need to move the vlan_filtering change code to the non-crosschip
> branch.
>
> Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> Reported-by: Jan Bětík <hagrid@svine.us>
> Signed-off-by: Marek Behún <kabel@kernel.org>
> ---

Hi Marek,

I ran into the same issue a while back and fixed it (or at least thought
I did) in 108dc8741c20. Has that been applied to your tree? Maybe I
missed some tag that caused it to not be back-ported?
Marek Behún March 14, 2022, 4:05 p.m. UTC | #4
On Mon, 14 Mar 2022 16:48:47 +0100
Tobias Waldekranz <tobias@waldekranz.com> wrote:

> On Mon, Mar 14, 2022 at 16:34, Marek Behún <kabel@kernel.org> wrote:
> > Fix a data structure breaking / NULL-pointer dereference in
> > dsa_switch_bridge_leave().
> >
> > When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> > notifier for every DSA switch that contains ports that are in the
> > bridge.
> >
> > But the part of the code that unsets vlan_filtering expects that the ds
> > argument refers to the same switch that contains the leaving port.
> >
> > This leads to various problems, including a NULL pointer dereference,
> > which was observed on Turris MOX with 2 switches (one with 8 user ports
> > and another with 4 user ports).
> >
> > Thus we need to move the vlan_filtering change code to the non-crosschip
> > branch.
> >
> > Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> > Reported-by: Jan Bětík <hagrid@svine.us>
> > Signed-off-by: Marek Behún <kabel@kernel.org>
> > ---  
> 
> Hi Marek,
> 
> I ran into the same issue a while back and fixed it (or at least thought
> I did) in 108dc8741c20. Has that been applied to your tree? Maybe I
> missed some tag that caused it to not be back-ported?

It wasn't applied because I was working with net, not net-next.

Very well. We will need to get this backported to stable, though.

Marek
Marek Behún March 14, 2022, 4:06 p.m. UTC | #5
On Mon, 14 Mar 2022 15:45:33 +0000
Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> On Mon, Mar 14, 2022 at 04:34:10PM +0100, Marek Behún wrote:
> > Fix a data structure breaking / NULL-pointer dereference in
> > dsa_switch_bridge_leave().
> > 
> > When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> > notifier for every DSA switch that contains ports that are in the
> > bridge.
> > 
> > But the part of the code that unsets vlan_filtering expects that the ds
> > argument refers to the same switch that contains the leaving port.
> > 
> > This leads to various problems, including a NULL pointer dereference,
> > which was observed on Turris MOX with 2 switches (one with 8 user ports
> > and another with 4 user ports).
> > 
> > Thus we need to move the vlan_filtering change code to the non-crosschip
> > branch.
> > 
> > Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> > Reported-by: Jan Bětík <hagrid@svine.us>
> > Signed-off-by: Marek Behún <kabel@kernel.org>
> > ---  
> 
> Ah, wait a minute, you're missing Tobias' patch
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=108dc8741c203e9d6ce4e973367f1bac20c7192b
> 
> What happened is that it was applied to "net-next" instead of "net",
> despite being correctly targeted.
> https://patchwork.kernel.org/project/netdevbpf/patch/20220124210944.3749235-3-tobias@waldekranz.com/
> Hmmm...

OK thx.

Marek
Vladimir Oltean March 14, 2022, 4:17 p.m. UTC | #6
On Mon, Mar 14, 2022 at 05:05:29PM +0100, Marek Behún wrote:
> On Mon, 14 Mar 2022 16:48:47 +0100
> Tobias Waldekranz <tobias@waldekranz.com> wrote:
> 
> > On Mon, Mar 14, 2022 at 16:34, Marek Behún <kabel@kernel.org> wrote:
> > > Fix a data structure breaking / NULL-pointer dereference in
> > > dsa_switch_bridge_leave().
> > >
> > > When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> > > notifier for every DSA switch that contains ports that are in the
> > > bridge.
> > >
> > > But the part of the code that unsets vlan_filtering expects that the ds
> > > argument refers to the same switch that contains the leaving port.
> > >
> > > This leads to various problems, including a NULL pointer dereference,
> > > which was observed on Turris MOX with 2 switches (one with 8 user ports
> > > and another with 4 user ports).
> > >
> > > Thus we need to move the vlan_filtering change code to the non-crosschip
> > > branch.
> > >
> > > Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> > > Reported-by: Jan Bětík <hagrid@svine.us>
> > > Signed-off-by: Marek Behún <kabel@kernel.org>
> > > ---  
> > 
> > Hi Marek,
> > 
> > I ran into the same issue a while back and fixed it (or at least thought
> > I did) in 108dc8741c20. Has that been applied to your tree? Maybe I
> > missed some tag that caused it to not be back-ported?
> 
> It wasn't applied because I was working with net, not net-next.
> 
> Very well. We will need to get this backported to stable, though.
> 
> Marek

Who can send Tobias's 2 patches to linux-stable branches for 5.4 and higher?
Marek Behún March 14, 2022, 4:34 p.m. UTC | #7
On Mon, 14 Mar 2022 16:17:06 +0000
Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> On Mon, Mar 14, 2022 at 05:05:29PM +0100, Marek Behún wrote:
> > On Mon, 14 Mar 2022 16:48:47 +0100
> > Tobias Waldekranz <tobias@waldekranz.com> wrote:
> >   
> > > On Mon, Mar 14, 2022 at 16:34, Marek Behún <kabel@kernel.org> wrote:  
> > > > Fix a data structure breaking / NULL-pointer dereference in
> > > > dsa_switch_bridge_leave().
> > > >
> > > > When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> > > > notifier for every DSA switch that contains ports that are in the
> > > > bridge.
> > > >
> > > > But the part of the code that unsets vlan_filtering expects that the ds
> > > > argument refers to the same switch that contains the leaving port.
> > > >
> > > > This leads to various problems, including a NULL pointer dereference,
> > > > which was observed on Turris MOX with 2 switches (one with 8 user ports
> > > > and another with 4 user ports).
> > > >
> > > > Thus we need to move the vlan_filtering change code to the non-crosschip
> > > > branch.
> > > >
> > > > Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> > > > Reported-by: Jan Bětík <hagrid@svine.us>
> > > > Signed-off-by: Marek Behún <kabel@kernel.org>
> > > > ---    
> > > 
> > > Hi Marek,
> > > 
> > > I ran into the same issue a while back and fixed it (or at least thought
> > > I did) in 108dc8741c20. Has that been applied to your tree? Maybe I
> > > missed some tag that caused it to not be back-ported?  
> > 
> > It wasn't applied because I was working with net, not net-next.
> > 
> > Very well. We will need to get this backported to stable, though.
> > 
> > Marek  
> 
> Who can send Tobias's 2 patches to linux-stable branches for 5.4 and higher?

I can, but I thought it should only be done after it gets merged to
Linus' master.

Marek
Vladimir Oltean March 14, 2022, 4:42 p.m. UTC | #8
On Mon, Mar 14, 2022 at 05:34:33PM +0100, Marek Behún wrote:
> On Mon, 14 Mar 2022 16:17:06 +0000
> Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> 
> > On Mon, Mar 14, 2022 at 05:05:29PM +0100, Marek Behún wrote:
> > > On Mon, 14 Mar 2022 16:48:47 +0100
> > > Tobias Waldekranz <tobias@waldekranz.com> wrote:
> > >   
> > > > On Mon, Mar 14, 2022 at 16:34, Marek Behún <kabel@kernel.org> wrote:  
> > > > > Fix a data structure breaking / NULL-pointer dereference in
> > > > > dsa_switch_bridge_leave().
> > > > >
> > > > > When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> > > > > notifier for every DSA switch that contains ports that are in the
> > > > > bridge.
> > > > >
> > > > > But the part of the code that unsets vlan_filtering expects that the ds
> > > > > argument refers to the same switch that contains the leaving port.
> > > > >
> > > > > This leads to various problems, including a NULL pointer dereference,
> > > > > which was observed on Turris MOX with 2 switches (one with 8 user ports
> > > > > and another with 4 user ports).
> > > > >
> > > > > Thus we need to move the vlan_filtering change code to the non-crosschip
> > > > > branch.
> > > > >
> > > > > Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> > > > > Reported-by: Jan Bětík <hagrid@svine.us>
> > > > > Signed-off-by: Marek Behún <kabel@kernel.org>
> > > > > ---    
> > > > 
> > > > Hi Marek,
> > > > 
> > > > I ran into the same issue a while back and fixed it (or at least thought
> > > > I did) in 108dc8741c20. Has that been applied to your tree? Maybe I
> > > > missed some tag that caused it to not be back-ported?  
> > > 
> > > It wasn't applied because I was working with net, not net-next.
> > > 
> > > Very well. We will need to get this backported to stable, though.
> > > 
> > > Marek  
> > 
> > Who can send Tobias's 2 patches to linux-stable branches for 5.4 and higher?
> 
> I can, but I thought it should only be done after it gets merged to
> Linus' master.

Ah, now I re-read Tobias' discussion with Jakub from the cover letter.
I've never seen that procedure in action, to be honest, let's see how it goes.
Marek Behún March 14, 2022, 4:47 p.m. UTC | #9
On Mon, 14 Mar 2022 16:48:47 +0100
Tobias Waldekranz <tobias@waldekranz.com> wrote:

> On Mon, Mar 14, 2022 at 16:34, Marek Behún <kabel@kernel.org> wrote:
> > Fix a data structure breaking / NULL-pointer dereference in
> > dsa_switch_bridge_leave().
> >
> > When a DSA port leaves a bridge, dsa_switch_bridge_leave() is called by
> > notifier for every DSA switch that contains ports that are in the
> > bridge.
> >
> > But the part of the code that unsets vlan_filtering expects that the ds
> > argument refers to the same switch that contains the leaving port.
> >
> > This leads to various problems, including a NULL pointer dereference,
> > which was observed on Turris MOX with 2 switches (one with 8 user ports
> > and another with 4 user ports).
> >
> > Thus we need to move the vlan_filtering change code to the non-crosschip
> > branch.
> >
> > Fixes: d371b7c92d190 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
> > Reported-by: Jan Bětík <hagrid@svine.us>
> > Signed-off-by: Marek Behún <kabel@kernel.org>
> > ---  
> 
> Hi Marek,
> 
> I ran into the same issue a while back and fixed it (or at least thought
> I did) in 108dc8741c20. Has that been applied to your tree? Maybe I
> missed some tag that caused it to not be back-ported?

Tobias, I can backport these patches to 5.4+ stable. Or have you
already prepared backports?

Marek
Marek Behún March 14, 2022, 6:09 p.m. UTC | #10
On Mon, 14 Mar 2022 17:47:00 +0100
Marek Behún <kabel@kernel.org> wrote:

> Tobias, I can backport these patches to 5.4+ stable. Or have you
> already prepared backports?

OK the patches are prepared here

https://secure.nic.cz/files/mbehun/dsa-fix-queue/
Tobias Waldekranz March 14, 2022, 6:19 p.m. UTC | #11
On Mon, Mar 14, 2022 at 19:09, Marek Behún <kabel@kernel.org> wrote:
> On Mon, 14 Mar 2022 17:47:00 +0100
> Marek Behún <kabel@kernel.org> wrote:
>
>> Tobias, I can backport these patches to 5.4+ stable. Or have you
>> already prepared backports?
>
> OK the patches are prepared here
>
> https://secure.nic.cz/files/mbehun/dsa-fix-queue/

Great, thank you!

Not sure what the procedure is here, any action required on my part?
Marek Behún March 14, 2022, 7:27 p.m. UTC | #12
On Mon, 14 Mar 2022 19:19:52 +0100
Tobias Waldekranz <tobias@waldekranz.com> wrote:

> On Mon, Mar 14, 2022 at 19:09, Marek Behún <kabel@kernel.org> wrote:
> > On Mon, 14 Mar 2022 17:47:00 +0100
> > Marek Behún <kabel@kernel.org> wrote:
> >  
> >> Tobias, I can backport these patches to 5.4+ stable. Or have you
> >> already prepared backports?  
> >
> > OK the patches are prepared here
> >
> > https://secure.nic.cz/files/mbehun/dsa-fix-queue/  
> 
> Great, thank you!
> 
> Not sure what the procedure is here, any action required on my part?

Not particularly. I can just send it to stable once Linus merges it.

Pity we need to wait till it gets to Linus.

Marek
Jakub Kicinski March 15, 2022, 12:04 a.m. UTC | #13
On Mon, 14 Mar 2022 16:42:37 +0000 Vladimir Oltean wrote:
> > I can, but I thought it should only be done after it gets merged to
> > Linus' master.  
> 
> Ah, now I re-read Tobias' discussion with Jakub from the cover letter.
> I've never seen that procedure in action, to be honest, let's see how it goes.

I guess we could have applied it to both trees and dealt with 
the merge :S No point doing it now, tho, merge window is afoot.
diff mbox series

Patch

diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index e3c7d2627a61..38afb1e8fcb7 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -123,9 +123,61 @@  static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 	struct dsa_port *dp;
 	int err;
 
-	if (dst->index == info->tree_index && ds->index == info->sw_index &&
-	    ds->ops->port_bridge_leave)
-		ds->ops->port_bridge_leave(ds, info->port, info->bridge);
+	if (dst->index == info->tree_index && ds->index == info->sw_index) {
+		if (ds->ops->port_bridge_leave)
+			ds->ops->port_bridge_leave(ds, info->port,
+						   info->bridge);
+
+		/* This function is called by the notifier for every DSA switch
+		 * that has ports in the bridge we are leaving, but vlan
+		 * filtering on the port should be changed only once. Since the
+		 * code expects that ds is the switch containing the leaving
+		 * port, the following code must not be called in the crosschip
+		 * branch, only here.
+		 */
+
+		if (ds->needs_standalone_vlan_filtering &&
+		    !br_vlan_enabled(info->bridge.dev)) {
+			change_vlan_filtering = true;
+			vlan_filtering = true;
+		} else if (!ds->needs_standalone_vlan_filtering &&
+			   br_vlan_enabled(info->bridge.dev)) {
+			change_vlan_filtering = true;
+			vlan_filtering = false;
+		}
+
+		/* If the bridge was vlan_filtering, the bridge core doesn't
+		 * trigger an event for changing vlan_filtering setting upon
+		 * slave ports leaving it. That is a good thing, because that
+		 * lets us handle it and also handle the case where the switch's
+		 * vlan_filtering setting is global (not per port). When that
+		 * happens, the correct moment to trigger the vlan_filtering
+		 * callback is only when the last port leaves the last
+		 * VLAN-aware bridge.
+		 */
+		if (change_vlan_filtering && ds->vlan_filtering_is_global) {
+			dsa_switch_for_each_port(dp, ds) {
+				struct net_device *br =
+					dsa_port_bridge_dev_get(dp);
+
+				if (br && br_vlan_enabled(br)) {
+					change_vlan_filtering = false;
+					break;
+				}
+			}
+		}
+
+		if (change_vlan_filtering) {
+			err = dsa_port_vlan_filtering(dsa_to_port(ds,
+								  info->port),
+						      vlan_filtering, &extack);
+			if (extack._msg)
+				dev_err(ds->dev, "port %d: %s\n", info->port,
+					extack._msg);
+			if (err && err != -EOPNOTSUPP)
+				return err;
+		}
+	}
 
 	if ((dst->index != info->tree_index || ds->index != info->sw_index) &&
 	    ds->ops->crosschip_bridge_leave)
@@ -133,45 +185,6 @@  static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 						info->sw_index, info->port,
 						info->bridge);
 
-	if (ds->needs_standalone_vlan_filtering &&
-	    !br_vlan_enabled(info->bridge.dev)) {
-		change_vlan_filtering = true;
-		vlan_filtering = true;
-	} else if (!ds->needs_standalone_vlan_filtering &&
-		   br_vlan_enabled(info->bridge.dev)) {
-		change_vlan_filtering = true;
-		vlan_filtering = false;
-	}
-
-	/* If the bridge was vlan_filtering, the bridge core doesn't trigger an
-	 * event for changing vlan_filtering setting upon slave ports leaving
-	 * it. That is a good thing, because that lets us handle it and also
-	 * handle the case where the switch's vlan_filtering setting is global
-	 * (not per port). When that happens, the correct moment to trigger the
-	 * vlan_filtering callback is only when the last port leaves the last
-	 * VLAN-aware bridge.
-	 */
-	if (change_vlan_filtering && ds->vlan_filtering_is_global) {
-		dsa_switch_for_each_port(dp, ds) {
-			struct net_device *br = dsa_port_bridge_dev_get(dp);
-
-			if (br && br_vlan_enabled(br)) {
-				change_vlan_filtering = false;
-				break;
-			}
-		}
-	}
-
-	if (change_vlan_filtering) {
-		err = dsa_port_vlan_filtering(dsa_to_port(ds, info->port),
-					      vlan_filtering, &extack);
-		if (extack._msg)
-			dev_err(ds->dev, "port %d: %s\n", info->port,
-				extack._msg);
-		if (err && err != -EOPNOTSUPP)
-			return err;
-	}
-
 	return dsa_tag_8021q_bridge_leave(ds, info);
 }