diff mbox series

[RFC,net-next,(resend),2/4] net: bridge: send notification for roaming hosts

Message ID 20241108035546.2055996-3-elliot.ayrey@alliedtelesis.co.nz (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series Send notifications for roaming hosts | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 40 this patch: 40
netdev/build_tools success Errors and warnings before: 2 (+0) this patch: 2 (+0)
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 62 this patch: 62
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4104 this patch: 4104
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 3 this patch: 3
netdev/source_inline success Was 0 now: 0

Commit Message

Elliot Ayrey Nov. 8, 2024, 3:55 a.m. UTC
When an fdb entry is configured as static and sticky it should never
roam. However there are times where it would be useful to know when
this happens so a user application can act on it. For this reason,
extend the fdb notification mechanism to send a notification when the
bridge detects a host that is attempting to roam when it has been
configured not to.

This is achieved by temporarily updating the fdb entry with the new
port, setting a new notify roaming bit, firing off a notification, and
restoring the original port immediately afterwards. The port remains
unchanged, respecting the sticky flag, but userspace is now notified
of the new port the host was seen on.

The roaming bit is cleared if the entry becomes inactive or if it is
replaced by a user entry.

Signed-off-by: Elliot Ayrey <elliot.ayrey@alliedtelesis.co.nz>
---
 include/uapi/linux/neighbour.h |  4 ++-
 net/bridge/br_fdb.c            | 64 +++++++++++++++++++++++-----------
 net/bridge/br_input.c          | 10 ++++--
 net/bridge/br_private.h        |  3 ++
 4 files changed, 58 insertions(+), 23 deletions(-)

Comments

Andrew Lunn Nov. 8, 2024, 1:42 p.m. UTC | #1
> This is achieved by temporarily updating the fdb entry with the new
> port, setting a new notify roaming bit, firing off a notification, and
> restoring the original port immediately afterwards. The port remains
> unchanged, respecting the sticky flag, but userspace is now notified
> of the new port the host was seen on.

This sounds a bit hacky. Could you add a new optional attribute to the
netlink message indicating the roam destination, so there is no need
to play games with the actual port?

I'm not too deep into how these all works, but i also wounder about
backwards compatibility. Old code which does not look for
FDB_NOTIFY_ROAMING_BIT is going to think it really has moved, with
your code. By using a new attribute, and not changing the port, old
code just sees a notification it is on the port it always was on,
which is less likely to cause issues?

And do we want to differentiate between it wants to roam, but the
sticky bit has stopped that, and it really has roamed?

	Andrew
Elliot Ayrey Nov. 24, 2024, 9:23 p.m. UTC | #2
On Sat, 2024-11-09 at 15:40 +0200, Nikolay Aleksandrov wrote:
> No way, this is ridiculous. Changing the port like that for a notification is not
> ok at all. It is also not the bridge's job to notify user-space for sticky fdbs
> that are trying to roam, you already have some user-space app and you can catch
> such fdbs by other means (sniffing, ebpf hooks, netfilter matching etc). Such
> change can also lead to DDoS attacks with many notifications.

Unfortunately in this case the only indication we get from the hardware of this
event happening is a switchdev notification to the bridge. All traffic is dropped
in hardware when the port is in this mode so the methods you suggest will not work.

I have changed my implementation to use Andrew's suggestion of using a new attribute
rather than messing with the port. But would this also be more appropriate if the
notification was only triggered when receiving the event from hardware? If not
then do you have any suggestions for getting these kinds of events from hardware
to userspace without going through the bridge?
Elliot Ayrey Nov. 24, 2024, 9:32 p.m. UTC | #3
On Fri, 2024-11-08 at 14:42 +0100, Andrew Lunn wrote:
> This sounds a bit hacky. Could you add a new optional attribute to the
> netlink message indicating the roam destination, so there is no need
> to play games with the actual port?

Yes that's another option.

> I'm not too deep into how these all works, but i also wounder about
> backwards compatibility. Old code which does not look for
> FDB_NOTIFY_ROAMING_BIT is going to think it really has moved, with
> your code. By using a new attribute, and not changing the port, old
> code just sees a notification it is on the port it always was on,
> which is less likely to cause issues?

Thanks that's a good point. I'll have a look at making it an attribute
instead.

> And do we want to differentiate between it wants to roam, but the
> sticky bit has stopped that, and it really has roamed?

That is partly what this patch is trying to do, since actually roaming
hosts will already trigger a notification with the new port.

I will try your suggestion as it seems better than what I have here. I
also think moving away from relying on the sticky bit might be good.
This behaviour relies on the port being locked in hardware so it might
be more appropriate for this to be part of locked behaviour in the
kernel also?
Andrew Lunn Nov. 24, 2024, 9:39 p.m. UTC | #4
> I have changed my implementation to use Andrew's suggestion of using a new attribute
> rather than messing with the port. But would this also be more appropriate if the
> notification was only triggered when receiving the event from hardware?

Hardware only accelerates what the Linux network stack already does in
software. You need something which makes sense for a pure software
setup.

	Andrew
Nikolay Aleksandrov Nov. 24, 2024, 9:57 p.m. UTC | #5
On 24/11/2024 23:23, Elliot Ayrey wrote:
> On Sat, 2024-11-09 at 15:40 +0200, Nikolay Aleksandrov wrote:
>> No way, this is ridiculous. Changing the port like that for a notification is not
>> ok at all. It is also not the bridge's job to notify user-space for sticky fdbs
>> that are trying to roam, you already have some user-space app and you can catch
>> such fdbs by other means (sniffing, ebpf hooks, netfilter matching etc). Such
>> change can also lead to DDoS attacks with many notifications.
> 
> Unfortunately in this case the only indication we get from the hardware of this
> event happening is a switchdev notification to the bridge. All traffic is dropped
> in hardware when the port is in this mode so the methods you suggest will not work.
> 

I see

> I have changed my implementation to use Andrew's suggestion of using a new attribute
> rather than messing with the port. But would this also be more appropriate if the
> notification was only triggered when receiving the event from hardware? If not
> then do you have any suggestions for getting these kinds of events from hardware
> to userspace without going through the bridge?
> 
> 

We want to have the same behaviour (or as close as possible) between sw and hw.
Since this can cause many notifications to be sent up for current setups, maybe
make it optional so we'll get notifications for roam attempts only when we
explicitly enable them, with default off. You can look into bridge's bool options
for this (e.g. link-local fdb learning option).


Cheers,
 Nik
diff mbox series

Patch

diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index 5e67a7eaf4a7..e1c686268808 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -201,10 +201,12 @@  enum {
  /* FDB activity notification bits used in NFEA_ACTIVITY_NOTIFY:
   * - FDB_NOTIFY_BIT - notify on activity/expire for any entry
   * - FDB_NOTIFY_INACTIVE_BIT - mark as inactive to avoid multiple notifications
+  * - FDB_NOTIFY_ROAMING_BIT - mark as attempting to roam
   */
 enum {
 	FDB_NOTIFY_BIT		= (1 << 0),
-	FDB_NOTIFY_INACTIVE_BIT	= (1 << 1)
+	FDB_NOTIFY_INACTIVE_BIT	= (1 << 1),
+	FDB_NOTIFY_ROAMING_BIT	= (1 << 2)
 };
 
 /* embedded into NDA_FDB_EXT_ATTRS:
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 72663ca824d3..a8b841e74e15 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -145,6 +145,8 @@  static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
 			goto nla_put_failure;
 		if (test_bit(BR_FDB_NOTIFY_INACTIVE, &fdb->flags))
 			notify_bits |= FDB_NOTIFY_INACTIVE_BIT;
+		if (test_bit(BR_FDB_NOTIFY_ROAMING, &fdb->flags))
+			notify_bits |= FDB_NOTIFY_ROAMING_BIT;
 
 		if (nla_put_u8(skb, NFEA_ACTIVITY_NOTIFY, notify_bits)) {
 			nla_nest_cancel(skb, nest);
@@ -554,8 +556,10 @@  void br_fdb_cleanup(struct work_struct *work)
 					work_delay = min(work_delay,
 							 this_timer - now);
 				else if (!test_and_set_bit(BR_FDB_NOTIFY_INACTIVE,
-							   &f->flags))
+							   &f->flags)) {
+					clear_bit(BR_FDB_NOTIFY_ROAMING, &f->flags);
 					fdb_notify(br, f, RTM_NEWNEIGH, false);
+				}
 			}
 			continue;
 		}
@@ -880,6 +884,19 @@  static bool __fdb_mark_active(struct net_bridge_fdb_entry *fdb)
 		  test_and_clear_bit(BR_FDB_NOTIFY_INACTIVE, &fdb->flags));
 }
 
+void br_fdb_notify_roaming(struct net_bridge *br, struct net_bridge_port *p,
+			   struct net_bridge_fdb_entry *fdb)
+{
+	struct net_bridge_port *old_p = READ_ONCE(fdb->dst);
+
+	if (test_bit(BR_FDB_NOTIFY, &fdb->flags) &&
+	    !test_and_set_bit(BR_FDB_NOTIFY_ROAMING, &fdb->flags)) {
+		WRITE_ONCE(fdb->dst, p);
+		fdb_notify(br, fdb, RTM_NEWNEIGH, false);
+		WRITE_ONCE(fdb->dst, old_p);
+	}
+}
+
 void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 		   const unsigned char *addr, u16 vid, unsigned long flags)
 {
@@ -906,21 +923,24 @@  void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 			}
 
 			/* fastpath: update of existing entry */
-			if (unlikely(source != READ_ONCE(fdb->dst) &&
-				     !test_bit(BR_FDB_STICKY, &fdb->flags))) {
-				br_switchdev_fdb_notify(br, fdb, RTM_DELNEIGH);
-				WRITE_ONCE(fdb->dst, source);
-				fdb_modified = true;
-				/* Take over HW learned entry */
-				if (unlikely(test_bit(BR_FDB_ADDED_BY_EXT_LEARN,
-						      &fdb->flags)))
-					clear_bit(BR_FDB_ADDED_BY_EXT_LEARN,
-						  &fdb->flags);
-				/* Clear locked flag when roaming to an
-				 * unlocked port.
-				 */
-				if (unlikely(test_bit(BR_FDB_LOCKED, &fdb->flags)))
-					clear_bit(BR_FDB_LOCKED, &fdb->flags);
+			if (unlikely(source != READ_ONCE(fdb->dst))) {
+				if (unlikely(test_bit(BR_FDB_STICKY, &fdb->flags))) {
+					br_fdb_notify_roaming(br, source, fdb);
+				} else {
+					br_switchdev_fdb_notify(br, fdb, RTM_DELNEIGH);
+					WRITE_ONCE(fdb->dst, source);
+					fdb_modified = true;
+					/* Take over HW learned entry */
+					if (unlikely(test_bit(BR_FDB_ADDED_BY_EXT_LEARN,
+							      &fdb->flags)))
+						clear_bit(BR_FDB_ADDED_BY_EXT_LEARN,
+							  &fdb->flags);
+					/* Clear locked flag when roaming to an
+					 * unlocked port.
+					 */
+					if (unlikely(test_bit(BR_FDB_LOCKED, &fdb->flags)))
+						clear_bit(BR_FDB_LOCKED, &fdb->flags);
+				}
 			}
 
 			if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags))) {
@@ -1045,6 +1065,7 @@  static bool fdb_handle_notify(struct net_bridge_fdb_entry *fdb, u8 notify)
 		   test_and_clear_bit(BR_FDB_NOTIFY, &fdb->flags)) {
 		/* disabled activity tracking, clear notify state */
 		clear_bit(BR_FDB_NOTIFY_INACTIVE, &fdb->flags);
+		clear_bit(BR_FDB_NOTIFY_ROAMING, &fdb->flags);
 		modified = true;
 	}
 
@@ -1457,10 +1478,13 @@  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 
 		fdb->updated = jiffies;
 
-		if (READ_ONCE(fdb->dst) != p &&
-		    !test_bit(BR_FDB_STICKY, &fdb->flags)) {
-			WRITE_ONCE(fdb->dst, p);
-			modified = true;
+		if (READ_ONCE(fdb->dst) != p) {
+			if (test_bit(BR_FDB_STICKY, &fdb->flags)) {
+				br_fdb_notify_roaming(br, p, fdb);
+			} else {
+				WRITE_ONCE(fdb->dst, p);
+				modified = true;
+			}
 		}
 
 		if (test_and_set_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags)) {
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index ceaa5a89b947..512ffab16f5d 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -120,8 +120,14 @@  int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 				br_fdb_update(br, p, eth_hdr(skb)->h_source,
 					      vid, BIT(BR_FDB_LOCKED));
 			goto drop;
-		} else if (READ_ONCE(fdb_src->dst) != p ||
-			   test_bit(BR_FDB_LOCAL, &fdb_src->flags)) {
+		} else if (READ_ONCE(fdb_src->dst) != p) {
+			/* FDB is trying to roam. Notify userspace and drop
+			 * the packet
+			 */
+			if (test_bit(BR_FDB_STICKY, &fdb_src->flags))
+				br_fdb_notify_roaming(br, p, fdb_src);
+			goto drop;
+		} else if (test_bit(BR_FDB_LOCAL, &fdb_src->flags)) {
 			/* FDB mismatch. Drop the packet without roaming. */
 			goto drop;
 		} else if (test_bit(BR_FDB_LOCKED, &fdb_src->flags)) {
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 041f6e571a20..18d3cb5fec0e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -277,6 +277,7 @@  enum {
 	BR_FDB_NOTIFY_INACTIVE,
 	BR_FDB_LOCKED,
 	BR_FDB_DYNAMIC_LEARNED,
+	BR_FDB_NOTIFY_ROAMING,
 };
 
 struct net_bridge_fdb_key {
@@ -874,6 +875,8 @@  int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      bool swdev_notify);
 void br_fdb_offloaded_set(struct net_bridge *br, struct net_bridge_port *p,
 			  const unsigned char *addr, u16 vid, bool offloaded);
+void br_fdb_notify_roaming(struct net_bridge *br, struct net_bridge_port *p,
+			   struct net_bridge_fdb_entry *fdb);
 
 /* br_forward.c */
 enum br_pkt_type {