diff mbox series

[v2,net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag

Message ID 20240905200551.4099064-1-jrife@google.com (mailing list archive)
State New
Headers show
Series [v2,net-next] wireguard: allowedips: Add WGALLOWEDIP_F_REMOVE_ME flag | expand

Commit Message

Jordan Rife Sept. 5, 2024, 8:05 p.m. UTC
With the current API the only way to remove an allowed IP is to
completely rebuild the allowed IPs set for a peer using
WGPEER_F_REPLACE_ALLOWEDIPS. In other words, if my current configuration
is such that a peer has allowed IP IPs 192.168.0.2 and 192.168.0.3 and I
want to remove 192.168.0.2 the actual transition looks like this.

[192.168.0.2, 192.168.0.3] <-- Initial state
[]                         <-- Step 1: Allowed IPs removed for peer
[192.168.0.3]              <-- Step 2: Allowed IPs added back for peer

This is true even if the allowed IP list is small and the update does
not need to be batched into multiple WG_CMD_SET_DEVICE requests, as
the removal and subsequent addition of IPs is non-atomic within a single
request. Consequently, wg_allowedips_lookup_dst and
wg_allowedips_lookup_src may return NULL while reconfiguring a peer even
for packets bound for IPs a user did not intend to remove leading to
unintended interruptions in connectivity. This presents in userspace as
failed calls to sendto and sendmsg for UDP sockets. In my case, I ran
netperf while repeatedly reconfiguring the allowed IPs for a peer with
wg.

/usr/local/bin/netperf -H 10.102.73.72 -l 10m -t UDP_STREAM -- -R 1 -m 1024
send_data: data send error: No route to host (errno 113)
netperf: send_omni: send_data failed: No route to host

While this may not be of particular concern for environments where peers
and allowed IPs are mostly static, Cilium manages peers and allowed IPs
in a dynamic environment where peers (i.e. Kubernetes nodes) and allowed
IPs (i.e. Pods running on those nodes) can frequently change. Cilium
must continually keep its WireGuard device's configuration in sync with
its cluster state leading to unnecessary churn and packet drops.

This patch introduces a new flag called WGALLOWEDIP_F_REMOVE_ME which
in the same way that WGPEER_F_REMOVE_ME allows a user to remove a single
peer from a WireGuard device's configuration allows a user to remove an
IP from a peer's set of allowed IPs. This has two benefits. First, it
allows systems such as Cilium to avoid introducing connectivity blips
while reconfiguring a WireGuard device. Second, it allows us to more
efficiently keep the device's configuration in sync with the cluster
state, as we no longer need to do frequent rebuilds of the allowed IPs
list for each peer. Instead, the device's configuration can be
incrementally updated. This patch also bumps WG_GENL_VERSION which can
be used by clients to detect whether or not their system supports the
WGALLOWEDIP_F_REMOVE_ME flag.

=======
Changes
=======

v1->v2
------
* Fixed some Sparse warnings

Signed-off-by: Jordan Rife <jrife@google.com>
---
 drivers/net/wireguard/allowedips.c            | 103 ++++++++++----
 drivers/net/wireguard/allowedips.h            |   4 +
 drivers/net/wireguard/netlink.c               |  45 +++++--
 drivers/net/wireguard/selftest/allowedips.c   |  30 +++++
 include/uapi/linux/wireguard.h                |  11 +-
 tools/testing/selftests/wireguard/Makefile    |  18 +++
 tools/testing/selftests/wireguard/netns.sh    |  38 ++++++
 tools/testing/selftests/wireguard/remove-ip.c | 126 ++++++++++++++++++
 8 files changed, 333 insertions(+), 42 deletions(-)
 create mode 100644 tools/testing/selftests/wireguard/Makefile
 create mode 100644 tools/testing/selftests/wireguard/remove-ip.c

Comments

Jordan Rife Oct. 1, 2024, 4:26 p.m. UTC | #1
Hi Jason,

Just reaching out to see if you've had a chance to take a look at this. Thanks!

-Jordan
Jason A. Donenfeld Nov. 17, 2024, 8:50 p.m. UTC | #2
Hi Jordan,

Sorry it's taken me a bit to look at this patch. A few notes:

On Thu, Sep 05, 2024 at 03:05:41PM -0500, Jordan Rife wrote:
> With the current API the only way to remove an allowed IP is to
> completely rebuild the allowed IPs set for a peer using
> WGPEER_F_REPLACE_ALLOWEDIPS. In other words, if my current configuration
> is such that a peer has allowed IP IPs 192.168.0.2 and 192.168.0.3 and I
> want to remove 192.168.0.2 the actual transition looks like this.
> 
> [192.168.0.2, 192.168.0.3] <-- Initial state
> []                         <-- Step 1: Allowed IPs removed for peer
> [192.168.0.3]              <-- Step 2: Allowed IPs added back for peer
> 
> This is true even if the allowed IP list is small and the update does
> not need to be batched into multiple WG_CMD_SET_DEVICE requests, as
> the removal and subsequent addition of IPs is non-atomic within a single
> request. Consequently, wg_allowedips_lookup_dst and
> wg_allowedips_lookup_src may return NULL while reconfiguring a peer even
> for packets bound for IPs a user did not intend to remove leading to
> unintended interruptions in connectivity. This presents in userspace as
> failed calls to sendto and sendmsg for UDP sockets. In my case, I ran
> netperf while repeatedly reconfiguring the allowed IPs for a peer with
> wg.
> 
> /usr/local/bin/netperf -H 10.102.73.72 -l 10m -t UDP_STREAM -- -R 1 -m 1024
> send_data: data send error: No route to host (errno 113)
> netperf: send_omni: send_data failed: No route to host

That's a worthwhile point. This indeed fixes a *bug*, beyond being just
a helpful new feature.

> incrementally updated. This patch also bumps WG_GENL_VERSION which can
> be used by clients to detect whether or not their system supports the
> WGALLOWEDIP_F_REMOVE_ME flag.

I'm actually less enthusiastic about this part, but mainly because I
haven't looked closely at what the convention for this is. I was
wondering though - this adds WGALLOWEDIP_A_FLAGS, which didn't exist
before. Shouldn't some upper layer return a relevant value in that case?
And even within the existing flags, for WGPEER_A_FLAGS, for example, old
kernels check to see if there are new flags, for this purpose, e.g.:

        if (attrs[WGPEER_A_FLAGS])
                flags = nla_get_u32(attrs[WGPEER_A_FLAGS]);
        ret = -EOPNOTSUPP;
        if (flags & ~__WGPEER_F_ALL)
                goto out;

So I think we might be able to avoid having to bump the version number.
GENL is supposed to be extensible like this.

> +static void _remove(struct allowedips_node *node, struct mutex *lock)

This file doesn't really do the _ prefix thing anywhere. Can you call
this something more descriptive, like "remove_node"?

> -		if (free_parent)
> -			child = rcu_dereference_protected(
> -					parent->bit[!(node->parent_bit_packed & 1)],
> -					lockdep_is_held(lock));
[...]
> +	if (free_parent)
> +		child = rcu_dereference_protected(parent->bit[!(node->parent_bit_packed & 1)],
> +						  lockdep_is_held(lock));

Thanks for fixing up the ugly extra \n in the original code you copy and
pasted from. I remember the horror of that when I added line breaks in
the original code.

> +	call_rcu(&node->rcu, node_free_rcu);
> +	if (!free_parent)
> +		return;
> +	if (child)
> +		child->parent_bit_packed = parent->parent_bit_packed;
> +	*(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child;
> +	call_rcu(&parent->rcu, node_free_rcu);
> +}
> +
> +static int remove(struct allowedips_node __rcu **trie, u8 bits, const u8 *key,
> +		  u8 cidr, struct wg_peer *peer, struct mutex *lock)
> +{
> +	struct allowedips_node *node;
> +
> +	if (unlikely(cidr > bits || !peer))
> +		return -EINVAL;

Reasoning for this is that it copies the logic in add()?

> +	if (!rcu_access_pointer(*trie) ||
> +	    !node_placement(*trie, key, cidr, bits, &node, lock) ||
> +	    peer != rcu_access_pointer(node->peer))
> +		return 0;

What's the reasoning behind returning success when it can't find the
node? Because in that case it's already removed so the function is
idempotent? And you checked that nothing really cares about the return
value there anyway? Or is this a mistake and you meant to return
something else? I can imagine good reasoning in either direction; I'd
just like to learn that your choice is deliberate.

> +
> +	_remove(node, lock);
> +
> +	return 0;
> +}
> +
>  	family = nla_get_u16(attrs[WGALLOWEDIP_A_FAMILY]);
>  	cidr = nla_get_u8(attrs[WGALLOWEDIP_A_CIDR_MASK]);
> +	if (attrs[WGALLOWEDIP_A_FLAGS])
> +		flags = nla_get_u32(attrs[WGALLOWEDIP_A_FLAGS]);

As I mentioned above, you need to do the dance of:

        ret = -EOPNOTSUPP;
        if (flags & ~__WGALLOWEDIP_F_ALL)
                goto out;

So that we can safely extend this later.


>  
>  	if (family == AF_INET && cidr <= 32 &&
> -	    nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr))
> -		ret = wg_allowedips_insert_v4(
> -			&peer->device->peer_allowedips,
> -			nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer,
> -			&peer->device->device_update_lock);
> -	else if (family == AF_INET6 && cidr <= 128 &&
> -		 nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr))
> -		ret = wg_allowedips_insert_v6(
> -			&peer->device->peer_allowedips,
> -			nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer,
> -			&peer->device->device_update_lock);
> +	    nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr)) {
> +		if (flags & WGALLOWEDIP_F_REMOVE_ME)
> +			ret = wg_allowedips_remove_v4(&peer->device->peer_allowedips,
> +						      nla_data(attrs[WGALLOWEDIP_A_IPADDR]),
> +						      cidr,
> +						      peer,
> +						      &peer->device->device_update_lock);

We get 100 chars now, so you can rewrite this as:

                        ret = wg_allowedips_remove_v4(&peer->device->peer_allowedips,
                                                      nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr,
                                                      peer, &peer->device->device_update_lock);

> + *                    WGALLOWEDIP_A_FLAGS: NLA_U32, WGALLOWEDIP_F_REMOVE_ME if
> + *                                         the specified IP should be removed,

That comma should be a semicolon because what comes after is a complete
sentence, and there's no conjunction.

> + *                                         otherwise this IP will be added if
> + *                                         it is not already present.


> +remove-ip:
> +	gcc -I/usr/include/libnl3 \
> +	    -I../../../../usr/include \
> +	    remove-ip.c \
> +	    -o remove-ip \
> +	    -lnl-genl-3 \
> +	    -lnl-3
>
> +	sock = nl_socket_alloc();
> +	genl_connect(sock);
> +	family = genl_ctrl_resolve(sock, WG_GENL_NAME);
> +	msg = nlmsg_alloc();
> +	genlmsg_put(msg, NL_AUTO_PID, NL_AUTO_SEQ, family, 0, NLM_F_ECHO,
> +		    WG_CMD_SET_DEVICE, WG_GENL_VERSION);
> +	nla_put_string(msg, WGDEVICE_A_IFNAME, argv[1]);
> +
> +	struct nlattr *peers = nla_nest_start(msg, WGDEVICE_A_PEERS);
> +	struct nlattr *peer0 = nla_nest_start(msg, 0);
> +
> +	nla_put(msg, WGPEER_A_PUBLIC_KEY, CURVE25519_KEY_SIZE, pub_key);
> +
> +	struct nlattr *allowed_ips = nla_nest_start(msg, WGPEER_A_ALLOWEDIPS);
> +	struct nlattr *allowed_ip0 = nla_nest_start(msg, 0);
> +
> +	nla_put_u16(msg, WGALLOWEDIP_A_FAMILY, af);
> +	nla_put(msg, WGALLOWEDIP_A_IPADDR, addr_len, &addr);
> +	nla_put_u8(msg, WGALLOWEDIP_A_CIDR_MASK, cidr);
> +	nla_put_u32(msg, WGALLOWEDIP_A_FLAGS, WGALLOWEDIP_F_REMOVE_ME);
> +	nla_nest_end(msg, allowed_ip0);
> +	nla_nest_end(msg, allowed_ips);
> +	nla_nest_end(msg, peer0);
> +	nla_nest_end(msg, peers);
> +
> +	int err = nl_send_sync(sock, msg);
> +
> +	if (err < 0) {
> +		char message[256];
> +
> +		nl_perror(err, message);
> +		printf("An error occurred: %d - %s\n", err, message);
> +	}
> +
 
I'm not so keen on this, simply because we've been able to do everything
else in that script and keeping with the "make sure the userspace
tooling" paradigm. There are two options:

1. Rewrite netns.sh all in C, only depending on libnl or whatever (which
   I would actually really *love* to see happen). This would change the
   testing paradigm, but I'd be okay with that if it's done well and all
   at once.

2. Add support for this new flag to wg(8) (which I think is necessary
   anyway for this to land; kernel features and userspace support oughta
   be posted at once).


Thanks for the patch. I like the feature and I'm happy you posted this.

Jason
Jason A. Donenfeld Nov. 17, 2024, 9:46 p.m. UTC | #3
On Thu, Sep 05, 2024 at 03:05:41PM -0500, Jordan Rife wrote:
> With the current API the only way to remove an allowed IP is to
> completely rebuild the allowed IPs set for a peer using
> WGPEER_F_REPLACE_ALLOWEDIPS. 

Just for posterity, there actually is another way: create a new peer
with a random key, and give the allowed IP you want to remove to that
peer. Moves are atomic. Then destroy that peer.

Not that this is clean or nice or something, and I like your patch. But
in case somebody gets into trouble before this lands, I thought I should
note it on the list.

Jason
Jordan Rife Nov. 18, 2024, 8:44 p.m. UTC | #4
Hi Jason,

Thanks a lot for taking a look!

> I'm actually less enthusiastic about this part, but mainly because I
> haven't looked closely at what the convention for this is. I was
> wondering though - this adds WGALLOWEDIP_A_FLAGS, which didn't exist
> before. Shouldn't some upper layer return a relevant value in that case?
> And even within the existing flags, for WGPEER_A_FLAGS, for example, old
> kernels check to see if there are new flags, for this purpose, e.g.:
>
>         if (attrs[WGPEER_A_FLAGS])
>                 flags = nla_get_u32(attrs[WGPEER_A_FLAGS]);
>         ret = -EOPNOTSUPP;
>         if (flags & ~__WGPEER_F_ALL)
>                 goto out;
>
> So I think we might be able to avoid having to bump the version number.
> GENL is supposed to be extensible like this.

I think the challenge with WGALLOWEDIP_A_FLAGS in particular is that
because it didn't exist since the beginning like WGPEER_A_FLAGS, there
are kernels out there that have no knowledge of it and wouldn't have
this check in place. While I think it's a good idea to replicate this
check for WGALLOWEDIP_A_FLAGS as well for future compatibility, we
still need some way for clients to probe whether or not this feature
is supported in case they're running on an older kernel. If we want to
keep the version number as-is, I see a few alternatives:

1. Include WGALLOWEDIP_A_FLAGS in the response of WG_CMD_GET_DEVICE.
This would be a new field inside each entry of WGPEER_A_ALLOWEDIPS. If
the result of WG_CMD_GET_DEVICE contains WGALLOWEDIP_A_FLAGS then a
client knows that it can use features WGALLOWEDIP_F_REMOVE_ME. It
could potentially be used later to contain persistent flags for an IP,
but for now would just be zeroed out. This fails if there are no
allowed IPs configured for the device you're trying to configure, but
in the case where you're trying to use WGALLOWEDIP_F_REMOVE_ME you
probably would.
2. Keep everything the same. Don't bump the version number. Clients
could in theory try to use WGALLOWEDIP_A_FLAGS with WG_CMD_SET_DEVICE
then check if their request had the intended effect (e.g. checking if
the IP was removed).

I slightly prefer approach 1 myself, as I feel it's a bit cleaner, but
I'm happy to discuss some other approaches here. I'm trying to think
about how these probes could be used inside the WireGuard Go library
and wg itself. Assuming approach one,

* For libraries that manage WireGuard like
golang.zx2c4.com/wireguard/wgctrl the first time a client uses
.Device() (WG_CMD_GET_DEVICE under the hood) there would need to be
some logic that looks at WGPEER_A_ALLOWEDIPS for one of the peers and
sets some internal flag like c.supportsAllowedIPFlags. When a client
later calls c.ConfigureDevice() the library would disallow direct IP
removal if the feature is not supported (c.supportsAllowedIPFlags !=
nil && !*c.supportsAllowedIPFlags). Alternatively, you could do all
this up front while initializing the client by creating some dummy
device + peer, adding some IP, then using WG_CMD_GET_DEVICE to see if
WGALLOWEDIP_A_FLAGS is present in the result.
* Similarly for wg, if the user is trying to remove an allowed IP
you'd first query the allowed IP for a peer and check for
WGALLOWEDIP_A_FLAGS if it doesn't exist in the response then the
command would fail and print something like "not supported".

Bumping WG_GENL_VERSION is cleaner still, since clients in userspace
just need to check an integer value and avoid any probing logic.
However, like you I am unsure what the convention is here and whether
or not this has any unintended effects.

> This file doesn't really do the _ prefix thing anywhere. Can you call
> this something more descriptive, like "remove_node"?

Sure. I'll update this in v3.

> Reasoning for this is that it copies the logic in add()?

As for the cidr > bits check, the intent was simply to fail if the
user passes an invalid value for WGALLOWEDIP_A_CIDR_MASK. Although,
unlike the add() case, I suppose we could just remove this check.
Worst case if a user passes something high like 255 remove() would
just be a no-op. It looks safe to remove !peer check as well. I can
update this in v3.

> What's the reasoning behind returning success when it can't find the
> node? Because in that case it's already removed so the function is
> idempotent? And you checked that nothing really cares about the return
> value there anyway? Or is this a mistake and you meant to return
> something else? I can imagine good reasoning in either direction; I'd
> just like to learn that your choice is deliberate.

Yes, I opted for idempotence here intentionally. At least for the use
cases I have in mind the intent is basically "remove all these allowed
IPs from this peer if they exist". If an allowed IP already got
removed or is mapped to another peer somehow then that's fine. I'd
imagine it complicates the code in userspace to have to deal with
corner cases involving possible error codes returned when removing a
batch of allowed IPs. You'd need to query the device again, reform
your request, etc. I /think/ add() is idempotent as well in cases
where you re-add an IP to a peer, so there's some symmetry here.
Perhaps if there are use cases that need more stricter checks in the
future we could introduce a new flag to return an error code in this
case.

> As I mentioned above, you need to do the dance of:
>
>         ret = -EOPNOTSUPP;
>         if (flags & ~__WGALLOWEDIP_F_ALL)
>                 goto out;
>
> So that we can safely extend this later.

Good idea. I will add this in v3 as well.

> We get 100 chars now, so you can rewrite this as:
>
>                         ret = wg_allowedips_remove_v4(&peer->device->peer_allowedips,
>                                                       nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr,
>                                                       peer, &peer->device->device_update_lock);

Nice, will do.

> That comma should be a semicolon because what comes after is a complete
> sentence, and there's no conjunction.

Good point. I think we might actually need a comma /after/ otherwise:
"...; otherwise, ...": "WGALLOWEDIP_F_REMOVE_ME if the specified IP
should be removed; otherwise, this IP will be added if it is not
already present".

> I'm not so keen on this, simply because we've been able to do everything
> else in that script and keeping with the "make sure the userspace
> tooling" paradigm. There are two options:
>
> 1. Rewrite netns.sh all in C, only depending on libnl or whatever (which
>    I would actually really *love* to see happen). This would change the
>    testing paradigm, but I'd be okay with that if it's done well and all
>    at once.
>
> 2. Add support for this new flag to wg(8) (which I think is necessary
>    anyway for this to land; kernel features and userspace support oughta
>    be posted at once).
>
>
> Thanks for the patch. I like the feature and I'm happy you posted this.

Option 1 seems like a heavy lift for this patch. I think option 2
makes more sense, as long as there is an understanding that netns.sh
needs to be run with the latest and greatest version of wg in order
for all tests to pass. Maybe we can add a version check to selectively
disable the remove IP tests if wg is on an older version. I agree
though that long term option 1 would be nice, as it provides a finer
level of control and tests could be run without as many external
dependencies. I can get rid of the remove-ip cruft and send two
patches to the wireguard mailing list if that works, one for the
kernel and one for wireguard-tools.

However, this raises some questions about the wg interface itself
which would be used both by netns.sh and end users. Looking at the
current interface for wg, the way to configure allowed IPs currently
is "wg set". For example,

wg set peer ABCD... allowed-ips 192.168.0.24/32,192.168.0.44/32,192.168.0.88/32

The intended effect is to completely replace the allowed IPs for that
peer rather than to make an incremental change. I think we'll need to
extend the interface a bit. There are a few directions we can take
here:

1. Add a new flag to "wg set" like --incremental in which case it
won't use WGPEER_F_REPLACE_ALLOWEDIPS under the hood. Support addition
or removal of allowed IPs with a "-" suffix on CIDRs you want to
remove (192.168.0.24/32-,192.168.0.44/32-,192.168.0.88/32).
2. Same as 1 but with a new argument name, allowed-ips-diff instead of
the --incremental flag. This appears more in line with the current
style of wg's arguments.

There are a lot more variations here, so I wanted to get your input
here before just picking a direction. Thanks again for the thorough
feedback!

-Jordan
diff mbox series

Patch

diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c
index 4b8528206cc8a..ff52259dd8d81 100644
--- a/drivers/net/wireguard/allowedips.c
+++ b/drivers/net/wireguard/allowedips.c
@@ -249,6 +249,56 @@  static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key,
 	return 0;
 }
 
+static void _remove(struct allowedips_node *node, struct mutex *lock)
+{
+	struct allowedips_node *child, **parent_bit, *parent;
+	bool free_parent;
+
+	list_del_init(&node->peer_list);
+	RCU_INIT_POINTER(node->peer, NULL);
+	if (node->bit[0] && node->bit[1])
+		return;
+	child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])],
+					  lockdep_is_held(lock));
+	if (child)
+		child->parent_bit_packed = node->parent_bit_packed;
+	parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL);
+	*parent_bit = child;
+	parent = (void *)parent_bit -
+			offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]);
+	free_parent = !rcu_access_pointer(node->bit[0]) &&
+			!rcu_access_pointer(node->bit[1]) &&
+			(node->parent_bit_packed & 3) <= 1 &&
+			!rcu_access_pointer(parent->peer);
+	if (free_parent)
+		child = rcu_dereference_protected(parent->bit[!(node->parent_bit_packed & 1)],
+						  lockdep_is_held(lock));
+	call_rcu(&node->rcu, node_free_rcu);
+	if (!free_parent)
+		return;
+	if (child)
+		child->parent_bit_packed = parent->parent_bit_packed;
+	*(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child;
+	call_rcu(&parent->rcu, node_free_rcu);
+}
+
+static int remove(struct allowedips_node __rcu **trie, u8 bits, const u8 *key,
+		  u8 cidr, struct wg_peer *peer, struct mutex *lock)
+{
+	struct allowedips_node *node;
+
+	if (unlikely(cidr > bits || !peer))
+		return -EINVAL;
+	if (!rcu_access_pointer(*trie) ||
+	    !node_placement(*trie, key, cidr, bits, &node, lock) ||
+	    peer != rcu_access_pointer(node->peer))
+		return 0;
+
+	_remove(node, lock);
+
+	return 0;
+}
+
 void wg_allowedips_init(struct allowedips *table)
 {
 	table->root4 = table->root6 = NULL;
@@ -300,43 +350,38 @@  int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip,
 	return add(&table->root6, 128, key, cidr, peer, lock);
 }
 
+int wg_allowedips_remove_v4(struct allowedips *table, const struct in_addr *ip,
+			    u8 cidr, struct wg_peer *peer, struct mutex *lock)
+{
+	/* Aligned so it can be passed to fls */
+	u8 key[4] __aligned(__alignof(u32));
+
+	++table->seq;
+	swap_endian(key, (const u8 *)ip, 32);
+	return remove(&table->root4, 32, key, cidr, peer, lock);
+}
+
+int wg_allowedips_remove_v6(struct allowedips *table, const struct in6_addr *ip,
+			    u8 cidr, struct wg_peer *peer, struct mutex *lock)
+{
+	/* Aligned so it can be passed to fls64 */
+	u8 key[16] __aligned(__alignof(u64));
+
+	++table->seq;
+	swap_endian(key, (const u8 *)ip, 128);
+	return remove(&table->root6, 128, key, cidr, peer, lock);
+}
+
 void wg_allowedips_remove_by_peer(struct allowedips *table,
 				  struct wg_peer *peer, struct mutex *lock)
 {
-	struct allowedips_node *node, *child, **parent_bit, *parent, *tmp;
-	bool free_parent;
+	struct allowedips_node *node, *tmp;
 
 	if (list_empty(&peer->allowedips_list))
 		return;
 	++table->seq;
 	list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) {
-		list_del_init(&node->peer_list);
-		RCU_INIT_POINTER(node->peer, NULL);
-		if (node->bit[0] && node->bit[1])
-			continue;
-		child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])],
-						  lockdep_is_held(lock));
-		if (child)
-			child->parent_bit_packed = node->parent_bit_packed;
-		parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL);
-		*parent_bit = child;
-		parent = (void *)parent_bit -
-			 offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]);
-		free_parent = !rcu_access_pointer(node->bit[0]) &&
-			      !rcu_access_pointer(node->bit[1]) &&
-			      (node->parent_bit_packed & 3) <= 1 &&
-			      !rcu_access_pointer(parent->peer);
-		if (free_parent)
-			child = rcu_dereference_protected(
-					parent->bit[!(node->parent_bit_packed & 1)],
-					lockdep_is_held(lock));
-		call_rcu(&node->rcu, node_free_rcu);
-		if (!free_parent)
-			continue;
-		if (child)
-			child->parent_bit_packed = parent->parent_bit_packed;
-		*(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child;
-		call_rcu(&parent->rcu, node_free_rcu);
+		_remove(node, lock);
 	}
 }
 
diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h
index 2346c797eb4d8..931958cb6e100 100644
--- a/drivers/net/wireguard/allowedips.h
+++ b/drivers/net/wireguard/allowedips.h
@@ -38,6 +38,10 @@  int wg_allowedips_insert_v4(struct allowedips *table, const struct in_addr *ip,
 			    u8 cidr, struct wg_peer *peer, struct mutex *lock);
 int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip,
 			    u8 cidr, struct wg_peer *peer, struct mutex *lock);
+int wg_allowedips_remove_v4(struct allowedips *table, const struct in_addr *ip,
+			    u8 cidr, struct wg_peer *peer, struct mutex *lock);
+int wg_allowedips_remove_v6(struct allowedips *table, const struct in6_addr *ip,
+			    u8 cidr, struct wg_peer *peer, struct mutex *lock);
 void wg_allowedips_remove_by_peer(struct allowedips *table,
 				  struct wg_peer *peer, struct mutex *lock);
 /* The ip input pointer should be __aligned(__alignof(u64))) */
diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c
index f7055180ba4aa..5f2a8553ab43d 100644
--- a/drivers/net/wireguard/netlink.c
+++ b/drivers/net/wireguard/netlink.c
@@ -46,7 +46,8 @@  static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = {
 static const struct nla_policy allowedip_policy[WGALLOWEDIP_A_MAX + 1] = {
 	[WGALLOWEDIP_A_FAMILY]		= { .type = NLA_U16 },
 	[WGALLOWEDIP_A_IPADDR]		= NLA_POLICY_MIN_LEN(sizeof(struct in_addr)),
-	[WGALLOWEDIP_A_CIDR_MASK]	= { .type = NLA_U8 }
+	[WGALLOWEDIP_A_CIDR_MASK]	= { .type = NLA_U8 },
+	[WGALLOWEDIP_A_FLAGS]		= { .type = NLA_U32 }
 };
 
 static struct wg_device *lookup_interface(struct nlattr **attrs,
@@ -329,6 +330,7 @@  static int set_port(struct wg_device *wg, u16 port)
 static int set_allowedip(struct wg_peer *peer, struct nlattr **attrs)
 {
 	int ret = -EINVAL;
+	u32 flags = 0;
 	u16 family;
 	u8 cidr;
 
@@ -337,19 +339,38 @@  static int set_allowedip(struct wg_peer *peer, struct nlattr **attrs)
 		return ret;
 	family = nla_get_u16(attrs[WGALLOWEDIP_A_FAMILY]);
 	cidr = nla_get_u8(attrs[WGALLOWEDIP_A_CIDR_MASK]);
+	if (attrs[WGALLOWEDIP_A_FLAGS])
+		flags = nla_get_u32(attrs[WGALLOWEDIP_A_FLAGS]);
 
 	if (family == AF_INET && cidr <= 32 &&
-	    nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr))
-		ret = wg_allowedips_insert_v4(
-			&peer->device->peer_allowedips,
-			nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer,
-			&peer->device->device_update_lock);
-	else if (family == AF_INET6 && cidr <= 128 &&
-		 nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr))
-		ret = wg_allowedips_insert_v6(
-			&peer->device->peer_allowedips,
-			nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer,
-			&peer->device->device_update_lock);
+	    nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr)) {
+		if (flags & WGALLOWEDIP_F_REMOVE_ME)
+			ret = wg_allowedips_remove_v4(&peer->device->peer_allowedips,
+						      nla_data(attrs[WGALLOWEDIP_A_IPADDR]),
+						      cidr,
+						      peer,
+						      &peer->device->device_update_lock);
+		else
+			ret = wg_allowedips_insert_v4(&peer->device->peer_allowedips,
+						      nla_data(attrs[WGALLOWEDIP_A_IPADDR]),
+						      cidr,
+						      peer,
+						      &peer->device->device_update_lock);
+	} else if (family == AF_INET6 && cidr <= 128 &&
+		   nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr)) {
+		if (flags & WGALLOWEDIP_F_REMOVE_ME)
+			ret = wg_allowedips_remove_v6(&peer->device->peer_allowedips,
+						      nla_data(attrs[WGALLOWEDIP_A_IPADDR]),
+						      cidr,
+						      peer,
+						      &peer->device->device_update_lock);
+		else
+			ret = wg_allowedips_insert_v6(&peer->device->peer_allowedips,
+						      nla_data(attrs[WGALLOWEDIP_A_IPADDR]),
+						      cidr,
+						      peer,
+						      &peer->device->device_update_lock);
+	}
 
 	return ret;
 }
diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c
index 3d1f64ff2e122..9f6458a889e96 100644
--- a/drivers/net/wireguard/selftest/allowedips.c
+++ b/drivers/net/wireguard/selftest/allowedips.c
@@ -461,6 +461,10 @@  static __init struct wg_peer *init_peer(void)
 	wg_allowedips_insert_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \
 					cidr, mem, &mutex)
 
+#define remove(version, mem, ipa, ipb, ipc, ipd, cidr)                      \
+	wg_allowedips_remove_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \
+					cidr, mem, &mutex)
+
 #define maybe_fail() do {                                               \
 		++i;                                                    \
 		if (!_s) {                                              \
@@ -586,6 +590,32 @@  bool __init wg_allowedips_selftest(void)
 	test_negative(4, a, 192, 0, 0, 0);
 	test_negative(4, a, 255, 0, 0, 0);
 
+	insert(4, a, 1, 0, 0, 0, 32);
+	insert(4, a, 192, 0, 0, 0, 24);
+	insert(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128);
+	insert(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98);
+	test(4, a, 1, 0, 0, 0);
+	test(4, a, 192, 0, 0, 1);
+	test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef);
+	test(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010);
+	/* Must be an exact match to remove */
+	remove(4, a, 192, 0, 0, 0, 32);
+	test(4, a, 192, 0, 0, 1);
+	remove(4, a, 192, 0, 0, 0, 24);
+	test_negative(4, a, 192, 0, 0, 1);
+	remove(4, a, 1, 0, 0, 0, 32);
+	test_negative(4, a, 1, 0, 0, 0);
+	/* Must be an exact match to remove */
+	remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 96);
+	test(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef);
+	remove(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128);
+	test_negative(6, a, 0x24446801, 0x40e40800, 0xdeaebeef, 0xdefbeef);
+	/* Must match the peer to remove */
+	remove(6, b, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98);
+	test(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010);
+	remove(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98);
+	test_negative(6, a, 0x24446800, 0xf0e40800, 0xeeaebeef, 0x10101010);
+
 	wg_allowedips_free(&t, &mutex);
 	wg_allowedips_init(&t);
 	insert(4, a, 192, 168, 0, 0, 16);
diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h
index ae88be14c9478..e219194cb9f5a 100644
--- a/include/uapi/linux/wireguard.h
+++ b/include/uapi/linux/wireguard.h
@@ -101,6 +101,10 @@ 
  *                    WGALLOWEDIP_A_FAMILY: NLA_U16
  *                    WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr
  *                    WGALLOWEDIP_A_CIDR_MASK: NLA_U8
+ *                    WGALLOWEDIP_A_FLAGS: NLA_U32, WGALLOWEDIP_F_REMOVE_ME if
+ *                                         the specified IP should be removed,
+ *                                         otherwise this IP will be added if
+ *                                         it is not already present.
  *                0: NLA_NESTED
  *                    ...
  *                0: NLA_NESTED
@@ -132,7 +136,7 @@ 
 #define _WG_UAPI_WIREGUARD_H
 
 #define WG_GENL_NAME "wireguard"
-#define WG_GENL_VERSION 1
+#define WG_GENL_VERSION 2
 
 #define WG_KEY_LEN 32
 
@@ -184,11 +188,16 @@  enum wgpeer_attribute {
 };
 #define WGPEER_A_MAX (__WGPEER_A_LAST - 1)
 
+enum wgallowedip_flag {
+	WGALLOWEDIP_F_REMOVE_ME = 1U << 0,
+	__WGALLOWEDIP_F_ALL = WGALLOWEDIP_F_REMOVE_ME
+};
 enum wgallowedip_attribute {
 	WGALLOWEDIP_A_UNSPEC,
 	WGALLOWEDIP_A_FAMILY,
 	WGALLOWEDIP_A_IPADDR,
 	WGALLOWEDIP_A_CIDR_MASK,
+	WGALLOWEDIP_A_FLAGS,
 	__WGALLOWEDIP_A_LAST
 };
 #define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1)
diff --git a/tools/testing/selftests/wireguard/Makefile b/tools/testing/selftests/wireguard/Makefile
new file mode 100644
index 0000000000000..4f4db54f89cb3
--- /dev/null
+++ b/tools/testing/selftests/wireguard/Makefile
@@ -0,0 +1,18 @@ 
+# SPDX-License-Identifier: GPL-2.0
+#
+# Note: To build this you must install libnl-3 and libnl-genl-3 development
+# packages.
+remove-ip:
+	gcc -I/usr/include/libnl3 \
+	    -I../../../../usr/include \
+	    remove-ip.c \
+	    -o remove-ip \
+	    -lnl-genl-3 \
+	    -lnl-3
+
+.PHONY: all
+all: remove-ip
+
+.PHONY: clean
+clean:
+	rm remove-ip
diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh
index 405ff262ca93d..70058d6ebbe85 100755
--- a/tools/testing/selftests/wireguard/netns.sh
+++ b/tools/testing/selftests/wireguard/netns.sh
@@ -28,6 +28,7 @@  exec 3>&1
 export LANG=C
 export WG_HIDE_KEYS=never
 NPROC=( /sys/devices/system/cpu/cpu+([0-9]) ); NPROC=${#NPROC[@]}
+SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
 netns0="wg-test-$$-0"
 netns1="wg-test-$$-1"
 netns2="wg-test-$$-2"
@@ -610,6 +611,43 @@  n0 wg set wg0 peer "$pub2" allowed-ips "$allowedips"
 } < <(n0 wg show wg0 allowed-ips)
 ip0 link del wg0
 
+# Test IP removal
+allowedips=( )
+for i in {1..197}; do
+        allowedips+=( 192.168.0.$i )
+        allowedips+=( abcd::$i )
+done
+saved_ifs="$IFS"
+IFS=,
+allowedips="${allowedips[*]}"
+IFS="$saved_ifs"
+ip0 link add wg0 type wireguard
+n0 wg set wg0 peer "$pub1" allowed-ips "$allowedips"
+pub1_hex=$(echo "$pub1" | base64 -d | xxd -p -c 50)
+n0 $SCRIPT_DIR/remove-ip wg0 "$pub1_hex" 4 192.168.0.1
+n0 $SCRIPT_DIR/remove-ip wg0 "$pub1_hex" 4 192.168.0.20
+n0 $SCRIPT_DIR/remove-ip wg0 "$pub1_hex" 4 192.168.0.100
+n0 $SCRIPT_DIR/remove-ip wg0 "$pub1_hex" 6 abcd::1
+n0 $SCRIPT_DIR/remove-ip wg0 "$pub1_hex" 6 abcd::20
+n0 $SCRIPT_DIR/remove-ip wg0 "$pub1_hex" 6 abcd::100
+n0 wg show wg0 allowed-ips
+{
+	read -r pub allowedips
+	[[ $pub == "$pub1" ]]
+	i=0
+	for ip in $allowedips; do
+		[[ "$ip" != "192.168.0.1" ]]
+		[[ "$ip" != "192.168.0.20" ]]
+		[[ "$ip" != "192.168.0.100" ]]
+		[[ "$ip" != "abcd::1" ]]
+		[[ "$ip" != "abcd::20" ]]
+		[[ "$ip" != "abcd::100" ]]
+		((++i))
+	done
+	((i == 388))
+} < <(n0 wg show wg0 allowed-ips)
+ip0 link del wg0
+
 ! n0 wg show doesnotexist || false
 
 ip0 link add wg0 type wireguard
diff --git a/tools/testing/selftests/wireguard/remove-ip.c b/tools/testing/selftests/wireguard/remove-ip.c
new file mode 100644
index 0000000000000..242f922d99b56
--- /dev/null
+++ b/tools/testing/selftests/wireguard/remove-ip.c
@@ -0,0 +1,126 @@ 
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/wireguard.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <netlink/socket.h>
+#include <netlink/netlink.h>
+#include <netlink/genl/ctrl.h>
+#include <netlink/genl/genl.h>
+#include <netlink/genl/family.h>
+
+#define CURVE25519_KEY_SIZE 32
+
+const char *usage = "Usage: remove-ip INTERFACE_NAME PEER_PUBLIC_KEY_HEX IP_VERSION IP";
+
+char h2b(char c)
+{
+	if ('0' <= c && c <= '9')
+		return c - '0';
+	else if ('a' <= c && c <= 'f')
+		return 10 + (c - 'a');
+
+	return -1;
+}
+
+int parse_key(const char *raw, unsigned char key[CURVE25519_KEY_SIZE])
+{
+	int ret = 0;
+	int i;
+
+	for (i = 0; i < CURVE25519_KEY_SIZE; i++) {
+		char h, l;
+
+		h = h2b(raw[0]);
+		if (h < 0)
+			return -1;
+
+		l = h2b(raw[1]);
+		if (l < 0)
+			return -1;
+
+		key[i] = (h << 4) | l;
+		raw += 2;
+	}
+
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	unsigned char addr[sizeof(struct in6_addr)];
+	unsigned char pub_key[CURVE25519_KEY_SIZE];
+	struct nl_sock *sock;
+	struct nl_msg *msg;
+	int addr_len;
+	int family;
+	int cidr;
+	int af;
+
+	if (argc < 5) {
+		printf("Not enough arguments.\n\n%s\n", usage);
+		return -1;
+	}
+
+	if (parse_key(argv[2], pub_key)) {
+		printf("Could not parse public key\n");
+		return -1;
+	}
+
+	switch (argv[3][0]) {
+	case '4':
+		af = AF_INET;
+		addr_len = sizeof(struct in_addr);
+		cidr = 32;
+		break;
+	case '6':
+		af = AF_INET6;
+		addr_len = sizeof(struct in6_addr);
+		cidr = 128;
+		break;
+	default:
+		printf("Invalid IP version\n");
+		return -1;
+	}
+
+	if (inet_pton(af, argv[4], &addr) <= 0) {
+		printf("Could not parse IP address\n");
+		return -1;
+	}
+
+	sock = nl_socket_alloc();
+	genl_connect(sock);
+	family = genl_ctrl_resolve(sock, WG_GENL_NAME);
+	msg = nlmsg_alloc();
+	genlmsg_put(msg, NL_AUTO_PID, NL_AUTO_SEQ, family, 0, NLM_F_ECHO,
+		    WG_CMD_SET_DEVICE, WG_GENL_VERSION);
+	nla_put_string(msg, WGDEVICE_A_IFNAME, argv[1]);
+
+	struct nlattr *peers = nla_nest_start(msg, WGDEVICE_A_PEERS);
+	struct nlattr *peer0 = nla_nest_start(msg, 0);
+
+	nla_put(msg, WGPEER_A_PUBLIC_KEY, CURVE25519_KEY_SIZE, pub_key);
+
+	struct nlattr *allowed_ips = nla_nest_start(msg, WGPEER_A_ALLOWEDIPS);
+	struct nlattr *allowed_ip0 = nla_nest_start(msg, 0);
+
+	nla_put_u16(msg, WGALLOWEDIP_A_FAMILY, af);
+	nla_put(msg, WGALLOWEDIP_A_IPADDR, addr_len, &addr);
+	nla_put_u8(msg, WGALLOWEDIP_A_CIDR_MASK, cidr);
+	nla_put_u32(msg, WGALLOWEDIP_A_FLAGS, WGALLOWEDIP_F_REMOVE_ME);
+	nla_nest_end(msg, allowed_ip0);
+	nla_nest_end(msg, allowed_ips);
+	nla_nest_end(msg, peer0);
+	nla_nest_end(msg, peers);
+
+	int err = nl_send_sync(sock, msg);
+
+	if (err < 0) {
+		char message[256];
+
+		nl_perror(err, message);
+		printf("An error occurred: %d - %s\n", err, message);
+	}
+
+	return err;
+}