diff mbox series

[RFCv2,2/4] nl80211: Support >4096 byte NEW_WIPHY event nlmsg

Message ID 20190816192703.12445-2-denkenz@gmail.com (mailing list archive)
State RFC
Delegated to: Johannes Berg
Headers show
Series [RFCv2,1/4] nl80211: Fix broken non-split wiphy dumps | expand

Commit Message

Denis Kenzior Aug. 16, 2019, 7:27 p.m. UTC
For historical reasons, NEW_WIPHY messages generated by dumps or
GET_WIPHY commands were limited to 4096 bytes due to userspace tools
using limited buffers.  Once the sizes NEW_WIPHY messages exceeded these
sizes, split dumps were introduced.  All any non-legacy data was added
only to messages using split-dumps (including filtered dumps).

However, split-dumping has quite a significant overhead.  On cards
tested, split dumps generated message sizes 1.7-1.8x compared to
non-split dumps, while still comfortably fitting into an 8k buffer.  The
kernel now expects userspace to provide 16k buffers by default, and 32k
buffers are possible.

Introduce a concept of a large message, so that if the kernel detects
that userspace has provided a buffer of sufficient size, a non-split
message could be generated.

Signed-off-by: Denis Kenzior <denkenz@gmail.com>
---
 net/wireless/nl80211.c | 51 ++++++++++++++++++++++++++++++------------
 1 file changed, 37 insertions(+), 14 deletions(-)

Comments

Johannes Berg Aug. 30, 2019, 9:36 a.m. UTC | #1
On Fri, 2019-08-16 at 14:27 -0500, Denis Kenzior wrote:
> For historical reasons, NEW_WIPHY messages generated by dumps or
> GET_WIPHY commands were limited to 4096 bytes due to userspace tools
> using limited buffers.

I think now that I've figured out why, it'd be good to note that it
wasn't due to userspace tools, but rather due to the default netlink
dump skb allocation at the time, prior to commit  9063e21fb026
("netlink: autosize skb lengthes").

> Once the sizes NEW_WIPHY messages exceeded these
> sizes, split dumps were introduced.  All any non-legacy data was added
> only to messages using split-dumps (including filtered dumps).
> 
> However, split-dumping has quite a significant overhead.  On cards
> tested, split dumps generated message sizes 1.7-1.8x compared to
> non-split dumps, while still comfortably fitting into an 8k buffer.  The
> kernel now expects userspace to provide 16k buffers by default, and 32k
> buffers are possible.
> 
> Introduce a concept of a large message, so that if the kernel detects
> that userspace has provided a buffer of sufficient size, a non-split
> message could be generated.

So, there's still a wrinkle with this. Larger SKB allocation can fail,
and instead of returning an error to userspace, the kernel will allocate
a smaller SKB instead.

With genetlink, we currently don't even have a way of controlling the
minimum allocation that's always required.

Since we already have basically all of the mechanics, I'd say perhaps a
better concept would be to "split when necessary", aborting if split
isn't supported.

IOW, do something like

... nl80211_send_wiphy(...)
{
[...]

switch (state->split_start) {
[...]
case <N>:
	[...] // put stuff
	state->split_start++;
	state->skb_end = nlmsg_get_pos(skb);
	/* fall through */
case <N+1>:
[...]
}

finish:
	genlmsg_end(msg, hdr);
	return 0;
nla_put_failure:
	if (state->split_start < 9) {
		genlmsg_cancel(msg, hdr);
		return -EMSGSIZE;
	}
	nlmsg_trim(msg, state->skb_end);
	goto finish;
}


That way, we fill each SKB as much as possible, up to 32k if userspace
provided big enough buffers *and* we could allocate the SKB.


Your userspace would still set the split flag, and thus be compatible
with all kinds of options:
 * really old kernel not supporting split
 * older kernel sending many messages
 * kernel after this change packing more into one message
 * even if allocating big SKBs failed

johannes
Denis Kenzior Aug. 30, 2019, 7:56 p.m. UTC | #2
Hi Johannes,

On 8/30/19 4:36 AM, Johannes Berg wrote:
> On Fri, 2019-08-16 at 14:27 -0500, Denis Kenzior wrote:
>> For historical reasons, NEW_WIPHY messages generated by dumps or
>> GET_WIPHY commands were limited to 4096 bytes due to userspace tools
>> using limited buffers.
> 
> I think now that I've figured out why, it'd be good to note that it
> wasn't due to userspace tools, but rather due to the default netlink
> dump skb allocation at the time, prior to commit  9063e21fb026
> ("netlink: autosize skb lengthes").
> 

Sure, will take care of it.

>> Once the sizes NEW_WIPHY messages exceeded these
>> sizes, split dumps were introduced.  All any non-legacy data was added
>> only to messages using split-dumps (including filtered dumps).
>>
>> However, split-dumping has quite a significant overhead.  On cards
>> tested, split dumps generated message sizes 1.7-1.8x compared to
>> non-split dumps, while still comfortably fitting into an 8k buffer.  The
>> kernel now expects userspace to provide 16k buffers by default, and 32k
>> buffers are possible.
>>
>> Introduce a concept of a large message, so that if the kernel detects
>> that userspace has provided a buffer of sufficient size, a non-split
>> message could be generated.
> 
> So, there's still a wrinkle with this. Larger SKB allocation can fail,
> and instead of returning an error to userspace, the kernel will allocate
> a smaller SKB instead.
> 
> With genetlink, we currently don't even have a way of controlling the
> minimum allocation that's always required.
> 
> Since we already have basically all of the mechanics, I'd say perhaps a
> better concept would be to "split when necessary", aborting if split
> isn't supported.
> 
> IOW, do something like
> 
> ... nl80211_send_wiphy(...)
> {
> [...]
> 
> switch (state->split_start) {
> [...]
> case <N>:
> 	[...] // put stuff
> 	state->split_start++;
> 	state->skb_end = nlmsg_get_pos(skb);
> 	/* fall through */
> case <N+1>:
> [...]
> }
> 
> finish:
> 	genlmsg_end(msg, hdr);
> 	return 0;
> nla_put_failure:
> 	if (state->split_start < 9) {
> 		genlmsg_cancel(msg, hdr);
> 		return -EMSGSIZE;
> 	}
> 	nlmsg_trim(msg, state->skb_end);
> 	goto finish;
> }
> 
> 
> That way, we fill each SKB as much as possible, up to 32k if userspace
> provided big enough buffers *and* we could allocate the SKB.
> 
> 
> Your userspace would still set the split flag, and thus be compatible
> with all kinds of options:
>   * really old kernel not supporting split
>   * older kernel sending many messages
>   * kernel after this change packing more into one message
>   * even if allocating big SKBs failed
> 

What I was thinking was to attempt to build a large message first and if 
that fails to fail over to the old logic.  But I think what you propose 
is even better.  I'll incorporate this feedback into the next version.

Regards,
-Denis
diff mbox series

Patch

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index b9b0199b5ec6..682a095415eb 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -1839,6 +1839,7 @@  struct nl80211_dump_wiphy_state {
 	long start;
 	long split_start, band_start, chan_start, capa_start;
 	bool split;
+	bool large_message;
 };
 
 static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
@@ -2027,7 +2028,8 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 
 					if (nl80211_msg_put_channel(
 							msg, &rdev->wiphy, chan,
-							state->split))
+							state->split ||
+							state->large_message))
 						goto nla_put_failure;
 
 					nla_nest_end(msg, nl_freq);
@@ -2072,7 +2074,7 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 		i = nl80211_add_commands_unsplit(rdev, msg);
 		if (i < 0)
 			goto nla_put_failure;
-		if (state->split) {
+		if (state->split || state->large_message) {
 			CMD(crit_proto_start, CRIT_PROTOCOL_START);
 			CMD(crit_proto_stop, CRIT_PROTOCOL_STOP);
 			if (rdev->wiphy.flags & WIPHY_FLAG_HAS_CHANNEL_SWITCH)
@@ -2111,7 +2113,8 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 		/* fall through */
 	case 6:
 #ifdef CONFIG_PM
-		if (nl80211_send_wowlan(msg, rdev, state->split))
+		if (nl80211_send_wowlan(msg, rdev,
+					state->split || state->large_message))
 			goto nla_put_failure;
 		state->split_start++;
 		if (state->split)
@@ -2126,7 +2129,8 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 			goto nla_put_failure;
 
 		if (nl80211_put_iface_combinations(&rdev->wiphy, msg,
-						   state->split))
+						   state->split ||
+						   state->large_message))
 			goto nla_put_failure;
 
 		state->split_start++;
@@ -2145,7 +2149,7 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 		 * dump is split, otherwise it makes it too big. Therefore
 		 * only advertise it in that case.
 		 */
-		if (state->split)
+		if (state->split || state->large_message)
 			features |= NL80211_FEATURE_ADVERTISE_CHAN_LIMITS;
 		if (nla_put_u32(msg, NL80211_ATTR_FEATURE_FLAGS, features))
 			goto nla_put_failure;
@@ -2170,13 +2174,20 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 		 *
 		 * We still increment split_start so that in the split
 		 * case we'll continue with more data in the next round,
-		 * but break unconditionally so unsplit data stops here.
+		 * but break unless large_messages are requested, so
+		 * legacy unsplit data stops here.
 		 */
 		state->split_start++;
 
-		if (!state->split)
+		if (state->split)
+			break;
+
+		if (!state->large_message) {
 			state->split_start = 0;
-		break;
+			break;
+		}
+
+		/* Fall through */
 	case 9:
 		if (rdev->wiphy.extended_capabilities &&
 		    (nla_put(msg, NL80211_ATTR_EXT_CAPA,
@@ -2218,7 +2229,9 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 		}
 
 		state->split_start++;
-		break;
+		if (state->split)
+			break;
+		/* Fall through */
 	case 10:
 		if (nl80211_send_coalesce(msg, rdev))
 			goto nla_put_failure;
@@ -2234,7 +2247,9 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 			goto nla_put_failure;
 
 		state->split_start++;
-		break;
+		if (state->split)
+			break;
+		/* Fall through */
 	case 11:
 		if (rdev->wiphy.n_vendor_commands) {
 			const struct nl80211_vendor_cmd_info *info;
@@ -2270,7 +2285,9 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 			nla_nest_end(msg, nested);
 		}
 		state->split_start++;
-		break;
+		if (state->split)
+			break;
+		/* Fall through */
 	case 12:
 		if (rdev->wiphy.flags & WIPHY_FLAG_HAS_CHANNEL_SWITCH &&
 		    nla_put_u8(msg, NL80211_ATTR_MAX_CSA_COUNTERS,
@@ -2312,7 +2329,9 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 		}
 
 		state->split_start++;
-		break;
+		if (state->split)
+			break;
+		/* Fall through */
 	case 13:
 		if (rdev->wiphy.num_iftype_ext_capab &&
 		    rdev->wiphy.iftype_ext_capab) {
@@ -2380,13 +2399,17 @@  static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev,
 		}
 
 		state->split_start++;
-		break;
+		if (state->split)
+			break;
+		/* Fall through */
 	case 14:
 		if (nl80211_send_pmsr_capa(rdev, msg))
 			goto nla_put_failure;
 
 		state->split_start++;
-		break;
+		if (state->split)
+			break;
+		/* Fall through */
 	case 15:
 		if (rdev->wiphy.akm_suites &&
 		    nla_put(msg, NL80211_ATTR_AKM_SUITES,