Message ID | 20190816192703.12445-2-denkenz@gmail.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Johannes Berg |
Headers | show |
Series | [RFCv2,1/4] nl80211: Fix broken non-split wiphy dumps | expand |
On Fri, 2019-08-16 at 14:27 -0500, Denis Kenzior wrote: > For historical reasons, NEW_WIPHY messages generated by dumps or > GET_WIPHY commands were limited to 4096 bytes due to userspace tools > using limited buffers. I think now that I've figured out why, it'd be good to note that it wasn't due to userspace tools, but rather due to the default netlink dump skb allocation at the time, prior to commit 9063e21fb026 ("netlink: autosize skb lengthes"). > Once the sizes NEW_WIPHY messages exceeded these > sizes, split dumps were introduced. All any non-legacy data was added > only to messages using split-dumps (including filtered dumps). > > However, split-dumping has quite a significant overhead. On cards > tested, split dumps generated message sizes 1.7-1.8x compared to > non-split dumps, while still comfortably fitting into an 8k buffer. The > kernel now expects userspace to provide 16k buffers by default, and 32k > buffers are possible. > > Introduce a concept of a large message, so that if the kernel detects > that userspace has provided a buffer of sufficient size, a non-split > message could be generated. So, there's still a wrinkle with this. Larger SKB allocation can fail, and instead of returning an error to userspace, the kernel will allocate a smaller SKB instead. With genetlink, we currently don't even have a way of controlling the minimum allocation that's always required. Since we already have basically all of the mechanics, I'd say perhaps a better concept would be to "split when necessary", aborting if split isn't supported. IOW, do something like ... nl80211_send_wiphy(...) { [...] switch (state->split_start) { [...] case <N>: [...] // put stuff state->split_start++; state->skb_end = nlmsg_get_pos(skb); /* fall through */ case <N+1>: [...] } finish: genlmsg_end(msg, hdr); return 0; nla_put_failure: if (state->split_start < 9) { genlmsg_cancel(msg, hdr); return -EMSGSIZE; } nlmsg_trim(msg, state->skb_end); goto finish; } That way, we fill each SKB as much as possible, up to 32k if userspace provided big enough buffers *and* we could allocate the SKB. Your userspace would still set the split flag, and thus be compatible with all kinds of options: * really old kernel not supporting split * older kernel sending many messages * kernel after this change packing more into one message * even if allocating big SKBs failed johannes
Hi Johannes, On 8/30/19 4:36 AM, Johannes Berg wrote: > On Fri, 2019-08-16 at 14:27 -0500, Denis Kenzior wrote: >> For historical reasons, NEW_WIPHY messages generated by dumps or >> GET_WIPHY commands were limited to 4096 bytes due to userspace tools >> using limited buffers. > > I think now that I've figured out why, it'd be good to note that it > wasn't due to userspace tools, but rather due to the default netlink > dump skb allocation at the time, prior to commit 9063e21fb026 > ("netlink: autosize skb lengthes"). > Sure, will take care of it. >> Once the sizes NEW_WIPHY messages exceeded these >> sizes, split dumps were introduced. All any non-legacy data was added >> only to messages using split-dumps (including filtered dumps). >> >> However, split-dumping has quite a significant overhead. On cards >> tested, split dumps generated message sizes 1.7-1.8x compared to >> non-split dumps, while still comfortably fitting into an 8k buffer. The >> kernel now expects userspace to provide 16k buffers by default, and 32k >> buffers are possible. >> >> Introduce a concept of a large message, so that if the kernel detects >> that userspace has provided a buffer of sufficient size, a non-split >> message could be generated. > > So, there's still a wrinkle with this. Larger SKB allocation can fail, > and instead of returning an error to userspace, the kernel will allocate > a smaller SKB instead. > > With genetlink, we currently don't even have a way of controlling the > minimum allocation that's always required. > > Since we already have basically all of the mechanics, I'd say perhaps a > better concept would be to "split when necessary", aborting if split > isn't supported. > > IOW, do something like > > ... nl80211_send_wiphy(...) > { > [...] > > switch (state->split_start) { > [...] > case <N>: > [...] // put stuff > state->split_start++; > state->skb_end = nlmsg_get_pos(skb); > /* fall through */ > case <N+1>: > [...] > } > > finish: > genlmsg_end(msg, hdr); > return 0; > nla_put_failure: > if (state->split_start < 9) { > genlmsg_cancel(msg, hdr); > return -EMSGSIZE; > } > nlmsg_trim(msg, state->skb_end); > goto finish; > } > > > That way, we fill each SKB as much as possible, up to 32k if userspace > provided big enough buffers *and* we could allocate the SKB. > > > Your userspace would still set the split flag, and thus be compatible > with all kinds of options: > * really old kernel not supporting split > * older kernel sending many messages > * kernel after this change packing more into one message > * even if allocating big SKBs failed > What I was thinking was to attempt to build a large message first and if that fails to fail over to the old logic. But I think what you propose is even better. I'll incorporate this feedback into the next version. Regards, -Denis
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index b9b0199b5ec6..682a095415eb 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -1839,6 +1839,7 @@ struct nl80211_dump_wiphy_state { long start; long split_start, band_start, chan_start, capa_start; bool split; + bool large_message; }; static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, @@ -2027,7 +2028,8 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, if (nl80211_msg_put_channel( msg, &rdev->wiphy, chan, - state->split)) + state->split || + state->large_message)) goto nla_put_failure; nla_nest_end(msg, nl_freq); @@ -2072,7 +2074,7 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, i = nl80211_add_commands_unsplit(rdev, msg); if (i < 0) goto nla_put_failure; - if (state->split) { + if (state->split || state->large_message) { CMD(crit_proto_start, CRIT_PROTOCOL_START); CMD(crit_proto_stop, CRIT_PROTOCOL_STOP); if (rdev->wiphy.flags & WIPHY_FLAG_HAS_CHANNEL_SWITCH) @@ -2111,7 +2113,8 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, /* fall through */ case 6: #ifdef CONFIG_PM - if (nl80211_send_wowlan(msg, rdev, state->split)) + if (nl80211_send_wowlan(msg, rdev, + state->split || state->large_message)) goto nla_put_failure; state->split_start++; if (state->split) @@ -2126,7 +2129,8 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, goto nla_put_failure; if (nl80211_put_iface_combinations(&rdev->wiphy, msg, - state->split)) + state->split || + state->large_message)) goto nla_put_failure; state->split_start++; @@ -2145,7 +2149,7 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, * dump is split, otherwise it makes it too big. Therefore * only advertise it in that case. */ - if (state->split) + if (state->split || state->large_message) features |= NL80211_FEATURE_ADVERTISE_CHAN_LIMITS; if (nla_put_u32(msg, NL80211_ATTR_FEATURE_FLAGS, features)) goto nla_put_failure; @@ -2170,13 +2174,20 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, * * We still increment split_start so that in the split * case we'll continue with more data in the next round, - * but break unconditionally so unsplit data stops here. + * but break unless large_messages are requested, so + * legacy unsplit data stops here. */ state->split_start++; - if (!state->split) + if (state->split) + break; + + if (!state->large_message) { state->split_start = 0; - break; + break; + } + + /* Fall through */ case 9: if (rdev->wiphy.extended_capabilities && (nla_put(msg, NL80211_ATTR_EXT_CAPA, @@ -2218,7 +2229,9 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, } state->split_start++; - break; + if (state->split) + break; + /* Fall through */ case 10: if (nl80211_send_coalesce(msg, rdev)) goto nla_put_failure; @@ -2234,7 +2247,9 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, goto nla_put_failure; state->split_start++; - break; + if (state->split) + break; + /* Fall through */ case 11: if (rdev->wiphy.n_vendor_commands) { const struct nl80211_vendor_cmd_info *info; @@ -2270,7 +2285,9 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, nla_nest_end(msg, nested); } state->split_start++; - break; + if (state->split) + break; + /* Fall through */ case 12: if (rdev->wiphy.flags & WIPHY_FLAG_HAS_CHANNEL_SWITCH && nla_put_u8(msg, NL80211_ATTR_MAX_CSA_COUNTERS, @@ -2312,7 +2329,9 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, } state->split_start++; - break; + if (state->split) + break; + /* Fall through */ case 13: if (rdev->wiphy.num_iftype_ext_capab && rdev->wiphy.iftype_ext_capab) { @@ -2380,13 +2399,17 @@ static int nl80211_send_wiphy(struct cfg80211_registered_device *rdev, } state->split_start++; - break; + if (state->split) + break; + /* Fall through */ case 14: if (nl80211_send_pmsr_capa(rdev, msg)) goto nla_put_failure; state->split_start++; - break; + if (state->split) + break; + /* Fall through */ case 15: if (rdev->wiphy.akm_suites && nla_put(msg, NL80211_ATTR_AKM_SUITES,
For historical reasons, NEW_WIPHY messages generated by dumps or GET_WIPHY commands were limited to 4096 bytes due to userspace tools using limited buffers. Once the sizes NEW_WIPHY messages exceeded these sizes, split dumps were introduced. All any non-legacy data was added only to messages using split-dumps (including filtered dumps). However, split-dumping has quite a significant overhead. On cards tested, split dumps generated message sizes 1.7-1.8x compared to non-split dumps, while still comfortably fitting into an 8k buffer. The kernel now expects userspace to provide 16k buffers by default, and 32k buffers are possible. Introduce a concept of a large message, so that if the kernel detects that userspace has provided a buffer of sufficient size, a non-split message could be generated. Signed-off-by: Denis Kenzior <denkenz@gmail.com> --- net/wireless/nl80211.c | 51 ++++++++++++++++++++++++++++++------------ 1 file changed, 37 insertions(+), 14 deletions(-)