[net-next,v5,4/9] netdev-genl: Dump gro_flush_timeout

Message ID	20241009005525.13651-5-jdamato@fastly.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 536C944C6F for <netdev@vger.kernel.org>; Wed, 9 Oct 2024 00:56:01 +0000 (UTC) From: Joe Damato <jdamato@fastly.com> To: netdev@vger.kernel.org Cc: mkarsten@uwaterloo.ca, skhawaja@google.com, sdf@fomichev.me, bjorn@rivosinc.com, amritha.nambiar@intel.com, sridhar.samudrala@intel.com, willemdebruijn.kernel@gmail.com, Joe Damato <jdamato@fastly.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Donald Hunter <donald.hunter@gmail.com>, Jesper Dangaard Brouer <hawk@kernel.org>, Mina Almasry <almasrymina@google.com>, Xuan Zhuo <xuanzhuo@linux.alibaba.com>, linux-kernel@vger.kernel.org (open list) Subject: [net-next v5 4/9] netdev-genl: Dump gro_flush_timeout Date: Wed, 9 Oct 2024 00:54:58 +0000 Message-Id: <20241009005525.13651-5-jdamato@fastly.com> In-Reply-To: <20241009005525.13651-1-jdamato@fastly.com> References: <20241009005525.13651-1-jdamato@fastly.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Add support for per-NAPI config via netlink \| expand [net-next,v5,0/9] Add support for per-NAPI config via netlink [net-next,v5,1/9] net: napi: Make napi_defer_hard_irqs per-NAPI [net-next,v5,2/9] netdev-genl: Dump napi_defer_hard_irqs [net-next,v5,3/9] net: napi: Make gro_flush_timeout per-NAPI [net-next,v5,4/9] netdev-genl: Dump gro_flush_timeout [net-next,v5,5/9] net: napi: Add napi_config [net-next,v5,6/9] netdev-genl: Support setting per-NAPI config values [net-next,v5,7/9] bnxt: Add support for persistent NAPI config [net-next,v5,8/9] mlx5: Add support for persistent NAPI config [net-next,v5,9/9] mlx4: Add support for persistent NAPI config to RX CQs

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net-next, async
netdev/ynl	success	Generated files up to date; no warnings/errors; GEN HAS DIFF 2 files changed, 93 insertions(+);
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 43 this patch: 43
netdev/build_tools	success	Errors and warnings before: 0 (+2) this patch: 0 (+2)
netdev/cc_maintainers	success	CCed 7 of 7 maintainers
netdev/build_clang	success	Errors and warnings before: 83 this patch: 83
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 4144 this patch: 4144
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 50 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 12 this patch: 12
netdev/source_inline	success	Was 0 now: 0

Joe Damato Oct. 9, 2024, 12:54 a.m. UTC

Support dumping gro_flush_timeout for a NAPI ID.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 Documentation/netlink/specs/netdev.yaml | 6 ++++++
 include/uapi/linux/netdev.h             | 1 +
 net/core/netdev-genl.c                  | 6 ++++++
 tools/include/uapi/linux/netdev.h       | 1 +
 4 files changed, 14 insertions(+)

Jakub Kicinski Oct. 10, 2024, 3:14 a.m. UTC | #1

On Wed,  9 Oct 2024 00:54:58 +0000 Joe Damato wrote:
> +        name: gro-flush-timeout
> +        doc: The timeout, in nanoseconds, of when to trigger the NAPI
> +             watchdog timer and schedule NAPI processing.

You gotta respin because we reformatted the cacheline info.

So while at it perhaps throw in a sentence here about the GRO effects?
The initial use of GRO flush timeout was to hold incomplete GRO
super-frames in the GRO engine across NAPI cycles.

Joe Damato Oct. 10, 2024, 4:34 a.m. UTC | #2

On Wed, Oct 09, 2024 at 08:14:40PM -0700, Jakub Kicinski wrote:
> On Wed,  9 Oct 2024 00:54:58 +0000 Joe Damato wrote:
> > +        name: gro-flush-timeout
> > +        doc: The timeout, in nanoseconds, of when to trigger the NAPI
> > +             watchdog timer and schedule NAPI processing.
> 
> You gotta respin because we reformatted the cacheline info.

Yea, I figured I'd be racing with that change and would need a
respin.

I'm not sure how the queue works exactly, but it looks like I might
also be racing with another change [1], I think.

I think I'm just over 24hr and could respin and resend now, but
should I wait longer in case [1] is merged before you see my
respin?

Just trying to figure out how to get the fewest number of respins
possible ;)

> So while at it perhaps throw in a sentence here about the GRO effects?
> The initial use of GRO flush timeout was to hold incomplete GRO
> super-frames in the GRO engine across NAPI cycles.

From my reading of the code, if the timeout is non-zero, then
napi_gro_flush will flush only "old" super-frames in
napi_complete_done.

If that's accurate (and maybe I missed something?), then how about:

doc: The timeout, in nanoseconds, of when to trigger the NAPI
     watchdog timer which schedules NAPI processing. Additionally, a
     non-zero value will also prevent GRO from flushing recent
     super-frames at the end of a NAPI cycle. This may add receive
     latency in exchange for reducing the number of frames processed
     by the network stack.

LMK if that's accurate and sounds OK or if it's wrong / too verbose?

[1]: https://lore.kernel.org/netdev/20241009232728.107604-1-edumazet@google.com/T/#m3f11aae53b3244037ac641ef36985c5e85e2ed5e

Eric Dumazet Oct. 10, 2024, 4:45 a.m. UTC | #3

On Thu, Oct 10, 2024 at 6:34 AM Joe Damato <jdamato@fastly.com> wrote:
>
> On Wed, Oct 09, 2024 at 08:14:40PM -0700, Jakub Kicinski wrote:
> > On Wed,  9 Oct 2024 00:54:58 +0000 Joe Damato wrote:
> > > +        name: gro-flush-timeout
> > > +        doc: The timeout, in nanoseconds, of when to trigger the NAPI
> > > +             watchdog timer and schedule NAPI processing.
> >
> > You gotta respin because we reformatted the cacheline info.
>
> Yea, I figured I'd be racing with that change and would need a
> respin.
>
> I'm not sure how the queue works exactly, but it looks like I might
> also be racing with another change [1], I think.
>
> I think I'm just over 24hr and could respin and resend now, but
> should I wait longer in case [1] is merged before you see my
> respin?

I would avoid the rtnl_lock() addition in "netdev-genl: Support
setting per-NAPI config values"
before re-sending ?

>
> Just trying to figure out how to get the fewest number of respins
> possible ;)
>
> > So while at it perhaps throw in a sentence here about the GRO effects?
> > The initial use of GRO flush timeout was to hold incomplete GRO
> > super-frames in the GRO engine across NAPI cycles.
>
> From my reading of the code, if the timeout is non-zero, then
> napi_gro_flush will flush only "old" super-frames in
> napi_complete_done.
>
> If that's accurate (and maybe I missed something?), then how about:
>
> doc: The timeout, in nanoseconds, of when to trigger the NAPI
>      watchdog timer which schedules NAPI processing. Additionally, a
>      non-zero value will also prevent GRO from flushing recent
>      super-frames at the end of a NAPI cycle. This may add receive
>      latency in exchange for reducing the number of frames processed
>      by the network stack.

Note that linux TCP always has a PSH flag at the end of each TSO packet,
so the latency increase is only possible in presence of tail drop,
if the last MSS (with the PSH) was dropped.


>
> LMK if that's accurate and sounds OK or if it's wrong / too verbose?

I do not think it is too verbose.

>
> [1]: https://lore.kernel.org/netdev/20241009232728.107604-1-edumazet@google.com/T/#m3f11aae53b3244037ac641ef36985c5e85e2ed5e

Joe Damato Oct. 10, 2024, 4:59 a.m. UTC | #4

On Thu, Oct 10, 2024 at 06:45:11AM +0200, Eric Dumazet wrote:
> On Thu, Oct 10, 2024 at 6:34 AM Joe Damato <jdamato@fastly.com> wrote:
> >
> > On Wed, Oct 09, 2024 at 08:14:40PM -0700, Jakub Kicinski wrote:
> > > On Wed,  9 Oct 2024 00:54:58 +0000 Joe Damato wrote:
> > > > +        name: gro-flush-timeout
> > > > +        doc: The timeout, in nanoseconds, of when to trigger the NAPI
> > > > +             watchdog timer and schedule NAPI processing.
> > >
> > > You gotta respin because we reformatted the cacheline info.
> >
> > Yea, I figured I'd be racing with that change and would need a
> > respin.
> >
> > I'm not sure how the queue works exactly, but it looks like I might
> > also be racing with another change [1], I think.
> >
> > I think I'm just over 24hr and could respin and resend now, but
> > should I wait longer in case [1] is merged before you see my
> > respin?
> 
> I would avoid the rtnl_lock() addition in "netdev-genl: Support
> setting per-NAPI config values"
> before re-sending ?

OK.

> >
> > Just trying to figure out how to get the fewest number of respins
> > possible ;)
> >
> > > So while at it perhaps throw in a sentence here about the GRO effects?
> > > The initial use of GRO flush timeout was to hold incomplete GRO
> > > super-frames in the GRO engine across NAPI cycles.
> >
> > From my reading of the code, if the timeout is non-zero, then
> > napi_gro_flush will flush only "old" super-frames in
> > napi_complete_done.
> >
> > If that's accurate (and maybe I missed something?), then how about:
> >
> > doc: The timeout, in nanoseconds, of when to trigger the NAPI
> >      watchdog timer which schedules NAPI processing. Additionally, a
> >      non-zero value will also prevent GRO from flushing recent
> >      super-frames at the end of a NAPI cycle. This may add receive
> >      latency in exchange for reducing the number of frames processed
> >      by the network stack.
> 
> Note that linux TCP always has a PSH flag at the end of each TSO packet,
> so the latency increase is only possible in presence of tail drop,
> if the last MSS (with the PSH) was dropped.

Would you like me to note that in the doc, as well?

[net-next,v5,4/9] netdev-genl: Dump gro_flush_timeout

Checks

Commit Message

Comments

Patch