mbox series

[v4,net-next,00/12] net: introduce TX H/W shaping API

Message ID cover.1724165948.git.pabeni@redhat.com (mailing list archive)
Headers show
Series net: introduce TX H/W shaping API | expand

Message

Paolo Abeni Aug. 20, 2024, 3:12 p.m. UTC
We have a plurality of shaping-related drivers API, but none flexible
enough to meet existing demand from vendors[1].

This series introduces new device APIs to configure in a flexible way
TX H/W shaping. The new functionalities are exposed via a newly
defined generic netlink interface and include introspection
capabilities. Some self-tests are included, on top of a dummy
netdevsim implementation, and a basic implementation for the iavf
driver.

Some usage examples:

* Configure shaping on a given queue:

./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \
	--do set --json '{"ifindex":'$IFINDEX',
			"shaper": {"handle":
				{"scope": "queue", "id":'$QUEUEID' },
			"bw-max": 2000000 }}'

* Container B/W sharing

The orchestration infrastructure wants to group the 
container-related queues under a RR scheduling and limit the aggregate
bandwidth:

./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \
	--do group --json '{"ifindex":'$IFINDEX', 
			"leaves": [ 
			  {"handle": {"scope": "queue", "id":'$QID1' },
			   "weight": '$W1'}, 
			  {"handle": {"scope": "queue", "id":'$QID2' },
			   "weight": '$W2'}], 
			  {"handle": {"scope": "queue", "id":'$QID3' },
			   "weight": '$W3'}], 
			"root": { "handle": {"scope":"node"},
			 	  "bw-max": 10000000}}'
{'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 0}}

Q1 \
    \
Q2 -- node 0 -------  netdev
    / (bw-max: 10M)
Q3 / 


* Delegation

A containers wants to limit the aggregate B/W bandwidth of 2 of the 3
queues it owns - the starting configuration is the one from the
previous point:

SPEC=Documentation/netlink/specs/net_shaper.yaml
./tools/net/ynl/cli.py --spec $SPEC \
	--do group --json '{"ifindex":'$IFINDEX', 
			"leaves": [ 
			  {"handle": {"scope": "queue", "id":'$QID1' },
			   "weight": '$W1'}, 
			  {"handle": {"scope": "queue", "id":'$QID2' },
			   "weight": '$W2'}], 
			"root": { "handle": {"scope": "node"},
				  "parent": {"scope": "node", "id": 0},
				  "bw-max": 5000000 }}'
{'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 1}}

Q1 -- node 1 --------\
    / (bw-max: 5M)    \
Q2 /                   node 0 -------  netdev
                      /  (bw-max: 10M)
Q3 ------------------

* Cleanup:

To delete a single queue shaper:

./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex":'$IFINDEX', 
	  "handle": {"scope": "queue", "id":'$QID3' }}'

Q1 -- node 1 --------\
    / (bw-max: 5M)    \
Q2 /                   node 0 -------  netdev
                       (bw-max: 10M)

Deleting a node shaper relinks all its leaves to the node's parent:

./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex":'$IFINDEX', 
	  "handle": {"scope": "node", "id":1 }}'

Q1 ------\
          \
          node 0 -------  netdev
         /  (bw-max: 10M)
Q2 -----

Deleting the last shaper under a node shaper deletes the node, too:

./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex":'$IFINDEX', 
	  "handle": {"scope": "queue", "id":'$QID1' }}'
./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex":'$IFINDEX', 
	  "handle": {"scope": "queue", "id":'$QID2' }}'
./tools/net/ynl/cli.py --spec $SPEC --do get --json '{"ifindex":'$IF', 
			  "handle": { "scope": "node", "id": 0}}'
Netlink error: No such file or directory
nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
	error: -2
	extack: {'bad-attr': '.handle'}
---
Changes from V3:
 - rename
 - locking
 - delete operates on node, too

v3: https://lore.kernel.org/netdev/cover.1722357745.git.pabeni@redhat.com/

Changes from RFC v2:
 - added patch 1
 - fixed deprecated API usage

RFC v2: https://lore.kernel.org/netdev/cover.1721851988.git.pabeni@redhat.com/

Changes from RFC v1:
 - set() and delete() ops operate on a single shaper
 - added group() op to allow grouping and nesting
 - split the NL implementation into multiple patches to help reviewing

RFC v1: https://lore.kernel.org/netdev/cover.1719518113.git.pabeni@redhat.com/

[1] https://lore.kernel.org/netdev/20240405102313.GA310894@kernel.org/

Paolo Abeni (9):
  tools: ynl: lift an assumption about spec file name
  netlink: spec: add shaper YAML spec
  net-shapers: implement NL get operation
  net-shapers: implement NL set and delete operations
  net-shapers: implement NL group operation
  net-shapers: implement delete support for NODE scope shaper
  netlink: spec: add shaper introspection support
  net: shaper: implement introspection support
  testing: net-drv: add basic shaper test

Sudheer Mogilappagari (1):
  iavf: Add net_shaper_ops support

Wenjun Wu (2):
  virtchnl: support queue rate limit and quanta size configuration
  ice: Support VF queue rate limit and quanta size configuration

 Documentation/netlink/specs/net_shaper.yaml   |  373 +++++
 Documentation/networking/kapi.rst             |    3 +
 MAINTAINERS                                   |    1 +
 drivers/net/Kconfig                           |    1 +
 drivers/net/ethernet/intel/Kconfig            |    1 +
 drivers/net/ethernet/intel/iavf/iavf.h        |    3 +
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  150 ++
 drivers/net/ethernet/intel/iavf/iavf_txrx.h   |    2 +
 .../net/ethernet/intel/iavf/iavf_virtchnl.c   |   65 +
 drivers/net/ethernet/intel/ice/ice.h          |    2 +
 drivers/net/ethernet/intel/ice/ice_base.c     |    2 +
 drivers/net/ethernet/intel/ice/ice_common.c   |   21 +
 .../net/ethernet/intel/ice/ice_hw_autogen.h   |    8 +
 drivers/net/ethernet/intel/ice/ice_txrx.h     |    1 +
 drivers/net/ethernet/intel/ice/ice_type.h     |    1 +
 drivers/net/ethernet/intel/ice/ice_vf_lib.h   |    8 +
 drivers/net/ethernet/intel/ice/ice_virtchnl.c |  333 +++++
 drivers/net/ethernet/intel/ice/ice_virtchnl.h |   11 +
 .../intel/ice/ice_virtchnl_allowlist.c        |    6 +
 drivers/net/netdevsim/netdev.c                |   41 +
 include/linux/avf/virtchnl.h                  |  119 ++
 include/linux/netdevice.h                     |   17 +
 include/net/net_shaper.h                      |  116 ++
 include/uapi/linux/net_shaper.h               |   90 ++
 net/Kconfig                                   |    3 +
 net/Makefile                                  |    1 +
 net/core/dev.c                                |    2 +
 net/core/dev.h                                |    6 +
 net/shaper/Makefile                           |    9 +
 net/shaper/shaper.c                           | 1202 +++++++++++++++++
 net/shaper/shaper_nl_gen.c                    |  152 +++
 net/shaper/shaper_nl_gen.h                    |   41 +
 tools/net/ynl/ynl-gen-c.py                    |    6 +-
 tools/testing/selftests/drivers/net/Makefile  |    1 +
 tools/testing/selftests/drivers/net/shaper.py |  236 ++++
 .../testing/selftests/net/lib/py/__init__.py  |    1 +
 tools/testing/selftests/net/lib/py/ynl.py     |    5 +
 37 files changed, 3038 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/netlink/specs/net_shaper.yaml
 create mode 100644 include/net/net_shaper.h
 create mode 100644 include/uapi/linux/net_shaper.h
 create mode 100644 net/shaper/Makefile
 create mode 100644 net/shaper/shaper.c
 create mode 100644 net/shaper/shaper_nl_gen.c
 create mode 100644 net/shaper/shaper_nl_gen.h
 create mode 100755 tools/testing/selftests/drivers/net/shaper.py

Comments

Jakub Kicinski Aug. 22, 2024, 12:58 a.m. UTC | #1
On Tue, 20 Aug 2024 17:12:21 +0200 Paolo Abeni wrote:
>   tools: ynl: lift an assumption about spec file name

I'll pop this one in, no point carrying it around.

Double check other patches for trailing whitespace. There were some
warnings when I was pulling the series in.
patchwork-bot+netdevbpf@kernel.org Aug. 22, 2024, 1:10 a.m. UTC | #2
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 20 Aug 2024 17:12:21 +0200 you wrote:
> We have a plurality of shaping-related drivers API, but none flexible
> enough to meet existing demand from vendors[1].
> 
> This series introduces new device APIs to configure in a flexible way
> TX H/W shaping. The new functionalities are exposed via a newly
> defined generic netlink interface and include introspection
> capabilities. Some self-tests are included, on top of a dummy
> netdevsim implementation, and a basic implementation for the iavf
> driver.
> 
> [...]

Here is the summary with links:
  - [v4,net-next,01/12] tools: ynl: lift an assumption about spec file name
    https://git.kernel.org/netdev/net-next/c/f32c821ae019
  - [v4,net-next,02/12] netlink: spec: add shaper YAML spec
    (no matching commit)
  - [v4,net-next,03/12] net-shapers: implement NL get operation
    (no matching commit)
  - [v4,net-next,04/12] net-shapers: implement NL set and delete operations
    (no matching commit)
  - [v4,net-next,05/12] net-shapers: implement NL group operation
    (no matching commit)
  - [v4,net-next,06/12] net-shapers: implement delete support for NODE scope shaper
    (no matching commit)
  - [v4,net-next,07/12] netlink: spec: add shaper introspection support
    (no matching commit)
  - [v4,net-next,08/12] net: shaper: implement introspection support
    (no matching commit)
  - [v4,net-next,09/12] testing: net-drv: add basic shaper test
    (no matching commit)
  - [v4,net-next,10/12] virtchnl: support queue rate limit and quanta size configuration
    (no matching commit)
  - [v4,net-next,11/12] ice: Support VF queue rate limit and quanta size configuration
    (no matching commit)
  - [v4,net-next,12/12] iavf: Add net_shaper_ops support
    (no matching commit)

You are awesome, thank you!
Jakub Kicinski Aug. 23, 2024, 12:43 a.m. UTC | #3
On Tue, 20 Aug 2024 17:12:21 +0200 Paolo Abeni wrote:
> * Delegation
> 
> A containers wants to limit the aggregate B/W bandwidth of 2 of the 3
> queues it owns - the starting configuration is the one from the
> previous point:
> 
> SPEC=Documentation/netlink/specs/net_shaper.yaml
> ./tools/net/ynl/cli.py --spec $SPEC \
> 	--do group --json '{"ifindex":'$IFINDEX', 
> 			"leaves": [ 
> 			  {"handle": {"scope": "queue", "id":'$QID1' },
> 			   "weight": '$W1'}, 
> 			  {"handle": {"scope": "queue", "id":'$QID2' },
> 			   "weight": '$W2'}], 
> 			"root": { "handle": {"scope": "node"},
> 				  "parent": {"scope": "node", "id": 0},

In the delegation use case I was hoping "parent" would be automatic.
Paolo Abeni Aug. 23, 2024, 7:51 a.m. UTC | #4
On 8/23/24 02:43, Jakub Kicinski wrote:
> On Tue, 20 Aug 2024 17:12:21 +0200 Paolo Abeni wrote:
>> * Delegation
>>
>> A containers wants to limit the aggregate B/W bandwidth of 2 of the 3
>> queues it owns - the starting configuration is the one from the
>> previous point:
>>
>> SPEC=Documentation/netlink/specs/net_shaper.yaml
>> ./tools/net/ynl/cli.py --spec $SPEC \
>> 	--do group --json '{"ifindex":'$IFINDEX',
>> 			"leaves": [
>> 			  {"handle": {"scope": "queue", "id":'$QID1' },
>> 			   "weight": '$W1'},
>> 			  {"handle": {"scope": "queue", "id":'$QID2' },
>> 			   "weight": '$W2'}],
>> 			"root": { "handle": {"scope": "node"},
>> 				  "parent": {"scope": "node", "id": 0},
> 
> In the delegation use case I was hoping "parent" would be automatic.

Currently the parent is automatic/implicit when creating a node directly 
nested to the the netdev shaper.

I now see we can use as default parent the current leaves' parent, when 
that is the same for all the to-be-grouped leaves.

Actually, if we restrict the group operation to operate only on set of 
leaves respecting the above, I *guess* we will not lose generality and 
we could simplify a bit the spec. WDYT?

Thanks,

Paolo
Jakub Kicinski Aug. 27, 2024, 2:14 a.m. UTC | #5
On Fri, 23 Aug 2024 09:51:24 +0200 Paolo Abeni wrote:
> On 8/23/24 02:43, Jakub Kicinski wrote:
> > On Tue, 20 Aug 2024 17:12:21 +0200 Paolo Abeni wrote:  
> >> * Delegation
> >>
> >> A containers wants to limit the aggregate B/W bandwidth of 2 of the 3
> >> queues it owns - the starting configuration is the one from the
> >> previous point:
> >>
> >> SPEC=Documentation/netlink/specs/net_shaper.yaml
> >> ./tools/net/ynl/cli.py --spec $SPEC \
> >> 	--do group --json '{"ifindex":'$IFINDEX',
> >> 			"leaves": [
> >> 			  {"handle": {"scope": "queue", "id":'$QID1' },
> >> 			   "weight": '$W1'},
> >> 			  {"handle": {"scope": "queue", "id":'$QID2' },
> >> 			   "weight": '$W2'}],
> >> 			"root": { "handle": {"scope": "node"},
> >> 				  "parent": {"scope": "node", "id": 0},  
> > 
> > In the delegation use case I was hoping "parent" would be automatic.  
> 
> Currently the parent is automatic/implicit when creating a node directly 
> nested to the the netdev shaper.
> 
> I now see we can use as default parent the current leaves' parent, when 
> that is the same for all the to-be-grouped leaves.
> 
> Actually, if we restrict the group operation to operate only on set of 
> leaves respecting the above, I *guess* we will not lose generality and 
> we could simplify a bit the spec. WDYT?

I remember having a use case in mind where specifying parent would be
very useful. I think it may have been related to atomic changes.
I'm not sure if what I describe below is exactly that case...

Imagine:

Qx -{hierarchy}---\
                   \{hierarchy}-- netdev
Q0-------P0\ SP----/   
Q1--\ RR-P1/
Q2--/

Let's say we own queues 0,1,2 and want to remove the SP layer.
It's convenient to do:

	$node = get($SP-node)
	group(leaves: [Q0, Q1, Q2], parent=$node.parent)

And have the kernel "garbage collect" the old RR node and the old SP
node (since they will now have no children). We want to avoid the
situations where user space has to do complex transitions thru
states which device may not support (make sure Q1, Q2 have right prios,
delete old RR, now we have SP w/ 3 inputs, delete the SP, create a new
group).

For the case above we could technically identify the correct parent by
skipping the nodes which will be garbage collected later. But imagine
that instead of deleting the hierarchy we wanted to move Q1 from P1 
to P0:

	group(leaves: [Q0, Q1], parent=SP, prio=P0)

does the job.

I admit this are somewhat contrived, and I agree that we won't lose
generality, but I think it will narrow the range of hierarchies we
can transition between atomically.
Paolo Abeni Aug. 27, 2024, 7:54 a.m. UTC | #6
On 8/27/24 04:14, Jakub Kicinski wrote:
> On Fri, 23 Aug 2024 09:51:24 +0200 Paolo Abeni wrote:
>> On 8/23/24 02:43, Jakub Kicinski wrote:
>>> On Tue, 20 Aug 2024 17:12:21 +0200 Paolo Abeni wrote:
>>>> * Delegation
>>>>
>>>> A containers wants to limit the aggregate B/W bandwidth of 2 of the 3
>>>> queues it owns - the starting configuration is the one from the
>>>> previous point:
>>>>
>>>> SPEC=Documentation/netlink/specs/net_shaper.yaml
>>>> ./tools/net/ynl/cli.py --spec $SPEC \
>>>> 	--do group --json '{"ifindex":'$IFINDEX',
>>>> 			"leaves": [
>>>> 			  {"handle": {"scope": "queue", "id":'$QID1' },
>>>> 			   "weight": '$W1'},
>>>> 			  {"handle": {"scope": "queue", "id":'$QID2' },
>>>> 			   "weight": '$W2'}],
>>>> 			"root": { "handle": {"scope": "node"},
>>>> 				  "parent": {"scope": "node", "id": 0},
>>>
>>> In the delegation use case I was hoping "parent" would be automatic.
>>
>> Currently the parent is automatic/implicit when creating a node directly
>> nested to the the netdev shaper.
>>
>> I now see we can use as default parent the current leaves' parent, when
>> that is the same for all the to-be-grouped leaves.
>>
>> Actually, if we restrict the group operation to operate only on set of
>> leaves respecting the above, I *guess* we will not lose generality and
>> we could simplify a bit the spec. WDYT?
> 
> I remember having a use case in mind where specifying parent would be
> very useful. I think it may have been related to atomic changes.
> I'm not sure if what I describe below is exactly that case...
> 
> Imagine:
> 
> Qx -{hierarchy}---\
>                     \{hierarchy}-- netdev
> Q0-------P0\ SP----/
> Q1--\ RR-P1/
> Q2--/
> 
> Let's say we own queues 0,1,2 and want to remove the SP layer.
> It's convenient to do:
> 
> 	$node = get($SP-node)
> 	group(leaves: [Q0, Q1, Q2], parent=$node.parent)
> 
> And have the kernel "garbage collect" the old RR node and the old SP
> node (since they will now have no children). We want to avoid the
> situations where user space has to do complex transitions thru
> states which device may not support (make sure Q1, Q2 have right prios,
> delete old RR, now we have SP w/ 3 inputs, delete the SP, create a new
> group).

FTR, while updating the group() implementation to infer the root's 
parent handle in most cases, I stumbled upon a similar scenario.

> For the case above we could technically identify the correct parent by
> skipping the nodes which will be garbage collected later. 

I think that implementation would be quite non trivial/error prone, and 
I think making the new root's parent explicit would be more clear from 
user-space perspective.

What I have now in my local tree is a group() implementation the 
inherits the newly created root's parent handle from the leaves, if all 
of them have the same parent prior to the group() invocation. Otherwise 
it requires the user to specify the root's parent handle. In any case, 
the user-specified root's parent handle value overrides the 
'inherited'/guessed one.

It will cover the above and will not require an explicit parent in most 
case. Would that be good enough?

Thanks,

Paolo
Jakub Kicinski Aug. 27, 2024, 1:53 p.m. UTC | #7
On Tue, 27 Aug 2024 09:54:52 +0200 Paolo Abeni wrote:
> > For the case above we could technically identify the correct parent by
> > skipping the nodes which will be garbage collected later.   
> 
> I think that implementation would be quite non trivial/error prone, and 
> I think making the new root's parent explicit would be more clear from 
> user-space perspective.
> 
> What I have now in my local tree is a group() implementation the 
> inherits the newly created root's parent handle from the leaves, if all 
> of them have the same parent prior to the group() invocation. Otherwise 
> it requires the user to specify the root's parent handle. In any case, 
> the user-specified root's parent handle value overrides the 
> 'inherited'/guessed one.
> 
> It will cover the above and will not require an explicit parent in most 
> case. Would that be good enough?

Yes, that's great.