Message ID | 75cb77aa91040829e55c5cae73e79349f3988e06.1719518113.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: introduce TX shaping H/W offload API | expand |
On Thu, 27 Jun 2024 22:17:18 +0200 Paolo Abeni wrote: > + - > + name: port > + doc: The root shaper for the whole H/W. > + - > + name: netdev > + doc: The main shaper for the given network device. > + - > + name: queue > + doc: The shaper is attached to the given device queue. > + - > + name: detached > + doc: | > + The shaper can be attached to port, netdev or other > + detached shapers, allowing nesting and grouping of > + netdev or queues. > + render-max: true nit: move other properties before the list > +attribute-sets: > + - > + name: net_shaper s/_/-/ on all names > + attributes: > + - > + name: ifindex > + type: u32 > + - > + name: parent > + type: nest > + nested-attributes: handle > + - > + name: handle > + type: nest > + nested-attributes: handle > + - > + name: metric > + type: u32 > + enum: metric > + - > + name: bw-min > + type: u64 s/u64/uint/ > + - > + name: handles > + type: nest > + multi-attr: true > + nested-attributes: handle oh both handle and handles... it'd be good to have more doc: for the attributes, all these things get auto-rendered on docs.kernel.org > + - > + name: shapers > + type: nest > + multi-attr: true > + nested-attributes: ns-info How do shapers differ from shaping attrs in this scope? :S > + - > + name: modified > + type: u32 > + - > + name: pad > + type: pad after using uint this can go > +operations: > + list: > + - > + name: get > + doc: | > + Get / Dump information about a/all the shaper for a given device > + attribute-set: net_shaper > + flags: [ admin-perm ] Any reason why get is admin-perm ?
On Fri, 2024-06-28 at 19:12 -0700, Jakub Kicinski wrote: > On Thu, 27 Jun 2024 22:17:18 +0200 Paolo Abeni wrote: > > > + - > > + name: shapers > > + type: nest > > + multi-attr: true > > + nested-attributes: ns-info > > How do shapers differ from shaping attrs in this scope? :S the set() operation must configure multiple shapers with a single command - to allow the 'atomic configuration changes' need for Andrew's use-case. Out-of-sheer ignorance on my side, the above was the most straight- forward way to provide set() with an array of shapers. Do you mean there are better way to achieve the goal, or "just" that the documentation here is missing and _necessary_? > > +operations: > > + list: > > + - > > + name: get > > + doc: | > > + Get / Dump information about a/all the shaper for a given device > > + attribute-set: net_shaper > > + flags: [ admin-perm ] > > Any reason why get is admin-perm ? Mostly a "better safe then sorry" approach and cargo-cult form other recent yaml changes the hard reasons. Fine to drop it, if there is agreement. Side note: ack to everything else noted in your reply. Thanks, Paolo
On Fri, 2024-06-28 at 19:12 -0700, Jakub Kicinski wrote: > On Thu, 27 Jun 2024 22:17:18 +0200 Paolo Abeni wrote: > > > +attribute-sets: > > + - > > + name: net_shaper > > s/_/-/ on all names It looks like the initial name: net_shaper would require some patching to ynl-gen-c.py to handle the s/_/-/ conversion in the generated file names, too. Cheers, Paolo
On Mon, 2024-07-01 at 16:50 +0200, Paolo Abeni wrote: > On Fri, 2024-06-28 at 19:12 -0700, Jakub Kicinski wrote: > > On Thu, 27 Jun 2024 22:17:18 +0200 Paolo Abeni wrote: > > > > > +attribute-sets: > > > + - > > > + name: net_shaper > > > > s/_/-/ on all names > > It looks like the initial > > name: net_shaper > > would require some patching to ynl-gen-c.py to handle the s/_/-/ > conversion in the generated file names, too. I mean something alike the following, to be explicit: --- diff --git a/tools/net/ynl/ynl-gen-c.py b/tools/net/ynl/ynl-gen-c.py index 374ca5e86e24..51529fabd517 100755 --- a/tools/net/ynl/ynl-gen-c.py +++ b/tools/net/ynl/ynl-gen-c.py @@ -59,9 +59,9 @@ class Type(SpecAttr): if 'nested-attributes' in attr: self.nested_attrs = attr['nested-attributes'] if self.nested_attrs == family.name: - self.nested_render_name = c_lower(f"{family.name}") + self.nested_render_name = c_lower(f"{family.ident_name}") else: - self.nested_render_name = c_lower(f"{family.name}_{self.nested_attrs}") + self.nested_render_name = c_lower(f"{family.ident_name}_{self.nested_attrs}") if self.nested_attrs in self.family.consts: self.nested_struct_type = 'struct ' + self.nested_render_name + '_' @@ -693,9 +693,9 @@ class Struct: self.nested = type_list is None if family.name == c_lower(space_name): - self.render_name = c_lower(family.name) + self.render_name = c_lower(family.ident_name) else: - self.render_name = c_lower(family.name + '-' + space_name) + self.render_name = c_lower(family.ident_name + '-' + space_name) self.struct_name = 'struct ' + self.render_name if self.nested and space_name in family.consts: self.struct_name += '_' @@ -761,7 +761,7 @@ class EnumEntry(SpecEnumEntry): class EnumSet(SpecEnumSet): def __init__(self, family, yaml): - self.render_name = c_lower(family.name + '-' + yaml['name']) + self.render_name = c_lower(family.ident_name + '-' + yaml['name']) if 'enum-name' in yaml: if yaml['enum-name']: @@ -777,7 +777,7 @@ class EnumSet(SpecEnumSet): else: self.user_type = 'int' - self.value_pfx = yaml.get('name-prefix', f"{family.name}-{yaml['name']}-") + self.value_pfx = yaml.get('name-prefix', f"{family.ident_name}-{yaml['name']}-") super().__init__(family, yaml) @@ -802,9 +802,9 @@ class AttrSet(SpecAttrSet): if 'name-prefix' in yaml: pfx = yaml['name-prefix'] elif self.name == family.name: - pfx = family.name + '-a-' + pfx = family.ident_name + '-a-' else: - pfx = f"{family.name}-a-{self.name}-" + pfx = f"{family.ident_name}-a-{self.name}-" self.name_prefix = c_upper(pfx) self.max_name = c_upper(self.yaml.get('attr-max-name', f"{self.name_prefix}max")) self.cnt_name = c_upper(self.yaml.get('attr-cnt-name', f"__{self.name_prefix}max")) @@ -861,7 +861,7 @@ class Operation(SpecOperation): def __init__(self, family, yaml, req_value, rsp_value): super().__init__(family, yaml, req_value, rsp_value) - self.render_name = c_lower(family.name + '_' + self.name) + self.render_name = c_lower(family.ident_name + '_' + self.name) self.dual_policy = ('do' in yaml and 'request' in yaml['do']) and \ ('dump' in yaml and 'request' in yaml['dump']) @@ -911,11 +911,11 @@ class Family(SpecFamily): if 'uapi-header' in self.yaml: self.uapi_header = self.yaml['uapi-header'] else: - self.uapi_header = f"linux/{self.name}.h" + self.uapi_header = f"linux/{self.ident_name}.h" if self.uapi_header.startswith("linux/") and self.uapi_header.endswith('.h'): self.uapi_header_name = self.uapi_header[6:-2] else: - self.uapi_header_name = self.name + self.uapi_header_name = self.ident_name def resolve(self): self.resolve_up(super()) @@ -923,7 +923,7 @@ class Family(SpecFamily): if self.yaml.get('protocol', 'genetlink') not in {'genetlink', 'genetlink-c', 'genetlink-legacy'}: raise Exception("Codegen only supported for genetlink") - self.c_name = c_lower(self.name) + self.c_name = c_lower(self.ident_name) if 'name-prefix' in self.yaml['operations']: self.op_prefix = c_upper(self.yaml['operations']['name-prefix']) else: @@ -2173,7 +2173,7 @@ def print_kernel_op_table_fwd(family, cw, terminate): exported = not kernel_can_gen_family_struct(family) if not terminate or exported: - cw.p(f"/* Ops table for {family.name} */") + cw.p(f"/* Ops table for {family.ident_name} */") pol_to_struct = {'global': 'genl_small_ops', 'per-op': 'genl_ops', @@ -2225,12 +2225,12 @@ def print_kernel_op_table_fwd(family, cw, terminate): continue if 'do' in op: - name = c_lower(f"{family.name}-nl-{op_name}-doit") + name = c_lower(f"{family.ident_name}-nl-{op_name}-doit") cw.write_func_prot('int', name, ['struct sk_buff *skb', 'struct genl_info *info'], suffix=';') if 'dump' in op: - name = c_lower(f"{family.name}-nl-{op_name}-dumpit") + name = c_lower(f"{family.ident_name}-nl-{op_name}-dumpit") cw.write_func_prot('int', name, ['struct sk_buff *skb', 'struct netlink_callback *cb'], suffix=';') cw.nl() @@ -2256,13 +2256,13 @@ def print_kernel_op_table(family, cw): for x in op['dont-validate']])), ) for op_mode in ['do', 'dump']: if op_mode in op: - name = c_lower(f"{family.name}-nl-{op_name}-{op_mode}it") + name = c_lower(f"{family.ident_name}-nl-{op_name}-{op_mode}it") members.append((op_mode + 'it', name)) if family.kernel_policy == 'per-op': struct = Struct(family, op['attribute-set'], type_list=op['do']['request']['attributes']) - name = c_lower(f"{family.name}-{op_name}-nl-policy") + name = c_lower(f"{family.ident_name}-{op_name}-nl-policy") members.append(('policy', name)) members.append(('maxattr', struct.attr_max_val.enum_name)) if 'flags' in op: @@ -2294,7 +2294,7 @@ def print_kernel_op_table(family, cw): members.append(('validate', ' | '.join([c_upper('genl-dont-validate-' + x) for x in dont_validate])), ) - name = c_lower(f"{family.name}-nl-{op_name}-{op_mode}it") + name = c_lower(f"{family.ident_name}-nl-{op_name}-{op_mode}it") if 'pre' in op[op_mode]: members.append((cb_names[op_mode]['pre'], c_lower(op[op_mode]['pre']))) members.append((op_mode + 'it', name)) @@ -2305,9 +2305,9 @@ def print_kernel_op_table(family, cw): type_list=op[op_mode]['request']['attributes']) if op.dual_policy: - name = c_lower(f"{family.name}-{op_name}-{op_mode}-nl-policy") + name = c_lower(f"{family.ident_name}-{op_name}-{op_mode}-nl-policy") else: - name = c_lower(f"{family.name}-{op_name}-nl-policy") + name = c_lower(f"{family.ident_name}-{op_name}-nl-policy") members.append(('policy', name)) members.append(('maxattr', struct.attr_max_val.enum_name)) flags = (op['flags'] if 'flags' in op else []) + ['cmd-cap-' + op_mode] @@ -2326,7 +2326,7 @@ def print_kernel_mcgrp_hdr(family, cw): cw.block_start('enum') for grp in family.mcgrps['list']: - grp_id = c_upper(f"{family.name}-nlgrp-{grp['name']},") + grp_id = c_upper(f"{family.ident_name}-nlgrp-{grp['name']},") cw.p(grp_id) cw.block_end(';') cw.nl() @@ -2339,7 +2339,7 @@ def print_kernel_mcgrp_src(family, cw): cw.block_start('static const struct genl_multicast_group ' + family.c_name + '_nl_mcgrps[] =') for grp in family.mcgrps['list']: name = grp['name'] - grp_id = c_upper(f"{family.name}-nlgrp-{name}") + grp_id = c_upper(f"{family.ident_name}-nlgrp-{name}") cw.p('[' + grp_id + '] = { "' + name + '", },') cw.block_end(';') cw.nl() @@ -2361,7 +2361,7 @@ def print_kernel_family_struct_src(family, cw): if not kernel_can_gen_family_struct(family): return - cw.block_start(f"struct genl_family {family.name}_nl_family __ro_after_init =") + cw.block_start(f"struct genl_family {family.ident_name}_nl_family __ro_after_init =") cw.p('.name\t\t= ' + family.fam_key + ',') cw.p('.version\t= ' + family.ver_key + ',') cw.p('.netnsok\t= true,') @@ -2429,7 +2429,7 @@ def render_uapi(family, cw): cw.p(' */') uapi_enum_start(family, cw, const, 'name') - name_pfx = const.get('name-prefix', f"{family.name}-{const['name']}-") + name_pfx = const.get('name-prefix', f"{family.ident_name}-{const['name']}-") for entry in enum.entries.values(): suffix = ',' if entry.value_change: @@ -2451,7 +2451,7 @@ def render_uapi(family, cw): cw.nl() elif const['type'] == 'const': defines.append([c_upper(family.get('c-define-name', - f"{family.name}-{const['name']}")), + f"{family.ident_name}-{const['name']}")), const['value']]) if defines: @@ -2529,7 +2529,7 @@ def render_uapi(family, cw): defines = [] for grp in family.mcgrps['list']: name = grp['name'] - defines.append([c_upper(grp.get('c-define-name', f"{family.name}-mcgrp-{name}")), + defines.append([c_upper(grp.get('c-define-name', f"{family.ident_name}-mcgrp-{name}")), f'{name}']) cw.nl() if defines:
On Mon, 01 Jul 2024 17:50:17 +0200 Paolo Abeni wrote: > > would require some patching to ynl-gen-c.py to handle the s/_/-/ > > conversion in the generated file names, too. > > I mean something alike the following, to be explicit: Great, thanks! I wasn't sure we're actually ready to make the family name use dashes, since we don't have any precedent. But since you have the patch - please submit!
On Mon, 01 Jul 2024 12:14:32 +0200 Paolo Abeni wrote: > > > + - > > > + name: shapers > > > + type: nest > > > + multi-attr: true > > > + nested-attributes: ns-info > > > > How do shapers differ from shaping attrs in this scope? :S > > the set() operation must configure multiple shapers with a single > command - to allow the 'atomic configuration changes' need for Andrew's > use-case. > > Out-of-sheer ignorance on my side, the above was the most straight- > forward way to provide set() with an array of shapers. > > Do you mean there are better way to achieve the goal, or "just" that > the documentation here is missing and _necessary_? I see, I had a look at patch 2 now. But that's really "Andrew's use-case" it doesn't cover deletion, right? Sorry that I don't have a perfect suggestion either but it seems like a half-measure. It's a partial support for transactions. If we want transactions we should group ops like nftables. Have normal ops (add, delete, modify) and control ops (start, commit) which clone the entire tree, then ops change it, and commit presents new tree to the device. Alternative would be to, instead of supporting transactions have some form of a "complex instruction set". Most transformations will take a set of inputs (+weights / prios), shaping params, and where to attach. > > > +operations: > > > + list: > > > + - > > > + name: get > > > + doc: | > > > + Get / Dump information about a/all the shaper for a given device > > > + attribute-set: net_shaper > > > + flags: [ admin-perm ] > > > > Any reason why get is admin-perm ? > > Mostly a "better safe then sorry" approach and cargo-cult form other > recent yaml changes the hard reasons. Fine to drop it, if there is > agreement. I thought we default to GET being non-privileged. I think that's better, monitoring shouldn't require admin perm and presumably those shapers may grow stats at some stage. But no strong feelings.
On Mon, 2024-07-01 at 19:54 -0700, Jakub Kicinski wrote: > On Mon, 01 Jul 2024 12:14:32 +0200 Paolo Abeni wrote: > > > > + - > > > > + name: shapers > > > > + type: nest > > > > + multi-attr: true > > > > + nested-attributes: ns-info > > > > > > How do shapers differ from shaping attrs in this scope? :S > > > > the set() operation must configure multiple shapers with a single > > command - to allow the 'atomic configuration changes' need for Andrew's > > use-case. > > > > Out-of-sheer ignorance on my side, the above was the most straight- > > forward way to provide set() with an array of shapers. > > > > Do you mean there are better way to achieve the goal, or "just" that > > the documentation here is missing and _necessary_? > > I see, I had a look at patch 2 now. > But that's really "Andrew's use-case" it doesn't cover deletion, right? > Sorry that I don't have a perfect suggestion either but it seems like > a half-measure. It's a partial support for transactions. If we want > transactions we should group ops like nftables. Have normal ops (add, > delete, modify) and control ops (start, commit) which clone the entire > tree, then ops change it, and commit presents new tree to the device. Yes, it does not cover deletion _and_ update/add/move within the same atomic operation. Still any configuration could be reached from default/initial state with set(<possibly many shapers>). Additionally, given any arbitrary configuration, the default/initial state could be restored with a single delete(<possibly many handlers>). The above covers any possible limitation enforced by the H/W, not just the DSA use-case. Do you have a strong feeling for atomic transactions from any arbitrary state towards any other? If so, I’d like to understand why? Dealing with transactions allowing arbitrary any state <> any state atomic changes will involve some complex logic that seems better assigned to user-space. Thanks, Paolo
On Tue, 02 Jul 2024 16:21:38 +0200 Paolo Abeni wrote: > > I see, I had a look at patch 2 now. > > But that's really "Andrew's use-case" it doesn't cover deletion, right? > > Sorry that I don't have a perfect suggestion either but it seems like > > a half-measure. It's a partial support for transactions. If we want > > transactions we should group ops like nftables. Have normal ops (add, > > delete, modify) and control ops (start, commit) which clone the entire > > tree, then ops change it, and commit presents new tree to the device. > > Yes, it does not cover deletion _and_ update/add/move within the same > atomic operation. > > Still any configuration could be reached from default/initial state > with set(<possibly many shapers>). Additionally, given any arbitrary > configuration, the default/initial state could be restored with a > single delete(<possibly many handlers>). From: q0 -. RR \ q1 / > SP q2 -. RR / q3 / To: q0 ------\ q1 -------> SP q2 -. RR / q3 / You have to both delete an RR node, and set SP params on Q0 and Q1. > The above covers any possible limitation enforced by the H/W, not just > the DSA use-case. > > Do you have a strong feeling for atomic transactions from any arbitrary > state towards any other? If so, I’d like to understand why? I don't believe this is covers all cases. > Dealing with transactions allowing arbitrary any state <> any state > atomic changes will involve some complex logic that seems better > assigned to user-space. Complex logic in which part of the code? It's just a full clone of the xarray, then do whatever ops user is asking to do, then tree walk to render diff as a set of ops. If you mean the tree walk to convert tree diff into ops, I think we need that anyway, otherwise we may get into a situation where there's a dependency between the user space implementation and driver expectations.
Oops, I unintentionally dropped most recipients, re-adding them, sorry for the duplicate. On Tue, Jul 2, 2024 at 5:05 PM Jakub Kicinski <kuba@kernel.org> wrote: > On Tue, 02 Jul 2024 16:21:38 +0200 Paolo Abeni wrote: > > > I see, I had a look at patch 2 now. > > > But that's really "Andrew's use-case" it doesn't cover deletion, right? > > > Sorry that I don't have a perfect suggestion either but it seems like > > > a half-measure. It's a partial support for transactions. If we want > > > transactions we should group ops like nftables. Have normal ops (add, > > > delete, modify) and control ops (start, commit) which clone the entire > > > tree, then ops change it, and commit presents new tree to the device. > > > > Yes, it does not cover deletion _and_ update/add/move within the same > > atomic operation. > > > > Still any configuration could be reached from default/initial state > > with set(<possibly many shapers>). Additionally, given any arbitrary > > configuration, the default/initial state could be restored with a > > single delete(<possibly many handlers>). > > From: > > q0 -. RR \ > q1 / > SP > q2 -. RR / > q3 / Call this C1 > To: > > q0 ------\ > q1 -------> SP > q2 -. RR / > q3 / Call this C2 > You have to both delete an RR node, and set SP params on Q0 and Q1. default -> C1: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ --do set --json '{ "ifindex":2, "shapers": [ \ { "parent": { "scope": "netdev"}, "handle": { "scope": "detached", "id": 1 }, "priority": 1 }, { "parent": { "scope": "netdev"}, "handle": { "scope": "detached", "id": 2 }, "priority": 2 }, { "parent": { "scope": "detached", "id":1}, "handle": { "scope": "queue", "id": 1 }, "weight": 1 }, { "parent": { "scope": "detached", "id":1}, "handle": { "scope": "queue", "id": 2 }, "weight": 2 }, { "parent": { "scope": "detached" "id":2}, "handle": { "scope": "queue", "id": 3 }, "weight": 1 }, { "parent": { "scope": "detached" "id":2}, "handle": { "scope": "queue", "id": 4 }, "weight": 2 }]} C1 -> C2: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ --do delete --json '{ "ifindex":2, "handles": [ \ { "scope": "queue", "id": 1 }, { "scope": "queue", "id": 2 }, { "scope": "queue", "id": 3 }, { "scope": "queue", "id": 4 }, { "scope": "detached", "id": 1 }, { "scope": "detached", "id": 2 }]} ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ --do set --json '{ "ifindex":2, "shapers": [ \ { "parent": { "scope": "netdev"}, "handle": { "scope": "detached", "id": 1 }, "priority": 1 }, { "parent": { "scope": "netdev"}, "handle": { "scope": "queue", "id": 1 }, "priority": 2 }, { "parent": { "scope": "netdev"}, "handle": { "scope": "queue", "id": 2 }, "priorirty": 3 }, { "parent": { "scope": "detached" "id":1}, "handle": { "scope": "queue", "id": 3 }, "weight": 1 }, { "parent": { "scope": "detached" "id":1}, "handle": { "scope": "queue", "id": 4 }, "weight": 2 }, The goal is to allow the system to reach the final configuration, even with the assumption the H/W does not support any configuration except the starting one and the final one. But the infra requires that the system _must_ support at least a 3rd configuration, the default one. > > The above covers any possible limitation enforced by the H/W, not just > > the DSA use-case. > > > > Do you have a strong feeling for atomic transactions from any arbitrary > > state towards any other? If so, I’d like to understand why? > > I don't believe this is covers all cases. Given any pair of configurations C1, C2 we can reach C2 via C1 -> default, default -> C2. respecting any H/W constraint. > > Dealing with transactions allowing arbitrary any state <> any state > > atomic changes will involve some complex logic that seems better > > assigned to user-space. > > Complex logic in which part of the code? IIRC in a past iteration you pointed out that the complexity of computing the delta between 2 arbitrary configurations is significantly higher than going through the empty/default configuration. In any case I think that the larger complexity to implement a full transactional model. nft had proven that to be very hard and bug prone. I really would avoid that option, if possible. Thanks, Paolo
On Tue, 2024-07-02 at 14:08 -0700, Jakub Kicinski wrote: > Not sure if dropping CCs was on purpose ? Nope, just PEBKAC here. Re-adding them and include your reply verbatim, to avoid losing track on the ML > On Tue, 2 Jul 2024 21:49:09 +0200 Paolo Abeni wrote: > > On Tue, Jul 2, 2024 at 5:05 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > On Tue, 02 Jul 2024 16:21:38 +0200 Paolo Abeni wrote: > > > > Yes, it does not cover deletion _and_ update/add/move within the same > > > > atomic operation. > > > > > > > > Still any configuration could be reached from default/initial state > > > > with set(<possibly many shapers>). Additionally, given any arbitrary > > > > configuration, the default/initial state could be restored with a > > > > single delete(<possibly many handlers>). > > > > > > From: > > > > > > q0 -. RR \ > > > q1 / > SP > > > q2 -. RR / > > > q3 / > > > > Call this C1 > > > > > To: > > > > > > q0 ------\ > > > q1 -------> SP > > > q2 -. RR / > > > q3 / > > > > Call this C2 > > > > > You have to both delete an RR node, and set SP params on Q0 and Q1. > > > > default -> C1: > > > > ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ > > --do set --json '{ "ifindex":2, "shapers": [ \ > > { "parent": { "scope": "netdev"}, "handle": { > > "scope": "detached", "id": 1 }, "priority": 1 }, > > { "parent": { "scope": "netdev"}, "handle": { > > "scope": "detached", "id": 2 }, "priority": 2 }, > > { "parent": { "scope": "detached", "id":1}, > > "handle": { "scope": "queue", "id": 1 }, "weight": 1 }, > > { "parent": { "scope": "detached", "id":1}, > > "handle": { "scope": "queue", "id": 2 }, "weight": 2 }, > > { "parent": { "scope": "detached" "id":2}, > > "handle": { "scope": "queue", "id": 3 }, "weight": 1 }, > > { "parent": { "scope": "detached" "id":2}, > > "handle": { "scope": "queue", "id": 4 }, "weight": 2 }]} > > C1 -> C2: > > > > ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ > > --do delete --json '{ "ifindex":2, "handles": [ \ > > { "scope": "queue", "id": 1 }, > > { "scope": "queue", "id": 2 }, > > { "scope": "queue", "id": 3 }, > > { "scope": "queue", "id": 4 }, > > { "scope": "detached", "id": 1 }, > > { "scope": "detached", "id": 2 }]} > > > > ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \ > > --do set --json '{ "ifindex":2, "shapers": [ \ > > { "parent": { "scope": "netdev"}, "handle": { > > "scope": "detached", "id": 1 }, "priority": 1 }, > > { "parent": { "scope": "netdev"}, "handle": { > > "scope": "queue", "id": 1 }, "priority": 2 }, > > { "parent": { "scope": "netdev"}, "handle": { > > "scope": "queue", "id": 2 }, "priorirty": 3 }, > > { "parent": { "scope": "detached" "id":1}, > > "handle": { "scope": "queue", "id": 3 }, "weight": 1 }, > > { "parent": { "scope": "detached" "id":1}, > > "handle": { "scope": "queue", "id": 4 }, "weight": 2 }, > > > > The goal is to allow the system to reach the final configuration, even > > with the assumption the H/W does not support any configuration except > > the starting one and the final one. > > But the infra requires that the system _must_ support at least a 3rd > > configuration, the default one. > > This assumes there is a single daemon responsible for control of all of > the shaping, which I think the endless BPF multi-program back and forth > proves to be incorrect. > > We'd also lose stats each time, and make reconfiguration disruptive to > running workloads. Note there is no stats support for the shapers, nor is planned, nor the H/W I know of have any support for it. The destructive operations will be needed only when the configuration change is inherently destructive. Nothing prevents the user-space to push a direct configuration change when possible - which should be the most frequent case, in practice. Regarding the entity responsible for control, I had in mind a single one, yes. I read the above as you are looking forward to e.g. different applications configuring their own shaping accessing directly the NL interface, am I correct? Why can’t such applications talk to that daemon, instead? Anyway different applications must touch disjoint resources (e.g. disjoint queues sets) right? In such a case even multiple destructive configuration changes (on disjoint resources set) will not be problematic. Still if we want to allow somewhat consistent, concurrent, destructive configuration changes on shared resource (why? It sounds a bit of overdesign at this point), we could extend the set() operation to additional support shapers deletion, e.g. adding an additional ‘delete’ flag attribute to the ‘handle’ sub-set. > > > > The above covers any possible limitation enforced by the H/W, not just > > > > the DSA use-case. > > > > > > > > Do you have a strong feeling for atomic transactions from any arbitrary > > > > state towards any other? If so, I’d like to understand why? > > > > > > I don't believe this is covers all cases. > > > > Given any pair of configurations C1, C2 we can reach C2 via C1 -> > > default, default -> C2. respecting any H/W constraint. > > > > > > Dealing with transactions allowing arbitrary any state <> any state > > > > atomic changes will involve some complex logic that seems better > > > > assigned to user-space. > > > > > > Complex logic in which part of the code? > > > > IIRC in a past iteration you pointed out that the complexity of > > computing the delta between 2 arbitrary configurations is > > significantly higher than going through the empty/default > > configuration. > > For a fixed-layout scheduler where HW blocks have a fixed mapping > with user's hierarchy - it's easier to program the shapers from > the hierarchy directly. Since node maps to the same set of registers > reprogramming will be writing to a register a value it already has > - a noop. That's different than doing a full reset and reprogram > each time, as two separate calls from user space. > > For Intel's cases OTOH, when each command is a separate FW call > we can't actually enforce the atomic transitions, right? > Your code seems to handle returns smaller the number of commands, > from which I infer that we may execute half of the modification? Yes, the code and the NL API allows the NIC to do the update incrementally, and AFAICS Intel ICE has no support for complex transactions. Somewhat enforcing atomic transitions will be quite complex at best and is not need to accomplish the stated goal - allow reconfiguration even when the H/W does not support intermediate states. Do we need to enforce atomicity? Why? NFT has proven that a transational model implementation is hard, and should be avoided if possible. > IOW for Andrew's HW - he'd probably prefer to look at the resulting > tree, no matter what previous state we were in. For Intel we _can't_ > support atomic commands, if they span multiple cycles of FW exchanges? My understanding is that we can’t have atomic updates on Intel, from firmare perspective. As said, I don’t think it’s necessary to support them. WRT the DSA H/W, do you mean the core should always set() the whole known tree to the driver, regardless of the specific changes asked by the user-space? If so, what about letting the driver expose some capability (or private flag) asking the core for such behavior? So that the driver will do the largish set() only with the H/W requiring that. Anyway I'm not sure the mentioned DSA H/W would benefit from always receiving the whole configuration. e.g. changing the weight for a single queue shaper would not need the whole data-set. > > In any case I think that the larger complexity to implement a full > > transactional model. nft had proven that to be very hard and bug > > prone. I really would avoid that option, if possible. > > Maybe instead of discussing the user space API it'd be more beneficial > to figure out a solid way of translating the existing APIs into the new > model? Could you please rephrase? I think all the arguments discussed here are related to the model - at some point that impact the user space API, too. Thanks, Paolo
On Wed, 03 Jul 2024 16:53:38 +0200 Paolo Abeni wrote: > Note there is no stats support for the shapers, nor is planned, nor the > H/W I know of have any support for it. > > The destructive operations will be needed only when the configuration > change is inherently destructive. Nothing prevents the user-space to > push a direct configuration change when possible - which should be the > most frequent case, in practice. > > Regarding the entity responsible for control, I had in mind a single > one, yes. I read the above as you are looking forward to e.g. different > applications configuring their own shaping accessing directly the NL > interface, am I correct? Why can’t such applications talk to that > daemon, instead? We know such daemon did not materialize in other cases. We can assume there is a daemon if we think that's a good design. We shouldn't use a mirage of a third-party daemon to wave away problems we didn't manage to solve. > Anyway different applications must touch disjoint resources (e.g. > disjoint queues sets) right? In such a case even multiple destructive > configuration changes (on disjoint resources set) will not be > problematic. > > Still if we want to allow somewhat consistent, concurrent, destructive > configuration changes on shared resource (why? It sounds a bit of > overdesign at this point), we could extend the set() operation to > additional support shapers deletion, e.g. adding an additional ‘delete’ > flag attribute to the ‘handle’ sub-set. To judge whether it's an over-design we'd need to know what the user scenarios are, those are not included in the series AFAICS. > > For a fixed-layout scheduler where HW blocks have a fixed mapping > > with user's hierarchy - it's easier to program the shapers from > > the hierarchy directly. Since node maps to the same set of registers > > reprogramming will be writing to a register a value it already has > > - a noop. That's different than doing a full reset and reprogram > > each time, as two separate calls from user space. > > > > For Intel's cases OTOH, when each command is a separate FW call > > we can't actually enforce the atomic transitions, right? > > Your code seems to handle returns smaller the number of commands, > > from which I infer that we may execute half of the modification? > > Yes, the code and the NL API allows the NIC to do the update > incrementally, and AFAICS Intel ICE has no support for complex > transactions. > Somewhat enforcing atomic transitions will be quite complex at best and > is not need to accomplish the stated goal - allow reconfiguration even > when the H/W does not support intermediate states. > > Do we need to enforce atomicity? Why? NFT has proven that a > transational model implementation is hard, and should be avoided if > possible. > > > IOW for Andrew's HW - he'd probably prefer to look at the resulting > > tree, no matter what previous state we were in. For Intel we _can't_ > > support atomic commands, if they span multiple cycles of FW exchanges? > > My understanding is that we can’t have atomic updates on Intel, from > firmare perspective. As said, I don’t think it’s necessary to support > them. > > WRT the DSA H/W, do you mean the core should always set() the whole > known tree to the driver, regardless of the specific changes asked by > the user-space? If so, what about letting the driver expose some > capability (or private flag) asking the core for such behavior? So that > the driver will do the largish set() only with the H/W requiring that. > > Anyway I'm not sure the mentioned DSA H/W would benefit from always > receiving the whole configuration. e.g. changing the weight for a > single queue shaper would not need the whole data-set. To be blunt - what I'm getting at is that the API mirrors Intel's FW API with an extra kludge to support the DSA H/W - in the end matching neither what the DSA wants nor what Intel can do. > > > In any case I think that the larger complexity to implement a full > > > transactional model. nft had proven that to be very hard and bug > > > prone. I really would avoid that option, if possible. > > > > Maybe instead of discussing the user space API it'd be more beneficial > > to figure out a solid way of translating the existing APIs into the new > > model? > > Could you please rephrase? I think all the arguments discussed here are > related to the model - at some point that impact the user space API, > too. I was hoping that implementing the code that mirrors the existing APIs into tree operations would teach us something about the primitives that operate on the tree. The proposed primitives are really low level, which is why we need to "fuse" them into multi-change operations. More specifically what I described a few emails up was a group+schedule+limit paradigm. Instead of describing single moves and parameter setting - transformation would describe inputs, scheduling across them, and rate limit to apply. But one transformation would always operate on at most one MUX node. IOW I'm trying to explore whether we can find a language of transformations which will be more complex than single micro-operations on the tree, but sufficiently expressive to provide atomic transformations without transactions of micro-ops.
On Wed, 2024-07-03 at 14:20 -0700, Jakub Kicinski wrote: > On Wed, 03 Jul 2024 16:53:38 +0200 Paolo Abeni wrote: > > > Anyway different applications must touch disjoint resources (e.g. > > disjoint queues sets) right? In such a case even multiple destructive > > configuration changes (on disjoint resources set) will not be > > problematic. > > > > Still if we want to allow somewhat consistent, concurrent, destructive > > configuration changes on shared resource (why? It sounds a bit of > > overdesign at this point), we could extend the set() operation to > > additional support shapers deletion, e.g. adding an additional ‘delete’ > > flag attribute to the ‘handle’ sub-set. > > To judge whether it's an over-design we'd need to know what the user > scenarios are, those are not included in the series AFAICS. My bad, in the cover I referred to the prior discussions without explicitly quoting the contents. The initial goal here was to allow the user-space to configure per- queue, H/W offloaded, TX shaping. That later evolved in introducing an in-kernel H/W offload TX shaping API capable of replacing the many existing similar in-kernel APIs and supporting the foreseeable H/W capabilities. > To be blunt - what I'm getting at is that the API mirrors Intel's FW > API with an extra kludge to support the DSA H/W - in the end matching > neither what the DSA wants nor what Intel can do. The API is similar to Intel’s FW API because to my understanding the underlying design - an arbitrary tree - is the most complete representation possible for shaping H/W. AFAICT is also similar to what other NIC vendors’ offer. I don’t see why the APIs don’t match what Intel can do, could you please elaborate on that? > IOW I'm trying to explore whether we can find a language of > transformations which will be more complex than single micro-operations > on the tree, but sufficiently expressive to provide atomic > transformations without transactions of micro-ops. I personally find it straight-forward describing the scenario you proposed in terms of the simple operations as allowed by this series. I also think it’s easier to build arbitrarily complex scenarios in terms of simple operations instead of trying to put enough complexity in the language to describe everything. It will easily lose flexibility or increase complexity for unclear gain. Why do you think that would be a better approach? Also the simple building blocks approach is IMHO closer to the original use-case. Are there any other reasons for atomic operations, beyond addressing low-end H/W? Note that WRT the specific limitations we are aware of, the APIs allows atomic conf changes without resorting to transition into the default configuration. My biggest concern is that from my perspective we are incrementally adding requisites, with increasing complexity, further and further away from the initial use-case. Thanks, Paolo
On Mon, 08 Jul 2024 21:42:00 +0200 Paolo Abeni wrote: > > To judge whether it's an over-design we'd need to know what the user > > scenarios are, those are not included in the series AFAICS. > > My bad, in the cover I referred to the prior discussions without > explicitly quoting the contents. > > The initial goal here was to allow the user-space to configure per- > queue, H/W offloaded, TX shaping. Per-queue Tx shaping is already supported. I mean "user scenarios" in the agile programming sense as in something that gets closer to production use cases. "Set rate limit on a queue" is a suggested solution not a statement of a problem. > That later evolved in introducing an in-kernel H/W offload TX shaping > API capable of replacing the many existing similar in-kernel APIs and > supporting the foreseeable H/W capabilities. > > > To be blunt - what I'm getting at is that the API mirrors Intel's FW > > API with an extra kludge to support the DSA H/W - in the end matching > > neither what the DSA wants nor what Intel can do. > > The API is similar to Intel’s FW API because to my understanding the > underlying design - an arbitrary tree - is the most complete > representation possible for shaping H/W. AFAICT is also similar to what > other NIC vendors’ offer. > > I don’t see why the APIs don’t match what Intel can do, could you > please elaborate on that? That's not the main point I was making, I was complaining about how the "extension" to support DSA HW was bolted onto this API. Undeniably the implementation must be stored as a tree (with some max depth). That doesn't imply that, for example, arbitrary re-parenting of non-leaf nodes is an operation that makes sense for all devices. From memory devlink rate doesn't allow mixing some node types after one parent, too. > > IOW I'm trying to explore whether we can find a language of > > transformations which will be more complex than single micro-operations > > on the tree, but sufficiently expressive to provide atomic > > transformations without transactions of micro-ops. > > I personally find it straight-forward describing the scenario you > proposed in terms of the simple operations as allowed by this series. I > also think it’s easier to build arbitrarily complex scenarios in terms > of simple operations instead of trying to put enough complexity in the > language to describe everything. It will easily lose flexibility or > increase complexity for unclear gain. Why do you think that would be a > better approach? I strongly dislike the operation grouping/batching as it stands in v1. Do you think that part of the design is clean? I also disagree with the assertion that having a language of transformations more advanced that "add/move/delete" increases complexity. Language gives you properties you can reason about. That's why rbtree, B-tree and other algos define a language of transformations. Naive ops make arbitrary trees easy but I put it to you that allowing arbitrary trees without any enforced invariants will breed far more complexity and bugs than a properly designed language :) The primary operation of interest is, in fact: given a set of resources (queues or netdevs) which currently feed one mux node - build a sub-hierarchy. Use case 1 - container b/w sharing. Container manager will want to group a set of queues, and feed a higher layer RR node (so that number of queues doesn't impact load sharing). Say we have two containers (c1, c2 represent queues assigned to them); before container 3 starts: c1 - \ c1 - >RR c1 - / > RR(root) c2 - \ RR c2 - / allocate 2 queues: c1 - \ c1 - ->RR c1 - / > RR(root) c2 - \ RR c2 - / ' qX / qY / hierarchize ("group([qX, qY], type="rr")): c1 - \ c1 - >RR c1 - / \ c2 - \ RR -> RR(root) c2 - / / c3 - \ RR c3 - / The container manager just wants so say "take the new queues (X, Y), put them under an RR node". If the language is build around creating mux nodes - that's a single call. Note that the RR(root) node is implicit (in your API it's not visible but it is implied). Users case 2 - delegation - the neat thing about using such construct is that as you can see we never referred to output, i.e. RR(root). The output can be implied by whatever node the queues already output to. So say container 1 (c1) wants to set a b/w limit on two of it's queues (let's call its queues A B C): rr_id = group([B C], type="rr") rate_limit(rr_id, 1Gbps) and end up with: cA ----- \ cB - \RR*/ RR cC - / \ c2 - \ RR -> RR(root) c2 - / / c3 - \ RR c3 - / * new node, also has rate limit set You don't have to worry about parentage permissions. Container can only add nodes (which is always safe) or delete nodes (and we can trivially enforce it only deletes node it has created itself). > Also the simple building blocks approach is IMHO closer to the original > use-case. > > Are there any other reasons for atomic operations, beyond addressing > low-end H/W? I think atomic changes are convenient to match what the user wants to do. And the second use case is that I do believe there's a real need to allow uncoordinated agents to modify sections on the hierarchy. Neither of those are hard requirements, but I think any application driven requirement should come before "that's the FW API for vendor X" I hope this we can agree on. Thinking about it (for longer than I care to admit), one concern I have about the "mux creation" API I described above is that it forces existence of leaf and non-leaf nodes at the same parent, at least transiently. Can we go back to an API with an explicit create/modify/delete? All we need for Andrew's use case, I believe, is to be able to "somewhat atomically" move leaf nodes.
diff --git a/Documentation/netlink/specs/shaper.yaml b/Documentation/netlink/specs/shaper.yaml new file mode 100644 index 000000000000..8563c85de68d --- /dev/null +++ b/Documentation/netlink/specs/shaper.yaml @@ -0,0 +1,202 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) + +name: net_shaper + +doc: Network HW Rate Limiting offload + +definitions: + - + type: enum + name: scope + doc: the different scopes where a shaper can be attached + entries: + - name: unspec + doc: The scope is not specified + - + name: port + doc: The root shaper for the whole H/W. + - + name: netdev + doc: The main shaper for the given network device. + - + name: queue + doc: The shaper is attached to the given device queue. + - + name: detached + doc: | + The shaper can be attached to port, netdev or other + detached shapers, allowing nesting and grouping of + netdev or queues. + render-max: true + - + type: enum + name: metric + doc: different metric each shaper can support + entries: + - + name: bps + doc: Shaper operates on a bits per second basis + - + name: pps + doc: Shaper operates on a packets per second basis + +attribute-sets: + - + name: net_shaper + attributes: + - + name: ifindex + type: u32 + - + name: parent + type: nest + nested-attributes: handle + - + name: handle + type: nest + nested-attributes: handle + - + name: metric + type: u32 + enum: metric + - + name: bw-min + type: u64 + - + name: bw-max + type: u64 + - + name: burst + type: u64 + - + name: priority + type: u32 + - + name: weight + type: u32 + - + name: scope + type: u32 + enum: scope + - + name: id + type: u32 + - + name: handles + type: nest + multi-attr: true + nested-attributes: handle + - + name: shapers + type: nest + multi-attr: true + nested-attributes: ns-info + - + name: modified + type: u32 + - + name: pad + type: pad + - + name: handle + subset-of: net_shaper + attributes: + - + name: scope + - + name: id + - + name: ns-info + subset-of: net_shaper + attributes: + - + name: parent + - + name: handle + - + name: metric + - + name: bw-min + - + name: bw-max + - + name: burst + - + name: priority + - + name: weight + +operations: + list: + - + name: get + doc: | + Get / Dump information about a/all the shaper for a given device + attribute-set: net_shaper + flags: [ admin-perm ] + + do: + request: + attributes: + - ifindex + - handle + reply: + attributes: &ns-attrs + - parent + - handle + - metric + - bw-min + - bw-max + - burst + - priority + - weight + + dump: + request: + attributes: + - ifindex + reply: + attributes: *ns-attrs + - + name: set + doc: | + Create or configures the specified shapers. + The update is atomic with respect to all shaper + affected by a single command, and is allowed to + affect a subset of the specified shapers, e.g. + due to H/W resources exhaustion. In such case + the update stops at the first failure, the extack + is set accordingly. + attribute-set: net_shaper + flags: [ admin-perm ] + + do: + request: + attributes: + - ifindex + - shapers + reply: + attributes: + - modified + + - + name: delete + doc: | + Clear (remove) the specified shaper. + The update is atomic with respect to all shaper + affected by a single command, and is allowed to + affect a subset of the specified shapers, e.g. + due to H/W resources exhaustion. In such case + the update stops at the first failure, the extack + is set accordingly. + attribute-set: net_shaper + flags: [ admin-perm ] + + do: + request: + attributes: + - ifindex + - handles + reply: + attributes: + - modified diff --git a/include/uapi/linux/net_shaper.h b/include/uapi/linux/net_shaper.h new file mode 100644 index 000000000000..7e6b655e6c6d --- /dev/null +++ b/include/uapi/linux/net_shaper.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ +/* Do not edit directly, auto-generated from: */ +/* Documentation/netlink/specs/shaper.yaml */ +/* YNL-GEN uapi header */ + +#ifndef _UAPI_LINUX_NET_SHAPER_H +#define _UAPI_LINUX_NET_SHAPER_H + +#define NET_SHAPER_FAMILY_NAME "net_shaper" +#define NET_SHAPER_FAMILY_VERSION 1 + +/** + * enum net_shaper_scope - the different scopes where a shaper can be attached + * @NET_SHAPER_SCOPE_UNSPEC: The scope is not specified + * @NET_SHAPER_SCOPE_PORT: The root shaper for the whole H/W. + * @NET_SHAPER_SCOPE_NETDEV: The main shaper for the given network device. + * @NET_SHAPER_SCOPE_QUEUE: The shaper is attached to the given device queue. + * @NET_SHAPER_SCOPE_DETACHED: The shaper can be attached to port, netdev or + * other detached shapers, allowing nesting and grouping of netdev or queues. + */ +enum net_shaper_scope { + NET_SHAPER_SCOPE_UNSPEC, + NET_SHAPER_SCOPE_PORT, + NET_SHAPER_SCOPE_NETDEV, + NET_SHAPER_SCOPE_QUEUE, + NET_SHAPER_SCOPE_DETACHED, + + /* private: */ + __NET_SHAPER_SCOPE_MAX, + NET_SHAPER_SCOPE_MAX = (__NET_SHAPER_SCOPE_MAX - 1) +}; + +/** + * enum net_shaper_metric - different metric each shaper can support + * @NET_SHAPER_METRIC_BPS: Shaper operates on a bits per second basis + * @NET_SHAPER_METRIC_PPS: Shaper operates on a packets per second basis + */ +enum net_shaper_metric { + NET_SHAPER_METRIC_BPS, + NET_SHAPER_METRIC_PPS, +}; + +enum { + NET_SHAPER_A_IFINDEX = 1, + NET_SHAPER_A_PARENT, + NET_SHAPER_A_HANDLE, + NET_SHAPER_A_METRIC, + NET_SHAPER_A_BW_MIN, + NET_SHAPER_A_BW_MAX, + NET_SHAPER_A_BURST, + NET_SHAPER_A_PRIORITY, + NET_SHAPER_A_WEIGHT, + NET_SHAPER_A_SCOPE, + NET_SHAPER_A_ID, + NET_SHAPER_A_HANDLES, + NET_SHAPER_A_SHAPERS, + NET_SHAPER_A_MODIFIED, + NET_SHAPER_A_PAD, + + __NET_SHAPER_A_MAX, + NET_SHAPER_A_MAX = (__NET_SHAPER_A_MAX - 1) +}; + +enum { + NET_SHAPER_CMD_GET = 1, + NET_SHAPER_CMD_SET, + NET_SHAPER_CMD_DELETE, + + __NET_SHAPER_CMD_MAX, + NET_SHAPER_CMD_MAX = (__NET_SHAPER_CMD_MAX - 1) +}; + +#endif /* _UAPI_LINUX_NET_SHAPER_H */ diff --git a/net/Kconfig b/net/Kconfig index d27d0deac0bf..31fccfed04f7 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -66,6 +66,9 @@ config SKB_DECRYPTED config SKB_EXTENSIONS bool +config NET_SHAPER + bool + menu "Networking options" source "net/packet/Kconfig" diff --git a/net/Makefile b/net/Makefile index 65bb8c72a35e..60ed5190eda8 100644 --- a/net/Makefile +++ b/net/Makefile @@ -79,3 +79,4 @@ obj-$(CONFIG_XDP_SOCKETS) += xdp/ obj-$(CONFIG_MPTCP) += mptcp/ obj-$(CONFIG_MCTP) += mctp/ obj-$(CONFIG_NET_HANDSHAKE) += handshake/ +obj-$(CONFIG_NET_SHAPER) += shaper/ diff --git a/net/shaper/Makefile b/net/shaper/Makefile new file mode 100644 index 000000000000..13375884d60e --- /dev/null +++ b/net/shaper/Makefile @@ -0,0 +1,9 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Makefile for the Generic HANDSHAKE service +# +# Copyright (c) 2024, Red Hat, Inc. +# + +obj-y += shaper.o shaper_nl_gen.o + diff --git a/net/shaper/shaper.c b/net/shaper/shaper.c new file mode 100644 index 000000000000..49de88c68e2f --- /dev/null +++ b/net/shaper/shaper.c @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include <linux/kernel.h> +#include <linux/skbuff.h> + +#include "shaper_nl_gen.h" + +int net_shaper_nl_get_doit(struct sk_buff *skb, struct genl_info *info) +{ + return -EOPNOTSUPP; +} + +int net_shaper_nl_get_dumpit(struct sk_buff *skb, + struct netlink_callback *cb) +{ + return -EOPNOTSUPP; +} + +int net_shaper_nl_set_doit(struct sk_buff *skb, struct genl_info *info) +{ + return -EOPNOTSUPP; +} + +int net_shaper_nl_delete_doit(struct sk_buff *skb, struct genl_info *info) +{ + return -EOPNOTSUPP; +} + +static int __init shaper_init(void) +{ + return genl_register_family(&net_shaper_nl_family); +} + +subsys_initcall(shaper_init); diff --git a/net/shaper/shaper_nl_gen.c b/net/shaper/shaper_nl_gen.c new file mode 100644 index 000000000000..159b4cb6d2b8 --- /dev/null +++ b/net/shaper/shaper_nl_gen.c @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) +/* Do not edit directly, auto-generated from: */ +/* Documentation/netlink/specs/shaper.yaml */ +/* YNL-GEN kernel source */ + +#include <net/netlink.h> +#include <net/genetlink.h> + +#include "shaper_nl_gen.h" + +#include <uapi/linux/net_shaper.h> + +/* Common nested types */ +const struct nla_policy net_shaper_handle_nl_policy[NET_SHAPER_A_ID + 1] = { + [NET_SHAPER_A_SCOPE] = NLA_POLICY_MAX(NLA_U32, 4), + [NET_SHAPER_A_ID] = { .type = NLA_U32, }, +}; + +const struct nla_policy net_shaper_ns_info_nl_policy[NET_SHAPER_A_WEIGHT + 1] = { + [NET_SHAPER_A_PARENT] = NLA_POLICY_NESTED(net_shaper_handle_nl_policy), + [NET_SHAPER_A_HANDLE] = NLA_POLICY_NESTED(net_shaper_handle_nl_policy), + [NET_SHAPER_A_METRIC] = NLA_POLICY_MAX(NLA_U32, 1), + [NET_SHAPER_A_BW_MIN] = { .type = NLA_U64, }, + [NET_SHAPER_A_BW_MAX] = { .type = NLA_U64, }, + [NET_SHAPER_A_BURST] = { .type = NLA_U64, }, + [NET_SHAPER_A_PRIORITY] = { .type = NLA_U32, }, + [NET_SHAPER_A_WEIGHT] = { .type = NLA_U32, }, +}; + +/* NET_SHAPER_CMD_GET - do */ +static const struct nla_policy net_shaper_get_do_nl_policy[NET_SHAPER_A_HANDLE + 1] = { + [NET_SHAPER_A_IFINDEX] = { .type = NLA_U32, }, + [NET_SHAPER_A_HANDLE] = NLA_POLICY_NESTED(net_shaper_handle_nl_policy), +}; + +/* NET_SHAPER_CMD_GET - dump */ +static const struct nla_policy net_shaper_get_dump_nl_policy[NET_SHAPER_A_IFINDEX + 1] = { + [NET_SHAPER_A_IFINDEX] = { .type = NLA_U32, }, +}; + +/* NET_SHAPER_CMD_SET - do */ +static const struct nla_policy net_shaper_set_nl_policy[NET_SHAPER_A_SHAPERS + 1] = { + [NET_SHAPER_A_IFINDEX] = { .type = NLA_U32, }, + [NET_SHAPER_A_SHAPERS] = NLA_POLICY_NESTED(net_shaper_ns_info_nl_policy), +}; + +/* NET_SHAPER_CMD_DELETE - do */ +static const struct nla_policy net_shaper_delete_nl_policy[NET_SHAPER_A_HANDLES + 1] = { + [NET_SHAPER_A_IFINDEX] = { .type = NLA_U32, }, + [NET_SHAPER_A_HANDLES] = NLA_POLICY_NESTED(net_shaper_handle_nl_policy), +}; + +/* Ops table for net_shaper */ +static const struct genl_split_ops net_shaper_nl_ops[] = { + { + .cmd = NET_SHAPER_CMD_GET, + .doit = net_shaper_nl_get_doit, + .policy = net_shaper_get_do_nl_policy, + .maxattr = NET_SHAPER_A_HANDLE, + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, + }, + { + .cmd = NET_SHAPER_CMD_GET, + .dumpit = net_shaper_nl_get_dumpit, + .policy = net_shaper_get_dump_nl_policy, + .maxattr = NET_SHAPER_A_IFINDEX, + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DUMP, + }, + { + .cmd = NET_SHAPER_CMD_SET, + .doit = net_shaper_nl_set_doit, + .policy = net_shaper_set_nl_policy, + .maxattr = NET_SHAPER_A_SHAPERS, + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, + }, + { + .cmd = NET_SHAPER_CMD_DELETE, + .doit = net_shaper_nl_delete_doit, + .policy = net_shaper_delete_nl_policy, + .maxattr = NET_SHAPER_A_HANDLES, + .flags = GENL_ADMIN_PERM | GENL_CMD_CAP_DO, + }, +}; + +struct genl_family net_shaper_nl_family __ro_after_init = { + .name = NET_SHAPER_FAMILY_NAME, + .version = NET_SHAPER_FAMILY_VERSION, + .netnsok = true, + .parallel_ops = true, + .module = THIS_MODULE, + .split_ops = net_shaper_nl_ops, + .n_split_ops = ARRAY_SIZE(net_shaper_nl_ops), +}; diff --git a/net/shaper/shaper_nl_gen.h b/net/shaper/shaper_nl_gen.h new file mode 100644 index 000000000000..663406224d62 --- /dev/null +++ b/net/shaper/shaper_nl_gen.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ +/* Do not edit directly, auto-generated from: */ +/* Documentation/netlink/specs/shaper.yaml */ +/* YNL-GEN kernel header */ + +#ifndef _LINUX_NET_SHAPER_GEN_H +#define _LINUX_NET_SHAPER_GEN_H + +#include <net/netlink.h> +#include <net/genetlink.h> + +#include <uapi/linux/net_shaper.h> + +/* Common nested types */ +extern const struct nla_policy net_shaper_handle_nl_policy[NET_SHAPER_A_ID + 1]; +extern const struct nla_policy net_shaper_ns_info_nl_policy[NET_SHAPER_A_WEIGHT + 1]; + +int net_shaper_nl_get_doit(struct sk_buff *skb, struct genl_info *info); +int net_shaper_nl_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb); +int net_shaper_nl_set_doit(struct sk_buff *skb, struct genl_info *info); +int net_shaper_nl_delete_doit(struct sk_buff *skb, struct genl_info *info); + +extern struct genl_family net_shaper_nl_family; + +#endif /* _LINUX_NET_SHAPER_GEN_H */
Define the user-space visible interface to query, configure and delete network shapers via yaml definition. Add dummy implementations for the relevant NL callbacks. set() and delete() operations allows touching multiple shapers with a single operation, atomically. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- v4 -> v5: - fixed a few typos - set() and get() ops reply with the number of affected handles - re-ordered bps and pps - added 'unspec' scope v3 -> v4: - dropped 'major' - renamed 'minor' as 'id' - rename 'bw_{max,min} as 'bw-{max,min}' v2 -> v3: - dropped 'move' op, use parent in 'set' instead - expand 'handle' in 'scope', 'major', 'minor' - rename 'queue_group' scope to 'detached' - rename 'info' attr to 'shapers' - added pad attribute (for 64 bits' sake) v1 -> v2: - reset -> delete - added 'parent' and 'burst' --- Documentation/netlink/specs/shaper.yaml | 202 ++++++++++++++++++++++++ include/uapi/linux/net_shaper.h | 73 +++++++++ net/Kconfig | 3 + net/Makefile | 1 + net/shaper/Makefile | 9 ++ net/shaper/shaper.c | 34 ++++ net/shaper/shaper_nl_gen.c | 93 +++++++++++ net/shaper/shaper_nl_gen.h | 25 +++ 8 files changed, 440 insertions(+) create mode 100644 Documentation/netlink/specs/shaper.yaml create mode 100644 include/uapi/linux/net_shaper.h create mode 100644 net/shaper/Makefile create mode 100644 net/shaper/shaper.c create mode 100644 net/shaper/shaper_nl_gen.c create mode 100644 net/shaper/shaper_nl_gen.h