mbox series

[net-next,v2,0/3] net/sched: Introduce tc block ports tracking and use

Message ID 20230819163515.2266246-1-victor@mojatatu.com (mailing list archive)
Headers show
Series net/sched: Introduce tc block ports tracking and use | expand

Message

Victor Nogueira Aug. 19, 2023, 4:35 p.m. UTC
__context__
The "tc block" is a collection of netdevs/ports which allow qdiscs to share
match-action block instances (as opposed to the traditional tc filter per
netdev/port)[1].

Example setup:
$ tc qdisc add dev ens7 ingress block 22
$ tc qdisc add dev ens8 ingress block 22

Once the block is created we can add a filter using the block index:
$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action drop

A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
either ens7 or ens8 is dropped.

__this patchset__
Up to this point in the implementation, the block is unaware of its ports.
This patch fixes that and makes the tc block ports available to the
datapath as well as the offload control path (by virtue of the ports being
in the tc block structure).

For the datapath we provide a use case of the tc block in an action
we call "blockcast" in patch 3. This action can be used in an example as
such:

$ tc qdisc add dev ens7 ingress block 22
$ tc qdisc add dev ens8 ingress block 22
$ tc qdisc add dev ens9 ingress block 22
$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action blockcast

When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
of ens7, ens8 or ens9 it will be copied to all ports other than itself.
For example, if it arrives on ens8 then a copy of the packet will be
"blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens7.

For an offload path, one use case is to "group" all ports belonging to a
PCI device into the same tc block.

Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
tc datapath and patch 3 implements datapath usage via a new tc action
"blockcast".

__Acknowledgements__
Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
better. The idea of integrating the ports into the tc block was suggested
by Jiri Pirko.

[1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")

Changes in v2:
  - Remove RFC tag
  - Add more details in patch 0(Jiri)
  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
    Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
  - Fix bad dev dereference in printk of blockcast action (Simon)

Victor Nogueira (3):
  net/sched: Introduce tc block netdev tracking infra
  net/sched: cls_api: Expose tc block ports to the datapath
  Introduce blockcast tc action

 include/net/sch_generic.h |   8 +
 include/net/tc_wrapper.h  |   5 +
 net/sched/Kconfig         |  13 ++
 net/sched/Makefile        |   1 +
 net/sched/act_blockcast.c | 299 ++++++++++++++++++++++++++++++++++++++
 net/sched/cls_api.c       |  11 +-
 net/sched/sch_api.c       |  79 +++++++++-
 net/sched/sch_generic.c   |  40 ++++-
 8 files changed, 449 insertions(+), 7 deletions(-)
 create mode 100644 net/sched/act_blockcast.c

Comments

Vlad Buslov Aug. 21, 2023, 7:07 p.m. UTC | #1
On Sat 19 Aug 2023 at 13:35, Victor Nogueira <victor@mojatatu.com> wrote:
> __context__
> The "tc block" is a collection of netdevs/ports which allow qdiscs to share
> match-action block instances (as opposed to the traditional tc filter per
> netdev/port)[1].
>
> Example setup:
> $ tc qdisc add dev ens7 ingress block 22
> $ tc qdisc add dev ens8 ingress block 22
>
> Once the block is created we can add a filter using the block index:
> $ tc filter add block 22 protocol ip pref 25 \
>   flower dst_ip 192.168.0.0/16 action drop
>
> A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
> either ens7 or ens8 is dropped.
>
> __this patchset__
> Up to this point in the implementation, the block is unaware of its ports.
> This patch fixes that and makes the tc block ports available to the
> datapath as well as the offload control path (by virtue of the ports being
> in the tc block structure).

Could you elaborate on offload control path? I guess I'm missing
something here because struct flow_cls_offload doesn't seem to include
pointer to the parent tcf_block instance.

>
> For the datapath we provide a use case of the tc block in an action
> we call "blockcast" in patch 3. This action can be used in an example as
> such:
>
> $ tc qdisc add dev ens7 ingress block 22
> $ tc qdisc add dev ens8 ingress block 22
> $ tc qdisc add dev ens9 ingress block 22
> $ tc filter add block 22 protocol ip pref 25 \
>   flower dst_ip 192.168.0.0/16 action blockcast
>
> When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
> of ens7, ens8 or ens9 it will be copied to all ports other than itself.
> For example, if it arrives on ens8 then a copy of the packet will be
> "blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens7.
>
> For an offload path, one use case is to "group" all ports belonging to a
> PCI device into the same tc block.
>
> Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
> tc datapath and patch 3 implements datapath usage via a new tc action
> "blockcast".
>
> __Acknowledgements__
> Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
> better. The idea of integrating the ports into the tc block was suggested
> by Jiri Pirko.
>
> [1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")
>
> Changes in v2:
>   - Remove RFC tag
>   - Add more details in patch 0(Jiri)
>   - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
>     Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
>   - Fix bad dev dereference in printk of blockcast action (Simon)
>
> Victor Nogueira (3):
>   net/sched: Introduce tc block netdev tracking infra
>   net/sched: cls_api: Expose tc block ports to the datapath
>   Introduce blockcast tc action
>
>  include/net/sch_generic.h |   8 +
>  include/net/tc_wrapper.h  |   5 +
>  net/sched/Kconfig         |  13 ++
>  net/sched/Makefile        |   1 +
>  net/sched/act_blockcast.c | 299 ++++++++++++++++++++++++++++++++++++++
>  net/sched/cls_api.c       |  11 +-
>  net/sched/sch_api.c       |  79 +++++++++-
>  net/sched/sch_generic.c   |  40 ++++-
>  8 files changed, 449 insertions(+), 7 deletions(-)
>  create mode 100644 net/sched/act_blockcast.c
Jamal Hadi Salim Aug. 24, 2023, 1:47 p.m. UTC | #2
On Mon, Aug 21, 2023 at 3:12 PM Vlad Buslov <vladbu@nvidia.com> wrote:
>
> On Sat 19 Aug 2023 at 13:35, Victor Nogueira <victor@mojatatu.com> wrote:
> > __context__
> > The "tc block" is a collection of netdevs/ports which allow qdiscs to share
> > match-action block instances (as opposed to the traditional tc filter per
> > netdev/port)[1].
> >
> > Example setup:
> > $ tc qdisc add dev ens7 ingress block 22
> > $ tc qdisc add dev ens8 ingress block 22
> >
> > Once the block is created we can add a filter using the block index:
> > $ tc filter add block 22 protocol ip pref 25 \
> >   flower dst_ip 192.168.0.0/16 action drop
> >
> > A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
> > either ens7 or ens8 is dropped.
> >
> > __this patchset__
> > Up to this point in the implementation, the block is unaware of its ports.
> > This patch fixes that and makes the tc block ports available to the
> > datapath as well as the offload control path (by virtue of the ports being
> > in the tc block structure).
>
> Could you elaborate on offload control path? I guess I'm missing
> something here because struct flow_cls_offload doesn't seem to include
> pointer to the parent tcf_block instance.
>

Sorry - that statement was subconsciously over-reaching as far as this
patch is concerned, but talking from P4TC pov, (even though the
current submission for P4TC is s/w only):
A single PCI device is mapped to at least one PF and possibly many VFs
- this gets mapped to a tc block...
Then the tc filter adds the P4 program to a block. The goal then is to
send a table entry towards the driver, once instead of replicating it
many times.
This can be achieved either at a) the tc layer by keeping the entries
per block and only invoke the driver once or b) let the driver
maintain the state (with or without the tc block).
For P4TC either is achievable because the tables are "global". The
challenge is how to get the driver to be aware of the tc block.
To answer your question, the idea is to be able to pass this list of
ports per block to the driver (which as you point out doesnt exist
today, but should be easy to add).

Thoughts?

cheers,
jamal


> >
> > For the datapath we provide a use case of the tc block in an action
> > we call "blockcast" in patch 3. This action can be used in an example as
> > such:
> >
> > $ tc qdisc add dev ens7 ingress block 22
> > $ tc qdisc add dev ens8 ingress block 22
> > $ tc qdisc add dev ens9 ingress block 22
> > $ tc filter add block 22 protocol ip pref 25 \
> >   flower dst_ip 192.168.0.0/16 action blockcast
> >
> > When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
> > of ens7, ens8 or ens9 it will be copied to all ports other than itself.
> > For example, if it arrives on ens8 then a copy of the packet will be
> > "blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens7.
> >
> > For an offload path, one use case is to "group" all ports belonging to a
> > PCI device into the same tc block.
> >
> > Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
> > tc datapath and patch 3 implements datapath usage via a new tc action
> > "blockcast".
> >
> > __Acknowledgements__
> > Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
> > better. The idea of integrating the ports into the tc block was suggested
> > by Jiri Pirko.
> >
> > [1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")
> >
> > Changes in v2:
> >   - Remove RFC tag
> >   - Add more details in patch 0(Jiri)
> >   - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
> >     Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
> >   - Fix bad dev dereference in printk of blockcast action (Simon)
> >
> > Victor Nogueira (3):
> >   net/sched: Introduce tc block netdev tracking infra
> >   net/sched: cls_api: Expose tc block ports to the datapath
> >   Introduce blockcast tc action
> >
> >  include/net/sch_generic.h |   8 +
> >  include/net/tc_wrapper.h  |   5 +
> >  net/sched/Kconfig         |  13 ++
> >  net/sched/Makefile        |   1 +
> >  net/sched/act_blockcast.c | 299 ++++++++++++++++++++++++++++++++++++++
> >  net/sched/cls_api.c       |  11 +-
> >  net/sched/sch_api.c       |  79 +++++++++-
> >  net/sched/sch_generic.c   |  40 ++++-
> >  8 files changed, 449 insertions(+), 7 deletions(-)
> >  create mode 100644 net/sched/act_blockcast.c
>