mbox series

[net-next,v4,0/3] net/sched: Introduce tc block ports tracking and use

Message ID 20231005184228.467845-1-victor@mojatatu.com (mailing list archive)
Headers show
Series net/sched: Introduce tc block ports tracking and use | expand

Message

Victor Nogueira Oct. 5, 2023, 6:42 p.m. UTC
__Context__
The "tc block" is a collection of netdevs/ports which allow qdiscs to share
match-action block instances (as opposed to the traditional tc filter per
netdev/port)[1].

Example setup:
$ tc qdisc add dev ens7 ingress block 22
$ tc qdisc add dev ens8 ingress block 22

Once the block is created we can add a filter using the block index:
$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action drop

A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
either ens7 or ens8 is dropped.

__This patchset__
Up to this point in the implementation, the block is unaware of its ports.
This patch fixes that and makes the tc block ports available to the
datapath.

For the datapath we provide a use case of the tc block in an action
we call "blockcast" in patch 3. This action can be used in an example as
such:

$ tc qdisc add dev ens7 ingress block 22
$ tc qdisc add dev ens8 ingress block 22
$ tc qdisc add dev ens9 ingress block 22
$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action blockcast

When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
of ens7, ens8 or ens9 it will be copied to all ports other than itself.
For example, if it arrives on ens8 then a copy of the packet will be
"blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens8.

Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
tc datapath and patch 3 implements datapath usage via a new tc action
"blockcast".

__Acknowledgements__
Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
better. The idea of integrating the ports into the tc block was suggested
by Jiri Pirko.

[1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")

Changes in v2:
  - Remove RFC tag
  - Add more details in patch 0(Jiri)
  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
    Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
  - Fix bad dev dereference in printk of blockcast action (Simon)

Changes in v3:
  - Add missing xa_destroy (pointed out by Vlad)
  - Remove bugfix pointed by Vlad (will send in separate patch)
  - Removed ports from subject in patch #2 and typos (suggested by Marcelo)
  - Remove net_notice_ratelimited debug messages in error
    cases (suggested by Marcelo)
  - Minor changes to appease sparse's lock context warning

Changes in v4:
  - Avoid code repetition using gotos in cast_one (suggested by Paolo)
  - Fix typo in cover letter (pointed out by Paolo)
  - Create a module description for act_blockcast
    (reported by Paolo and CI)

Victor Nogueira (3):
  net/sched: Introduce tc block netdev tracking infra
  net/sched: cls_api: Expose tc block to the datapath
  net/sched: act_blockcast: Introduce blockcast tc action

 include/net/sch_generic.h    |   8 +
 include/net/tc_wrapper.h     |   5 +
 include/uapi/linux/pkt_cls.h |   1 +
 net/sched/Kconfig            |  13 ++
 net/sched/Makefile           |   1 +
 net/sched/act_blockcast.c    | 297 +++++++++++++++++++++++++++++++++++
 net/sched/cls_api.c          |  12 +-
 net/sched/sch_api.c          |  58 +++++++
 net/sched/sch_generic.c      |  34 +++-
 9 files changed, 426 insertions(+), 3 deletions(-)
 create mode 100644 net/sched/act_blockcast.c

Comments

Jiri Pirko Oct. 6, 2023, 12:59 p.m. UTC | #1
Thu, Oct 05, 2023 at 08:42:25PM CEST, victor@mojatatu.com wrote:
>__Context__
>The "tc block" is a collection of netdevs/ports which allow qdiscs to share
>match-action block instances (as opposed to the traditional tc filter per
>netdev/port)[1].
>
>Example setup:
>$ tc qdisc add dev ens7 ingress block 22
>$ tc qdisc add dev ens8 ingress block 22
>
>Once the block is created we can add a filter using the block index:
>$ tc filter add block 22 protocol ip pref 25 \
>  flower dst_ip 192.168.0.0/16 action drop
>
>A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
>either ens7 or ens8 is dropped.
>
>__This patchset__
>Up to this point in the implementation, the block is unaware of its ports.
>This patch fixes that and makes the tc block ports available to the

Odd. You fix a bug. Is there a bug? If yes, you need to describe it. If
no, don't use "fix".


>datapath.
>
>For the datapath we provide a use case of the tc block in an action
>we call "blockcast" in patch 3. This action can be used in an example as
>such:
>
>$ tc qdisc add dev ens7 ingress block 22
>$ tc qdisc add dev ens8 ingress block 22
>$ tc qdisc add dev ens9 ingress block 22
>$ tc filter add block 22 protocol ip pref 25 \
>  flower dst_ip 192.168.0.0/16 action blockcast

Seems to me a bit odd that the action works with the entity (block) is
is connected to. I would expect rather to give the action configuration:

$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action blockcast block 22
                                                ^^^^^^^^

Then this is more flexible and allows user to use this action for any
packet, no matter from where it was received.

Looks like this is functionality-wise similar to mirred redirect. Why
can't we have that action extended to accept block number instead of
netdev and have something like:

$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22

This would be very much alike we do either "tc filter add dev X" or "tc
filter add block Y".

Regarding the filtering, that could be a simple flag config of mirred
action:

$ tc filter add block 22 protocol ip pref 25 \
  flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
  srcfilter

Or something like that.

Makes sense?



>
>When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
>of ens7, ens8 or ens9 it will be copied to all ports other than itself.
>For example, if it arrives on ens8 then a copy of the packet will be
>"blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens8.
>
>Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
>tc datapath and patch 3 implements datapath usage via a new tc action
>"blockcast".
>
>__Acknowledgements__
>Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
>better. The idea of integrating the ports into the tc block was suggested
>by Jiri Pirko.
>
>[1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")
>
>Changes in v2:
>  - Remove RFC tag
>  - Add more details in patch 0(Jiri)
>  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
>    Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
>  - Fix bad dev dereference in printk of blockcast action (Simon)
>
>Changes in v3:
>  - Add missing xa_destroy (pointed out by Vlad)
>  - Remove bugfix pointed by Vlad (will send in separate patch)
>  - Removed ports from subject in patch #2 and typos (suggested by Marcelo)
>  - Remove net_notice_ratelimited debug messages in error
>    cases (suggested by Marcelo)
>  - Minor changes to appease sparse's lock context warning
>
>Changes in v4:
>  - Avoid code repetition using gotos in cast_one (suggested by Paolo)
>  - Fix typo in cover letter (pointed out by Paolo)
>  - Create a module description for act_blockcast
>    (reported by Paolo and CI)
>
>Victor Nogueira (3):
>  net/sched: Introduce tc block netdev tracking infra
>  net/sched: cls_api: Expose tc block to the datapath
>  net/sched: act_blockcast: Introduce blockcast tc action
>
> include/net/sch_generic.h    |   8 +
> include/net/tc_wrapper.h     |   5 +
> include/uapi/linux/pkt_cls.h |   1 +
> net/sched/Kconfig            |  13 ++
> net/sched/Makefile           |   1 +
> net/sched/act_blockcast.c    | 297 +++++++++++++++++++++++++++++++++++
> net/sched/cls_api.c          |  12 +-
> net/sched/sch_api.c          |  58 +++++++
> net/sched/sch_generic.c      |  34 +++-
> 9 files changed, 426 insertions(+), 3 deletions(-)
> create mode 100644 net/sched/act_blockcast.c
>
>-- 
>2.25.1
>
Jamal Hadi Salim Oct. 6, 2023, 3:37 p.m. UTC | #2
On Fri, Oct 6, 2023 at 8:59 AM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Thu, Oct 05, 2023 at 08:42:25PM CEST, victor@mojatatu.com wrote:
> >__Context__
> >The "tc block" is a collection of netdevs/ports which allow qdiscs to share
> >match-action block instances (as opposed to the traditional tc filter per
> >netdev/port)[1].
> >
> >Example setup:
> >$ tc qdisc add dev ens7 ingress block 22
> >$ tc qdisc add dev ens8 ingress block 22
> >
> >Once the block is created we can add a filter using the block index:
> >$ tc filter add block 22 protocol ip pref 25 \
> >  flower dst_ip 192.168.0.0/16 action drop
> >
> >A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
> >either ens7 or ens8 is dropped.
> >
> >__This patchset__
> >Up to this point in the implementation, the block is unaware of its ports.
> >This patch fixes that and makes the tc block ports available to the
>
> Odd. You fix a bug. Is there a bug? If yes, you need to describe it. If
> no, don't use "fix".

Ok, Jiri;->  we will change the language.

>
> >datapath.
> >
> >For the datapath we provide a use case of the tc block in an action
> >we call "blockcast" in patch 3. This action can be used in an example as
> >such:
> >
> >$ tc qdisc add dev ens7 ingress block 22
> >$ tc qdisc add dev ens8 ingress block 22
> >$ tc qdisc add dev ens9 ingress block 22
> >$ tc filter add block 22 protocol ip pref 25 \
> >  flower dst_ip 192.168.0.0/16 action blockcast
>
> Seems to me a bit odd that the action works with the entity (block) is
> is connected to. I would expect rather to give the action configuration:
>
> $ tc filter add block 22 protocol ip pref 25 \
>   flower dst_ip 192.168.0.0/16 action blockcast block 22
>                                                 ^^^^^^^^

We are currently passing the blockid in the skb cb field so it is
configuration-less. I suppose we could add this as an optional field
and use it when specified.

> Then this is more flexible and allows user to use this action for any
> packet, no matter from where it was received.
>
> Looks like this is functionality-wise similar to mirred redirect. Why
> can't we have that action extended to accept block number instead of
> netdev and have something like:
>
> $ tc filter add block 22 protocol ip pref 25 \
>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
>
> This would be very much alike we do either "tc filter add dev X" or "tc
> filter add block Y".
>

We did consider it but concluded it is a lot of work to get it done on
mirred - just take a look at mirred and you'll see what i mean;->
Based on that review we came to the conclusion that at some point it
would be safer to separate mirred's mirror from redirect; there are
too many checks to avoid one or the other based on whether you are
coming from egress vs ingress etc. This one is simple, it is just a
broadcast.


> Regarding the filtering, that could be a simple flag config of mirred
> action:
>
> $ tc filter add block 22 protocol ip pref 25 \
>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
>   srcfilter
>
> Or something like that.
>

See my comment above.

cheers,
jamal
> Makes sense?
>
>
>
> >
> >When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
> >of ens7, ens8 or ens9 it will be copied to all ports other than itself.
> >For example, if it arrives on ens8 then a copy of the packet will be
> >"blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens8.
> >
> >Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
> >tc datapath and patch 3 implements datapath usage via a new tc action
> >"blockcast".
> >
> >__Acknowledgements__
> >Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
> >better. The idea of integrating the ports into the tc block was suggested
> >by Jiri Pirko.
> >
> >[1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")
> >
> >Changes in v2:
> >  - Remove RFC tag
> >  - Add more details in patch 0(Jiri)
> >  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
> >    Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
> >  - Fix bad dev dereference in printk of blockcast action (Simon)
> >
> >Changes in v3:
> >  - Add missing xa_destroy (pointed out by Vlad)
> >  - Remove bugfix pointed by Vlad (will send in separate patch)
> >  - Removed ports from subject in patch #2 and typos (suggested by Marcelo)
> >  - Remove net_notice_ratelimited debug messages in error
> >    cases (suggested by Marcelo)
> >  - Minor changes to appease sparse's lock context warning
> >
> >Changes in v4:
> >  - Avoid code repetition using gotos in cast_one (suggested by Paolo)
> >  - Fix typo in cover letter (pointed out by Paolo)
> >  - Create a module description for act_blockcast
> >    (reported by Paolo and CI)
> >
> >Victor Nogueira (3):
> >  net/sched: Introduce tc block netdev tracking infra
> >  net/sched: cls_api: Expose tc block to the datapath
> >  net/sched: act_blockcast: Introduce blockcast tc action
> >
> > include/net/sch_generic.h    |   8 +
> > include/net/tc_wrapper.h     |   5 +
> > include/uapi/linux/pkt_cls.h |   1 +
> > net/sched/Kconfig            |  13 ++
> > net/sched/Makefile           |   1 +
> > net/sched/act_blockcast.c    | 297 +++++++++++++++++++++++++++++++++++
> > net/sched/cls_api.c          |  12 +-
> > net/sched/sch_api.c          |  58 +++++++
> > net/sched/sch_generic.c      |  34 +++-
> > 9 files changed, 426 insertions(+), 3 deletions(-)
> > create mode 100644 net/sched/act_blockcast.c
> >
> >--
> >2.25.1
> >
Jiri Pirko Oct. 6, 2023, 4:50 p.m. UTC | #3
Fri, Oct 06, 2023 at 05:37:41PM CEST, jhs@mojatatu.com wrote:
>On Fri, Oct 6, 2023 at 8:59 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Thu, Oct 05, 2023 at 08:42:25PM CEST, victor@mojatatu.com wrote:
>> >__Context__
>> >The "tc block" is a collection of netdevs/ports which allow qdiscs to share
>> >match-action block instances (as opposed to the traditional tc filter per
>> >netdev/port)[1].
>> >
>> >Example setup:
>> >$ tc qdisc add dev ens7 ingress block 22
>> >$ tc qdisc add dev ens8 ingress block 22
>> >
>> >Once the block is created we can add a filter using the block index:
>> >$ tc filter add block 22 protocol ip pref 25 \
>> >  flower dst_ip 192.168.0.0/16 action drop
>> >
>> >A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
>> >either ens7 or ens8 is dropped.
>> >
>> >__This patchset__
>> >Up to this point in the implementation, the block is unaware of its ports.
>> >This patch fixes that and makes the tc block ports available to the
>>
>> Odd. You fix a bug. Is there a bug? If yes, you need to describe it. If
>> no, don't use "fix".
>
>Ok, Jiri;->  we will change the language.
>
>>
>> >datapath.
>> >
>> >For the datapath we provide a use case of the tc block in an action
>> >we call "blockcast" in patch 3. This action can be used in an example as
>> >such:
>> >
>> >$ tc qdisc add dev ens7 ingress block 22
>> >$ tc qdisc add dev ens8 ingress block 22
>> >$ tc qdisc add dev ens9 ingress block 22
>> >$ tc filter add block 22 protocol ip pref 25 \
>> >  flower dst_ip 192.168.0.0/16 action blockcast
>>
>> Seems to me a bit odd that the action works with the entity (block) is
>> is connected to. I would expect rather to give the action configuration:
>>
>> $ tc filter add block 22 protocol ip pref 25 \
>>   flower dst_ip 192.168.0.0/16 action blockcast block 22
>>                                                 ^^^^^^^^
>
>We are currently passing the blockid in the skb cb field so it is
>configuration-less. I suppose we could add this as an optional field
>and use it when specified.

I don't understand the need for configuration less here. You don't have
it for the rest of the actions. Why this is special?


>
>> Then this is more flexible and allows user to use this action for any
>> packet, no matter from where it was received.
>>
>> Looks like this is functionality-wise similar to mirred redirect. Why
>> can't we have that action extended to accept block number instead of
>> netdev and have something like:
>>
>> $ tc filter add block 22 protocol ip pref 25 \
>>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
>>
>> This would be very much alike we do either "tc filter add dev X" or "tc
>> filter add block Y".
>>
>
>We did consider it but concluded it is a lot of work to get it done on
>mirred - just take a look at mirred and you'll see what i mean;->
>Based on that review we came to the conclusion that at some point it
>would be safer to separate mirred's mirror from redirect; there are
>too many checks to avoid one or the other based on whether you are
>coming from egress vs ingress etc. This one is simple, it is just a
>broadcast.

Perhaps it is a nice opportunity to do such mirred cleanup, prepare the
code and implement block send afterwards?

If I omit the code for now, from user perspective, this functionality
belongs into mirred, don't you think? Just replace "dev" by "block" and
you got what you need.


>
>
>> Regarding the filtering, that could be a simple flag config of mirred
>> action:
>>
>> $ tc filter add block 22 protocol ip pref 25 \
>>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
>>   srcfilter
>>
>> Or something like that.
>>
>
>See my comment above.
>
>cheers,
>jamal
>> Makes sense?
>>
>>
>>
>> >
>> >When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
>> >of ens7, ens8 or ens9 it will be copied to all ports other than itself.
>> >For example, if it arrives on ens8 then a copy of the packet will be
>> >"blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens8.
>> >
>> >Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
>> >tc datapath and patch 3 implements datapath usage via a new tc action
>> >"blockcast".
>> >
>> >__Acknowledgements__
>> >Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
>> >better. The idea of integrating the ports into the tc block was suggested
>> >by Jiri Pirko.
>> >
>> >[1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")
>> >
>> >Changes in v2:
>> >  - Remove RFC tag
>> >  - Add more details in patch 0(Jiri)
>> >  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
>> >    Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
>> >  - Fix bad dev dereference in printk of blockcast action (Simon)
>> >
>> >Changes in v3:
>> >  - Add missing xa_destroy (pointed out by Vlad)
>> >  - Remove bugfix pointed by Vlad (will send in separate patch)
>> >  - Removed ports from subject in patch #2 and typos (suggested by Marcelo)
>> >  - Remove net_notice_ratelimited debug messages in error
>> >    cases (suggested by Marcelo)
>> >  - Minor changes to appease sparse's lock context warning
>> >
>> >Changes in v4:
>> >  - Avoid code repetition using gotos in cast_one (suggested by Paolo)
>> >  - Fix typo in cover letter (pointed out by Paolo)
>> >  - Create a module description for act_blockcast
>> >    (reported by Paolo and CI)
>> >
>> >Victor Nogueira (3):
>> >  net/sched: Introduce tc block netdev tracking infra
>> >  net/sched: cls_api: Expose tc block to the datapath
>> >  net/sched: act_blockcast: Introduce blockcast tc action
>> >
>> > include/net/sch_generic.h    |   8 +
>> > include/net/tc_wrapper.h     |   5 +
>> > include/uapi/linux/pkt_cls.h |   1 +
>> > net/sched/Kconfig            |  13 ++
>> > net/sched/Makefile           |   1 +
>> > net/sched/act_blockcast.c    | 297 +++++++++++++++++++++++++++++++++++
>> > net/sched/cls_api.c          |  12 +-
>> > net/sched/sch_api.c          |  58 +++++++
>> > net/sched/sch_generic.c      |  34 +++-
>> > 9 files changed, 426 insertions(+), 3 deletions(-)
>> > create mode 100644 net/sched/act_blockcast.c
>> >
>> >--
>> >2.25.1
>> >
Jamal Hadi Salim Oct. 6, 2023, 7:06 p.m. UTC | #4
On Fri, Oct 6, 2023 at 12:50 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Fri, Oct 06, 2023 at 05:37:41PM CEST, jhs@mojatatu.com wrote:
> >On Fri, Oct 6, 2023 at 8:59 AM Jiri Pirko <jiri@resnulli.us> wrote:
> >>
> >> Thu, Oct 05, 2023 at 08:42:25PM CEST, victor@mojatatu.com wrote:
> >> >__Context__
> >> >The "tc block" is a collection of netdevs/ports which allow qdiscs to share
> >> >match-action block instances (as opposed to the traditional tc filter per
> >> >netdev/port)[1].
> >> >
> >> >Example setup:
> >> >$ tc qdisc add dev ens7 ingress block 22
> >> >$ tc qdisc add dev ens8 ingress block 22
> >> >
> >> >Once the block is created we can add a filter using the block index:
> >> >$ tc filter add block 22 protocol ip pref 25 \
> >> >  flower dst_ip 192.168.0.0/16 action drop
> >> >
> >> >A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
> >> >either ens7 or ens8 is dropped.
> >> >
> >> >__This patchset__
> >> >Up to this point in the implementation, the block is unaware of its ports.
> >> >This patch fixes that and makes the tc block ports available to the
> >>
> >> Odd. You fix a bug. Is there a bug? If yes, you need to describe it. If
> >> no, don't use "fix".
> >
> >Ok, Jiri;->  we will change the language.
> >
> >>
> >> >datapath.
> >> >
> >> >For the datapath we provide a use case of the tc block in an action
> >> >we call "blockcast" in patch 3. This action can be used in an example as
> >> >such:
> >> >
> >> >$ tc qdisc add dev ens7 ingress block 22
> >> >$ tc qdisc add dev ens8 ingress block 22
> >> >$ tc qdisc add dev ens9 ingress block 22
> >> >$ tc filter add block 22 protocol ip pref 25 \
> >> >  flower dst_ip 192.168.0.0/16 action blockcast
> >>
> >> Seems to me a bit odd that the action works with the entity (block) is
> >> is connected to. I would expect rather to give the action configuration:
> >>
> >> $ tc filter add block 22 protocol ip pref 25 \
> >>   flower dst_ip 192.168.0.0/16 action blockcast block 22
> >>                                                 ^^^^^^^^
> >
> >We are currently passing the blockid in the skb cb field so it is
> >configuration-less. I suppose we could add this as an optional field
> >and use it when specified.
>
> I don't understand the need for configuration less here. You don't have
> it for the rest of the actions. Why this is special?

It is not needed really. Think of an L2 switch - the broadcast action
is to send to all ports but self.

>
> >
> >> Then this is more flexible and allows user to use this action for any
> >> packet, no matter from where it was received.
> >>
> >> Looks like this is functionality-wise similar to mirred redirect. Why
> >> can't we have that action extended to accept block number instead of
> >> netdev and have something like:
> >>
> >> $ tc filter add block 22 protocol ip pref 25 \
> >>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
> >>
> >> This would be very much alike we do either "tc filter add dev X" or "tc
> >> filter add block Y".
> >>
> >
> >We did consider it but concluded it is a lot of work to get it done on
> >mirred - just take a look at mirred and you'll see what i mean;->
> >Based on that review we came to the conclusion that at some point it
> >would be safer to separate mirred's mirror from redirect; there are
> >too many checks to avoid one or the other based on whether you are
> >coming from egress vs ingress etc. This one is simple, it is just a
> >broadcast.
>
> Perhaps it is a nice opportunity to do such mirred cleanup, prepare the
> code and implement block send afterwards?

I was worried about breaking some existing use cases - the code has
got too clever.
But probably it is time to show it some love, one of us will invest
time into it.

> If I omit the code for now, from user perspective, this functionality
> belongs into mirred, don't you think? Just replace "dev" by "block" and
> you got what you need.

If we can adequately cleanup mirred,  then we can put it there but
certainly now we are adding more buttons to click on mirred. It may
make sense to refactor the mirred code then reuse the refactored code
in a new action.

cheers,
jamal

>
> >
> >
> >> Regarding the filtering, that could be a simple flag config of mirred
> >> action:
> >>
> >> $ tc filter add block 22 protocol ip pref 25 \
> >>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
> >>   srcfilter
> >>
> >> Or something like that.
> >>
> >
> >See my comment above.
> >
> >cheers,
> >jamal
> >> Makes sense?
> >>
> >>
> >>
> >> >
> >> >When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
> >> >of ens7, ens8 or ens9 it will be copied to all ports other than itself.
> >> >For example, if it arrives on ens8 then a copy of the packet will be
> >> >"blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens8.
> >> >
> >> >Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
> >> >tc datapath and patch 3 implements datapath usage via a new tc action
> >> >"blockcast".
> >> >
> >> >__Acknowledgements__
> >> >Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
> >> >better. The idea of integrating the ports into the tc block was suggested
> >> >by Jiri Pirko.
> >> >
> >> >[1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")
> >> >
> >> >Changes in v2:
> >> >  - Remove RFC tag
> >> >  - Add more details in patch 0(Jiri)
> >> >  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
> >> >    Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
> >> >  - Fix bad dev dereference in printk of blockcast action (Simon)
> >> >
> >> >Changes in v3:
> >> >  - Add missing xa_destroy (pointed out by Vlad)
> >> >  - Remove bugfix pointed by Vlad (will send in separate patch)
> >> >  - Removed ports from subject in patch #2 and typos (suggested by Marcelo)
> >> >  - Remove net_notice_ratelimited debug messages in error
> >> >    cases (suggested by Marcelo)
> >> >  - Minor changes to appease sparse's lock context warning
> >> >
> >> >Changes in v4:
> >> >  - Avoid code repetition using gotos in cast_one (suggested by Paolo)
> >> >  - Fix typo in cover letter (pointed out by Paolo)
> >> >  - Create a module description for act_blockcast
> >> >    (reported by Paolo and CI)
> >> >
> >> >Victor Nogueira (3):
> >> >  net/sched: Introduce tc block netdev tracking infra
> >> >  net/sched: cls_api: Expose tc block to the datapath
> >> >  net/sched: act_blockcast: Introduce blockcast tc action
> >> >
> >> > include/net/sch_generic.h    |   8 +
> >> > include/net/tc_wrapper.h     |   5 +
> >> > include/uapi/linux/pkt_cls.h |   1 +
> >> > net/sched/Kconfig            |  13 ++
> >> > net/sched/Makefile           |   1 +
> >> > net/sched/act_blockcast.c    | 297 +++++++++++++++++++++++++++++++++++
> >> > net/sched/cls_api.c          |  12 +-
> >> > net/sched/sch_api.c          |  58 +++++++
> >> > net/sched/sch_generic.c      |  34 +++-
> >> > 9 files changed, 426 insertions(+), 3 deletions(-)
> >> > create mode 100644 net/sched/act_blockcast.c
> >> >
> >> >--
> >> >2.25.1
> >> >
Jakub Kicinski Oct. 6, 2023, 10:25 p.m. UTC | #5
On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
> > I don't understand the need for configuration less here. You don't have
> > it for the rest of the actions. Why this is special?  

+1, FWIW

> It is not needed really. Think of an L2 switch - the broadcast action
> is to send to all ports but self.

We do have an implementation of an L2 switch already, what's the use
case which necessitates this new action / functionality?
Jamal Hadi Salim Oct. 6, 2023, 11 p.m. UTC | #6
On Fri, Oct 6, 2023 at 6:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
> > > I don't understand the need for configuration less here. You don't have
> > > it for the rest of the actions. Why this is special?
>
> +1, FWIW

We dont have any rule that says all actions MUST have parameters ;->
There is nothing speacial about any action that doesnt have a
parameter.

> > It is not needed really. Think of an L2 switch - the broadcast action
> > is to send to all ports but self.
>
> We do have an implementation of an L2 switch already, what's the use
> case which necessitates this new action / functionality?

It is not an L2 switch - the L2 example switch was what came to mind
of something that does a broadcast (it doesnt depend on MAC addresses
for example). Could have called it a hub. Ex: you could match on many
different header fields or skb metadata, then first modify the packet
using NAT, etc and then "blockcast" it.

cheers,
jamal
Jiri Pirko Oct. 7, 2023, 10:20 a.m. UTC | #7
Sat, Oct 07, 2023 at 01:00:00AM CEST, jhs@mojatatu.com wrote:
>On Fri, Oct 6, 2023 at 6:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
>> > > I don't understand the need for configuration less here. You don't have
>> > > it for the rest of the actions. Why this is special?
>>
>> +1, FWIW
>
>We dont have any rule that says all actions MUST have parameters ;->
>There is nothing speacial about any action that doesnt have a
>parameter.

You are getting the configuration from the block/device the action is
attached to. Can you point me to another action doing that?
Jiri Pirko Oct. 7, 2023, 10:22 a.m. UTC | #8
Fri, Oct 06, 2023 at 09:06:45PM CEST, jhs@mojatatu.com wrote:
>On Fri, Oct 6, 2023 at 12:50 PM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Fri, Oct 06, 2023 at 05:37:41PM CEST, jhs@mojatatu.com wrote:
>> >On Fri, Oct 6, 2023 at 8:59 AM Jiri Pirko <jiri@resnulli.us> wrote:
>> >>
>> >> Thu, Oct 05, 2023 at 08:42:25PM CEST, victor@mojatatu.com wrote:
>> >> >__Context__
>> >> >The "tc block" is a collection of netdevs/ports which allow qdiscs to share
>> >> >match-action block instances (as opposed to the traditional tc filter per
>> >> >netdev/port)[1].
>> >> >
>> >> >Example setup:
>> >> >$ tc qdisc add dev ens7 ingress block 22
>> >> >$ tc qdisc add dev ens8 ingress block 22
>> >> >
>> >> >Once the block is created we can add a filter using the block index:
>> >> >$ tc filter add block 22 protocol ip pref 25 \
>> >> >  flower dst_ip 192.168.0.0/16 action drop
>> >> >
>> >> >A packet with dst IP matching 192.168.0.0/16 arriving on the ingress of
>> >> >either ens7 or ens8 is dropped.
>> >> >
>> >> >__This patchset__
>> >> >Up to this point in the implementation, the block is unaware of its ports.
>> >> >This patch fixes that and makes the tc block ports available to the
>> >>
>> >> Odd. You fix a bug. Is there a bug? If yes, you need to describe it. If
>> >> no, don't use "fix".
>> >
>> >Ok, Jiri;->  we will change the language.
>> >
>> >>
>> >> >datapath.
>> >> >
>> >> >For the datapath we provide a use case of the tc block in an action
>> >> >we call "blockcast" in patch 3. This action can be used in an example as
>> >> >such:
>> >> >
>> >> >$ tc qdisc add dev ens7 ingress block 22
>> >> >$ tc qdisc add dev ens8 ingress block 22
>> >> >$ tc qdisc add dev ens9 ingress block 22
>> >> >$ tc filter add block 22 protocol ip pref 25 \
>> >> >  flower dst_ip 192.168.0.0/16 action blockcast
>> >>
>> >> Seems to me a bit odd that the action works with the entity (block) is
>> >> is connected to. I would expect rather to give the action configuration:
>> >>
>> >> $ tc filter add block 22 protocol ip pref 25 \
>> >>   flower dst_ip 192.168.0.0/16 action blockcast block 22
>> >>                                                 ^^^^^^^^
>> >
>> >We are currently passing the blockid in the skb cb field so it is
>> >configuration-less. I suppose we could add this as an optional field
>> >and use it when specified.
>>
>> I don't understand the need for configuration less here. You don't have
>> it for the rest of the actions. Why this is special?
>
>It is not needed really. Think of an L2 switch - the broadcast action
>is to send to all ports but self.
>
>>
>> >
>> >> Then this is more flexible and allows user to use this action for any
>> >> packet, no matter from where it was received.
>> >>
>> >> Looks like this is functionality-wise similar to mirred redirect. Why
>> >> can't we have that action extended to accept block number instead of
>> >> netdev and have something like:
>> >>
>> >> $ tc filter add block 22 protocol ip pref 25 \
>> >>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
>> >>
>> >> This would be very much alike we do either "tc filter add dev X" or "tc
>> >> filter add block Y".
>> >>
>> >
>> >We did consider it but concluded it is a lot of work to get it done on
>> >mirred - just take a look at mirred and you'll see what i mean;->
>> >Based on that review we came to the conclusion that at some point it
>> >would be safer to separate mirred's mirror from redirect; there are
>> >too many checks to avoid one or the other based on whether you are
>> >coming from egress vs ingress etc. This one is simple, it is just a
>> >broadcast.
>>
>> Perhaps it is a nice opportunity to do such mirred cleanup, prepare the
>> code and implement block send afterwards?
>
>I was worried about breaking some existing use cases - the code has
>got too clever.
>But probably it is time to show it some love, one of us will invest
>time into it.

Awesome.


>
>> If I omit the code for now, from user perspective, this functionality
>> belongs into mirred, don't you think? Just replace "dev" by "block" and
>> you got what you need.
>
>If we can adequately cleanup mirred,  then we can put it there but
>certainly now we are adding more buttons to click on mirred. It may
>make sense to refactor the mirred code then reuse the refactored code
>in a new action.

I don't understand why you need any new action. mirred redirect to block
instead of dev is exactly what you need. Isn't it?


>
>cheers,
>jamal
>
>>
>> >
>> >
>> >> Regarding the filtering, that could be a simple flag config of mirred
>> >> action:
>> >>
>> >> $ tc filter add block 22 protocol ip pref 25 \
>> >>   flower dst_ip 192.168.0.0/16 action mirred egress redirect block 22
>> >>   srcfilter
>> >>
>> >> Or something like that.
>> >>
>> >
>> >See my comment above.
>> >
>> >cheers,
>> >jamal
>> >> Makes sense?
>> >>
>> >>
>> >>
>> >> >
>> >> >When a packet(matching dst IP 192.168.0.0/16) arrives on the ingress of any
>> >> >of ens7, ens8 or ens9 it will be copied to all ports other than itself.
>> >> >For example, if it arrives on ens8 then a copy of the packet will be
>> >> >"blockcasted";-> to both ens7 and ens9 (unmodified), but not to ens8.
>> >> >
>> >> >Patch 1 introduces the required infra. Patch 2 exposes the tc block to the
>> >> >tc datapath and patch 3 implements datapath usage via a new tc action
>> >> >"blockcast".
>> >> >
>> >> >__Acknowledgements__
>> >> >Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
>> >> >better. The idea of integrating the ports into the tc block was suggested
>> >> >by Jiri Pirko.
>> >> >
>> >> >[1] See commit ca46abd6f89f ("Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'")
>> >> >
>> >> >Changes in v2:
>> >> >  - Remove RFC tag
>> >> >  - Add more details in patch 0(Jiri)
>> >> >  - When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
>> >> >    Reported-by: kernel test robot <lkp@intel.com> (and horms@kernel.org)
>> >> >  - Fix bad dev dereference in printk of blockcast action (Simon)
>> >> >
>> >> >Changes in v3:
>> >> >  - Add missing xa_destroy (pointed out by Vlad)
>> >> >  - Remove bugfix pointed by Vlad (will send in separate patch)
>> >> >  - Removed ports from subject in patch #2 and typos (suggested by Marcelo)
>> >> >  - Remove net_notice_ratelimited debug messages in error
>> >> >    cases (suggested by Marcelo)
>> >> >  - Minor changes to appease sparse's lock context warning
>> >> >
>> >> >Changes in v4:
>> >> >  - Avoid code repetition using gotos in cast_one (suggested by Paolo)
>> >> >  - Fix typo in cover letter (pointed out by Paolo)
>> >> >  - Create a module description for act_blockcast
>> >> >    (reported by Paolo and CI)
>> >> >
>> >> >Victor Nogueira (3):
>> >> >  net/sched: Introduce tc block netdev tracking infra
>> >> >  net/sched: cls_api: Expose tc block to the datapath
>> >> >  net/sched: act_blockcast: Introduce blockcast tc action
>> >> >
>> >> > include/net/sch_generic.h    |   8 +
>> >> > include/net/tc_wrapper.h     |   5 +
>> >> > include/uapi/linux/pkt_cls.h |   1 +
>> >> > net/sched/Kconfig            |  13 ++
>> >> > net/sched/Makefile           |   1 +
>> >> > net/sched/act_blockcast.c    | 297 +++++++++++++++++++++++++++++++++++
>> >> > net/sched/cls_api.c          |  12 +-
>> >> > net/sched/sch_api.c          |  58 +++++++
>> >> > net/sched/sch_generic.c      |  34 +++-
>> >> > 9 files changed, 426 insertions(+), 3 deletions(-)
>> >> > create mode 100644 net/sched/act_blockcast.c
>> >> >
>> >> >--
>> >> >2.25.1
>> >> >
Jamal Hadi Salim Oct. 7, 2023, 11:06 a.m. UTC | #9
On Sat, Oct 7, 2023 at 6:20 AM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Sat, Oct 07, 2023 at 01:00:00AM CEST, jhs@mojatatu.com wrote:
> >On Fri, Oct 6, 2023 at 6:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >>
> >> On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
> >> > > I don't understand the need for configuration less here. You don't have
> >> > > it for the rest of the actions. Why this is special?
> >>
> >> +1, FWIW
> >
> >We dont have any rule that says all actions MUST have parameters ;->
> >There is nothing speacial about any action that doesnt have a
> >parameter.
>
> You are getting the configuration from the block/device the action is
> attached to. Can you point me to another action doing that?

We are entering a pedantic road i am afraid. If there is no existing
action that has zero config then consider this one the first one. We
use skb->metadata all the time as a source of information for actions,
classifiers, qdiscs. If i dont need config i dont need to invent one
just because, well, all other actions are using one or more config;->
Your suggestion to specify an extra config to select a block - which
may be different than the one the one packet on - is a useful
feature(it just adds more code) but really should be optional. i.e if
you dont specify a block id configuration then we pick the metadata
one.

> >If we can adequately cleanup mirred,  then we can put it there but
> >certainly now we are adding more buttons to click on mirred. It may
> >make sense to refactor the mirred code then reuse the refactored code
> >in a new action.
>
> I don't understand why you need any new action. mirred redirect to block
> instead of dev is exactly what you need. Isn't it?

The actions have different meanings and lumping them together then
selecting via a knob is not the right approach.
There's shared code for sure. Infact the sending code was ripped from
mirred so as not to touch the rest because like i said mirred has
since grown a couple of horns and tails. In retrospect mirred should
have been two actions with shared code - but it is too late to change
now because it is very widely used. If someone like me was afraid of
touching it is because there's a maintainance challenge. I consider it
in the same zone as trying to restructure something in the skb.
I agree mirroring to a group of ports with a simple config is a useful
feature. Mirroring to a group via a tc block is simpler but adding a
list of ports instead is more powerful. So this feature is useful to
have in mirred - just the adding of yet one more button to say "skip
this port" is my concern.
Lets see how the refactoring goes first then it will be clearer - it
is a lot of delicate work - but you are right lets give it some love
now.

cheers,
jamal


>
Jiri Pirko Oct. 7, 2023, 12:43 p.m. UTC | #10
Sat, Oct 07, 2023 at 01:06:43PM CEST, jhs@mojatatu.com wrote:
>On Sat, Oct 7, 2023 at 6:20 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Sat, Oct 07, 2023 at 01:00:00AM CEST, jhs@mojatatu.com wrote:
>> >On Fri, Oct 6, 2023 at 6:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
>> >>
>> >> On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
>> >> > > I don't understand the need for configuration less here. You don't have
>> >> > > it for the rest of the actions. Why this is special?
>> >>
>> >> +1, FWIW
>> >
>> >We dont have any rule that says all actions MUST have parameters ;->
>> >There is nothing speacial about any action that doesnt have a
>> >parameter.
>>
>> You are getting the configuration from the block/device the action is
>> attached to. Can you point me to another action doing that?
>
>We are entering a pedantic road i am afraid. If there is no existing
>action that has zero config then consider this one the first one. We

Nope, nothing pedantic about it. I was just curious if there's anything
out there I missed.


>use skb->metadata all the time as a source of information for actions,
>classifiers, qdiscs. If i dont need config i dont need to invent one

skb->metadata is something that is specific to a packet. That has
nothing to do with the actual configuration.


>just because, well, all other actions are using one or more config;->
>Your suggestion to specify an extra config to select a block - which
>may be different than the one the one packet on - is a useful
>feature(it just adds more code) but really should be optional. i.e if
>you dont specify a block id configuration then we pick the metadata
>one.

My primary point is, this should be mirred redirect to block instead of
what we currently have only for dev. That's it.



>
>> >If we can adequately cleanup mirred,  then we can put it there but
>> >certainly now we are adding more buttons to click on mirred. It may
>> >make sense to refactor the mirred code then reuse the refactored code
>> >in a new action.
>>
>> I don't understand why you need any new action. mirred redirect to block
>> instead of dev is exactly what you need. Isn't it?
>
>The actions have different meanings and lumping them together then
>selecting via a knob is not the right approach.
>There's shared code for sure. Infact the sending code was ripped from
>mirred so as not to touch the rest because like i said mirred has
>since grown a couple of horns and tails. In retrospect mirred should
>have been two actions with shared code - but it is too late to change
>now because it is very widely used. If someone like me was afraid of
>touching it is because there's a maintainance challenge. I consider it
>in the same zone as trying to restructure something in the skb.
>I agree mirroring to a group of ports with a simple config is a useful
>feature. Mirroring to a group via a tc block is simpler but adding a
>list of ports instead is more powerful. So this feature is useful to
>have in mirred - just the adding of yet one more button to say "skip
>this port" is my concern.

Why? Perhaps skb->iif could be used for check in the tx iteration.


>Lets see how the refactoring goes first then it will be clearer - it
>is a lot of delicate work - but you are right lets give it some love
>now.
>
>cheers,
>jamal
>
>
>>
Jamal Hadi Salim Oct. 7, 2023, 2:09 p.m. UTC | #11
On Sat, Oct 7, 2023 at 8:43 AM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Sat, Oct 07, 2023 at 01:06:43PM CEST, jhs@mojatatu.com wrote:
> >On Sat, Oct 7, 2023 at 6:20 AM Jiri Pirko <jiri@resnulli.us> wrote:
> >>
> >> Sat, Oct 07, 2023 at 01:00:00AM CEST, jhs@mojatatu.com wrote:
> >> >On Fri, Oct 6, 2023 at 6:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >> >>
> >> >> On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
> >> >> > > I don't understand the need for configuration less here. You don't have
> >> >> > > it for the rest of the actions. Why this is special?
> >> >>
> >> >> +1, FWIW
> >> >
> >> >We dont have any rule that says all actions MUST have parameters ;->
> >> >There is nothing speacial about any action that doesnt have a
> >> >parameter.
> >>
> >> You are getting the configuration from the block/device the action is
> >> attached to. Can you point me to another action doing that?
> >
> >We are entering a pedantic road i am afraid. If there is no existing
> >action that has zero config then consider this one the first one. We
>
> Nope, nothing pedantic about it. I was just curious if there's anything
> out there I missed.
>

Not sure if you noticed in the patch: the blockid on which the skb
arrived on is now available in the tc_cb[] so when it shows up at the
action we can just use it.

>
> >use skb->metadata all the time as a source of information for actions,
> >classifiers, qdiscs. If i dont need config i dont need to invent one
>
> skb->metadata is something that is specific to a packet. That has
> nothing to do with the actual configuration.

Essentially we turned blockid into skb metadata. A user specifying
configuration of a different blockid is certainly useful. My point is
we can have both worlds: when such a user config is missing we'll
assume a default which happens to be in the skb.

>
> >just because, well, all other actions are using one or more config;->
> >Your suggestion to specify an extra config to select a block - which
> >may be different than the one the one packet on - is a useful
> >feature(it just adds more code) but really should be optional. i.e if
> >you dont specify a block id configuration then we pick the metadata
> >one.
>
> My primary point is, this should be mirred redirect to block instead of
> what we currently have only for dev. That's it.

Agreed (and such a feature should be added regardless of this action).
The tc block provides a simple abstraction, but do you think it is
enough? Alternative is to use a list of ports given to mirred: it
allows us to group ports from different tc blocks or even just a
subset of what is in a tc block - but it will require a lot more code
to express such functionality.

>
>
> >
> >> >If we can adequately cleanup mirred,  then we can put it there but
> >> >certainly now we are adding more buttons to click on mirred. It may
> >> >make sense to refactor the mirred code then reuse the refactored code
> >> >in a new action.
> >>
> >> I don't understand why you need any new action. mirred redirect to block
> >> instead of dev is exactly what you need. Isn't it?
> >
> >The actions have different meanings and lumping them together then
> >selecting via a knob is not the right approach.
> >There's shared code for sure. Infact the sending code was ripped from
> >mirred so as not to touch the rest because like i said mirred has
> >since grown a couple of horns and tails. In retrospect mirred should
> >have been two actions with shared code - but it is too late to change
> >now because it is very widely used. If someone like me was afraid of
> >touching it is because there's a maintainance challenge. I consider it
> >in the same zone as trying to restructure something in the skb.
> >I agree mirroring to a group of ports with a simple config is a useful
> >feature. Mirroring to a group via a tc block is simpler but adding a
> >list of ports instead is more powerful. So this feature is useful to
> >have in mirred - just the adding of yet one more button to say "skip
> >this port" is my concern.
>
> Why? Perhaps skb->iif could be used for check in the tx iteration.
>

We use skb->dev->ifindex to find the exception. Is iif better?.
Jiri - but why does this have to be part of mirred::mirror? I am
asking the same question of why mirror and redirect have to be part
mirred instead of separate actions.

cheers,
jamal
Jiri Pirko Oct. 7, 2023, 5:20 p.m. UTC | #12
Sat, Oct 07, 2023 at 04:09:15PM CEST, jhs@mojatatu.com wrote:
>On Sat, Oct 7, 2023 at 8:43 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Sat, Oct 07, 2023 at 01:06:43PM CEST, jhs@mojatatu.com wrote:
>> >On Sat, Oct 7, 2023 at 6:20 AM Jiri Pirko <jiri@resnulli.us> wrote:
>> >>
>> >> Sat, Oct 07, 2023 at 01:00:00AM CEST, jhs@mojatatu.com wrote:
>> >> >On Fri, Oct 6, 2023 at 6:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
>> >> >>
>> >> >> On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
>> >> >> > > I don't understand the need for configuration less here. You don't have
>> >> >> > > it for the rest of the actions. Why this is special?
>> >> >>
>> >> >> +1, FWIW
>> >> >
>> >> >We dont have any rule that says all actions MUST have parameters ;->
>> >> >There is nothing speacial about any action that doesnt have a
>> >> >parameter.
>> >>
>> >> You are getting the configuration from the block/device the action is
>> >> attached to. Can you point me to another action doing that?
>> >
>> >We are entering a pedantic road i am afraid. If there is no existing
>> >action that has zero config then consider this one the first one. We
>>
>> Nope, nothing pedantic about it. I was just curious if there's anything
>> out there I missed.
>>
>
>Not sure if you noticed in the patch: the blockid on which the skb
>arrived on is now available in the tc_cb[] so when it shows up at the
>action we can just use it.

I see, but does it has to be? I don't think so with the solution I'm
proposing.


>
>>
>> >use skb->metadata all the time as a source of information for actions,
>> >classifiers, qdiscs. If i dont need config i dont need to invent one
>>
>> skb->metadata is something that is specific to a packet. That has
>> nothing to do with the actual configuration.
>
>Essentially we turned blockid into skb metadata. A user specifying
>configuration of a different blockid is certainly useful. My point is
>we can have both worlds: when such a user config is missing we'll
>assume a default which happens to be in the skb.
>
>>
>> >just because, well, all other actions are using one or more config;->
>> >Your suggestion to specify an extra config to select a block - which
>> >may be different than the one the one packet on - is a useful
>> >feature(it just adds more code) but really should be optional. i.e if
>> >you dont specify a block id configuration then we pick the metadata
>> >one.
>>
>> My primary point is, this should be mirred redirect to block instead of
>> what we currently have only for dev. That's it.
>
>Agreed (and such a feature should be added regardless of this action).
>The tc block provides a simple abstraction, but do you think it is
>enough? Alternative is to use a list of ports given to mirred: it
>allows us to group ports from different tc blocks or even just a
>subset of what is in a tc block - but it will require a lot more code
>to express such functionality.

Again, you attach filter to either dev or block. If you extend mirred
redirect to accept the same 2 types of target, I think it would be best.


>
>>
>>
>> >
>> >> >If we can adequately cleanup mirred,  then we can put it there but
>> >> >certainly now we are adding more buttons to click on mirred. It may
>> >> >make sense to refactor the mirred code then reuse the refactored code
>> >> >in a new action.
>> >>
>> >> I don't understand why you need any new action. mirred redirect to block
>> >> instead of dev is exactly what you need. Isn't it?
>> >
>> >The actions have different meanings and lumping them together then
>> >selecting via a knob is not the right approach.
>> >There's shared code for sure. Infact the sending code was ripped from
>> >mirred so as not to touch the rest because like i said mirred has
>> >since grown a couple of horns and tails. In retrospect mirred should
>> >have been two actions with shared code - but it is too late to change
>> >now because it is very widely used. If someone like me was afraid of
>> >touching it is because there's a maintainance challenge. I consider it
>> >in the same zone as trying to restructure something in the skb.
>> >I agree mirroring to a group of ports with a simple config is a useful
>> >feature. Mirroring to a group via a tc block is simpler but adding a
>> >list of ports instead is more powerful. So this feature is useful to
>> >have in mirred - just the adding of yet one more button to say "skip
>> >this port" is my concern.
>>
>> Why? Perhaps skb->iif could be used for check in the tx iteration.
>>
>
>We use skb->dev->ifindex to find the exception. Is iif better?.

iif contains ifindex of the actual ingress device. So if the netdev is
part of bond for example, this still contains the original ifindex.
So I buess that this depends on what you need. Looks to me that
skb->dev->ifindex would be probaly better. It contains the netdev that
the filter is attached on, right?


>Jiri - but why does this have to be part of mirred::mirror? I am
>asking the same question of why mirror and redirect have to be part
>mirred instead of separate actions.

You have to maintain the backwards compatibility. Currently mirred is
one action right? Does not matter how you do it in kernel, user should
not tell any difference.


>
>cheers,
>jamal
Jamal Hadi Salim Oct. 8, 2023, 12:38 p.m. UTC | #13
On Sat, Oct 7, 2023 at 1:20 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Sat, Oct 07, 2023 at 04:09:15PM CEST, jhs@mojatatu.com wrote:
> >On Sat, Oct 7, 2023 at 8:43 AM Jiri Pirko <jiri@resnulli.us> wrote:
> >>
> >> Sat, Oct 07, 2023 at 01:06:43PM CEST, jhs@mojatatu.com wrote:
> >> >On Sat, Oct 7, 2023 at 6:20 AM Jiri Pirko <jiri@resnulli.us> wrote:
> >> >>
> >> >> Sat, Oct 07, 2023 at 01:00:00AM CEST, jhs@mojatatu.com wrote:
> >> >> >On Fri, Oct 6, 2023 at 6:25 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >> >> >>
> >> >> >> On Fri, 6 Oct 2023 15:06:45 -0400 Jamal Hadi Salim wrote:
> >> >> >> > > I don't understand the need for configuration less here. You don't have
> >> >> >> > > it for the rest of the actions. Why this is special?
> >> >> >>
> >> >> >> +1, FWIW
> >> >> >
> >> >> >We dont have any rule that says all actions MUST have parameters ;->
> >> >> >There is nothing speacial about any action that doesnt have a
> >> >> >parameter.
> >> >>
> >> >> You are getting the configuration from the block/device the action is
> >> >> attached to. Can you point me to another action doing that?
> >> >
> >> >We are entering a pedantic road i am afraid. If there is no existing
> >> >action that has zero config then consider this one the first one. We
> >>
> >> Nope, nothing pedantic about it. I was just curious if there's anything
> >> out there I missed.
> >>
> >
> >Not sure if you noticed in the patch: the blockid on which the skb
> >arrived on is now available in the tc_cb[] so when it shows up at the
> >action we can just use it.
>
> I see, but does it has to be? I don't think so with the solution I'm
> proposing.

It's a simplistic use for a broadcast. We should support the one you
suggested as well.

> >
> >>
> >> >use skb->metadata all the time as a source of information for actions,
> >> >classifiers, qdiscs. If i dont need config i dont need to invent one
> >>
> >> skb->metadata is something that is specific to a packet. That has
> >> nothing to do with the actual configuration.
> >
> >Essentially we turned blockid into skb metadata. A user specifying
> >configuration of a different blockid is certainly useful. My point is
> >we can have both worlds: when such a user config is missing we'll
> >assume a default which happens to be in the skb.
> >
> >>
> >> >just because, well, all other actions are using one or more config;->
> >> >Your suggestion to specify an extra config to select a block - which
> >> >may be different than the one the one packet on - is a useful
> >> >feature(it just adds more code) but really should be optional. i.e if
> >> >you dont specify a block id configuration then we pick the metadata
> >> >one.
> >>
> >> My primary point is, this should be mirred redirect to block instead of
> >> what we currently have only for dev. That's it.
> >
> >Agreed (and such a feature should be added regardless of this action).
> >The tc block provides a simple abstraction, but do you think it is
> >enough? Alternative is to use a list of ports given to mirred: it
> >allows us to group ports from different tc blocks or even just a
> >subset of what is in a tc block - but it will require a lot more code
> >to express such functionality.
>
> Again, you attach filter to either dev or block. If you extend mirred
> redirect to accept the same 2 types of target, I think it would be best.
>

We are going to make block work with mirror, it makes sense. I am not
sure about the redirect, what is the semantic? mirror to everyone but
redirect to the last one?

>
> >
> >>
> >>
> >> >
> >> >> >If we can adequately cleanup mirred,  then we can put it there but
> >> >> >certainly now we are adding more buttons to click on mirred. It may
> >> >> >make sense to refactor the mirred code then reuse the refactored code
> >> >> >in a new action.
> >> >>
> >> >> I don't understand why you need any new action. mirred redirect to block
> >> >> instead of dev is exactly what you need. Isn't it?
> >> >
> >> >The actions have different meanings and lumping them together then
> >> >selecting via a knob is not the right approach.
> >> >There's shared code for sure. Infact the sending code was ripped from
> >> >mirred so as not to touch the rest because like i said mirred has
> >> >since grown a couple of horns and tails. In retrospect mirred should
> >> >have been two actions with shared code - but it is too late to change
> >> >now because it is very widely used. If someone like me was afraid of
> >> >touching it is because there's a maintainance challenge. I consider it
> >> >in the same zone as trying to restructure something in the skb.
> >> >I agree mirroring to a group of ports with a simple config is a useful
> >> >feature. Mirroring to a group via a tc block is simpler but adding a
> >> >list of ports instead is more powerful. So this feature is useful to
> >> >have in mirred - just the adding of yet one more button to say "skip
> >> >this port" is my concern.
> >>
> >> Why? Perhaps skb->iif could be used for check in the tx iteration.
> >>
> >
> >We use skb->dev->ifindex to find the exception. Is iif better?.
>
> iif contains ifindex of the actual ingress device. So if the netdev is
> part of bond for example, this still contains the original ifindex.
> So I buess that this depends on what you need. Looks to me that
> skb->dev->ifindex would be probaly better. It contains the netdev that
> the filter is attached on, right?
>

Note: you can use mirred to redirect to either ingress or egress of
other ports - I believe one of these ifindices changes to reflect the
new ifindex. We'll take a closer look.

>
> >Jiri - but why does this have to be part of mirred::mirror? I am
> >asking the same question of why mirror and redirect have to be part
> >mirred instead of separate actions.
>
> You have to maintain the backwards compatibility. Currently mirred is
> one action right? Does not matter how you do it in kernel, user should
> not tell any difference.

I dont mean to break existing mirred. What i meant was in retrospect i
wish i had the insight to separate mirred into two actions(and share
the code instead), it would have simplified the code and its
maintainance. It is for the same reason i am not in favor of is adding
the "skip this port" in mirror. This is in the spirit of unix
philosophy, which we have been mostly adhering to: write small
features/actions which do one thing well and stitch them together to
compose.

cheers,
jamal
>
> >
> >cheers,
> >jamal
Marcelo Ricardo Leitner Oct. 9, 2023, 8:54 p.m. UTC | #14
On Sat, Oct 07, 2023 at 07:20:52PM +0200, Jiri Pirko wrote:
> Sat, Oct 07, 2023 at 04:09:15PM CEST, jhs@mojatatu.com wrote:
> >On Sat, Oct 7, 2023 at 8:43 AM Jiri Pirko <jiri@resnulli.us> wrote:
...
> >> My primary point is, this should be mirred redirect to block instead of
> >> what we currently have only for dev. That's it.
> >
> >Agreed (and such a feature should be added regardless of this action).
> >The tc block provides a simple abstraction, but do you think it is
> >enough? Alternative is to use a list of ports given to mirred: it
> >allows us to group ports from different tc blocks or even just a
> >subset of what is in a tc block - but it will require a lot more code
> >to express such functionality.
>
> Again, you attach filter to either dev or block. If you extend mirred
> redirect to accept the same 2 types of target, I think it would be best.

The difference here between filter and action here is that you don't
really have an option for filters: you either attach it to either dev
or block, or you create an entire new class of objects, say,
"blockfilter", all while retaining the same filters, parameters, etc.
I'm not aware of a single filter that behaves differently over a block
than a netdev.

But for actions, there is the option, and despite the fact that both
"output packets", the semantics are not that close. It actually
helps with parameter parsing, documentation (man pages), testing (as
use and test cases can be more easily tracked) and perhaps more
importantly: if I don't want this feature, I can disable the new
module.

Later someone will say "hey why not have a hash_dst_selector", so it
can implement a load balancer through the block output? And mirred,
once a simple use case (with an already complex implementation),
becomes a partial implementation of bonding then. :)

In short, I'm not sure if having the user to fiddle through a maze of
options that only work in mode A or B or work differently is better
than having more specialized actions (which can and should reuse code).

  Marcelo
Jiri Pirko Oct. 10, 2023, 7:41 a.m. UTC | #15
Mon, Oct 09, 2023 at 10:54:07PM CEST, mleitner@redhat.com wrote:
>On Sat, Oct 07, 2023 at 07:20:52PM +0200, Jiri Pirko wrote:
>> Sat, Oct 07, 2023 at 04:09:15PM CEST, jhs@mojatatu.com wrote:
>> >On Sat, Oct 7, 2023 at 8:43 AM Jiri Pirko <jiri@resnulli.us> wrote:
>...
>> >> My primary point is, this should be mirred redirect to block instead of
>> >> what we currently have only for dev. That's it.
>> >
>> >Agreed (and such a feature should be added regardless of this action).
>> >The tc block provides a simple abstraction, but do you think it is
>> >enough? Alternative is to use a list of ports given to mirred: it
>> >allows us to group ports from different tc blocks or even just a
>> >subset of what is in a tc block - but it will require a lot more code
>> >to express such functionality.
>>
>> Again, you attach filter to either dev or block. If you extend mirred
>> redirect to accept the same 2 types of target, I think it would be best.
>
>The difference here between filter and action here is that you don't
>really have an option for filters: you either attach it to either dev
>or block, or you create an entire new class of objects, say,
>"blockfilter", all while retaining the same filters, parameters, etc.
>I'm not aware of a single filter that behaves differently over a block
>than a netdev.

Why do you talk about different behaviour? Nobody suggested that. I have
no idea what you mean by "blockfilter".



>
>But for actions, there is the option, and despite the fact that both

Which option?


>"output packets", the semantics are not that close. It actually
>helps with parameter parsing, documentation (man pages), testing (as
>use and test cases can be more easily tracked) and perhaps more
>importantly: if I don't want this feature, I can disable the new
>module.
>
>Later someone will say "hey why not have a hash_dst_selector", so it
>can implement a load balancer through the block output? And mirred,
>once a simple use case (with an already complex implementation),
>becomes a partial implementation of bonding then. :)
>
>In short, I'm not sure if having the user to fiddle through a maze of
>options that only work in mode A or B or work differently is better
>than having more specialized actions (which can and should reuse code).

Sure, you can have "blockmirredredirect" that would only to the
redirection for block target. No problem. I don't see why it can't be
just a case of mirred redirect, but if people want that separate, why
not.

My problem with the action "blockcast" is that it somehow works with
configuration of an entity the filter is attached to:
blockX->filterY->blockcastZ

Z goes all the way down to blockX to figure out where to redirect the
packet. If that is not odd, then I don't know what else is.

Has other consequences, like what happens if the filter is not attached
to block, but dev directly? What happens is blockcast action is shared
among filter? Etc.

Configuration should be action property. That is my point.



>
>  Marcelo
>
Marcelo Ricardo Leitner Oct. 10, 2023, 11:54 a.m. UTC | #16
On Tue, Oct 10, 2023 at 09:41:11AM +0200, Jiri Pirko wrote:
> Mon, Oct 09, 2023 at 10:54:07PM CEST, mleitner@redhat.com wrote:
> >On Sat, Oct 07, 2023 at 07:20:52PM +0200, Jiri Pirko wrote:
> >> Sat, Oct 07, 2023 at 04:09:15PM CEST, jhs@mojatatu.com wrote:
> >> >On Sat, Oct 7, 2023 at 8:43 AM Jiri Pirko <jiri@resnulli.us> wrote:
> >...
> >> >> My primary point is, this should be mirred redirect to block instead of
> >> >> what we currently have only for dev. That's it.
> >> >
> >> >Agreed (and such a feature should be added regardless of this action).
> >> >The tc block provides a simple abstraction, but do you think it is
> >> >enough? Alternative is to use a list of ports given to mirred: it
> >> >allows us to group ports from different tc blocks or even just a
> >> >subset of what is in a tc block - but it will require a lot more code
> >> >to express such functionality.
> >>
> >> Again, you attach filter to either dev or block. If you extend mirred
> >> redirect to accept the same 2 types of target, I think it would be best.
> >
> >The difference here between filter and action here is that you don't
> >really have an option for filters: you either attach it to either dev
> >or block, or you create an entire new class of objects, say,
> >"blockfilter", all while retaining the same filters, parameters, etc.
> >I'm not aware of a single filter that behaves differently over a block
> >than a netdev.
>
> Why do you talk about different behaviour? Nobody suggested that. I have

Because if mirred gets updated to support blocks, that's what will
happen. It will have options that only make sense for netdev or for
blocks.

> no idea what you mean by "blockfilter".

Well, it's described above. It was just an example.

>
>
>
> >
> >But for actions, there is the option, and despite the fact that both
>
> Which option?

To create a new action.

>
>
> >"output packets", the semantics are not that close. It actually
> >helps with parameter parsing, documentation (man pages), testing (as
> >use and test cases can be more easily tracked) and perhaps more
> >importantly: if I don't want this feature, I can disable the new
> >module.
> >
> >Later someone will say "hey why not have a hash_dst_selector", so it
> >can implement a load balancer through the block output? And mirred,
> >once a simple use case (with an already complex implementation),
> >becomes a partial implementation of bonding then. :)
> >
> >In short, I'm not sure if having the user to fiddle through a maze of
> >options that only work in mode A or B or work differently is better
> >than having more specialized actions (which can and should reuse code).
>
> Sure, you can have "blockmirredredirect" that would only to the
> redirection for block target. No problem. I don't see why it can't be
> just a case of mirred redirect, but if people want that separate, why
> not.
>
> My problem with the action "blockcast" is that it somehow works with
> configuration of an entity the filter is attached to:
> blockX->filterY->blockcastZ
>
> Z goes all the way down to blockX to figure out where to redirect the
> packet. If that is not odd, then I don't know what else is.

Maybe fried eggs with chocolate. :-D
Just joking..

>
> Has other consequences, like what happens if the filter is not attached
> to block, but dev directly? What happens is blockcast action is shared
> among filter? Etc.

Good points!

>
> Configuration should be action property. That is my point.

Makes sense. Whoever is adding such filter, already knows the block,
and can add the parameter to the action as well. Making it automatic
is not helping out that much in the end.

  Marcelo