diff mbox series

[net-next,v2,6/9] net: openvswitch: store sampling probability in cb.

Message ID 20240603185647.2310748-7-amorenoz@redhat.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net: openvswitch: Add sample multicasting. | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; GEN HAS DIFF 2 files changed, 276 insertions(+), 164 deletions(-);
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 901 this patch: 901
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 905 this patch: 905
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 905 this patch: 905
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 88 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 68 this patch: 68
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-04--00-00 (tests: 1040)

Commit Message

Adrian Moreno June 3, 2024, 6:56 p.m. UTC
The behavior of actions might not be the exact same if they are being
executed inside a nested sample action. Store the probability of the
parent sample action in the skb's cb area.

Use the probability in emit_sample to pass it down to psample.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 include/uapi/linux/openvswitch.h |  3 ++-
 net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
 net/openvswitch/datapath.h       |  3 +++
 net/openvswitch/vport.c          |  1 +
 4 files changed, 28 insertions(+), 4 deletions(-)

Comments

kernel test robot June 4, 2024, 6:09 a.m. UTC | #1
Hi Adrian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
base:   net-next/main
patch link:    https://lore.kernel.org/r/20240603185647.2310748-7-amorenoz%40redhat.com
patch subject: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20240604/202406041339.ytdRh41V-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240604/202406041339.ytdRh41V-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406041339.ytdRh41V-lkp@intel.com/

All errors (new ones prefixed by >>, old ones prefixed by <<):

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-zydacron.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-viewsonic.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-waltop.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-winwing.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/of/of_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/fbtft/fbtft.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-bootrom.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-spilib.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-hid.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-light.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-log.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-loopback.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-power-supply.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-raw.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-vibrator.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-audio-manager.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-gbphy.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-gpio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-i2c.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-pwm.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-sdio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-spi.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-uart.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-usb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/goldfish/goldfish_pipe.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/chrome/cros_kunit_proto_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mailbox/mtk-cmdq-mailbox.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_simpleondemand.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_performance.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_powersave.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_userspace.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwtracing/intel_th/intel_th_msu_sink.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvmem/nvmem-apple-efuses.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvmem/nvmem_brcm_nvram.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvmem/nvmem_u-boot-env.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mm-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mq-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mn-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mp-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hte/hte-tegra194-test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/vdpa/vdpa.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/parport/parport.o
WARNING: modpost: drivers/parport/parport_amiga: section mismatch in reference: amiga_parallel_driver+0x8 (section: .data) -> amiga_parallel_remove (section: .exit.text)
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/parsers/brcm_u-boot.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/parsers/tplink_safeloader.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/chips/cfi_util.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/chips/cfi_cmdset_0020.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/maps/map_funcs.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spmi/hisi-spmi-controller.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spmi/spmi-pmic-arb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/uio/uio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pcmcia/pcmcia_rsrc.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pcmcia/i82365.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwmon/corsair-cpro.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwmon/mr75203.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/vhost/vringh.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/greybus/greybus.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/greybus/gb-es2.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/rpmsg/rpmsg_char.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/iio/adc/ingenic-adc.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/iio/adc/xilinx-ams.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/iio/buffer/kfifo_buf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-core.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-hub.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-aspeed.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-gpio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-ast-cf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-scom.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/siox/siox-bus-gpio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/counter/ftm-quaddec.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/oss/dmasound/dmasound_core.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/codecs/snd-soc-wm-adsp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/fsl/imx-pcm-dma.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/mxs/snd-soc-mxs-pcm.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/qcom/snd-soc-qcom-sdw.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/intel/snd-sof-intel-atom.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/intel/snd-sof-acpi-intel-byt.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/intel/snd-sof-acpi-intel-bdw.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/snd-sof-imx8.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/snd-sof-imx8m.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/snd-sof-imx8ulp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/imx-common.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mtk-adsp-common.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mt8195/snd-sof-mt8195.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mt8186/snd-sof-mt8186.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-utils.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-acpi.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-of.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mtty.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mdpy.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mdpy-fb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mbochs.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/configfs/configfs_sample.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/bytestream-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/dma-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/inttype-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/record-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kobject/kobject-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kobject/kset-example.o
>> ERROR: modpost: "__udivdi3" [net/openvswitch/openvswitch.ko] undefined!
kernel test robot June 4, 2024, 8:49 a.m. UTC | #2
Hi Adrian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
base:   net-next/main
patch link:    https://lore.kernel.org/r/20240603185647.2310748-7-amorenoz%40redhat.com
patch subject: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406041623.ycwsuP85-lkp@intel.com/

All errors (new ones prefixed by >>):

   m68k-linux-ld: net/openvswitch/actions.o: in function `do_execute_actions':
>> actions.c:(.text+0x214e): undefined reference to `__udivdi3'
Adrian Moreno June 5, 2024, 7:34 p.m. UTC | #3
On Tue, Jun 04, 2024 at 04:49:39PM GMT, kernel test robot wrote:
> Hi Adrian,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on net-next/main]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
> base:   net-next/main
> patch link:    https://lore.kernel.org/r/20240603185647.2310748-7-amorenoz%40redhat.com
> patch subject: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
> config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/config)
> compiler: m68k-linux-gcc (GCC) 13.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202406041623.ycwsuP85-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    m68k-linux-ld: net/openvswitch/actions.o: in function `do_execute_actions':
> >> actions.c:(.text+0x214e): undefined reference to `__udivdi3'
>

I forgot about architectures that don't have native u64 division. I will
use "do_div" in the next version of the series.

> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
Aaron Conole June 14, 2024, 4:55 p.m. UTC | #4
Adrian Moreno <amorenoz@redhat.com> writes:

> The behavior of actions might not be the exact same if they are being
> executed inside a nested sample action. Store the probability of the
> parent sample action in the skb's cb area.

What does that mean?

> Use the probability in emit_sample to pass it down to psample.
>
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  include/uapi/linux/openvswitch.h |  3 ++-
>  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
>  net/openvswitch/datapath.h       |  3 +++
>  net/openvswitch/vport.c          |  1 +
>  4 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> index a0e9dde0584a..9d675725fa2b 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -649,7 +649,8 @@ enum ovs_flow_attr {
>   * Actions are passed as nested attributes.
>   *
>   * Executes the specified actions with the given probability on a per-packet
> - * basis.
> + * basis. Nested actions will be able to access the probability value of the
> + * parent @OVS_ACTION_ATTR_SAMPLE.
>   */
>  enum ovs_sample_attr {
>  	OVS_SAMPLE_ATTR_UNSPEC,
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 3b4dba0ded59..33f6d93ba5e4 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>  	struct nlattr *sample_arg;
>  	int rem = nla_len(attr);
>  	const struct sample_arg *arg;
> +	u32 init_probability;
>  	bool clone_flow_key;
> +	int err;
>  
>  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
>  	sample_arg = nla_data(attr);
>  	arg = nla_data(sample_arg);
>  	actions = nla_next(sample_arg, &rem);
> +	init_probability = OVS_CB(skb)->probability;
>  
>  	if ((arg->probability != U32_MAX) &&
>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>  		return 0;
>  	}
>  
> +	if (init_probability) {
> +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
> +					    arg->probability / U32_MAX);
> +	} else {
> +		OVS_CB(skb)->probability = arg->probability;
> +	}
> +

I'm confused by this.  Eventually, integer arithmetic will practically
guarantee that nested sample() calls will go to 0.  So eventually, the
test above will be impossible to meet mathematically.

OTOH, you could argue that a 1% of 50% is low anyway, but it still would
have a positive probability count, and still be possible for
get_random_u32() call to match.

I'm not sure about this particular change.  Why do we need it?

>  	clone_flow_key = !arg->exec;
> -	return clone_execute(dp, skb, key, 0, actions, rem, last,
> -			     clone_flow_key);
> +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
> +			    clone_flow_key);
> +
> +	if (!last)

Is this right?  Don't we only want to set the probability on the last
action?  Should the test be 'if (last)'?

> +		OVS_CB(skb)->probability = init_probability;
> +
> +	return err;
>  }
>  
>  /* When 'last' is true, clone() should always consume the 'skb'.
> @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>  	struct psample_metadata md = {};
>  	struct vport *input_vport;
>  	const struct nlattr *a;
> +	u32 rate;
>  	int rem;
>  
>  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>  
>  	md.in_ifindex = input_vport->dev->ifindex;
>  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> +	md.rate_as_probability = 1;
> +
> +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
>  
> -	psample_sample_packet(&psample_group, skb, 0, &md);
> +	psample_sample_packet(&psample_group, skb, rate, &md);
>  #endif
>  
>  	return 0;
> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> index 0cd29971a907..9ca6231ea647 100644
> --- a/net/openvswitch/datapath.h
> +++ b/net/openvswitch/datapath.h
> @@ -115,12 +115,15 @@ struct datapath {
>   * fragmented.
>   * @acts_origlen: The netlink size of the flow actions applied to this skb.
>   * @cutlen: The number of bytes from the packet end to be removed.
> + * @probability: The sampling probability that was applied to this skb; 0 means
> + * no sampling has occurred; U32_MAX means 100% probability.
>   */
>  struct ovs_skb_cb {
>  	struct vport		*input_vport;
>  	u16			mru;
>  	u16			acts_origlen;
>  	u32			cutlen;
> +	u32			probability;
>  };
>  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
>  
> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> index 972ae01a70f7..8732f6e51ae5 100644
> --- a/net/openvswitch/vport.c
> +++ b/net/openvswitch/vport.c
> @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
>  	OVS_CB(skb)->input_vport = vport;
>  	OVS_CB(skb)->mru = 0;
>  	OVS_CB(skb)->cutlen = 0;
> +	OVS_CB(skb)->probability = 0;
>  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
>  		u32 mark;
Adrian Moreno June 17, 2024, 7:08 a.m. UTC | #5
On Fri, Jun 14, 2024 at 12:55:59PM GMT, Aaron Conole wrote:
> Adrian Moreno <amorenoz@redhat.com> writes:
>
> > The behavior of actions might not be the exact same if they are being
> > executed inside a nested sample action. Store the probability of the
> > parent sample action in the skb's cb area.
>
> What does that mean?
>

Emit action, for instance, needs the probability so that psample
consumers know what was the sampling rate applied. Also, the way we
should inform about packet drops (via kfree_skb_reason) changes (see
patch 7/9).

> > Use the probability in emit_sample to pass it down to psample.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> > ---
> >  include/uapi/linux/openvswitch.h |  3 ++-
> >  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
> >  net/openvswitch/datapath.h       |  3 +++
> >  net/openvswitch/vport.c          |  1 +
> >  4 files changed, 28 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> > index a0e9dde0584a..9d675725fa2b 100644
> > --- a/include/uapi/linux/openvswitch.h
> > +++ b/include/uapi/linux/openvswitch.h
> > @@ -649,7 +649,8 @@ enum ovs_flow_attr {
> >   * Actions are passed as nested attributes.
> >   *
> >   * Executes the specified actions with the given probability on a per-packet
> > - * basis.
> > + * basis. Nested actions will be able to access the probability value of the
> > + * parent @OVS_ACTION_ATTR_SAMPLE.
> >   */
> >  enum ovs_sample_attr {
> >  	OVS_SAMPLE_ATTR_UNSPEC,
> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index 3b4dba0ded59..33f6d93ba5e4 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> > @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >  	struct nlattr *sample_arg;
> >  	int rem = nla_len(attr);
> >  	const struct sample_arg *arg;
> > +	u32 init_probability;
> >  	bool clone_flow_key;
> > +	int err;
> >
> >  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
> >  	sample_arg = nla_data(attr);
> >  	arg = nla_data(sample_arg);
> >  	actions = nla_next(sample_arg, &rem);
> > +	init_probability = OVS_CB(skb)->probability;
> >
> >  	if ((arg->probability != U32_MAX) &&
> >  	    (!arg->probability || get_random_u32() > arg->probability)) {
> > @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >  		return 0;
> >  	}
> >
> > +	if (init_probability) {
> > +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
> > +					    arg->probability / U32_MAX);
> > +	} else {
> > +		OVS_CB(skb)->probability = arg->probability;
> > +	}
> > +
>
> I'm confused by this.  Eventually, integer arithmetic will practically
> guarantee that nested sample() calls will go to 0.  So eventually, the
> test above will be impossible to meet mathematically.
>
> OTOH, you could argue that a 1% of 50% is low anyway, but it still would
> have a positive probability count, and still be possible for
> get_random_u32() call to match.
>

Using OVS's probability semantics, we can express probabilities as low
as (100/U32_MAX)% which is pretty low indeed. However, just because the
probability of executing the action is low I don't think we should not
report it.

Rethinking the integer arithmetics, it's true that we should avoid
hitting zero on the division, eg: nesting 6x 1% sampling rates will make
the result be zero which will make probability restoration fail on the
way back. Threrefore, the new probability should be at least 1.


> I'm not sure about this particular change.  Why do we need it?
>

Why do we need to propagate the probability down to nested "sample"
actions? or why do we need to store the probability in the cb area in
the first place?

The former: Just for correctness as only storing the last one would be
incorrect. Although I don't know of any use for nested "sample" actions.
The latter: To pass it down to psample so that sample receivers know how
the sampling rate applied (and, e.g: do throughput estimations like OVS
does with IPFIX).


> >  	clone_flow_key = !arg->exec;
> > -	return clone_execute(dp, skb, key, 0, actions, rem, last,
> > -			     clone_flow_key);
> > +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
> > +			    clone_flow_key);
> > +
> > +	if (!last)
>
> Is this right?  Don't we only want to set the probability on the last
> action?  Should the test be 'if (last)'?
>

This is restoring the parent's probability after the actions in the
current sample action have been executed.

If it was the last action there is no need to restore the probability
back to the parent's (or zero if it's there's only one level) since no
further action will require it. And more importantly, if it's the last
action, the packet gets free'ed inside that "branch" so we must not
access its memory.


> > +		OVS_CB(skb)->probability = init_probability;
> > +
> > +	return err;
> >  }
> >
> >  /* When 'last' is true, clone() should always consume the 'skb'.
> > @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >  	struct psample_metadata md = {};
> >  	struct vport *input_vport;
> >  	const struct nlattr *a;
> > +	u32 rate;
> >  	int rem;
> >
> >  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> > @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >
> >  	md.in_ifindex = input_vport->dev->ifindex;
> >  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> > +	md.rate_as_probability = 1;
> > +
> > +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
> >
> > -	psample_sample_packet(&psample_group, skb, 0, &md);
> > +	psample_sample_packet(&psample_group, skb, rate, &md);
> >  #endif
> >
> >  	return 0;
> > diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> > index 0cd29971a907..9ca6231ea647 100644
> > --- a/net/openvswitch/datapath.h
> > +++ b/net/openvswitch/datapath.h
> > @@ -115,12 +115,15 @@ struct datapath {
> >   * fragmented.
> >   * @acts_origlen: The netlink size of the flow actions applied to this skb.
> >   * @cutlen: The number of bytes from the packet end to be removed.
> > + * @probability: The sampling probability that was applied to this skb; 0 means
> > + * no sampling has occurred; U32_MAX means 100% probability.
> >   */
> >  struct ovs_skb_cb {
> >  	struct vport		*input_vport;
> >  	u16			mru;
> >  	u16			acts_origlen;
> >  	u32			cutlen;
> > +	u32			probability;
> >  };
> >  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
> >
> > diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> > index 972ae01a70f7..8732f6e51ae5 100644
> > --- a/net/openvswitch/vport.c
> > +++ b/net/openvswitch/vport.c
> > @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
> >  	OVS_CB(skb)->input_vport = vport;
> >  	OVS_CB(skb)->mru = 0;
> >  	OVS_CB(skb)->cutlen = 0;
> > +	OVS_CB(skb)->probability = 0;
> >  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
> >  		u32 mark;
>
Ilya Maximets June 17, 2024, 11:26 a.m. UTC | #6
On 6/17/24 09:08, Adrián Moreno wrote:
> On Fri, Jun 14, 2024 at 12:55:59PM GMT, Aaron Conole wrote:
>> Adrian Moreno <amorenoz@redhat.com> writes:
>>
>>> The behavior of actions might not be the exact same if they are being
>>> executed inside a nested sample action. Store the probability of the
>>> parent sample action in the skb's cb area.
>>
>> What does that mean?
>>
> 
> Emit action, for instance, needs the probability so that psample
> consumers know what was the sampling rate applied. Also, the way we
> should inform about packet drops (via kfree_skb_reason) changes (see
> patch 7/9).
> 
>>> Use the probability in emit_sample to pass it down to psample.
>>>
>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>> ---
>>>  include/uapi/linux/openvswitch.h |  3 ++-
>>>  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
>>>  net/openvswitch/datapath.h       |  3 +++
>>>  net/openvswitch/vport.c          |  1 +
>>>  4 files changed, 28 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
>>> index a0e9dde0584a..9d675725fa2b 100644
>>> --- a/include/uapi/linux/openvswitch.h
>>> +++ b/include/uapi/linux/openvswitch.h
>>> @@ -649,7 +649,8 @@ enum ovs_flow_attr {
>>>   * Actions are passed as nested attributes.
>>>   *
>>>   * Executes the specified actions with the given probability on a per-packet
>>> - * basis.
>>> + * basis. Nested actions will be able to access the probability value of the
>>> + * parent @OVS_ACTION_ATTR_SAMPLE.
>>>   */
>>>  enum ovs_sample_attr {
>>>  	OVS_SAMPLE_ATTR_UNSPEC,
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>> index 3b4dba0ded59..33f6d93ba5e4 100644
>>> --- a/net/openvswitch/actions.c
>>> +++ b/net/openvswitch/actions.c
>>> @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>  	struct nlattr *sample_arg;
>>>  	int rem = nla_len(attr);
>>>  	const struct sample_arg *arg;
>>> +	u32 init_probability;
>>>  	bool clone_flow_key;
>>> +	int err;
>>>
>>>  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
>>>  	sample_arg = nla_data(attr);
>>>  	arg = nla_data(sample_arg);
>>>  	actions = nla_next(sample_arg, &rem);
>>> +	init_probability = OVS_CB(skb)->probability;
>>>
>>>  	if ((arg->probability != U32_MAX) &&
>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>> @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>  		return 0;
>>>  	}
>>>
>>> +	if (init_probability) {
>>> +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
>>> +					    arg->probability / U32_MAX);
>>> +	} else {
>>> +		OVS_CB(skb)->probability = arg->probability;
>>> +	}
>>> +
>>
>> I'm confused by this.  Eventually, integer arithmetic will practically
>> guarantee that nested sample() calls will go to 0.  So eventually, the
>> test above will be impossible to meet mathematically.
>>
>> OTOH, you could argue that a 1% of 50% is low anyway, but it still would
>> have a positive probability count, and still be possible for
>> get_random_u32() call to match.
>>
> 
> Using OVS's probability semantics, we can express probabilities as low
> as (100/U32_MAX)% which is pretty low indeed. However, just because the
> probability of executing the action is low I don't think we should not
> report it.
> 
> Rethinking the integer arithmetics, it's true that we should avoid
> hitting zero on the division, eg: nesting 6x 1% sampling rates will make
> the result be zero which will make probability restoration fail on the
> way back. Threrefore, the new probability should be at least 1.
> 
> 
>> I'm not sure about this particular change.  Why do we need it?
>>
> 
> Why do we need to propagate the probability down to nested "sample"
> actions? or why do we need to store the probability in the cb area in
> the first place?
> 
> The former: Just for correctness as only storing the last one would be
> incorrect. Although I don't know of any use for nested "sample" actions.

I think, we can drop this for now.  All the user interfaces specify
the probability per action.  So, it should be fine to report the
probability of the action that emitted the sample without taking into
account the whole timeline of that packet.  Besides, packet can leave
OVS and go back loosing the metadata, so it will not actually be a
full solution anyway.  Single-action metadata is easier to define.

> The latter: To pass it down to psample so that sample receivers know how
> the sampling rate applied (and, e.g: do throughput estimations like OVS
> does with IPFIX).
> 
> 
>>>  	clone_flow_key = !arg->exec;
>>> -	return clone_execute(dp, skb, key, 0, actions, rem, last,
>>> -			     clone_flow_key);
>>> +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
>>> +			    clone_flow_key);
>>> +
>>> +	if (!last)
>>
>> Is this right?  Don't we only want to set the probability on the last
>> action?  Should the test be 'if (last)'?
>>
> 
> This is restoring the parent's probability after the actions in the
> current sample action have been executed.
> 
> If it was the last action there is no need to restore the probability
> back to the parent's (or zero if it's there's only one level) since no
> further action will require it. And more importantly, if it's the last
> action, the packet gets free'ed inside that "branch" so we must not
> access its memory.
> 
> 
>>> +		OVS_CB(skb)->probability = init_probability;
>>> +
>>> +	return err;
>>>  }
>>>
>>>  /* When 'last' is true, clone() should always consume the 'skb'.
>>> @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>>>  	struct psample_metadata md = {};
>>>  	struct vport *input_vport;
>>>  	const struct nlattr *a;
>>> +	u32 rate;
>>>  	int rem;
>>>
>>>  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>>> @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>>>
>>>  	md.in_ifindex = input_vport->dev->ifindex;
>>>  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
>>> +	md.rate_as_probability = 1;
>>> +
>>> +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
>>>
>>> -	psample_sample_packet(&psample_group, skb, 0, &md);
>>> +	psample_sample_packet(&psample_group, skb, rate, &md);
>>>  #endif
>>>
>>>  	return 0;
>>> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
>>> index 0cd29971a907..9ca6231ea647 100644
>>> --- a/net/openvswitch/datapath.h
>>> +++ b/net/openvswitch/datapath.h
>>> @@ -115,12 +115,15 @@ struct datapath {
>>>   * fragmented.
>>>   * @acts_origlen: The netlink size of the flow actions applied to this skb.
>>>   * @cutlen: The number of bytes from the packet end to be removed.
>>> + * @probability: The sampling probability that was applied to this skb; 0 means
>>> + * no sampling has occurred; U32_MAX means 100% probability.
>>>   */
>>>  struct ovs_skb_cb {
>>>  	struct vport		*input_vport;
>>>  	u16			mru;
>>>  	u16			acts_origlen;
>>>  	u32			cutlen;
>>> +	u32			probability;
>>>  };
>>>  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
>>>
>>> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
>>> index 972ae01a70f7..8732f6e51ae5 100644
>>> --- a/net/openvswitch/vport.c
>>> +++ b/net/openvswitch/vport.c
>>> @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
>>>  	OVS_CB(skb)->input_vport = vport;
>>>  	OVS_CB(skb)->mru = 0;
>>>  	OVS_CB(skb)->cutlen = 0;
>>> +	OVS_CB(skb)->probability = 0;
>>>  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
>>>  		u32 mark;
>>
>
Adrian Moreno June 18, 2024, 7:36 a.m. UTC | #7
On Mon, Jun 17, 2024 at 01:26:39PM GMT, Ilya Maximets wrote:
> On 6/17/24 09:08, Adrián Moreno wrote:
> > On Fri, Jun 14, 2024 at 12:55:59PM GMT, Aaron Conole wrote:
> >> Adrian Moreno <amorenoz@redhat.com> writes:
> >>
> >>> The behavior of actions might not be the exact same if they are being
> >>> executed inside a nested sample action. Store the probability of the
> >>> parent sample action in the skb's cb area.
> >>
> >> What does that mean?
> >>
> >
> > Emit action, for instance, needs the probability so that psample
> > consumers know what was the sampling rate applied. Also, the way we
> > should inform about packet drops (via kfree_skb_reason) changes (see
> > patch 7/9).
> >
> >>> Use the probability in emit_sample to pass it down to psample.
> >>>
> >>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>> ---
> >>>  include/uapi/linux/openvswitch.h |  3 ++-
> >>>  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
> >>>  net/openvswitch/datapath.h       |  3 +++
> >>>  net/openvswitch/vport.c          |  1 +
> >>>  4 files changed, 28 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> >>> index a0e9dde0584a..9d675725fa2b 100644
> >>> --- a/include/uapi/linux/openvswitch.h
> >>> +++ b/include/uapi/linux/openvswitch.h
> >>> @@ -649,7 +649,8 @@ enum ovs_flow_attr {
> >>>   * Actions are passed as nested attributes.
> >>>   *
> >>>   * Executes the specified actions with the given probability on a per-packet
> >>> - * basis.
> >>> + * basis. Nested actions will be able to access the probability value of the
> >>> + * parent @OVS_ACTION_ATTR_SAMPLE.
> >>>   */
> >>>  enum ovs_sample_attr {
> >>>  	OVS_SAMPLE_ATTR_UNSPEC,
> >>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>> index 3b4dba0ded59..33f6d93ba5e4 100644
> >>> --- a/net/openvswitch/actions.c
> >>> +++ b/net/openvswitch/actions.c
> >>> @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>  	struct nlattr *sample_arg;
> >>>  	int rem = nla_len(attr);
> >>>  	const struct sample_arg *arg;
> >>> +	u32 init_probability;
> >>>  	bool clone_flow_key;
> >>> +	int err;
> >>>
> >>>  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
> >>>  	sample_arg = nla_data(attr);
> >>>  	arg = nla_data(sample_arg);
> >>>  	actions = nla_next(sample_arg, &rem);
> >>> +	init_probability = OVS_CB(skb)->probability;
> >>>
> >>>  	if ((arg->probability != U32_MAX) &&
> >>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>> @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>  		return 0;
> >>>  	}
> >>>
> >>> +	if (init_probability) {
> >>> +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
> >>> +					    arg->probability / U32_MAX);
> >>> +	} else {
> >>> +		OVS_CB(skb)->probability = arg->probability;
> >>> +	}
> >>> +
> >>
> >> I'm confused by this.  Eventually, integer arithmetic will practically
> >> guarantee that nested sample() calls will go to 0.  So eventually, the
> >> test above will be impossible to meet mathematically.
> >>
> >> OTOH, you could argue that a 1% of 50% is low anyway, but it still would
> >> have a positive probability count, and still be possible for
> >> get_random_u32() call to match.
> >>
> >
> > Using OVS's probability semantics, we can express probabilities as low
> > as (100/U32_MAX)% which is pretty low indeed. However, just because the
> > probability of executing the action is low I don't think we should not
> > report it.
> >
> > Rethinking the integer arithmetics, it's true that we should avoid
> > hitting zero on the division, eg: nesting 6x 1% sampling rates will make
> > the result be zero which will make probability restoration fail on the
> > way back. Threrefore, the new probability should be at least 1.
> >
> >
> >> I'm not sure about this particular change.  Why do we need it?
> >>
> >
> > Why do we need to propagate the probability down to nested "sample"
> > actions? or why do we need to store the probability in the cb area in
> > the first place?
> >
> > The former: Just for correctness as only storing the last one would be
> > incorrect. Although I don't know of any use for nested "sample" actions.
>
> I think, we can drop this for now.  All the user interfaces specify
> the probability per action.  So, it should be fine to report the
> probability of the action that emitted the sample without taking into
> account the whole timeline of that packet.  Besides, packet can leave
> OVS and go back loosing the metadata, so it will not actually be a
> full solution anyway.  Single-action metadata is easier to define.
>

Sure, I guess we can drop it, I don't think there is a use case for nested
samples anyway.

> > The latter: To pass it down to psample so that sample receivers know how
> > the sampling rate applied (and, e.g: do throughput estimations like OVS
> > does with IPFIX).
> >
> >
> >>>  	clone_flow_key = !arg->exec;
> >>> -	return clone_execute(dp, skb, key, 0, actions, rem, last,
> >>> -			     clone_flow_key);
> >>> +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
> >>> +			    clone_flow_key);
> >>> +
> >>> +	if (!last)
> >>
> >> Is this right?  Don't we only want to set the probability on the last
> >> action?  Should the test be 'if (last)'?
> >>
> >
> > This is restoring the parent's probability after the actions in the
> > current sample action have been executed.
> >
> > If it was the last action there is no need to restore the probability
> > back to the parent's (or zero if it's there's only one level) since no
> > further action will require it. And more importantly, if it's the last
> > action, the packet gets free'ed inside that "branch" so we must not
> > access its memory.
> >
> >
> >>> +		OVS_CB(skb)->probability = init_probability;
> >>> +
> >>> +	return err;
> >>>  }
> >>>
> >>>  /* When 'last' is true, clone() should always consume the 'skb'.
> >>> @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >>>  	struct psample_metadata md = {};
> >>>  	struct vport *input_vport;
> >>>  	const struct nlattr *a;
> >>> +	u32 rate;
> >>>  	int rem;
> >>>
> >>>  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> >>> @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >>>
> >>>  	md.in_ifindex = input_vport->dev->ifindex;
> >>>  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> >>> +	md.rate_as_probability = 1;
> >>> +
> >>> +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
> >>>
> >>> -	psample_sample_packet(&psample_group, skb, 0, &md);
> >>> +	psample_sample_packet(&psample_group, skb, rate, &md);
> >>>  #endif
> >>>
> >>>  	return 0;
> >>> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> >>> index 0cd29971a907..9ca6231ea647 100644
> >>> --- a/net/openvswitch/datapath.h
> >>> +++ b/net/openvswitch/datapath.h
> >>> @@ -115,12 +115,15 @@ struct datapath {
> >>>   * fragmented.
> >>>   * @acts_origlen: The netlink size of the flow actions applied to this skb.
> >>>   * @cutlen: The number of bytes from the packet end to be removed.
> >>> + * @probability: The sampling probability that was applied to this skb; 0 means
> >>> + * no sampling has occurred; U32_MAX means 100% probability.
> >>>   */
> >>>  struct ovs_skb_cb {
> >>>  	struct vport		*input_vport;
> >>>  	u16			mru;
> >>>  	u16			acts_origlen;
> >>>  	u32			cutlen;
> >>> +	u32			probability;
> >>>  };
> >>>  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
> >>>
> >>> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> >>> index 972ae01a70f7..8732f6e51ae5 100644
> >>> --- a/net/openvswitch/vport.c
> >>> +++ b/net/openvswitch/vport.c
> >>> @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
> >>>  	OVS_CB(skb)->input_vport = vport;
> >>>  	OVS_CB(skb)->mru = 0;
> >>>  	OVS_CB(skb)->cutlen = 0;
> >>> +	OVS_CB(skb)->probability = 0;
> >>>  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
> >>>  		u32 mark;
> >>
> >
>
diff mbox series

Patch

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index a0e9dde0584a..9d675725fa2b 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -649,7 +649,8 @@  enum ovs_flow_attr {
  * Actions are passed as nested attributes.
  *
  * Executes the specified actions with the given probability on a per-packet
- * basis.
+ * basis. Nested actions will be able to access the probability value of the
+ * parent @OVS_ACTION_ATTR_SAMPLE.
  */
 enum ovs_sample_attr {
 	OVS_SAMPLE_ATTR_UNSPEC,
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 3b4dba0ded59..33f6d93ba5e4 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -1048,12 +1048,15 @@  static int sample(struct datapath *dp, struct sk_buff *skb,
 	struct nlattr *sample_arg;
 	int rem = nla_len(attr);
 	const struct sample_arg *arg;
+	u32 init_probability;
 	bool clone_flow_key;
+	int err;
 
 	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
 	sample_arg = nla_data(attr);
 	arg = nla_data(sample_arg);
 	actions = nla_next(sample_arg, &rem);
+	init_probability = OVS_CB(skb)->probability;
 
 	if ((arg->probability != U32_MAX) &&
 	    (!arg->probability || get_random_u32() > arg->probability)) {
@@ -1062,9 +1065,21 @@  static int sample(struct datapath *dp, struct sk_buff *skb,
 		return 0;
 	}
 
+	if (init_probability) {
+		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
+					    arg->probability / U32_MAX);
+	} else {
+		OVS_CB(skb)->probability = arg->probability;
+	}
+
 	clone_flow_key = !arg->exec;
-	return clone_execute(dp, skb, key, 0, actions, rem, last,
-			     clone_flow_key);
+	err = clone_execute(dp, skb, key, 0, actions, rem, last,
+			    clone_flow_key);
+
+	if (!last)
+		OVS_CB(skb)->probability = init_probability;
+
+	return err;
 }
 
 /* When 'last' is true, clone() should always consume the 'skb'.
@@ -1313,6 +1328,7 @@  static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
 	struct psample_metadata md = {};
 	struct vport *input_vport;
 	const struct nlattr *a;
+	u32 rate;
 	int rem;
 
 	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
@@ -1337,8 +1353,11 @@  static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
 
 	md.in_ifindex = input_vport->dev->ifindex;
 	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
+	md.rate_as_probability = 1;
+
+	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
 
-	psample_sample_packet(&psample_group, skb, 0, &md);
+	psample_sample_packet(&psample_group, skb, rate, &md);
 #endif
 
 	return 0;
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 0cd29971a907..9ca6231ea647 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -115,12 +115,15 @@  struct datapath {
  * fragmented.
  * @acts_origlen: The netlink size of the flow actions applied to this skb.
  * @cutlen: The number of bytes from the packet end to be removed.
+ * @probability: The sampling probability that was applied to this skb; 0 means
+ * no sampling has occurred; U32_MAX means 100% probability.
  */
 struct ovs_skb_cb {
 	struct vport		*input_vport;
 	u16			mru;
 	u16			acts_origlen;
 	u32			cutlen;
+	u32			probability;
 };
 #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
 
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 972ae01a70f7..8732f6e51ae5 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -500,6 +500,7 @@  int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
 	OVS_CB(skb)->input_vport = vport;
 	OVS_CB(skb)->mru = 0;
 	OVS_CB(skb)->cutlen = 0;
+	OVS_CB(skb)->probability = 0;
 	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
 		u32 mark;