diff mbox series

[net-next,v2,4/4] selftests: drv-net: rss_ctx: add tests for RSS configuration and contexts

Message ID 20240625010210.2002310-5-kuba@kernel.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series selftests: drv-net: rss_ctx: add tests for RSS contexts | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 842 this patch: 842
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 3 maintainers not CCed: shuah@kernel.org linux-kselftest@vger.kernel.org petrm@nvidia.com
netdev/build_clang success Errors and warnings before: 849 this patch: 849
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 854 this patch: 854
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-25--09-00 (tests: 662)

Commit Message

Jakub Kicinski June 25, 2024, 1:02 a.m. UTC
Add tests focusing on indirection table configuration and
creating extra RSS contexts in drivers which support it.

  $ export NETIF=eth0 REMOTE_...
  $ ./drivers/net/hw/rss_ctx.py
  KTAP version 1
  1..8
  ok 1 rss_ctx.test_rss_key_indir
  ok 2 rss_ctx.test_rss_context
  ok 3 rss_ctx.test_rss_context4
  # Increasing queue count 44 -> 66
  # Failed to create context 32, trying to test what we got
  ok 4 rss_ctx.test_rss_context32 # SKIP Tested only 31 contexts, wanted 32
  ok 5 rss_ctx.test_rss_context_overlap
  ok 6 rss_ctx.test_rss_context_overlap2
  # .. sprays traffic like a headless chicken ..
  not ok 7 rss_ctx.test_rss_context_out_of_order
  ok 8 rss_ctx.test_rss_context4_create_with_cfg
  # Totals: pass:6 fail:1 xfail:0 xpass:0 skip:1 error:0

Note that rss_ctx.test_rss_context_out_of_order fails with the device
I tested with, but it seems to be a device / driver bug.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
 - ensure right queue count for each test (David)
 - check driver has ntuple filters before starting (Willem)
 - add test for creating contexts with indir table specified (Ed)
 - fix ksft_lt (Ed)
 - query and validate indirection tables of non-first context (Ed)
 - test traffic steering vs OOO context removal (Ed)
 - make sure overlap test deletes all rules, 0 is a valid ntuple ID,
   so we can't do "if ntuple: remove()"
---
 .../testing/selftests/drivers/net/hw/Makefile |   1 +
 .../selftests/drivers/net/hw/rss_ctx.py       | 383 ++++++++++++++++++
 .../selftests/drivers/net/lib/py/load.py      |   7 +-
 tools/testing/selftests/net/lib/py/ksft.py    |   5 +
 tools/testing/selftests/net/lib/py/utils.py   |   8 +-
 5 files changed, 399 insertions(+), 5 deletions(-)
 create mode 100755 tools/testing/selftests/drivers/net/hw/rss_ctx.py

Comments

Petr Machata June 25, 2024, 10:42 a.m. UTC | #1
Jakub Kicinski <kuba@kernel.org> writes:

> Add tests focusing on indirection table configuration and
> creating extra RSS contexts in drivers which support it.
>
>   $ export NETIF=eth0 REMOTE_...
>   $ ./drivers/net/hw/rss_ctx.py
>   KTAP version 1
>   1..8
>   ok 1 rss_ctx.test_rss_key_indir
>   ok 2 rss_ctx.test_rss_context
>   ok 3 rss_ctx.test_rss_context4
>   # Increasing queue count 44 -> 66
>   # Failed to create context 32, trying to test what we got
>   ok 4 rss_ctx.test_rss_context32 # SKIP Tested only 31 contexts, wanted 32
>   ok 5 rss_ctx.test_rss_context_overlap
>   ok 6 rss_ctx.test_rss_context_overlap2
>   # .. sprays traffic like a headless chicken ..
>   not ok 7 rss_ctx.test_rss_context_out_of_order
>   ok 8 rss_ctx.test_rss_context4_create_with_cfg
>   # Totals: pass:6 fail:1 xfail:0 xpass:0 skip:1 error:0
>
> Note that rss_ctx.test_rss_context_out_of_order fails with the device
> I tested with, but it seems to be a device / driver bug.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> v2:
>  - ensure right queue count for each test (David)
>  - check driver has ntuple filters before starting (Willem)
>  - add test for creating contexts with indir table specified (Ed)
>  - fix ksft_lt (Ed)
>  - query and validate indirection tables of non-first context (Ed)
>  - test traffic steering vs OOO context removal (Ed)
>  - make sure overlap test deletes all rules, 0 is a valid ntuple ID,
>    so we can't do "if ntuple: remove()"
> ---
>  .../testing/selftests/drivers/net/hw/Makefile |   1 +
>  .../selftests/drivers/net/hw/rss_ctx.py       | 383 ++++++++++++++++++
>  .../selftests/drivers/net/lib/py/load.py      |   7 +-
>  tools/testing/selftests/net/lib/py/ksft.py    |   5 +
>  tools/testing/selftests/net/lib/py/utils.py   |   8 +-
>  5 files changed, 399 insertions(+), 5 deletions(-)
>  create mode 100755 tools/testing/selftests/drivers/net/hw/rss_ctx.py
>
> diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
> index 4933d045ab66..c9f2f48fc30f 100644
> --- a/tools/testing/selftests/drivers/net/hw/Makefile
> +++ b/tools/testing/selftests/drivers/net/hw/Makefile
> @@ -11,6 +11,7 @@ TEST_PROGS = \
>  	hw_stats_l3_gre.sh \
>  	loopback.sh \
>  	pp_alloc_fail.py \
> +	rss_ctx.py \
>  	#
>  
>  TEST_FILES := \
> diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py
> new file mode 100755
> index 000000000000..c9c864d5f7d1
> --- /dev/null
> +++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py
> @@ -0,0 +1,383 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +
> +import datetime
> +import random
> +from lib.py import ksft_run, ksft_pr, ksft_exit, ksft_eq, ksft_ge, ksft_lt
> +from lib.py import NetDrvEpEnv
> +from lib.py import NetdevFamily
> +from lib.py import KsftSkipEx
> +from lib.py import rand_port
> +from lib.py import ethtool, ip, GenerateTraffic, CmdExitFailure
> +
> +
> +def _rss_key_str(key):
> +    return ":".join(["{:02x}".format(x) for x in key])
> +
> +
> +def _rss_key_rand(length):
> +    return [random.randint(0, 255) for _ in range(length)]
> +
> +
> +def get_rss(cfg, context=0):
> +    return ethtool(f"-x {cfg.ifname} context {context}", json=True)[0]
> +
> +
> +def get_drop_err_sum(cfg):
> +    stats = ip("-s -s link show dev " + cfg.ifname, json=True)[0]
> +    cnt = 0
> +    for key in ['errors', 'dropped', 'over_errors', 'fifo_errors',
> +                'length_errors', 'crc_errors', 'missed_errors',
> +                'frame_errors']:
> +        cnt += stats["stats64"]["rx"][key]
> +    return cnt, stats["stats64"]["tx"]["carrier_changes"]
> +
> +
> +def ethtool_create(cfg, act, opts):
> +    output = ethtool(f"{act} {cfg.ifname} {opts}").stdout
> +    # Output will be something like: "New RSS context is 1" or
> +    # "Added rule with ID 7", we want the integer from the end
> +    return int(output.split()[-1])
> +
> +
> +def require_ntuple(cfg):
> +    features = ethtool(f"-k {cfg.ifname}", json=True)[0]
> +    if not features["ntuple-filters"]["active"]:
> +        # ntuple is more of a capability than a config knob, don't bother
> +        # trying to enable it (until some driver actually needs it).
> +        raise KsftSkipEx("Ntuple filters not enabled on the device: " + str(features["ntuple-filters"]))
> +
> +
> +# Get Rx packet counts for all queues, as a simple list of integers
> +# if @prev is specified the prev counts will be subtracted
> +def _get_rx_cnts(cfg, prev=None):
> +    cfg.wait_hw_stats_settle()
> +    data = cfg.netdevnl.qstats_get({"ifindex": cfg.ifindex, "scope": ["queue"]}, dump=True)
> +    data = [x for x in data if x['queue-type'] == "rx"]
> +    max_q = max([x["queue-id"] for x in data])
> +    queue_stats = [0] * (max_q + 1)
> +    for q in data:
> +        queue_stats[q["queue-id"]] = q["rx-packets"]
> +        if prev and q["queue-id"] < len(prev):
> +            queue_stats[q["queue-id"]] -= prev[q["queue-id"]]
> +    return queue_stats
> +
> +
> +def test_rss_key_indir(cfg):
> +    """
> +    Test basics like updating the main RSS key and indirection table.
> +    """
> +    if len(_get_rx_cnts(cfg)) < 2:
> +        KsftSkipEx("Device has only one queue (or doesn't support queue stats)")

I'm not sure, is this admin-correctible configuration issue? It looks
like this and some others should rather be XFAIL.

> +    data = get_rss(cfg)
> +    want_keys = ['rss-hash-key', 'rss-hash-function', 'rss-indirection-table']
> +    for k in want_keys:
> +        if k not in data:
> +            raise KsftFailEx("ethtool results missing key: " + k)
> +        if not data[k]:
> +            raise KsftFailEx(f"ethtool results empty for '{k}': {data[k]}")
> +
> +    key_len = len(data['rss-hash-key'])
> +
> +    # Set the key
> +    key = _rss_key_rand(key_len)
> +    ethtool(f"-X {cfg.ifname} hkey " + _rss_key_str(key))
> +
> +    data = get_rss(cfg)
> +    ksft_eq(key, data['rss-hash-key'])
> +
> +    # Set the indirection table
> +    ethtool(f"-X {cfg.ifname} equal 2")
> +    data = get_rss(cfg)
> +    ksft_eq(0, min(data['rss-indirection-table']))
> +    ksft_eq(1, max(data['rss-indirection-table']))
> +
> +    # Check we only get traffic on the first 2 queues
> +    cnts = _get_rx_cnts(cfg)
> +    GenerateTraffic(cfg).wait_pkts_and_stop(20000)
> +    cnts = _get_rx_cnts(cfg, prev=cnts)
> +    # 2 queues, 20k packets, must be at least 5k per queue
> +    ksft_ge(cnts[0], 5000, "traffic on main context (1/2): " + str(cnts))
> +    ksft_ge(cnts[1], 5000, "traffic on main context (2/2): " + str(cnts))
> +    # The other queues should be unused
> +    ksft_eq(sum(cnts[2:]), 0, "traffic on unused queues: " + str(cnts))
> +
> +    # Restore, and check traffic gets spread again
> +    ethtool(f"-X {cfg.ifname} default")
> +
> +    cnts = _get_rx_cnts(cfg)
> +    GenerateTraffic(cfg).wait_pkts_and_stop(20000)
> +    cnts = _get_rx_cnts(cfg, prev=cnts)
> +    # First two queues get less traffic than all the rest
> +    ksft_ge(sum(cnts[2:]), sum(cnts[:2]), "traffic distributed: " + str(cnts))

Now that you added ksft_lt, this can be made to actually match the comment:

    ksft_lt(sum(cnts[:2]), sum(cnts[2:]), "traffic distributed: " + str(cnts))

> +
> +
> +def test_rss_context(cfg, ctx_cnt=1, create_with_cfg=None):
> +    """
> +    Test separating traffic into RSS contexts.
> +    The queues will be allocated 2 for each context:
> +     ctx0  ctx1  ctx2  ctx3
> +    [0 1] [2 3] [4 5] [6 7] ...
> +    """
> +
> +    require_ntuple(cfg)
> +
> +    requested_ctx_cnt = ctx_cnt
> +
> +    # Try to allocate more queues when necessary
> +    qcnt = len(_get_rx_cnts(cfg))
> +    if qcnt >= 2 + 2 * ctx_cnt:
> +        qcnt = None
> +    else:
> +        try:
> +            ksft_pr(f"Increasing queue count {qcnt} -> {2 + 2 * ctx_cnt}")
> +            ethtool(f"-L {cfg.ifname} combined {2 + 2 * ctx_cnt}")
> +        except:
> +            raise KsftSkipEx("Not enough queues for the test")
> +
> +    ntuple = []
> +    ctx_id = []
> +    ports = []
> +    try:
> +        # Use queues 0 and 1 for normal traffic
> +        ethtool(f"-X {cfg.ifname} equal 2")
> +
> +        for i in range(ctx_cnt):
> +            want_cfg = f"start {2 + i * 2} equal 2"
> +            create_cfg = want_cfg if create_with_cfg else ""
> +
> +            try:
> +                ctx_id.append(ethtool_create(cfg, "-X", f"context new {create_cfg}"))
> +            except CmdExitFailure:
> +                # try to carry on and skip at the end
> +                if i == 0:
> +                    raise
> +                ksft_pr(f"Failed to create context {i + 1}, trying to test what we got")
> +                ctx_cnt = i
> +                break
> +
> +            if not create_with_cfg:
> +                ethtool(f"-X {cfg.ifname} context {ctx_id[i]} {want_cfg}")
> +
> +            # Sanity check the context we just created
> +            data = get_rss(cfg, ctx_id[i])
> +            ksft_eq(min(data['rss-indirection-table']), 2 + i * 2, "Unexpected context cfg: " + str(data))
> +            ksft_eq(max(data['rss-indirection-table']), 2 + i * 2 + 1, "Unexpected context cfg: " + str(data))
> +
> +            ports.append(rand_port())
> +            flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id[i]}"
> +            ntuple.append(ethtool_create(cfg, "-N", flow))
> +
> +        for i in range(ctx_cnt):
> +            cnts = _get_rx_cnts(cfg)
> +            GenerateTraffic(cfg, port=ports[i]).wait_pkts_and_stop(20000)
> +            cnts = _get_rx_cnts(cfg, prev=cnts)
> +
> +            ksft_lt(sum(cnts[ :2]), 10000, "traffic on main context:" + str(cnts))
> +            ksft_ge(sum(cnts[2+i*2:4+i*2]), 20000, f"traffic on context {i}: " + str(cnts))
> +            ksft_eq(sum(cnts[2:2+i*2] + cnts[4+i*2:]), 0, "traffic on other contexts: " + str(cnts))
> +    finally:
> +        for nid in ntuple:
> +            ethtool(f"-N {cfg.ifname} delete {nid}")
> +        for cid in ctx_id:
> +            ethtool(f"-X {cfg.ifname} context {cid} delete")
> +        ethtool(f"-X {cfg.ifname} default")
> +        if qcnt:
> +            ethtool(f"-L {cfg.ifname} combined {qcnt}")
> +
> +    if requested_ctx_cnt != ctx_cnt:
> +        raise KsftSkipEx(f"Tested only {ctx_cnt} contexts, wanted {requested_ctx_cnt}")
> +
> +
> +def test_rss_context4(cfg):
> +    test_rss_context(cfg, 4)
> +
> +
> +def test_rss_context32(cfg):
> +    test_rss_context(cfg, 32)
> +
> +
> +def test_rss_context4_create_with_cfg(cfg):
> +    test_rss_context(cfg, 4, create_with_cfg=True)
> +
> +
> +def test_rss_context_out_of_order(cfg, ctx_cnt=4):
> +    """
> +    Test separating traffic into RSS contexts.
> +    Contexts are removed in semi-random order, and steering re-tested
> +    to make sure removal doesn't break steering to surviving contexts.
> +    Test requires 3 contexts to work.
> +    """
> +
> +    require_ntuple(cfg)
> +
> +    requested_ctx_cnt = ctx_cnt
> +
> +    # Try to allocate more queues when necessary
> +    qcnt = len(_get_rx_cnts(cfg))
> +    if qcnt >= 2 + 2 * ctx_cnt:
> +        qcnt = None
> +    else:
> +        try:
> +            ksft_pr(f"Increasing queue count {qcnt} -> {2 + 2 * ctx_cnt}")
> +            ethtool(f"-L {cfg.ifname} combined {2 + 2 * ctx_cnt}")
> +        except:
> +            raise KsftSkipEx("Not enough queues for the test")

There are variations on this in each of the three tests. It would make
sense to extract to a helper, or perhaps even write as a context
manager. Untested:

class require_contexts:
    def __init__(self, cfg, count):
        self._cfg = cfg
        self._count = count
        self._qcnt = None

    def __enter__(self):
        qcnt = len(_get_rx_cnts(self._cfg))
        if qcnt >= self._count:
            return
        try:
            ksft_pr(f"Increasing queue count {qcnt} -> {self._count}")
            ethtool(f"-L {self._cfg.ifname} combined {self._count}")
            self._qcnt = qcnt
        except:
            raise KsftSkipEx("Not enough queues for the test")

    def __exit__(self, exc_type, exc_value, traceback):
        if self._qcnt is not None:
            ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")

Then:

    with require_contexts(cfg, 2 + 2 * ctx_cnt):
        ...

> +
> +    removed = [False] * ctx_cnt
> +    ntuple = []
> +    ctx_id = []
> +    ports = []
> +
> +    def remove_ctx(idx):
> +        ethtool(f"-N {cfg.ifname} delete {ntuple[idx]}")
> +        del ntuple[idx]
> +        ethtool(f"-X {cfg.ifname} context {ctx_id[idx]} delete")
> +        del ctx_id[idx]
> +        removed[idx] = True
> +
> +    def check_traffic():
> +        for i in range(ctx_cnt):
> +            cnts = _get_rx_cnts(cfg)
> +            GenerateTraffic(cfg, port=ports[i]).wait_pkts_and_stop(20000)
> +            cnts = _get_rx_cnts(cfg, prev=cnts)
> +
> +            if not removed[i]:
> +                ksft_lt(sum(cnts[ :2]), 10000, "traffic on main context:" + str(cnts))
> +                ksft_ge(sum(cnts[2+i*2:4+i*2]), 20000, f"traffic on context {i}: " + str(cnts))
> +                ksft_eq(sum(cnts[2:2+i*2] + cnts[4+i*2:]), 0, "traffic on other contexts: " + str(cnts))
> +            else:
> +                ksft_ge(sum(cnts[ :2]), 20000, "traffic on main context:" + str(cnts))
> +                ksft_eq(sum(cnts[2: ]),     0, "traffic on other contexts: " + str(cnts))
> +
> +    try:
> +        # Use queues 0 and 1 for normal traffic
> +        ethtool(f"-X {cfg.ifname} equal 2")
> +
> +        for i in range(ctx_cnt):
> +            ctx_id.append(ethtool_create(cfg, "-X", f"context new start {2 + i * 2} equal 2"))
> +
> +            ports.append(rand_port())
> +            flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id[i]}"
> +            ntuple.append(ethtool_create(cfg, "-N", flow))
> +
> +        check_traffic()
> +
> +        # Remove middle context
> +        remove_ctx(ctx_cnt // 2)
> +        check_traffic()
> +
> +        # Remove first context
> +        remove_ctx(0)
> +        check_traffic()
> +
> +        # Remove last context
> +        remove_ctx(-1)
> +        check_traffic()

I feel like this works by luck, the removed[] indices are straight,
whereas the ntuple[] and ctx_id[] ones shift around as elements are
removed from the array. I don't mind terribly much, it's not very likely
that somebody adds more legs into this test, but IMHO it would be
cleaner instead of deleting, to replace the deleted element with a None.
And then deleted[] is not even needed. (But then the loops below need an
extra condition.)

> +
> +    finally:
> +        for nid in ntuple:
> +            ethtool(f"-N {cfg.ifname} delete {nid}")
> +        for cid in ctx_id:
> +            ethtool(f"-X {cfg.ifname} context {cid} delete")
> +        ethtool(f"-X {cfg.ifname} default")
> +        if qcnt:
> +            ethtool(f"-L {cfg.ifname} combined {qcnt}")
> +
> +    if requested_ctx_cnt != ctx_cnt:
> +        raise KsftSkipEx(f"Tested only {ctx_cnt} contexts, wanted {requested_ctx_cnt}")
> +
> +
> +def test_rss_context_overlap(cfg, other_ctx=0):
> +    """
> +    Test contexts overlapping with each other.
> +    Use 4 queues for the main context, but only queues 2 and 3 for context 1.
> +    """
> +
> +    require_ntuple(cfg)
> +
> +    queue_cnt = len(_get_rx_cnts(cfg))
> +    if queue_cnt >= 4:
> +        queue_cnt = None
> +    else:
> +        try:
> +            ksft_pr(f"Increasing queue count {queue_cnt} -> 4")
> +            ethtool(f"-L {cfg.ifname} combined 4")
> +        except:
> +            raise KsftSkipEx("Not enough queues for the test")
> +
> +    ctx_id = None
> +    ntuple = None
> +    if other_ctx == 0:
> +        ethtool(f"-X {cfg.ifname} equal 4")
> +    else:
> +        other_ctx = ethtool_create(cfg, "-X", "context new")
> +        ethtool(f"-X {cfg.ifname} context {other_ctx} equal 4")
> +
> +    try:
> +        ctx_id = ethtool_create(cfg, "-X", "context new")
> +        ethtool(f"-X {cfg.ifname} context {ctx_id} start 2 equal 2")
> +
> +        port = rand_port()
> +        if other_ctx:
> +            flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {other_ctx}"
> +            ntuple = ethtool_create(cfg, "-N", flow)
> +
> +        # Test the main context
> +        cnts = _get_rx_cnts(cfg)
> +        GenerateTraffic(cfg, port=port).wait_pkts_and_stop(20000)
> +        cnts = _get_rx_cnts(cfg, prev=cnts)
> +
> +        ksft_ge(sum(cnts[ :4]), 20000, "traffic on main context: " + str(cnts))
> +        ksft_ge(sum(cnts[ :2]),  7000, "traffic on main context (1/2): " + str(cnts))
> +        ksft_ge(sum(cnts[2:4]),  7000, "traffic on main context (2/2): " + str(cnts))
> +        if other_ctx == 0:
> +            ksft_eq(sum(cnts[4: ]),     0, "traffic on other queues: " + str(cnts))
> +
> +        # Now create a rule for context 1 and make sure traffic goes to a subset
> +        if other_ctx:
> +            ethtool(f"-N {cfg.ifname} delete {ntuple}")
> +            ntuple = None
> +        flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}"
> +        ntuple = ethtool_create(cfg, "-N", flow)
> +
> +        cnts = _get_rx_cnts(cfg)
> +        GenerateTraffic(cfg, port=port).wait_pkts_and_stop(20000)
> +        cnts = _get_rx_cnts(cfg, prev=cnts)
> +
> +        ksft_lt(sum(cnts[ :2]),  7000, "traffic on main context: " + str(cnts))
> +        ksft_ge(sum(cnts[2:4]), 20000, "traffic on extra context: " + str(cnts))
> +        if other_ctx == 0:
> +            ksft_eq(sum(cnts[4: ]),     0, "traffic on other queues: " + str(cnts))
> +    finally:
> +        if ntuple is not None:
> +            ethtool(f"-N {cfg.ifname} delete {ntuple}")
> +        if ctx_id:
> +            ethtool(f"-X {cfg.ifname} context {ctx_id} delete")
> +        if other_ctx == 0:
> +            ethtool(f"-X {cfg.ifname} default")
> +        else:
> +            ethtool(f"-X {cfg.ifname} context {other_ctx} delete")
> +        if queue_cnt:
> +            ethtool(f"-L {cfg.ifname} combined {queue_cnt}")
> +
> +
> +def test_rss_context_overlap2(cfg):
> +    test_rss_context_overlap(cfg, True)
> +
> +
> +def main() -> None:
> +    with NetDrvEpEnv(__file__, nsim_test=False) as cfg:
> +        cfg.netdevnl = NetdevFamily()
> +
> +        ksft_run([test_rss_key_indir,
> +                  test_rss_context, test_rss_context4, test_rss_context32,
> +                  test_rss_context_overlap, test_rss_context_overlap2,
> +                  test_rss_context_out_of_order, test_rss_context4_create_with_cfg],
> +                 args=(cfg, ))
> +    ksft_exit()
> +
> +
> +if __name__ == "__main__":
> +    main()
> diff --git a/tools/testing/selftests/drivers/net/lib/py/load.py b/tools/testing/selftests/drivers/net/lib/py/load.py
> index 31f82f1e32c1..0f40a13926d0 100644
> --- a/tools/testing/selftests/drivers/net/lib/py/load.py
> +++ b/tools/testing/selftests/drivers/net/lib/py/load.py
> @@ -5,13 +5,14 @@ import time
>  from lib.py import ksft_pr, cmd, ip, rand_port, wait_port_listen
>  
>  class GenerateTraffic:
> -    def __init__(self, env):
> +    def __init__(self, env, port=None):
>          env.require_cmd("iperf3", remote=True)
>  
>          self.env = env
>  
> -        port = rand_port()
> -        self._iperf_server = cmd(f"iperf3 -s -p {port}", background=True)
> +        if port is None:
> +            port = rand_port()
> +        self._iperf_server = cmd(f"iperf3 -s -1 -p {port}", background=True)
>          wait_port_listen(port)
>          time.sleep(0.1)
>          self._iperf_client = cmd(f"iperf3 -c {env.addr} -P 16 -p {port} -t 86400",
> diff --git a/tools/testing/selftests/net/lib/py/ksft.py b/tools/testing/selftests/net/lib/py/ksft.py
> index 4769b4eb1ea1..45ffe277d94a 100644
> --- a/tools/testing/selftests/net/lib/py/ksft.py
> +++ b/tools/testing/selftests/net/lib/py/ksft.py
> @@ -57,6 +57,11 @@ KSFT_RESULT_ALL = True
>          _fail("Check failed", a, "<", b, comment)
>  
>  
> +def ksft_lt(a, b, comment=""):
> +    if a >= b:
> +        _fail("Check failed", a, ">", b, comment)
> +
> +
>  class ksft_raises:
>      def __init__(self, expected_type):
>          self.exception = None
> diff --git a/tools/testing/selftests/net/lib/py/utils.py b/tools/testing/selftests/net/lib/py/utils.py
> index 11dbdd3b7612..231e4a2f0252 100644
> --- a/tools/testing/selftests/net/lib/py/utils.py
> +++ b/tools/testing/selftests/net/lib/py/utils.py
> @@ -8,6 +8,10 @@ import subprocess
>  import time
>  
>  
> +class CmdExitFailure(Exception):
> +    pass
> +
> +
>  class cmd:
>      def __init__(self, comm, shell=True, fail=True, ns=None, background=False, host=None, timeout=5):
>          if ns:
> @@ -42,8 +46,8 @@ import time
>          if self.proc.returncode != 0 and fail:
>              if len(stderr) > 0 and stderr[-1] == "\n":
>                  stderr = stderr[:-1]
> -            raise Exception("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" %
> -                            (self.proc.args, stdout, stderr))
> +            raise CmdExitFailure("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" %
> +                                 (self.proc.args, stdout, stderr))
>  
>  
>  class bkg(cmd):
Jakub Kicinski June 25, 2024, 1:50 p.m. UTC | #2
On Tue, 25 Jun 2024 12:42:22 +0200 Petr Machata wrote:
> > +def test_rss_key_indir(cfg):
> > +    """
> > +    Test basics like updating the main RSS key and indirection table.
> > +    """
> > +    if len(_get_rx_cnts(cfg)) < 2:
> > +        KsftSkipEx("Device has only one queue (or doesn't support queue stats)")  
> 
> I'm not sure, is this admin-correctible configuration issue? It looks
> like this and some others should rather be XFAIL.

TBH I don't have a good compass on what should be XFAIL and what should
be SKIP in HW tests. Once vendors start running these we'll get more
experience (there's only one test using Xfail in HW now).
> > +    cnts = _get_rx_cnts(cfg)
> > +    GenerateTraffic(cfg).wait_pkts_and_stop(20000)
> > +    cnts = _get_rx_cnts(cfg, prev=cnts)
> > +    # First two queues get less traffic than all the rest
> > +    ksft_ge(sum(cnts[2:]), sum(cnts[:2]), "traffic distributed: " + str(cnts))  
> 
> Now that you added ksft_lt, this can be made to actually match the comment:
> 
>     ksft_lt(sum(cnts[:2]), sum(cnts[2:]), "traffic distributed: " + str(cnts))

ack

> > +    # Try to allocate more queues when necessary
> > +    qcnt = len(_get_rx_cnts(cfg))
> > +    if qcnt >= 2 + 2 * ctx_cnt:
> > +        qcnt = None
> > +    else:
> > +        try:
> > +            ksft_pr(f"Increasing queue count {qcnt} -> {2 + 2 * ctx_cnt}")
> > +            ethtool(f"-L {cfg.ifname} combined {2 + 2 * ctx_cnt}")
> > +        except:
> > +            raise KsftSkipEx("Not enough queues for the test")  
> 
> There are variations on this in each of the three tests. It would make
> sense to extract to a helper, or perhaps even write as a context
> manager. Untested:
> 
> class require_contexts:
>     def __init__(self, cfg, count):
>         self._cfg = cfg
>         self._count = count
>         self._qcnt = None
> 
>     def __enter__(self):
>         qcnt = len(_get_rx_cnts(self._cfg))
>         if qcnt >= self._count:
>             return
>         try:
>             ksft_pr(f"Increasing queue count {qcnt} -> {self._count}")
>             ethtool(f"-L {self._cfg.ifname} combined {self._count}")
>             self._qcnt = qcnt
>         except:
>             raise KsftSkipEx("Not enough queues for the test")
> 
>     def __exit__(self, exc_type, exc_value, traceback):
>         if self._qcnt is not None:
>             ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
> 
> Then:
> 
>     with require_contexts(cfg, 2 + 2 * ctx_cnt):
>         ...

There are 4 things to clean up, with doesn't cover all of them
naturally and complicates the code.

Once again, I'm thinking about adding some form of deferred execution.
	
	ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
	undo(ethtool, f"-L {self._cfg.ifname} combined {old_qcnt}")

Where cleanups will be executed in reverse order by ksft_run() after
the test, with the option to delete them.

	nid = ethtool_create(cfg, "-N", flow)
	ntuple = undo(ethtool, f"-N {cfg.ifname} delete {nid}")
	# .. code using ntuple ...
	ntuple.exec()
	# .. now ntuple is gone

or/and:

	nid = ethtool_create(cfg, "-N", flow)
	with undo(ethtool, f"-N {cfg.ifname} delete {nid}"):
		# .. code using ntuple ...
	# .. now ntuple is gone

Thoughts?

> > +        # Remove last context
> > +        remove_ctx(-1)
> > +        check_traffic()  
> 
> I feel like this works by luck, the removed[] indices are straight,
> whereas the ntuple[] and ctx_id[] ones shift around as elements are
> removed from the array. I don't mind terribly much, it's not very likely
> that somebody adds more legs into this test, but IMHO it would be
> cleaner instead of deleting, to replace the deleted element with a None.
> And then deleted[] is not even needed. (But then the loops below need an
> extra condition.)

I'd argue with the "by luck" but I see your point that it's suspicious,
I'll change it :)
Petr Machata June 25, 2024, 2:43 p.m. UTC | #3
Jakub Kicinski <kuba@kernel.org> writes:

> On Tue, 25 Jun 2024 12:42:22 +0200 Petr Machata wrote:
>> > +def test_rss_key_indir(cfg):
>> > +    """
>> > +    Test basics like updating the main RSS key and indirection table.
>> > +    """
>> > +    if len(_get_rx_cnts(cfg)) < 2:
>> > +        KsftSkipEx("Device has only one queue (or doesn't support queue stats)")  
>> 
>> I'm not sure, is this admin-correctible configuration issue? It looks
>> like this and some others should rather be XFAIL.
>
> TBH I don't have a good compass on what should be XFAIL and what should
> be SKIP in HW tests. Once vendors start running these we'll get more
> experience (there's only one test using Xfail in HW now).

Sure, me neither.

>> > +    # Try to allocate more queues when necessary
>> > +    qcnt = len(_get_rx_cnts(cfg))
>> > +    if qcnt >= 2 + 2 * ctx_cnt:
>> > +        qcnt = None
>> > +    else:
>> > +        try:
>> > +            ksft_pr(f"Increasing queue count {qcnt} -> {2 + 2 * ctx_cnt}")
>> > +            ethtool(f"-L {cfg.ifname} combined {2 + 2 * ctx_cnt}")
>> > +        except:
>> > +            raise KsftSkipEx("Not enough queues for the test")  
>> 
>> There are variations on this in each of the three tests. It would make
>> sense to extract to a helper, or perhaps even write as a context
>> manager. Untested:
>> 
>> class require_contexts:
>>     def __init__(self, cfg, count):
>>         self._cfg = cfg
>>         self._count = count
>>         self._qcnt = None
>> 
>>     def __enter__(self):
>>         qcnt = len(_get_rx_cnts(self._cfg))
>>         if qcnt >= self._count:
>>             return
>>         try:
>>             ksft_pr(f"Increasing queue count {qcnt} -> {self._count}")
>>             ethtool(f"-L {self._cfg.ifname} combined {self._count}")
>>             self._qcnt = qcnt
>>         except:
>>             raise KsftSkipEx("Not enough queues for the test")
>> 
>>     def __exit__(self, exc_type, exc_value, traceback):
>>         if self._qcnt is not None:
>>             ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
>> 
>> Then:
>> 
>>     with require_contexts(cfg, 2 + 2 * ctx_cnt):
>>         ...
>
> There are 4 things to clean up, with doesn't cover all of them
> naturally and complicates the code.

Yeah, you can't use it everywhere, but you can use it for the ethtool
config here.

Re complexity, how about this?

import contextlib

@contextlib.contextmanager
def require_contexts(cfg, count):
    qcnt = len(_get_rx_cnts(cfg))
    if qcnt >= count:
        qcnt = None
    else:
        try:
            ksft_pr(f"Increasing queue count {qcnt} -> {count}")
            ethtool(f"-L {cfg.ifname} combined {count}")
        except:
            raise KsftSkipEx("Not enough queues for the test")

    try:
        yield
    finally:
        if qcnt is not None:
            ethtool(f"-L {cfg.ifname} combined {qcnt}")

This is mostly just business logic, hardly any boilerplate, and still
just uses standard Python. You get the setup and cleanup next to each
other, which is important for cross-comparing the two.

Anyway, if I don't persuade you for The Right Path, something like this
would at least get rid of the duplication:

    qcnt = contexts_setup(cfg, 2 + 2 * ctx_cnt)
    try:
        ...
    finally:
        if qcnt:
            contexts_teardown(cfg, qcnt)

> Once again, I'm thinking about adding some form of deferred execution.
> 	
> 	ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
> 	undo(ethtool, f"-L {self._cfg.ifname} combined {old_qcnt}")
>
> Where cleanups will be executed in reverse order by ksft_run() after
> the test, with the option to delete them.
>
> 	nid = ethtool_create(cfg, "-N", flow)
> 	ntuple = undo(ethtool, f"-N {cfg.ifname} delete {nid}")
> 	# .. code using ntuple ...
> 	ntuple.exec()
> 	# .. now ntuple is gone
>
> or/and:
>
> 	nid = ethtool_create(cfg, "-N", flow)
> 	with undo(ethtool, f"-N {cfg.ifname} delete {nid}"):
> 		# .. code using ntuple ...
> 	# .. now ntuple is gone
>
> Thoughts?

Sure, this can be done, but you are introducing a new mechanism to solve
something that the language has had support for for 15 years or so.

Like, it's not terrible. I like it better than the try/finally aprroach,
because at least the setup and cleanup are localized.

Call it defer though? It doesn't "undo" there and then, but at some
later point.
Jakub Kicinski June 25, 2024, 5:06 p.m. UTC | #4
On Tue, 25 Jun 2024 16:43:55 +0200 Petr Machata wrote:
> > There are 4 things to clean up, with doesn't cover all of them
> > naturally and complicates the code.  
> 
> Yeah, you can't use it everywhere, but you can use it for the ethtool
> config here.
> 
> Re complexity, how about this?
> 
> import contextlib
> 
> @contextlib.contextmanager
> def require_contexts(cfg, count):
>     qcnt = len(_get_rx_cnts(cfg))
>     if qcnt >= count:
>         qcnt = None
>     else:
>         try:
>             ksft_pr(f"Increasing queue count {qcnt} -> {count}")
>             ethtool(f"-L {cfg.ifname} combined {count}")
>         except:
>             raise KsftSkipEx("Not enough queues for the test")
> 
>     try:
>         yield
>     finally:
>         if qcnt is not None:
>             ethtool(f"-L {cfg.ifname} combined {qcnt}")
> 
> This is mostly just business logic, hardly any boilerplate, and still
> just uses standard Python. You get the setup and cleanup next to each
> other, which is important for cross-comparing the two.

TBH I don't really understand of how the above works.

> Anyway, if I don't persuade you for The Right Path, something like this
> would at least get rid of the duplication:
> 
>     qcnt = contexts_setup(cfg, 2 + 2 * ctx_cnt)
>     try:
>         ...
>     finally:
>         if qcnt:
>             contexts_teardown(cfg, qcnt)

Are we discussing this exact test script or general guidance?

If the general guidance, my principle is to make the test look like
a list of bash commands as much as possible. Having to wrap
every single command you need to undo with a context manager
will take us pretty far from a linear script.

That's why I'd prefer if we provided a mechanism which makes
it easy to defer execution, rather than focus on particular cases.

> > Once again, I'm thinking about adding some form of deferred execution.
> > 	
> > 	ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
> > 	undo(ethtool, f"-L {self._cfg.ifname} combined {old_qcnt}")
> >
> > Where cleanups will be executed in reverse order by ksft_run() after
> > the test, with the option to delete them.
> >
> > 	nid = ethtool_create(cfg, "-N", flow)
> > 	ntuple = undo(ethtool, f"-N {cfg.ifname} delete {nid}")
> > 	# .. code using ntuple ...
> > 	ntuple.exec()
> > 	# .. now ntuple is gone
> >
> > or/and:
> >
> > 	nid = ethtool_create(cfg, "-N", flow)
> > 	with undo(ethtool, f"-N {cfg.ifname} delete {nid}"):
> > 		# .. code using ntuple ...
> > 	# .. now ntuple is gone
> >
> > Thoughts?  
> 
> Sure, this can be done, but you are introducing a new mechanism to solve
> something that the language has had support for for 15 years or so.

Well, I can't make the try: yield work for me :(

#!/bin/python3

import contextlib

@contextlib.contextmanager
def bla():
    try:
        yield
    except:
        print("deferred thing")

bla()
print("done")


Gives me:
$ ./test.py 
done


I don't know enough Python, IDK if we can assume much Python expertise
from others.

What we basically want is a form of atexit:

https://docs.python.org/3/library/atexit.html

The fact atexit module exists makes me wonder whether the problem is
really solved by the language itself. But maybe there's a deeper reason
for atexit.

> Like, it's not terrible. I like it better than the try/finally aprroach,
> because at least the setup and cleanup are localized.
> 
> Call it defer though? It doesn't "undo" there and then, but at some
> later point.

I like "defer". We're enqueuing for deferred exec. So another option
would be "enqueue". But I think "defer" is indeed better.

rm = defer(cmd, "rm example.txt")
rm.exec()   # run now, remove from internal queue
rm.cancel() # remove from queue, don't run

?
Jakub Kicinski June 25, 2024, 5:41 p.m. UTC | #5
On Tue, 25 Jun 2024 10:06:49 -0700 Jakub Kicinski wrote:
> @contextlib.contextmanager
> def bla():
>     try:
>         yield
>     except:
>         print("deferred thing")

Ah, finally, not except. That works.
Petr Machata June 26, 2024, 8:42 a.m. UTC | #6
Jakub Kicinski <kuba@kernel.org> writes:

> On Tue, 25 Jun 2024 16:43:55 +0200 Petr Machata wrote:
>> > There are 4 things to clean up, with doesn't cover all of them
>> > naturally and complicates the code.  
>> 
>> Yeah, you can't use it everywhere, but you can use it for the ethtool
>> config here.
>> 
>> Re complexity, how about this?
>> 
>> import contextlib
>> 
>> @contextlib.contextmanager
>> def require_contexts(cfg, count):
>>     qcnt = len(_get_rx_cnts(cfg))
>>     if qcnt >= count:
>>         qcnt = None
>>     else:
>>         try:
>>             ksft_pr(f"Increasing queue count {qcnt} -> {count}")
>>             ethtool(f"-L {cfg.ifname} combined {count}")
>>         except:
>>             raise KsftSkipEx("Not enough queues for the test")
>> 
>>     try:
>>         yield
>>     finally:
>>         if qcnt is not None:
>>             ethtool(f"-L {cfg.ifname} combined {qcnt}")
>> 
>> This is mostly just business logic, hardly any boilerplate, and still
>> just uses standard Python. You get the setup and cleanup next to each
>> other, which is important for cross-comparing the two.
>
> TBH I don't really understand of how the above works.

The decorator transforms the function into a context manager. yield
marks the point where the with: body runs, whetever is before is
initialization, whatever is after is cleanup. The try: finally: wrapper
is there in case the with body ends with an exception.

>> Anyway, if I don't persuade you for The Right Path, something like this
>> would at least get rid of the duplication:
>> 
>>     qcnt = contexts_setup(cfg, 2 + 2 * ctx_cnt)
>>     try:
>>         ...
>>     finally:
>>         if qcnt:
>>             contexts_teardown(cfg, qcnt)
>
> Are we discussing this exact test script or general guidance?
>
> If the general guidance, my principle is to make the test look like
> a list of bash commands as much as possible. Having to wrap
> every single command you need to undo with a context manager
> will take us pretty far from a linear script.
>
> That's why I'd prefer if we provided a mechanism which makes
> it easy to defer execution, rather than focus on particular cases.

I meant it as a general principle. Python has tools to solve the cleanup
problem very elegantly. I asked around in the meantime, people don't
generally find it hard to understand. It's a simple concept. It might be
_foreign_, I'll grant you that, but so will whatever new API you cook
up. And you can google the former easily enough.

OK, look, I believe I made my point. I don't want to split this
particular hair too much. I see you already sent something for defer,
so I'll take a look at that.

>> > Once again, I'm thinking about adding some form of deferred execution.
>> > 	
>> > 	ethtool(f"-L {self._cfg.ifname} combined {self._qcnt}")
>> > 	undo(ethtool, f"-L {self._cfg.ifname} combined {old_qcnt}")
>> >
>> > Where cleanups will be executed in reverse order by ksft_run() after
>> > the test, with the option to delete them.
>> >
>> > 	nid = ethtool_create(cfg, "-N", flow)
>> > 	ntuple = undo(ethtool, f"-N {cfg.ifname} delete {nid}")
>> > 	# .. code using ntuple ...
>> > 	ntuple.exec()
>> > 	# .. now ntuple is gone
>> >
>> > or/and:
>> >
>> > 	nid = ethtool_create(cfg, "-N", flow)
>> > 	with undo(ethtool, f"-N {cfg.ifname} delete {nid}"):
>> > 		# .. code using ntuple ...
>> > 	# .. now ntuple is gone
>> >
>> > Thoughts?  
>> 
>> Sure, this can be done, but you are introducing a new mechanism to solve
>> something that the language has had support for for 15 years or so.
>
> Well, I can't make the try: yield work for me :(
>
> #!/bin/python3
>
> import contextlib
>
> @contextlib.contextmanager
> def bla():
>     try:
>         yield
>     except:
>         print("deferred thing")
>
> bla()
> print("done")
>
>
> Gives me:
> $ ./test.py 
> done
>
> I don't know enough Python, IDK if we can assume much Python expertise
> from others.
>
> What we basically want is a form of atexit:
>
> https://docs.python.org/3/library/atexit.html
>
> The fact atexit module exists makes me wonder whether the problem is
> really solved by the language itself. But maybe there's a deeper reason
> for atexit.

I think it's just incremental. atexit was introduced in Python 2.0, with
statements in Python 2.6, some eight years later.

>> Like, it's not terrible. I like it better than the try/finally aprroach,
>> because at least the setup and cleanup are localized.
>> 
>> Call it defer though? It doesn't "undo" there and then, but at some
>> later point.
>
> I like "defer". We're enqueuing for deferred exec. So another option
> would be "enqueue". But I think "defer" is indeed better.
>
> rm = defer(cmd, "rm example.txt")
> rm.exec()   # run now, remove from internal queue
> rm.cancel() # remove from queue, don't run

Looks good.
diff mbox series

Patch

diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
index 4933d045ab66..c9f2f48fc30f 100644
--- a/tools/testing/selftests/drivers/net/hw/Makefile
+++ b/tools/testing/selftests/drivers/net/hw/Makefile
@@ -11,6 +11,7 @@  TEST_PROGS = \
 	hw_stats_l3_gre.sh \
 	loopback.sh \
 	pp_alloc_fail.py \
+	rss_ctx.py \
 	#
 
 TEST_FILES := \
diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py
new file mode 100755
index 000000000000..c9c864d5f7d1
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py
@@ -0,0 +1,383 @@ 
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+import datetime
+import random
+from lib.py import ksft_run, ksft_pr, ksft_exit, ksft_eq, ksft_ge, ksft_lt
+from lib.py import NetDrvEpEnv
+from lib.py import NetdevFamily
+from lib.py import KsftSkipEx
+from lib.py import rand_port
+from lib.py import ethtool, ip, GenerateTraffic, CmdExitFailure
+
+
+def _rss_key_str(key):
+    return ":".join(["{:02x}".format(x) for x in key])
+
+
+def _rss_key_rand(length):
+    return [random.randint(0, 255) for _ in range(length)]
+
+
+def get_rss(cfg, context=0):
+    return ethtool(f"-x {cfg.ifname} context {context}", json=True)[0]
+
+
+def get_drop_err_sum(cfg):
+    stats = ip("-s -s link show dev " + cfg.ifname, json=True)[0]
+    cnt = 0
+    for key in ['errors', 'dropped', 'over_errors', 'fifo_errors',
+                'length_errors', 'crc_errors', 'missed_errors',
+                'frame_errors']:
+        cnt += stats["stats64"]["rx"][key]
+    return cnt, stats["stats64"]["tx"]["carrier_changes"]
+
+
+def ethtool_create(cfg, act, opts):
+    output = ethtool(f"{act} {cfg.ifname} {opts}").stdout
+    # Output will be something like: "New RSS context is 1" or
+    # "Added rule with ID 7", we want the integer from the end
+    return int(output.split()[-1])
+
+
+def require_ntuple(cfg):
+    features = ethtool(f"-k {cfg.ifname}", json=True)[0]
+    if not features["ntuple-filters"]["active"]:
+        # ntuple is more of a capability than a config knob, don't bother
+        # trying to enable it (until some driver actually needs it).
+        raise KsftSkipEx("Ntuple filters not enabled on the device: " + str(features["ntuple-filters"]))
+
+
+# Get Rx packet counts for all queues, as a simple list of integers
+# if @prev is specified the prev counts will be subtracted
+def _get_rx_cnts(cfg, prev=None):
+    cfg.wait_hw_stats_settle()
+    data = cfg.netdevnl.qstats_get({"ifindex": cfg.ifindex, "scope": ["queue"]}, dump=True)
+    data = [x for x in data if x['queue-type'] == "rx"]
+    max_q = max([x["queue-id"] for x in data])
+    queue_stats = [0] * (max_q + 1)
+    for q in data:
+        queue_stats[q["queue-id"]] = q["rx-packets"]
+        if prev and q["queue-id"] < len(prev):
+            queue_stats[q["queue-id"]] -= prev[q["queue-id"]]
+    return queue_stats
+
+
+def test_rss_key_indir(cfg):
+    """
+    Test basics like updating the main RSS key and indirection table.
+    """
+    if len(_get_rx_cnts(cfg)) < 2:
+        KsftSkipEx("Device has only one queue (or doesn't support queue stats)")
+
+    data = get_rss(cfg)
+    want_keys = ['rss-hash-key', 'rss-hash-function', 'rss-indirection-table']
+    for k in want_keys:
+        if k not in data:
+            raise KsftFailEx("ethtool results missing key: " + k)
+        if not data[k]:
+            raise KsftFailEx(f"ethtool results empty for '{k}': {data[k]}")
+
+    key_len = len(data['rss-hash-key'])
+
+    # Set the key
+    key = _rss_key_rand(key_len)
+    ethtool(f"-X {cfg.ifname} hkey " + _rss_key_str(key))
+
+    data = get_rss(cfg)
+    ksft_eq(key, data['rss-hash-key'])
+
+    # Set the indirection table
+    ethtool(f"-X {cfg.ifname} equal 2")
+    data = get_rss(cfg)
+    ksft_eq(0, min(data['rss-indirection-table']))
+    ksft_eq(1, max(data['rss-indirection-table']))
+
+    # Check we only get traffic on the first 2 queues
+    cnts = _get_rx_cnts(cfg)
+    GenerateTraffic(cfg).wait_pkts_and_stop(20000)
+    cnts = _get_rx_cnts(cfg, prev=cnts)
+    # 2 queues, 20k packets, must be at least 5k per queue
+    ksft_ge(cnts[0], 5000, "traffic on main context (1/2): " + str(cnts))
+    ksft_ge(cnts[1], 5000, "traffic on main context (2/2): " + str(cnts))
+    # The other queues should be unused
+    ksft_eq(sum(cnts[2:]), 0, "traffic on unused queues: " + str(cnts))
+
+    # Restore, and check traffic gets spread again
+    ethtool(f"-X {cfg.ifname} default")
+
+    cnts = _get_rx_cnts(cfg)
+    GenerateTraffic(cfg).wait_pkts_and_stop(20000)
+    cnts = _get_rx_cnts(cfg, prev=cnts)
+    # First two queues get less traffic than all the rest
+    ksft_ge(sum(cnts[2:]), sum(cnts[:2]), "traffic distributed: " + str(cnts))
+
+
+def test_rss_context(cfg, ctx_cnt=1, create_with_cfg=None):
+    """
+    Test separating traffic into RSS contexts.
+    The queues will be allocated 2 for each context:
+     ctx0  ctx1  ctx2  ctx3
+    [0 1] [2 3] [4 5] [6 7] ...
+    """
+
+    require_ntuple(cfg)
+
+    requested_ctx_cnt = ctx_cnt
+
+    # Try to allocate more queues when necessary
+    qcnt = len(_get_rx_cnts(cfg))
+    if qcnt >= 2 + 2 * ctx_cnt:
+        qcnt = None
+    else:
+        try:
+            ksft_pr(f"Increasing queue count {qcnt} -> {2 + 2 * ctx_cnt}")
+            ethtool(f"-L {cfg.ifname} combined {2 + 2 * ctx_cnt}")
+        except:
+            raise KsftSkipEx("Not enough queues for the test")
+
+    ntuple = []
+    ctx_id = []
+    ports = []
+    try:
+        # Use queues 0 and 1 for normal traffic
+        ethtool(f"-X {cfg.ifname} equal 2")
+
+        for i in range(ctx_cnt):
+            want_cfg = f"start {2 + i * 2} equal 2"
+            create_cfg = want_cfg if create_with_cfg else ""
+
+            try:
+                ctx_id.append(ethtool_create(cfg, "-X", f"context new {create_cfg}"))
+            except CmdExitFailure:
+                # try to carry on and skip at the end
+                if i == 0:
+                    raise
+                ksft_pr(f"Failed to create context {i + 1}, trying to test what we got")
+                ctx_cnt = i
+                break
+
+            if not create_with_cfg:
+                ethtool(f"-X {cfg.ifname} context {ctx_id[i]} {want_cfg}")
+
+            # Sanity check the context we just created
+            data = get_rss(cfg, ctx_id[i])
+            ksft_eq(min(data['rss-indirection-table']), 2 + i * 2, "Unexpected context cfg: " + str(data))
+            ksft_eq(max(data['rss-indirection-table']), 2 + i * 2 + 1, "Unexpected context cfg: " + str(data))
+
+            ports.append(rand_port())
+            flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id[i]}"
+            ntuple.append(ethtool_create(cfg, "-N", flow))
+
+        for i in range(ctx_cnt):
+            cnts = _get_rx_cnts(cfg)
+            GenerateTraffic(cfg, port=ports[i]).wait_pkts_and_stop(20000)
+            cnts = _get_rx_cnts(cfg, prev=cnts)
+
+            ksft_lt(sum(cnts[ :2]), 10000, "traffic on main context:" + str(cnts))
+            ksft_ge(sum(cnts[2+i*2:4+i*2]), 20000, f"traffic on context {i}: " + str(cnts))
+            ksft_eq(sum(cnts[2:2+i*2] + cnts[4+i*2:]), 0, "traffic on other contexts: " + str(cnts))
+    finally:
+        for nid in ntuple:
+            ethtool(f"-N {cfg.ifname} delete {nid}")
+        for cid in ctx_id:
+            ethtool(f"-X {cfg.ifname} context {cid} delete")
+        ethtool(f"-X {cfg.ifname} default")
+        if qcnt:
+            ethtool(f"-L {cfg.ifname} combined {qcnt}")
+
+    if requested_ctx_cnt != ctx_cnt:
+        raise KsftSkipEx(f"Tested only {ctx_cnt} contexts, wanted {requested_ctx_cnt}")
+
+
+def test_rss_context4(cfg):
+    test_rss_context(cfg, 4)
+
+
+def test_rss_context32(cfg):
+    test_rss_context(cfg, 32)
+
+
+def test_rss_context4_create_with_cfg(cfg):
+    test_rss_context(cfg, 4, create_with_cfg=True)
+
+
+def test_rss_context_out_of_order(cfg, ctx_cnt=4):
+    """
+    Test separating traffic into RSS contexts.
+    Contexts are removed in semi-random order, and steering re-tested
+    to make sure removal doesn't break steering to surviving contexts.
+    Test requires 3 contexts to work.
+    """
+
+    require_ntuple(cfg)
+
+    requested_ctx_cnt = ctx_cnt
+
+    # Try to allocate more queues when necessary
+    qcnt = len(_get_rx_cnts(cfg))
+    if qcnt >= 2 + 2 * ctx_cnt:
+        qcnt = None
+    else:
+        try:
+            ksft_pr(f"Increasing queue count {qcnt} -> {2 + 2 * ctx_cnt}")
+            ethtool(f"-L {cfg.ifname} combined {2 + 2 * ctx_cnt}")
+        except:
+            raise KsftSkipEx("Not enough queues for the test")
+
+    removed = [False] * ctx_cnt
+    ntuple = []
+    ctx_id = []
+    ports = []
+
+    def remove_ctx(idx):
+        ethtool(f"-N {cfg.ifname} delete {ntuple[idx]}")
+        del ntuple[idx]
+        ethtool(f"-X {cfg.ifname} context {ctx_id[idx]} delete")
+        del ctx_id[idx]
+        removed[idx] = True
+
+    def check_traffic():
+        for i in range(ctx_cnt):
+            cnts = _get_rx_cnts(cfg)
+            GenerateTraffic(cfg, port=ports[i]).wait_pkts_and_stop(20000)
+            cnts = _get_rx_cnts(cfg, prev=cnts)
+
+            if not removed[i]:
+                ksft_lt(sum(cnts[ :2]), 10000, "traffic on main context:" + str(cnts))
+                ksft_ge(sum(cnts[2+i*2:4+i*2]), 20000, f"traffic on context {i}: " + str(cnts))
+                ksft_eq(sum(cnts[2:2+i*2] + cnts[4+i*2:]), 0, "traffic on other contexts: " + str(cnts))
+            else:
+                ksft_ge(sum(cnts[ :2]), 20000, "traffic on main context:" + str(cnts))
+                ksft_eq(sum(cnts[2: ]),     0, "traffic on other contexts: " + str(cnts))
+
+    try:
+        # Use queues 0 and 1 for normal traffic
+        ethtool(f"-X {cfg.ifname} equal 2")
+
+        for i in range(ctx_cnt):
+            ctx_id.append(ethtool_create(cfg, "-X", f"context new start {2 + i * 2} equal 2"))
+
+            ports.append(rand_port())
+            flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id[i]}"
+            ntuple.append(ethtool_create(cfg, "-N", flow))
+
+        check_traffic()
+
+        # Remove middle context
+        remove_ctx(ctx_cnt // 2)
+        check_traffic()
+
+        # Remove first context
+        remove_ctx(0)
+        check_traffic()
+
+        # Remove last context
+        remove_ctx(-1)
+        check_traffic()
+
+    finally:
+        for nid in ntuple:
+            ethtool(f"-N {cfg.ifname} delete {nid}")
+        for cid in ctx_id:
+            ethtool(f"-X {cfg.ifname} context {cid} delete")
+        ethtool(f"-X {cfg.ifname} default")
+        if qcnt:
+            ethtool(f"-L {cfg.ifname} combined {qcnt}")
+
+    if requested_ctx_cnt != ctx_cnt:
+        raise KsftSkipEx(f"Tested only {ctx_cnt} contexts, wanted {requested_ctx_cnt}")
+
+
+def test_rss_context_overlap(cfg, other_ctx=0):
+    """
+    Test contexts overlapping with each other.
+    Use 4 queues for the main context, but only queues 2 and 3 for context 1.
+    """
+
+    require_ntuple(cfg)
+
+    queue_cnt = len(_get_rx_cnts(cfg))
+    if queue_cnt >= 4:
+        queue_cnt = None
+    else:
+        try:
+            ksft_pr(f"Increasing queue count {queue_cnt} -> 4")
+            ethtool(f"-L {cfg.ifname} combined 4")
+        except:
+            raise KsftSkipEx("Not enough queues for the test")
+
+    ctx_id = None
+    ntuple = None
+    if other_ctx == 0:
+        ethtool(f"-X {cfg.ifname} equal 4")
+    else:
+        other_ctx = ethtool_create(cfg, "-X", "context new")
+        ethtool(f"-X {cfg.ifname} context {other_ctx} equal 4")
+
+    try:
+        ctx_id = ethtool_create(cfg, "-X", "context new")
+        ethtool(f"-X {cfg.ifname} context {ctx_id} start 2 equal 2")
+
+        port = rand_port()
+        if other_ctx:
+            flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {other_ctx}"
+            ntuple = ethtool_create(cfg, "-N", flow)
+
+        # Test the main context
+        cnts = _get_rx_cnts(cfg)
+        GenerateTraffic(cfg, port=port).wait_pkts_and_stop(20000)
+        cnts = _get_rx_cnts(cfg, prev=cnts)
+
+        ksft_ge(sum(cnts[ :4]), 20000, "traffic on main context: " + str(cnts))
+        ksft_ge(sum(cnts[ :2]),  7000, "traffic on main context (1/2): " + str(cnts))
+        ksft_ge(sum(cnts[2:4]),  7000, "traffic on main context (2/2): " + str(cnts))
+        if other_ctx == 0:
+            ksft_eq(sum(cnts[4: ]),     0, "traffic on other queues: " + str(cnts))
+
+        # Now create a rule for context 1 and make sure traffic goes to a subset
+        if other_ctx:
+            ethtool(f"-N {cfg.ifname} delete {ntuple}")
+            ntuple = None
+        flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}"
+        ntuple = ethtool_create(cfg, "-N", flow)
+
+        cnts = _get_rx_cnts(cfg)
+        GenerateTraffic(cfg, port=port).wait_pkts_and_stop(20000)
+        cnts = _get_rx_cnts(cfg, prev=cnts)
+
+        ksft_lt(sum(cnts[ :2]),  7000, "traffic on main context: " + str(cnts))
+        ksft_ge(sum(cnts[2:4]), 20000, "traffic on extra context: " + str(cnts))
+        if other_ctx == 0:
+            ksft_eq(sum(cnts[4: ]),     0, "traffic on other queues: " + str(cnts))
+    finally:
+        if ntuple is not None:
+            ethtool(f"-N {cfg.ifname} delete {ntuple}")
+        if ctx_id:
+            ethtool(f"-X {cfg.ifname} context {ctx_id} delete")
+        if other_ctx == 0:
+            ethtool(f"-X {cfg.ifname} default")
+        else:
+            ethtool(f"-X {cfg.ifname} context {other_ctx} delete")
+        if queue_cnt:
+            ethtool(f"-L {cfg.ifname} combined {queue_cnt}")
+
+
+def test_rss_context_overlap2(cfg):
+    test_rss_context_overlap(cfg, True)
+
+
+def main() -> None:
+    with NetDrvEpEnv(__file__, nsim_test=False) as cfg:
+        cfg.netdevnl = NetdevFamily()
+
+        ksft_run([test_rss_key_indir,
+                  test_rss_context, test_rss_context4, test_rss_context32,
+                  test_rss_context_overlap, test_rss_context_overlap2,
+                  test_rss_context_out_of_order, test_rss_context4_create_with_cfg],
+                 args=(cfg, ))
+    ksft_exit()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/tools/testing/selftests/drivers/net/lib/py/load.py b/tools/testing/selftests/drivers/net/lib/py/load.py
index 31f82f1e32c1..0f40a13926d0 100644
--- a/tools/testing/selftests/drivers/net/lib/py/load.py
+++ b/tools/testing/selftests/drivers/net/lib/py/load.py
@@ -5,13 +5,14 @@  import time
 from lib.py import ksft_pr, cmd, ip, rand_port, wait_port_listen
 
 class GenerateTraffic:
-    def __init__(self, env):
+    def __init__(self, env, port=None):
         env.require_cmd("iperf3", remote=True)
 
         self.env = env
 
-        port = rand_port()
-        self._iperf_server = cmd(f"iperf3 -s -p {port}", background=True)
+        if port is None:
+            port = rand_port()
+        self._iperf_server = cmd(f"iperf3 -s -1 -p {port}", background=True)
         wait_port_listen(port)
         time.sleep(0.1)
         self._iperf_client = cmd(f"iperf3 -c {env.addr} -P 16 -p {port} -t 86400",
diff --git a/tools/testing/selftests/net/lib/py/ksft.py b/tools/testing/selftests/net/lib/py/ksft.py
index 4769b4eb1ea1..45ffe277d94a 100644
--- a/tools/testing/selftests/net/lib/py/ksft.py
+++ b/tools/testing/selftests/net/lib/py/ksft.py
@@ -57,6 +57,11 @@  KSFT_RESULT_ALL = True
         _fail("Check failed", a, "<", b, comment)
 
 
+def ksft_lt(a, b, comment=""):
+    if a >= b:
+        _fail("Check failed", a, ">", b, comment)
+
+
 class ksft_raises:
     def __init__(self, expected_type):
         self.exception = None
diff --git a/tools/testing/selftests/net/lib/py/utils.py b/tools/testing/selftests/net/lib/py/utils.py
index 11dbdd3b7612..231e4a2f0252 100644
--- a/tools/testing/selftests/net/lib/py/utils.py
+++ b/tools/testing/selftests/net/lib/py/utils.py
@@ -8,6 +8,10 @@  import subprocess
 import time
 
 
+class CmdExitFailure(Exception):
+    pass
+
+
 class cmd:
     def __init__(self, comm, shell=True, fail=True, ns=None, background=False, host=None, timeout=5):
         if ns:
@@ -42,8 +46,8 @@  import time
         if self.proc.returncode != 0 and fail:
             if len(stderr) > 0 and stderr[-1] == "\n":
                 stderr = stderr[:-1]
-            raise Exception("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" %
-                            (self.proc.args, stdout, stderr))
+            raise CmdExitFailure("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" %
+                                 (self.proc.args, stdout, stderr))
 
 
 class bkg(cmd):