Message ID | 20231009160520.20831-1-larysa.zaremba@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: add options and frags to xdp_hw_metadata | expand |
On 10/09, Larysa Zaremba wrote: > This is a follow-up to the commit 9b2b86332a9b ("bpf: Allow to use kfunc > XDP hints and frags together"). > > The are some possible implementations problems that may arise when > providing metadata specifically for multi-buffer packets, therefore there > must be a possibility to test such option separately. > > Add an option to use multi-buffer AF_XDP xdp_hw_metadata and mark used XDP > program as capable to use frags. > > As for now, xdp_hw_metadata accepts no options, so add simple option > parsing logic and a help message. > > For quick reference, also add an ingress packet generation command to the > help message. The command comes from [0]. > > Example of output for multi-buffer packet: > > xsk_ring_cons__peek: 1 > 0xead018: rx_desc[15]->addr=10000000000f000 addr=f100 comp_addr=f000 > rx_hash: 0x5789FCBB with RSS type:0x29 > rx_timestamp: 1696856851535324697 (sec:1696856851.5353) > XDP RX-time: 1696856843158256391 (sec:1696856843.1583) > delta sec:-8.3771 (-8377068.306 usec) > AF_XDP time: 1696856843158413078 (sec:1696856843.1584) > delta sec:0.0002 (156.687 usec) > 0xead018: complete idx=23 addr=f000 > xsk_ring_cons__peek: 1 > 0xead018: rx_desc[16]->addr=100000000008000 addr=8100 comp_addr=8000 > 0xead018: complete idx=24 addr=8000 > xsk_ring_cons__peek: 1 > 0xead018: rx_desc[17]->addr=100000000009000 addr=9100 comp_addr=9000 EoP > 0xead018: complete idx=25 addr=9000 > > Metadata is printed for the first packet only. > > [0] https://lore.kernel.org/all/20230119221536.3349901-18-sdf@google.com/ > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> > --- > .../selftests/bpf/progs/xdp_hw_metadata.c | 2 +- > tools/testing/selftests/bpf/xdp_hw_metadata.c | 92 ++++++++++++++++--- > 2 files changed, 79 insertions(+), 15 deletions(-) > > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > index 63d7de6c6bbb..8767d919c881 100644 > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > @@ -21,7 +21,7 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, > extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, > enum xdp_rss_hash_type *rss_type) __ksym; > > -SEC("xdp") > +SEC("xdp.frags") > int rx(struct xdp_md *ctx) > { > void *data, *data_meta, *data_end; > diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c > index 17c980138796..25225720346b 100644 > --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c > +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c > @@ -26,6 +26,7 @@ > #include <linux/sockios.h> > #include <sys/mman.h> > #include <net/if.h> > +#include <ctype.h> > #include <poll.h> > #include <time.h> > > @@ -49,19 +50,29 @@ struct xsk { > struct xdp_hw_metadata *bpf_obj; > struct xsk *rx_xsk; > const char *ifname; > +bool use_frags; > int ifindex; > int rxq; > > void test__fail(void) { /* for network_helpers.c */ } > > -static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) > +static struct xsk_socket_config gen_socket_config(void) > { > - int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > - const struct xsk_socket_config socket_config = { > + struct xsk_socket_config socket_config = { > .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > .bind_flags = XDP_COPY, > }; > + > + if (use_frags) > + socket_config.bind_flags |= XDP_USE_SG; > + return socket_config; > +} nit: why not drop const from socket_config and add this 'if (use_frags)' directly to open_xsk? Not sure separate function really buys us anything? > +static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) > +{ > + int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > + struct xsk_socket_config socket_config = gen_socket_config(); > const struct xsk_umem_config umem_config = { > .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, > @@ -263,11 +274,14 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t > verify_skb_metadata(server_fd); > > for (i = 0; i < rxq; i++) { > + bool first_seg = true; > + bool is_eop = true; > + > if (fds[i].revents == 0) > continue; > > struct xsk *xsk = &rx_xsk[i]; > - > +peek: > ret = xsk_ring_cons__peek(&xsk->rx, 1, &idx); > printf("xsk_ring_cons__peek: %d\n", ret); > if (ret != 1) > @@ -276,12 +290,19 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t > rx_desc = xsk_ring_cons__rx_desc(&xsk->rx, idx); > comp_addr = xsk_umem__extract_addr(rx_desc->addr); > addr = xsk_umem__add_offset_to_addr(rx_desc->addr); > - printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n", > - xsk, idx, rx_desc->addr, addr, comp_addr); > - verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), > - clock_id); > + is_eop = !(rx_desc->options & XDP_PKT_CONTD); > + printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx%s\n", > + xsk, idx, rx_desc->addr, addr, comp_addr, is_eop ? " EoP" : ""); > + if (first_seg) { > + verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), > + clock_id); > + first_seg = false; > + } > + > xsk_ring_cons__release(&xsk->rx, 1); > refill_rx(xsk, comp_addr); > + if (!is_eop) > + goto peek; > } > } > > @@ -404,6 +425,54 @@ static void timestamping_enable(int fd, int val) > error(1, errno, "setsockopt(SO_TIMESTAMPING)"); > } > > +static void print_usage(void) > +{ > + const char *usage = > + " Usage: xdp_hw_metadata [OPTIONS] [IFNAME]\n" > + " Options:\n" > + " -m Enable multi-buffer XDP for larger MTU\n" > + " -h Display this help and exit\n\n" > + " Generate test packets on the other machine with:\n" > + " echo -n xdp | nc -u -q1 <dst_ip> 9091\n"; nit: any reason we have two spaces in the help description? I don't think it's a standard practice, so maybe drop them?
On Mon, Oct 09, 2023 at 09:49:54AM -0700, Stanislav Fomichev wrote: > On 10/09, Larysa Zaremba wrote: > > This is a follow-up to the commit 9b2b86332a9b ("bpf: Allow to use kfunc > > XDP hints and frags together"). > > > > The are some possible implementations problems that may arise when > > providing metadata specifically for multi-buffer packets, therefore there > > must be a possibility to test such option separately. > > > > Add an option to use multi-buffer AF_XDP xdp_hw_metadata and mark used XDP > > program as capable to use frags. > > > > As for now, xdp_hw_metadata accepts no options, so add simple option > > parsing logic and a help message. > > > > For quick reference, also add an ingress packet generation command to the > > help message. The command comes from [0]. > > > > Example of output for multi-buffer packet: > > > > xsk_ring_cons__peek: 1 > > 0xead018: rx_desc[15]->addr=10000000000f000 addr=f100 comp_addr=f000 > > rx_hash: 0x5789FCBB with RSS type:0x29 > > rx_timestamp: 1696856851535324697 (sec:1696856851.5353) > > XDP RX-time: 1696856843158256391 (sec:1696856843.1583) > > delta sec:-8.3771 (-8377068.306 usec) > > AF_XDP time: 1696856843158413078 (sec:1696856843.1584) > > delta sec:0.0002 (156.687 usec) > > 0xead018: complete idx=23 addr=f000 > > xsk_ring_cons__peek: 1 > > 0xead018: rx_desc[16]->addr=100000000008000 addr=8100 comp_addr=8000 > > 0xead018: complete idx=24 addr=8000 > > xsk_ring_cons__peek: 1 > > 0xead018: rx_desc[17]->addr=100000000009000 addr=9100 comp_addr=9000 EoP > > 0xead018: complete idx=25 addr=9000 > > > > Metadata is printed for the first packet only. > > > > [0] https://lore.kernel.org/all/20230119221536.3349901-18-sdf@google.com/ > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> > > --- > > .../selftests/bpf/progs/xdp_hw_metadata.c | 2 +- > > tools/testing/selftests/bpf/xdp_hw_metadata.c | 92 ++++++++++++++++--- > > 2 files changed, 79 insertions(+), 15 deletions(-) > > > > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > index 63d7de6c6bbb..8767d919c881 100644 > > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > @@ -21,7 +21,7 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, > > extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, > > enum xdp_rss_hash_type *rss_type) __ksym; > > > > -SEC("xdp") > > +SEC("xdp.frags") > > int rx(struct xdp_md *ctx) > > { > > void *data, *data_meta, *data_end; > > diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c > > index 17c980138796..25225720346b 100644 > > --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c > > +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c > > @@ -26,6 +26,7 @@ > > #include <linux/sockios.h> > > #include <sys/mman.h> > > #include <net/if.h> > > +#include <ctype.h> > > #include <poll.h> > > #include <time.h> > > > > @@ -49,19 +50,29 @@ struct xsk { > > struct xdp_hw_metadata *bpf_obj; > > struct xsk *rx_xsk; > > const char *ifname; > > +bool use_frags; > > int ifindex; > > int rxq; > > > > void test__fail(void) { /* for network_helpers.c */ } > > > > -static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) > > +static struct xsk_socket_config gen_socket_config(void) > > { > > - int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > > - const struct xsk_socket_config socket_config = { > > + struct xsk_socket_config socket_config = { > > .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > > .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > > .bind_flags = XDP_COPY, > > }; > > + > > + if (use_frags) > > + socket_config.bind_flags |= XDP_USE_SG; > > + return socket_config; > > +} > > nit: why not drop const from socket_config and add this 'if (use_frags)' > directly to open_xsk? Not sure separate function really buys us anything? > Considering there will also be ZC/copy option, I thought it would be good to separate socket config creation. After giving this a sencond thought though, for now options would control bind_flags only. What do you this about removing gen_socket_config(), but introducing get_bind_flags()? > > +static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) > > +{ > > + int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > > + struct xsk_socket_config socket_config = gen_socket_config(); > > const struct xsk_umem_config umem_config = { > > .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > > .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, > > @@ -263,11 +274,14 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t > > verify_skb_metadata(server_fd); > > > > for (i = 0; i < rxq; i++) { > > + bool first_seg = true; > > + bool is_eop = true; > > + > > if (fds[i].revents == 0) > > continue; > > > > struct xsk *xsk = &rx_xsk[i]; > > - > > +peek: > > ret = xsk_ring_cons__peek(&xsk->rx, 1, &idx); > > printf("xsk_ring_cons__peek: %d\n", ret); > > if (ret != 1) > > @@ -276,12 +290,19 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t > > rx_desc = xsk_ring_cons__rx_desc(&xsk->rx, idx); > > comp_addr = xsk_umem__extract_addr(rx_desc->addr); > > addr = xsk_umem__add_offset_to_addr(rx_desc->addr); > > - printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n", > > - xsk, idx, rx_desc->addr, addr, comp_addr); > > - verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), > > - clock_id); > > + is_eop = !(rx_desc->options & XDP_PKT_CONTD); > > + printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx%s\n", > > + xsk, idx, rx_desc->addr, addr, comp_addr, is_eop ? " EoP" : ""); > > + if (first_seg) { > > + verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), > > + clock_id); > > + first_seg = false; > > + } > > + > > xsk_ring_cons__release(&xsk->rx, 1); > > refill_rx(xsk, comp_addr); > > + if (!is_eop) > > + goto peek; > > } > > } > > > > @@ -404,6 +425,54 @@ static void timestamping_enable(int fd, int val) > > error(1, errno, "setsockopt(SO_TIMESTAMPING)"); > > } > > > > +static void print_usage(void) > > +{ > > + const char *usage = > > + " Usage: xdp_hw_metadata [OPTIONS] [IFNAME]\n" > > + " Options:\n" > > + " -m Enable multi-buffer XDP for larger MTU\n" > > + " -h Display this help and exit\n\n" > > + " Generate test packets on the other machine with:\n" > > + " echo -n xdp | nc -u -q1 <dst_ip> 9091\n"; > > nit: any reason we have two spaces in the help description? I don't > think it's a standard practice, so maybe drop them? I have just copied usage from xskxceiver. As I see, this is not a standard practice indeed, will fix.
On 10/10, Larysa Zaremba wrote: > On Mon, Oct 09, 2023 at 09:49:54AM -0700, Stanislav Fomichev wrote: > > On 10/09, Larysa Zaremba wrote: > > > This is a follow-up to the commit 9b2b86332a9b ("bpf: Allow to use kfunc > > > XDP hints and frags together"). > > > > > > The are some possible implementations problems that may arise when > > > providing metadata specifically for multi-buffer packets, therefore there > > > must be a possibility to test such option separately. > > > > > > Add an option to use multi-buffer AF_XDP xdp_hw_metadata and mark used XDP > > > program as capable to use frags. > > > > > > As for now, xdp_hw_metadata accepts no options, so add simple option > > > parsing logic and a help message. > > > > > > For quick reference, also add an ingress packet generation command to the > > > help message. The command comes from [0]. > > > > > > Example of output for multi-buffer packet: > > > > > > xsk_ring_cons__peek: 1 > > > 0xead018: rx_desc[15]->addr=10000000000f000 addr=f100 comp_addr=f000 > > > rx_hash: 0x5789FCBB with RSS type:0x29 > > > rx_timestamp: 1696856851535324697 (sec:1696856851.5353) > > > XDP RX-time: 1696856843158256391 (sec:1696856843.1583) > > > delta sec:-8.3771 (-8377068.306 usec) > > > AF_XDP time: 1696856843158413078 (sec:1696856843.1584) > > > delta sec:0.0002 (156.687 usec) > > > 0xead018: complete idx=23 addr=f000 > > > xsk_ring_cons__peek: 1 > > > 0xead018: rx_desc[16]->addr=100000000008000 addr=8100 comp_addr=8000 > > > 0xead018: complete idx=24 addr=8000 > > > xsk_ring_cons__peek: 1 > > > 0xead018: rx_desc[17]->addr=100000000009000 addr=9100 comp_addr=9000 EoP > > > 0xead018: complete idx=25 addr=9000 > > > > > > Metadata is printed for the first packet only. > > > > > > [0] https://lore.kernel.org/all/20230119221536.3349901-18-sdf@google.com/ > > > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> > > > --- > > > .../selftests/bpf/progs/xdp_hw_metadata.c | 2 +- > > > tools/testing/selftests/bpf/xdp_hw_metadata.c | 92 ++++++++++++++++--- > > > 2 files changed, 79 insertions(+), 15 deletions(-) > > > > > > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > > index 63d7de6c6bbb..8767d919c881 100644 > > > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > > @@ -21,7 +21,7 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, > > > extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, > > > enum xdp_rss_hash_type *rss_type) __ksym; > > > > > > -SEC("xdp") > > > +SEC("xdp.frags") > > > int rx(struct xdp_md *ctx) > > > { > > > void *data, *data_meta, *data_end; > > > diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c > > > index 17c980138796..25225720346b 100644 > > > --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c > > > +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c > > > @@ -26,6 +26,7 @@ > > > #include <linux/sockios.h> > > > #include <sys/mman.h> > > > #include <net/if.h> > > > +#include <ctype.h> > > > #include <poll.h> > > > #include <time.h> > > > > > > @@ -49,19 +50,29 @@ struct xsk { > > > struct xdp_hw_metadata *bpf_obj; > > > struct xsk *rx_xsk; > > > const char *ifname; > > > +bool use_frags; > > > int ifindex; > > > int rxq; > > > > > > void test__fail(void) { /* for network_helpers.c */ } > > > > > > -static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) > > > +static struct xsk_socket_config gen_socket_config(void) > > > { > > > - int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > > > - const struct xsk_socket_config socket_config = { > > > + struct xsk_socket_config socket_config = { > > > .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > > > .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > > > .bind_flags = XDP_COPY, > > > }; > > > + > > > + if (use_frags) > > > + socket_config.bind_flags |= XDP_USE_SG; > > > + return socket_config; > > > +} > > > > nit: why not drop const from socket_config and add this 'if (use_frags)' > > directly to open_xsk? Not sure separate function really buys us anything? > > > > Considering there will also be ZC/copy option, I thought it would be good to > separate socket config creation. After giving this a sencond thought though, > for now options would control bind_flags only. What do you this about removing > gen_socket_config(), but introducing get_bind_flags()? In my pending series [0] I ended up adding bind_flags argument to open_xsk. Maybe do the same here? This also lets you drop global use_frags (if you move option parsing directly into main). Or maybe add global bind_flags if you want to keep separate parsing routine (read_args)? Doesn't seem like we get anything by storing separate use_flags/use_copy and then construct bind_flags via extra get_bind_flags()? 0: https://lore.kernel.org/bpf/20231003200522.1914523-10-sdf@google.com/
On Tue, Oct 10, 2023 at 09:04:47AM -0700, Stanislav Fomichev wrote: > On 10/10, Larysa Zaremba wrote: > > On Mon, Oct 09, 2023 at 09:49:54AM -0700, Stanislav Fomichev wrote: > > > On 10/09, Larysa Zaremba wrote: > > > > This is a follow-up to the commit 9b2b86332a9b ("bpf: Allow to use kfunc > > > > XDP hints and frags together"). > > > > > > > > The are some possible implementations problems that may arise when > > > > providing metadata specifically for multi-buffer packets, therefore there > > > > must be a possibility to test such option separately. > > > > > > > > Add an option to use multi-buffer AF_XDP xdp_hw_metadata and mark used XDP > > > > program as capable to use frags. > > > > > > > > As for now, xdp_hw_metadata accepts no options, so add simple option > > > > parsing logic and a help message. > > > > > > > > For quick reference, also add an ingress packet generation command to the > > > > help message. The command comes from [0]. > > > > > > > > Example of output for multi-buffer packet: > > > > > > > > xsk_ring_cons__peek: 1 > > > > 0xead018: rx_desc[15]->addr=10000000000f000 addr=f100 comp_addr=f000 > > > > rx_hash: 0x5789FCBB with RSS type:0x29 > > > > rx_timestamp: 1696856851535324697 (sec:1696856851.5353) > > > > XDP RX-time: 1696856843158256391 (sec:1696856843.1583) > > > > delta sec:-8.3771 (-8377068.306 usec) > > > > AF_XDP time: 1696856843158413078 (sec:1696856843.1584) > > > > delta sec:0.0002 (156.687 usec) > > > > 0xead018: complete idx=23 addr=f000 > > > > xsk_ring_cons__peek: 1 > > > > 0xead018: rx_desc[16]->addr=100000000008000 addr=8100 comp_addr=8000 > > > > 0xead018: complete idx=24 addr=8000 > > > > xsk_ring_cons__peek: 1 > > > > 0xead018: rx_desc[17]->addr=100000000009000 addr=9100 comp_addr=9000 EoP > > > > 0xead018: complete idx=25 addr=9000 > > > > > > > > Metadata is printed for the first packet only. > > > > > > > > [0] https://lore.kernel.org/all/20230119221536.3349901-18-sdf@google.com/ > > > > > > > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> > > > > --- > > > > .../selftests/bpf/progs/xdp_hw_metadata.c | 2 +- > > > > tools/testing/selftests/bpf/xdp_hw_metadata.c | 92 ++++++++++++++++--- > > > > 2 files changed, 79 insertions(+), 15 deletions(-) > > > > > > > > diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > > > index 63d7de6c6bbb..8767d919c881 100644 > > > > --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > > > +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c > > > > @@ -21,7 +21,7 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, > > > > extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, > > > > enum xdp_rss_hash_type *rss_type) __ksym; > > > > > > > > -SEC("xdp") > > > > +SEC("xdp.frags") > > > > int rx(struct xdp_md *ctx) > > > > { > > > > void *data, *data_meta, *data_end; > > > > diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c > > > > index 17c980138796..25225720346b 100644 > > > > --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c > > > > +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c > > > > @@ -26,6 +26,7 @@ > > > > #include <linux/sockios.h> > > > > #include <sys/mman.h> > > > > #include <net/if.h> > > > > +#include <ctype.h> > > > > #include <poll.h> > > > > #include <time.h> > > > > > > > > @@ -49,19 +50,29 @@ struct xsk { > > > > struct xdp_hw_metadata *bpf_obj; > > > > struct xsk *rx_xsk; > > > > const char *ifname; > > > > +bool use_frags; > > > > int ifindex; > > > > int rxq; > > > > > > > > void test__fail(void) { /* for network_helpers.c */ } > > > > > > > > -static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) > > > > +static struct xsk_socket_config gen_socket_config(void) > > > > { > > > > - int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > > > > - const struct xsk_socket_config socket_config = { > > > > + struct xsk_socket_config socket_config = { > > > > .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > > > > .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, > > > > .bind_flags = XDP_COPY, > > > > }; > > > > + > > > > + if (use_frags) > > > > + socket_config.bind_flags |= XDP_USE_SG; > > > > + return socket_config; > > > > +} > > > > > > nit: why not drop const from socket_config and add this 'if (use_frags)' > > > directly to open_xsk? Not sure separate function really buys us anything? > > > > > > > Considering there will also be ZC/copy option, I thought it would be good to > > separate socket config creation. After giving this a sencond thought though, > > for now options would control bind_flags only. What do you this about removing > > gen_socket_config(), but introducing get_bind_flags()? > > In my pending series [0] I ended up adding bind_flags argument > to open_xsk. Maybe do the same here? This also lets you drop > global use_frags (if you move option parsing directly into main). > > Or maybe add global bind_flags if you want to keep separate parsing > routine (read_args)? Doesn't seem like we get anything by storing > separate use_flags/use_copy and then construct bind_flags via extra > get_bind_flags()? > I like the option with global bind_flags. > 0: https://lore.kernel.org/bpf/20231003200522.1914523-10-sdf@google.com/
diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c index 63d7de6c6bbb..8767d919c881 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -21,7 +21,7 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, enum xdp_rss_hash_type *rss_type) __ksym; -SEC("xdp") +SEC("xdp.frags") int rx(struct xdp_md *ctx) { void *data, *data_meta, *data_end; diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 17c980138796..25225720346b 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -26,6 +26,7 @@ #include <linux/sockios.h> #include <sys/mman.h> #include <net/if.h> +#include <ctype.h> #include <poll.h> #include <time.h> @@ -49,19 +50,29 @@ struct xsk { struct xdp_hw_metadata *bpf_obj; struct xsk *rx_xsk; const char *ifname; +bool use_frags; int ifindex; int rxq; void test__fail(void) { /* for network_helpers.c */ } -static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) +static struct xsk_socket_config gen_socket_config(void) { - int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; - const struct xsk_socket_config socket_config = { + struct xsk_socket_config socket_config = { .rx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .bind_flags = XDP_COPY, }; + + if (use_frags) + socket_config.bind_flags |= XDP_USE_SG; + return socket_config; +} + +static int open_xsk(int ifindex, struct xsk *xsk, __u32 queue_id) +{ + int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; + struct xsk_socket_config socket_config = gen_socket_config(); const struct xsk_umem_config umem_config = { .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, @@ -263,11 +274,14 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t verify_skb_metadata(server_fd); for (i = 0; i < rxq; i++) { + bool first_seg = true; + bool is_eop = true; + if (fds[i].revents == 0) continue; struct xsk *xsk = &rx_xsk[i]; - +peek: ret = xsk_ring_cons__peek(&xsk->rx, 1, &idx); printf("xsk_ring_cons__peek: %d\n", ret); if (ret != 1) @@ -276,12 +290,19 @@ static int verify_metadata(struct xsk *rx_xsk, int rxq, int server_fd, clockid_t rx_desc = xsk_ring_cons__rx_desc(&xsk->rx, idx); comp_addr = xsk_umem__extract_addr(rx_desc->addr); addr = xsk_umem__add_offset_to_addr(rx_desc->addr); - printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx\n", - xsk, idx, rx_desc->addr, addr, comp_addr); - verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), - clock_id); + is_eop = !(rx_desc->options & XDP_PKT_CONTD); + printf("%p: rx_desc[%u]->addr=%llx addr=%llx comp_addr=%llx%s\n", + xsk, idx, rx_desc->addr, addr, comp_addr, is_eop ? " EoP" : ""); + if (first_seg) { + verify_xdp_metadata(xsk_umem__get_data(xsk->umem_area, addr), + clock_id); + first_seg = false; + } + xsk_ring_cons__release(&xsk->rx, 1); refill_rx(xsk, comp_addr); + if (!is_eop) + goto peek; } } @@ -404,6 +425,54 @@ static void timestamping_enable(int fd, int val) error(1, errno, "setsockopt(SO_TIMESTAMPING)"); } +static void print_usage(void) +{ + const char *usage = + " Usage: xdp_hw_metadata [OPTIONS] [IFNAME]\n" + " Options:\n" + " -m Enable multi-buffer XDP for larger MTU\n" + " -h Display this help and exit\n\n" + " Generate test packets on the other machine with:\n" + " echo -n xdp | nc -u -q1 <dst_ip> 9091\n"; + + printf("%s", usage); +} + +static void read_args(int argc, char *argv[]) +{ + char opt; + + while ((opt = getopt(argc, argv, "mh")) != -1) { + switch (opt) { + case 'm': + use_frags = true; + break; + case 'h': + print_usage(); + exit(0); + case '?': + if (isprint(optopt)) + fprintf(stderr, "Unknown option: -%c\n", optopt); + fallthrough; + default: + print_usage(); + error(-1, opterr, "Command line options error"); + } + } + + if (optind >= argc) { + fprintf(stderr, "No device name provided\n"); + print_usage(); + exit(-1); + } + + ifname = argv[optind]; + ifindex = if_nametoindex(ifname); + + if (!ifname) + error(-1, errno, "Invalid interface name"); +} + int main(int argc, char *argv[]) { clockid_t clock_id = CLOCK_TAI; @@ -413,13 +482,8 @@ int main(int argc, char *argv[]) struct bpf_program *prog; - if (argc != 2) { - fprintf(stderr, "pass device name\n"); - return -1; - } + read_args(argc, argv); - ifname = argv[1]; - ifindex = if_nametoindex(ifname); rxq = rxq_num(ifname); printf("rxq: %d\n", rxq);
This is a follow-up to the commit 9b2b86332a9b ("bpf: Allow to use kfunc XDP hints and frags together"). The are some possible implementations problems that may arise when providing metadata specifically for multi-buffer packets, therefore there must be a possibility to test such option separately. Add an option to use multi-buffer AF_XDP xdp_hw_metadata and mark used XDP program as capable to use frags. As for now, xdp_hw_metadata accepts no options, so add simple option parsing logic and a help message. For quick reference, also add an ingress packet generation command to the help message. The command comes from [0]. Example of output for multi-buffer packet: xsk_ring_cons__peek: 1 0xead018: rx_desc[15]->addr=10000000000f000 addr=f100 comp_addr=f000 rx_hash: 0x5789FCBB with RSS type:0x29 rx_timestamp: 1696856851535324697 (sec:1696856851.5353) XDP RX-time: 1696856843158256391 (sec:1696856843.1583) delta sec:-8.3771 (-8377068.306 usec) AF_XDP time: 1696856843158413078 (sec:1696856843.1584) delta sec:0.0002 (156.687 usec) 0xead018: complete idx=23 addr=f000 xsk_ring_cons__peek: 1 0xead018: rx_desc[16]->addr=100000000008000 addr=8100 comp_addr=8000 0xead018: complete idx=24 addr=8000 xsk_ring_cons__peek: 1 0xead018: rx_desc[17]->addr=100000000009000 addr=9100 comp_addr=9000 EoP 0xead018: complete idx=25 addr=9000 Metadata is printed for the first packet only. [0] https://lore.kernel.org/all/20230119221536.3349901-18-sdf@google.com/ Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> --- .../selftests/bpf/progs/xdp_hw_metadata.c | 2 +- tools/testing/selftests/bpf/xdp_hw_metadata.c | 92 ++++++++++++++++--- 2 files changed, 79 insertions(+), 15 deletions(-)