Message ID | 20230418143617.27762-1-magnus.karlsson@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 2ddade322925641ee2a75f13665c51f2e74d7791 |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/xsk: fix munmap for hugepage allocated umem | expand |
Hello: This patch was applied to bpf/bpf-next.git (master) by Daniel Borkmann <daniel@iogearbox.net>: On Tue, 18 Apr 2023 16:36:17 +0200 you wrote: > From: Magnus Karlsson <magnus.karlsson@intel.com> > > Fix the unmapping of hugepage allocated umems so that they are > properly unmapped. The new test referred to in the fixes label, > introduced a test that allocated a umem that is not a multiple of a 2M > hugepage size. This is fine for mmap() that rounds the size up the > nearest multiple of 2M. But munmap() requires the size to be a > multiple of the hugepage size in order for it to unmap the region. The > current behaviour of not properly unmapping the umem, was discovered > when further additions of tests that require hugepages (unaligned mode > tests only) started failing as the system was running out of > hugepages. > > [...] Here is the summary with links: - [bpf-next] selftests/xsk: fix munmap for hugepage allocated umem https://git.kernel.org/bpf/bpf-next/c/2ddade322925 You are awesome, thank you!
> @@ -1286,16 +1287,19 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) > u64 umem_sz = ifobject->umem->num_frames * ifobject->umem->frame_size; > int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > LIBBPF_OPTS(bpf_xdp_query_opts, opts); > + off_t mmap_offset = 0; > void *bufs; > int ret; > > - if (ifobject->umem->unaligned_mode) > + if (ifobject->umem->unaligned_mode) { > mmap_flags |= MAP_HUGETLB; > + mmap_offset = MAP_HUGE_2MB; > + } MAP_HUGE_2MB should be ORed into mmap_flags. The offset argument should be zero for MAP_ANONYMOUS mappings. The tests may still fail if the default hugepage size is not 2MB. > > if (ifobject->shared_umem) > umem_sz *= 2; > > - bufs = mmap(NULL, umem_sz, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); > + bufs = mmap(NULL, umem_sz, PROT_READ | PROT_WRITE, mmap_flags, -1, mmap_offset); > if (bufs == MAP_FAILED) > exit_with_error(errno); > -Kal
On Thu, 20 Apr 2023 at 00:01, Kal Cutter Conley <kal.conley@dectris.com> wrote: > > > @@ -1286,16 +1287,19 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) > > u64 umem_sz = ifobject->umem->num_frames * ifobject->umem->frame_size; > > int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; > > LIBBPF_OPTS(bpf_xdp_query_opts, opts); > > + off_t mmap_offset = 0; > > void *bufs; > > int ret; > > > > - if (ifobject->umem->unaligned_mode) > > + if (ifobject->umem->unaligned_mode) { > > mmap_flags |= MAP_HUGETLB; > > + mmap_offset = MAP_HUGE_2MB; > > + } > > MAP_HUGE_2MB should be ORed into mmap_flags. The offset argument > should be zero for MAP_ANONYMOUS mappings. The tests may still fail if > the default hugepage size is not 2MB. You are correct that it should go into the flags field. Misread the man page so will send a fix. It was a conscious decision to require a hugepage size of 2M. I want it to fail if you do not have it since the rest of the code will not work if you are using some other size. Yes, it is possible to discover what hugepage sizes exist and act on that, but I want to keep the code simple. > > > > if (ifobject->shared_umem) > > umem_sz *= 2; > > > > - bufs = mmap(NULL, umem_sz, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); > > + bufs = mmap(NULL, umem_sz, PROT_READ | PROT_WRITE, mmap_flags, -1, mmap_offset); > > if (bufs == MAP_FAILED) > > exit_with_error(errno); > > > > -Kal
> It was a conscious decision to require a hugepage size of 2M. I want > it to fail if you do not have it since the rest of the code will not > work if you are using some other size. Yes, it is possible to discover > what hugepage sizes exist and act on that, but I want to keep the code > simple. Yes. I understood that and I think the solution is reasonable. Sadly, it's not trivial to query the default hugepage size from userspace AFAIK. Is parsing /proc/meminfo the only way? What I meant was: "the tests may still fail _with the old mode of failure (out of memory)_ if the default hugepage size is > 2MB".
diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c index 5a9691e942de..a59d04118842 100644 --- a/tools/testing/selftests/bpf/xskxceiver.c +++ b/tools/testing/selftests/bpf/xskxceiver.c @@ -77,6 +77,7 @@ #include <linux/if_link.h> #include <linux/if_ether.h> #include <linux/ip.h> +#include <linux/mman.h> #include <linux/udp.h> #include <arpa/inet.h> #include <net/if.h> @@ -1286,16 +1287,19 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) u64 umem_sz = ifobject->umem->num_frames * ifobject->umem->frame_size; int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; LIBBPF_OPTS(bpf_xdp_query_opts, opts); + off_t mmap_offset = 0; void *bufs; int ret; - if (ifobject->umem->unaligned_mode) + if (ifobject->umem->unaligned_mode) { mmap_flags |= MAP_HUGETLB; + mmap_offset = MAP_HUGE_2MB; + } if (ifobject->shared_umem) umem_sz *= 2; - bufs = mmap(NULL, umem_sz, PROT_READ | PROT_WRITE, mmap_flags, -1, 0); + bufs = mmap(NULL, umem_sz, PROT_READ | PROT_WRITE, mmap_flags, -1, mmap_offset); if (bufs == MAP_FAILED) exit_with_error(errno); @@ -1379,6 +1383,11 @@ static void *worker_testapp_validate_rx(void *arg) pthread_exit(NULL); } +static u64 ceil_u64(u64 a, u64 b) +{ + return (a + b - 1) / b; +} + static void testapp_clean_xsk_umem(struct ifobject *ifobj) { u64 umem_sz = ifobj->umem->num_frames * ifobj->umem->frame_size; @@ -1386,6 +1395,7 @@ static void testapp_clean_xsk_umem(struct ifobject *ifobj) if (ifobj->shared_umem) umem_sz *= 2; + umem_sz = ceil_u64(umem_sz, HUGEPAGE_SIZE) * HUGEPAGE_SIZE; xsk_umem__delete(ifobj->umem->umem); munmap(ifobj->umem->buffer, umem_sz); } @@ -1619,14 +1629,15 @@ static void testapp_stats_fill_empty(struct test_spec *test) /* Simple test */ static bool hugepages_present(struct ifobject *ifobject) { - const size_t mmap_sz = 2 * ifobject->umem->num_frames * ifobject->umem->frame_size; + size_t mmap_sz = 2 * ifobject->umem->num_frames * ifobject->umem->frame_size; void *bufs; bufs = mmap(NULL, mmap_sz, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); + MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, MAP_HUGE_2MB); if (bufs == MAP_FAILED) return false; + mmap_sz = ceil_u64(mmap_sz, HUGEPAGE_SIZE) * HUGEPAGE_SIZE; munmap(bufs, mmap_sz); return true; } diff --git a/tools/testing/selftests/bpf/xskxceiver.h b/tools/testing/selftests/bpf/xskxceiver.h index 919327807a4e..c535aeab2ca3 100644 --- a/tools/testing/selftests/bpf/xskxceiver.h +++ b/tools/testing/selftests/bpf/xskxceiver.h @@ -56,6 +56,7 @@ #define RX_FULL_RXQSIZE 32 #define UMEM_HEADROOM_TEST_SIZE 128 #define XSK_UMEM__INVALID_FRAME_SIZE (XSK_UMEM__DEFAULT_FRAME_SIZE + 1) +#define HUGEPAGE_SIZE (2 * 1024 * 1024) #define print_verbose(x...) do { if (opt_verbose) ksft_print_msg(x); } while (0)