Message ID | 20211008000309.43274-9-andrii@kernel.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | libbpf: support custom .rodata.*/.data.* sections | expand |
On Thu, Oct 7, 2021 at 5:04 PM <andrii.nakryiko@gmail.com> wrote: > > From: Andrii Nakryiko <andrii@kernel.org> > > Enhance existing selftests to demonstrate the use of custom > .data/.rodata sections. > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com>
On 10/8/21 2:03 AM, andrii.nakryiko@gmail.com wrote: > From: Andrii Nakryiko <andrii@kernel.org> > > Enhance existing selftests to demonstrate the use of custom > .data/.rodata sections. > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Just a thought, but wouldn't the actual demo / use case be better to show that we can now have a __read_mostly attribute which implies SEC(".data.read_mostly") section? Would be nice to add a ... #define __read_mostly SEC(".data.read_mostly") ... into tools/lib/bpf/bpf_helpers.h along with the series for use out of BPF programs as I think this should be a rather common use case. Thoughts? > --- > .../selftests/bpf/prog_tests/skeleton.c | 25 +++++++++++++++++++ > .../selftests/bpf/progs/test_skeleton.c | 10 ++++++++ > 2 files changed, 35 insertions(+) [...] > diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c > index 441fa1c552c8..47a7e76866c4 100644 > --- a/tools/testing/selftests/bpf/progs/test_skeleton.c > +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c > @@ -40,9 +40,16 @@ int kern_ver = 0; > > struct s out5 = {}; > > +const volatile int in_dynarr_sz SEC(".rodata.dyn"); > +const volatile int in_dynarr[4] SEC(".rodata.dyn") = { -1, -2, -3, -4 }; > + > +int out_dynarr[4] SEC(".data.dyn") = { 1, 2, 3, 4 }; > + > SEC("raw_tp/sys_enter") > int handler(const void *ctx) > { > + int i; > + > out1 = in1; > out2 = in2; > out3 = in3; > @@ -53,6 +60,9 @@ int handler(const void *ctx) > bpf_syscall = CONFIG_BPF_SYSCALL; > kern_ver = LINUX_KERNEL_VERSION; > > + for (i = 0; i < in_dynarr_sz; i++) > + out_dynarr[i] = in_dynarr[i]; > + > return 0; > } > >
On Mon, Oct 11, 2021 at 3:57 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 10/8/21 2:03 AM, andrii.nakryiko@gmail.com wrote: > > From: Andrii Nakryiko <andrii@kernel.org> > > > > Enhance existing selftests to demonstrate the use of custom > > .data/.rodata sections. > > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > > Just a thought, but wouldn't the actual demo / use case be better to show that we can > now have a __read_mostly attribute which implies SEC(".data.read_mostly") section? > > Would be nice to add a ... > > #define __read_mostly SEC(".data.read_mostly") > > ... into tools/lib/bpf/bpf_helpers.h along with the series for use out of BPF programs > as I think this should be a rather common use case. Thoughts? But what's so special about the ".data.read_mostly" ELF section for BPF programs? It's just another data section with no extra semantics. So unclear why we need to have a dedicated #define for that?.. > > > --- > > .../selftests/bpf/prog_tests/skeleton.c | 25 +++++++++++++++++++ > > .../selftests/bpf/progs/test_skeleton.c | 10 ++++++++ > > 2 files changed, 35 insertions(+) > [...] > > diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c > > index 441fa1c552c8..47a7e76866c4 100644 > > --- a/tools/testing/selftests/bpf/progs/test_skeleton.c > > +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c > > @@ -40,9 +40,16 @@ int kern_ver = 0; > > > > struct s out5 = {}; > > > > +const volatile int in_dynarr_sz SEC(".rodata.dyn"); > > +const volatile int in_dynarr[4] SEC(".rodata.dyn") = { -1, -2, -3, -4 }; > > + > > +int out_dynarr[4] SEC(".data.dyn") = { 1, 2, 3, 4 }; > > + > > SEC("raw_tp/sys_enter") > > int handler(const void *ctx) > > { > > + int i; > > + > > out1 = in1; > > out2 = in2; > > out3 = in3; > > @@ -53,6 +60,9 @@ int handler(const void *ctx) > > bpf_syscall = CONFIG_BPF_SYSCALL; > > kern_ver = LINUX_KERNEL_VERSION; > > > > + for (i = 0; i < in_dynarr_sz; i++) > > + out_dynarr[i] = in_dynarr[i]; > > + > > return 0; > > } > > > > >
On 10/12/21 5:47 AM, Andrii Nakryiko wrote: > On Mon, Oct 11, 2021 at 3:57 PM Daniel Borkmann <daniel@iogearbox.net> wrote: >> On 10/8/21 2:03 AM, andrii.nakryiko@gmail.com wrote: >>> From: Andrii Nakryiko <andrii@kernel.org> >>> >>> Enhance existing selftests to demonstrate the use of custom >>> .data/.rodata sections. >>> >>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> >> >> Just a thought, but wouldn't the actual demo / use case be better to show that we can >> now have a __read_mostly attribute which implies SEC(".data.read_mostly") section? >> >> Would be nice to add a ... >> >> #define __read_mostly SEC(".data.read_mostly") >> >> ... into tools/lib/bpf/bpf_helpers.h along with the series for use out of BPF programs >> as I think this should be a rather common use case. Thoughts? > > But what's so special about the ".data.read_mostly" ELF section for > BPF programs? It's just another data section with no extra semantics. > So unclear why we need to have a dedicated #define for that?.. I mean semantics are implicit that only vars would be located there which are by far more read than written to. Placing into separate .data.read_mostly would help to reduce cache misses due to false sharing e.g. if they are otherwise placed near vars which are written more often (silly example would be some counter in the prog). >>> --- >>> .../selftests/bpf/prog_tests/skeleton.c | 25 +++++++++++++++++++ >>> .../selftests/bpf/progs/test_skeleton.c | 10 ++++++++ >>> 2 files changed, 35 insertions(+) >> [...] >>> diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c >>> index 441fa1c552c8..47a7e76866c4 100644 >>> --- a/tools/testing/selftests/bpf/progs/test_skeleton.c >>> +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c >>> @@ -40,9 +40,16 @@ int kern_ver = 0; >>> >>> struct s out5 = {}; >>> >>> +const volatile int in_dynarr_sz SEC(".rodata.dyn"); >>> +const volatile int in_dynarr[4] SEC(".rodata.dyn") = { -1, -2, -3, -4 }; >>> + >>> +int out_dynarr[4] SEC(".data.dyn") = { 1, 2, 3, 4 }; >>> + >>> SEC("raw_tp/sys_enter") >>> int handler(const void *ctx) >>> { >>> + int i; >>> + >>> out1 = in1; >>> out2 = in2; >>> out3 = in3; >>> @@ -53,6 +60,9 @@ int handler(const void *ctx) >>> bpf_syscall = CONFIG_BPF_SYSCALL; >>> kern_ver = LINUX_KERNEL_VERSION; >>> >>> + for (i = 0; i < in_dynarr_sz; i++) >>> + out_dynarr[i] = in_dynarr[i]; >>> + >>> return 0; >>> } >>> >>> >>
On Tue, Oct 12, 2021 at 7:54 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 10/12/21 5:47 AM, Andrii Nakryiko wrote: > > On Mon, Oct 11, 2021 at 3:57 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > >> On 10/8/21 2:03 AM, andrii.nakryiko@gmail.com wrote: > >>> From: Andrii Nakryiko <andrii@kernel.org> > >>> > >>> Enhance existing selftests to demonstrate the use of custom > >>> .data/.rodata sections. > >>> > >>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > >> > >> Just a thought, but wouldn't the actual demo / use case be better to show that we can > >> now have a __read_mostly attribute which implies SEC(".data.read_mostly") section? > >> > >> Would be nice to add a ... > >> > >> #define __read_mostly SEC(".data.read_mostly") > >> > >> ... into tools/lib/bpf/bpf_helpers.h along with the series for use out of BPF programs > >> as I think this should be a rather common use case. Thoughts? > > > > But what's so special about the ".data.read_mostly" ELF section for > > BPF programs? It's just another data section with no extra semantics. > > So unclear why we need to have a dedicated #define for that?.. > > I mean semantics are implicit that only vars would be located there which are > by far more read than written to. Placing into separate .data.read_mostly would > help to reduce cache misses due to false sharing e.g. if they are otherwise placed > near vars which are written more often (silly example would be some counter in > the prog). We've discussed this offline. We concluded that it's a bit premature to add #define __read_mostly into bpf_helpers.h, but I'll use __read_mostly as part of a selftests to demonstrate this concept. > > >>> --- > >>> .../selftests/bpf/prog_tests/skeleton.c | 25 +++++++++++++++++++ > >>> .../selftests/bpf/progs/test_skeleton.c | 10 ++++++++ > >>> 2 files changed, 35 insertions(+) > >> [...] > >>> diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c > >>> index 441fa1c552c8..47a7e76866c4 100644 > >>> --- a/tools/testing/selftests/bpf/progs/test_skeleton.c > >>> +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c > >>> @@ -40,9 +40,16 @@ int kern_ver = 0; > >>> > >>> struct s out5 = {}; > >>> > >>> +const volatile int in_dynarr_sz SEC(".rodata.dyn"); > >>> +const volatile int in_dynarr[4] SEC(".rodata.dyn") = { -1, -2, -3, -4 }; > >>> + > >>> +int out_dynarr[4] SEC(".data.dyn") = { 1, 2, 3, 4 }; > >>> + > >>> SEC("raw_tp/sys_enter") > >>> int handler(const void *ctx) > >>> { > >>> + int i; > >>> + > >>> out1 = in1; > >>> out2 = in2; > >>> out3 = in3; > >>> @@ -53,6 +60,9 @@ int handler(const void *ctx) > >>> bpf_syscall = CONFIG_BPF_SYSCALL; > >>> kern_ver = LINUX_KERNEL_VERSION; > >>> > >>> + for (i = 0; i < in_dynarr_sz; i++) > >>> + out_dynarr[i] = in_dynarr[i]; > >>> + > >>> return 0; > >>> } > >>> > >>> > >>
diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c index fe1e204a65c6..f2713eeac076 100644 --- a/tools/testing/selftests/bpf/prog_tests/skeleton.c +++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c @@ -16,10 +16,13 @@ void test_skeleton(void) struct test_skeleton* skel; struct test_skeleton__bss *bss; struct test_skeleton__data *data; + struct test_skeleton__data_dyn *data_dyn; struct test_skeleton__rodata *rodata; + struct test_skeleton__rodata_dyn *rodata_dyn; struct test_skeleton__kconfig *kcfg; const void *elf_bytes; size_t elf_bytes_sz = 0; + int i; skel = test_skeleton__open(); if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) @@ -30,7 +33,12 @@ void test_skeleton(void) bss = skel->bss; data = skel->data; + data_dyn = skel->data_dyn; rodata = skel->rodata; + rodata_dyn = skel->rodata_dyn; + + ASSERT_STREQ(bpf_map__name(skel->maps.rodata_dyn), ".rodata.dyn", "rodata_dyn_name"); + ASSERT_STREQ(bpf_map__name(skel->maps.data_dyn), ".data.dyn", "data_dyn_name"); /* validate values are pre-initialized correctly */ CHECK(data->in1 != -1, "in1", "got %d != exp %d\n", data->in1, -1); @@ -46,6 +54,12 @@ void test_skeleton(void) CHECK(rodata->in.in6 != 0, "in6", "got %d != exp %d\n", rodata->in.in6, 0); CHECK(bss->out6 != 0, "out6", "got %d != exp %d\n", bss->out6, 0); + ASSERT_EQ(rodata_dyn->in_dynarr_sz, 0, "in_dynarr_sz"); + for (i = 0; i < 4; i++) + ASSERT_EQ(rodata_dyn->in_dynarr[i], -(i + 1), "in_dynarr"); + for (i = 0; i < 4; i++) + ASSERT_EQ(data_dyn->out_dynarr[i], i + 1, "out_dynarr"); + /* validate we can pre-setup global variables, even in .bss */ data->in1 = 10; data->in2 = 11; @@ -53,6 +67,10 @@ void test_skeleton(void) bss->in4 = 13; rodata->in.in6 = 14; + rodata_dyn->in_dynarr_sz = 4; + for (i = 0; i < 4; i++) + rodata_dyn->in_dynarr[i] = i + 10; + err = test_skeleton__load(skel); if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) goto cleanup; @@ -64,6 +82,10 @@ void test_skeleton(void) CHECK(bss->in4 != 13, "in4", "got %lld != exp %lld\n", bss->in4, 13LL); CHECK(rodata->in.in6 != 14, "in6", "got %d != exp %d\n", rodata->in.in6, 14); + ASSERT_EQ(rodata_dyn->in_dynarr_sz, 4, "in_dynarr_sz"); + for (i = 0; i < 4; i++) + ASSERT_EQ(rodata_dyn->in_dynarr[i], i + 10, "in_dynarr"); + /* now set new values and attach to get them into outX variables */ data->in1 = 1; data->in2 = 2; @@ -93,6 +115,9 @@ void test_skeleton(void) CHECK(bss->kern_ver != kcfg->LINUX_KERNEL_VERSION, "ext2", "got %d != exp %d\n", bss->kern_ver, kcfg->LINUX_KERNEL_VERSION); + for (i = 0; i < 4; i++) + ASSERT_EQ(data_dyn->out_dynarr[i], i + 10, "out_dynarr"); + elf_bytes = test_skeleton__elf_bytes(&elf_bytes_sz); ASSERT_OK_PTR(elf_bytes, "elf_bytes"); ASSERT_GE(elf_bytes_sz, 0, "elf_bytes_sz"); diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c index 441fa1c552c8..47a7e76866c4 100644 --- a/tools/testing/selftests/bpf/progs/test_skeleton.c +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c @@ -40,9 +40,16 @@ int kern_ver = 0; struct s out5 = {}; +const volatile int in_dynarr_sz SEC(".rodata.dyn"); +const volatile int in_dynarr[4] SEC(".rodata.dyn") = { -1, -2, -3, -4 }; + +int out_dynarr[4] SEC(".data.dyn") = { 1, 2, 3, 4 }; + SEC("raw_tp/sys_enter") int handler(const void *ctx) { + int i; + out1 = in1; out2 = in2; out3 = in3; @@ -53,6 +60,9 @@ int handler(const void *ctx) bpf_syscall = CONFIG_BPF_SYSCALL; kern_ver = LINUX_KERNEL_VERSION; + for (i = 0; i < in_dynarr_sz; i++) + out_dynarr[i] = in_dynarr[i]; + return 0; }