Message ID | 20210310015455.1095207-1-revest@chromium.org (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | [BUG] One-liner array initialization with two pointers in BPF results in NULLs | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On 3/9/21 5:54 PM, Florent Revest wrote: > I noticed that initializing an array of pointers using this syntax: > __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) > always results in array[0] and array[1] being NULL. > > Interestingly, if the array is only initialized with one pointer, ex: > __u64 array[] = { (__u64)&var1 }; > Then array[0] will not be NULL. > > Or if the array is initialized field by field, ex: > __u64 array[2]; > array[0] = (__u64)&var1; > array[1] = (__u64)&var2; > Then array[0] and array[1] will not be NULL either. > > I'm assuming that this should have something to do with relocations > and might be a bug in clang or in libbpf but because I don't know much > about these, I thought that reporting could be a good first step. :) Thanks for reporting. What you guess is correct, this is due to relocations :-( The compiler notoriously tend to put complex initial values into rodata section. For example, for __u64 array[] = { (__u64)&var1, (__u64)&var2 }; the compiler will put { (__u64)&var1, (__u64)&var2 } into rodata section. But &var1 and &var2 themselves need relocation since they are address of static variables which will sit inside .data section. So in the elf file, you will see the following relocations: RELOCATION RECORDS FOR [.rodata]: OFFSET TYPE VALUE 0000000000000018 R_BPF_64_64 .data 0000000000000020 R_BPF_64_64 .data Currently, libbpf does not handle relocation inside .rodata section, so they content remains 0. That is why you see the issue with pointer as NULL. With array size of 1, compiler does not bother to put it into rodata section. I *guess* that it works in the macro due to some kind of heuristics, e.g., nested blocks, etc, and llvm did not promote the array init value to rodata. I will double check whether llvm can complete prevent such transformation. Maybe in the future libbpf is able to handle relocations for rodata section too. But for the time being, please just consider to use either macro, or the explicit array assignment. Thanks for the reproducer! > > I attached below a repro with a dummy selftest that I expect should pass > but fails to pass with the latest clang and bpf-next. Hopefully, the > logic should be simple: I try to print two strings from pointers in an > array using bpf_seq_printf but depending on how the array is initialized > the helper either receives the string pointers or NULL pointers: > > test_bug:FAIL:read unexpected read: actual 'str1= str2= str1=STR1 > str2=STR2 ' != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 ' > > Signed-off-by: Florent Revest <revest@chromium.org> > --- > tools/testing/selftests/bpf/prog_tests/bug.c | 41 +++++++++++++++++++ > tools/testing/selftests/bpf/progs/test_bug.c | 43 ++++++++++++++++++++ > 2 files changed, 84 insertions(+) > create mode 100644 tools/testing/selftests/bpf/prog_tests/bug.c > create mode 100644 tools/testing/selftests/bpf/progs/test_bug.c > > diff --git a/tools/testing/selftests/bpf/prog_tests/bug.c b/tools/testing/selftests/bpf/prog_tests/bug.c > new file mode 100644 > index 000000000000..4b0fafd936b7 > --- /dev/null > +++ b/tools/testing/selftests/bpf/prog_tests/bug.c > @@ -0,0 +1,41 @@ > +#include <test_progs.h> > +#include "test_bug.skel.h" > + > +static int duration; > + > +void test_bug(void) > +{ > + struct test_bug *skel; > + struct bpf_link *link; > + char buf[64] = {}; > + int iter_fd, len; > + > + skel = test_bug__open_and_load(); > + if (CHECK(!skel, "test_bug__open_and_load", > + "skeleton open_and_load failed\n")) > + goto destroy; > + > + link = bpf_program__attach_iter(skel->progs.bug, NULL); > + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) > + goto destroy; > + > + iter_fd = bpf_iter_create(bpf_link__fd(link)); > + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n")) > + goto free_link; > + > + len = read(iter_fd, buf, sizeof(buf)); > + CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)); > + // BUG: We expect the strings to be printed in both cases but only the > + // second case works. > + // actual 'str1= str2= str1=STR1 str2=STR2 ' > + // != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 ' > + ASSERT_STREQ(buf, "str1=STR1 str2=STR2 str1=STR1 str2=STR2 ", "read"); > + > + close(iter_fd); > + > +free_link: > + bpf_link__destroy(link); > +destroy: > + test_bug__destroy(skel); > +} > + > diff --git a/tools/testing/selftests/bpf/progs/test_bug.c b/tools/testing/selftests/bpf/progs/test_bug.c > new file mode 100644 > index 000000000000..c41e69483785 > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/test_bug.c > @@ -0,0 +1,43 @@ > +#include "bpf_iter.h" > +#include <bpf/bpf_helpers.h> > +#include <bpf/bpf_tracing.h> > + > +char _license[] SEC("license") = "GPL"; > + > +SEC("iter/task") > +int bug(struct bpf_iter__task *ctx) > +{ > + struct seq_file *seq = ctx->meta->seq; > + > + /* We want to print two strings */ > + static const char fmt[] = "str1=%s str2=%s "; > + static char str1[] = "STR1"; > + static char str2[] = "STR2"; > + > + /* > + * Because bpf_seq_printf takes parameters to its format specifiers in > + * an array, we need to stuff pointers to str1 and str2 in a u64 array. > + */ > + > + /* First, we try a one-liner array initialization. Note that this is > + * what the BPF_SEQ_PRINTF macro does under the hood. */ > + __u64 param_not_working[] = { (__u64)str1, (__u64)str2 }; > + /* But we also try a field by field initialization of the array. We > + * would expect the arrays and the behavior to be exactly the same. */ > + __u64 param_working[2]; > + param_working[0] = (__u64)str1; > + param_working[1] = (__u64)str2; > + > + /* For convenience, only print once */ > + if (ctx->meta->seq_num != 0) > + return 0; > + > + /* Using the one-liner array of params, it does not print the strings */ > + bpf_seq_printf(seq, fmt, sizeof(fmt), > + param_not_working, sizeof(param_not_working)); > + /* Using the field-by-field array of params, it prints the strings */ > + bpf_seq_printf(seq, fmt, sizeof(fmt), > + param_working, sizeof(param_working)); > + > + return 0; > +} >
On 3/9/21 7:43 PM, Yonghong Song wrote: > > > On 3/9/21 5:54 PM, Florent Revest wrote: >> I noticed that initializing an array of pointers using this syntax: >> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; >> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) >> always results in array[0] and array[1] being NULL. >> >> Interestingly, if the array is only initialized with one pointer, ex: >> __u64 array[] = { (__u64)&var1 }; >> Then array[0] will not be NULL. >> >> Or if the array is initialized field by field, ex: >> __u64 array[2]; >> array[0] = (__u64)&var1; >> array[1] = (__u64)&var2; >> Then array[0] and array[1] will not be NULL either. >> >> I'm assuming that this should have something to do with relocations >> and might be a bug in clang or in libbpf but because I don't know much >> about these, I thought that reporting could be a good first step. :) > > Thanks for reporting. What you guess is correct, this is due to > relocations :-( > > The compiler notoriously tend to put complex initial values into > rodata section. For example, for > __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > the compiler will put > { (__u64)&var1, (__u64)&var2 } > into rodata section. > > But &var1 and &var2 themselves need relocation since they are > address of static variables which will sit inside .data section. > > So in the elf file, you will see the following relocations: > > RELOCATION RECORDS FOR [.rodata]: > OFFSET TYPE VALUE > 0000000000000018 R_BPF_64_64 .data > 0000000000000020 R_BPF_64_64 .data > > Currently, libbpf does not handle relocation inside .rodata > section, so they content remains 0. > > That is why you see the issue with pointer as NULL. > > With array size of 1, compiler does not bother to put it into > rodata section. > > I *guess* that it works in the macro due to some kind of heuristics, > e.g., nested blocks, etc, and llvm did not promote the array init value > to rodata. I will double check whether llvm can complete prevent > such transformation. > > Maybe in the future libbpf is able to handle relocations for > rodata section too. But for the time being, please just consider to use > either macro, or the explicit array assignment. Digging into the compiler, the compiler tries to make *const* initial value into rodata section if the initial value size > 64, so in this case, macro does not work either. I think this is how you discovered the issue. The llvm does not provide target hooks to influence this transformation. So, there are two workarounds, (1). __u64 param_working[2]; param_working[0] = (__u64)str1; param_working[1] = (__u64)str2; (2). BPF_SEQ_PRINTF(seq, "%s ", str1); BPF_SEQ_PRINTF(seq, "%s", str2); In practice, if you have at least one non-const format argument, you should be fine. But if all format arguments are constant, then none of them should be strings. Maybe we could change marco unsigned long long ___param[] = { args }; to declare an array explicitly and then have a loop to assign each array element? > > Thanks for the reproducer! > >> >> I attached below a repro with a dummy selftest that I expect should pass >> but fails to pass with the latest clang and bpf-next. Hopefully, the >> logic should be simple: I try to print two strings from pointers in an >> array using bpf_seq_printf but depending on how the array is initialized >> the helper either receives the string pointers or NULL pointers: >> >> test_bug:FAIL:read unexpected read: actual 'str1= str2= str1=STR1 >> str2=STR2 ' != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 ' >> >> Signed-off-by: Florent Revest <revest@chromium.org> >> --- >> tools/testing/selftests/bpf/prog_tests/bug.c | 41 +++++++++++++++++++ >> tools/testing/selftests/bpf/progs/test_bug.c | 43 ++++++++++++++++++++ >> 2 files changed, 84 insertions(+) >> create mode 100644 tools/testing/selftests/bpf/prog_tests/bug.c >> create mode 100644 tools/testing/selftests/bpf/progs/test_bug.c >> >> diff --git a/tools/testing/selftests/bpf/prog_tests/bug.c >> b/tools/testing/selftests/bpf/prog_tests/bug.c >> new file mode 100644 >> index 000000000000..4b0fafd936b7 >> --- /dev/null >> +++ b/tools/testing/selftests/bpf/prog_tests/bug.c >> @@ -0,0 +1,41 @@ >> +#include <test_progs.h> >> +#include "test_bug.skel.h" >> + >> +static int duration; >> + >> +void test_bug(void) >> +{ >> + struct test_bug *skel; >> + struct bpf_link *link; >> + char buf[64] = {}; >> + int iter_fd, len; >> + >> + skel = test_bug__open_and_load(); >> + if (CHECK(!skel, "test_bug__open_and_load", >> + "skeleton open_and_load failed\n")) >> + goto destroy; >> + >> + link = bpf_program__attach_iter(skel->progs.bug, NULL); >> + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) >> + goto destroy; >> + >> + iter_fd = bpf_iter_create(bpf_link__fd(link)); >> + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n")) >> + goto free_link; >> + >> + len = read(iter_fd, buf, sizeof(buf)); >> + CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)); >> + // BUG: We expect the strings to be printed in both cases but >> only the >> + // second case works. >> + // actual 'str1= str2= str1=STR1 str2=STR2 ' >> + // != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 ' >> + ASSERT_STREQ(buf, "str1=STR1 str2=STR2 str1=STR1 str2=STR2 ", >> "read"); >> + >> + close(iter_fd); >> + >> +free_link: >> + bpf_link__destroy(link); >> +destroy: >> + test_bug__destroy(skel); >> +} >> + >> diff --git a/tools/testing/selftests/bpf/progs/test_bug.c >> b/tools/testing/selftests/bpf/progs/test_bug.c >> new file mode 100644 >> index 000000000000..c41e69483785 >> --- /dev/null >> +++ b/tools/testing/selftests/bpf/progs/test_bug.c >> @@ -0,0 +1,43 @@ >> +#include "bpf_iter.h" >> +#include <bpf/bpf_helpers.h> >> +#include <bpf/bpf_tracing.h> >> + >> +char _license[] SEC("license") = "GPL"; >> + >> +SEC("iter/task") >> +int bug(struct bpf_iter__task *ctx) >> +{ >> + struct seq_file *seq = ctx->meta->seq; >> + >> + /* We want to print two strings */ >> + static const char fmt[] = "str1=%s str2=%s "; >> + static char str1[] = "STR1"; >> + static char str2[] = "STR2"; >> + >> + /* >> + * Because bpf_seq_printf takes parameters to its format >> specifiers in >> + * an array, we need to stuff pointers to str1 and str2 in a u64 >> array. >> + */ >> + >> + /* First, we try a one-liner array initialization. Note that this is >> + * what the BPF_SEQ_PRINTF macro does under the hood. */ >> + __u64 param_not_working[] = { (__u64)str1, (__u64)str2 }; >> + /* But we also try a field by field initialization of the array. We >> + * would expect the arrays and the behavior to be exactly the >> same. */ >> + __u64 param_working[2]; >> + param_working[0] = (__u64)str1; >> + param_working[1] = (__u64)str2; >> + >> + /* For convenience, only print once */ >> + if (ctx->meta->seq_num != 0) >> + return 0; >> + >> + /* Using the one-liner array of params, it does not print the >> strings */ >> + bpf_seq_printf(seq, fmt, sizeof(fmt), >> + param_not_working, sizeof(param_not_working)); >> + /* Using the field-by-field array of params, it prints the >> strings */ >> + bpf_seq_printf(seq, fmt, sizeof(fmt), >> + param_working, sizeof(param_working)); >> + >> + return 0; >> +} >>
On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song <yhs@fb.com> wrote: > On 3/9/21 7:43 PM, Yonghong Song wrote: > > On 3/9/21 5:54 PM, Florent Revest wrote: > >> I noticed that initializing an array of pointers using this syntax: > >> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > >> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) > >> always results in array[0] and array[1] being NULL. > >> > >> Interestingly, if the array is only initialized with one pointer, ex: > >> __u64 array[] = { (__u64)&var1 }; > >> Then array[0] will not be NULL. > >> > >> Or if the array is initialized field by field, ex: > >> __u64 array[2]; > >> array[0] = (__u64)&var1; > >> array[1] = (__u64)&var2; > >> Then array[0] and array[1] will not be NULL either. > >> > >> I'm assuming that this should have something to do with relocations > >> and might be a bug in clang or in libbpf but because I don't know much > >> about these, I thought that reporting could be a good first step. :) > > > > Thanks for reporting. What you guess is correct, this is due to > > relocations :-( > > > > The compiler notoriously tend to put complex initial values into > > rodata section. For example, for > > __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > > the compiler will put > > { (__u64)&var1, (__u64)&var2 } > > into rodata section. > > > > But &var1 and &var2 themselves need relocation since they are > > address of static variables which will sit inside .data section. > > > > So in the elf file, you will see the following relocations: > > > > RELOCATION RECORDS FOR [.rodata]: > > OFFSET TYPE VALUE > > 0000000000000018 R_BPF_64_64 .data > > 0000000000000020 R_BPF_64_64 .data Right :) Thank you for the explanations Yonghong! > > Currently, libbpf does not handle relocation inside .rodata > > section, so they content remains 0. Just for my own edification, why is .rodata relocation not yet handled in libbpf ? Is it because of a read-only mapping that makes it more difficult ? > > That is why you see the issue with pointer as NULL. > > > > With array size of 1, compiler does not bother to put it into > > rodata section. > > > > I *guess* that it works in the macro due to some kind of heuristics, > > e.g., nested blocks, etc, and llvm did not promote the array init value > > to rodata. I will double check whether llvm can complete prevent > > such transformation. > > > > Maybe in the future libbpf is able to handle relocations for > > rodata section too. But for the time being, please just consider to use > > either macro, or the explicit array assignment. > > Digging into the compiler, the compiler tries to make *const* initial > value into rodata section if the initial value size > 64, so in > this case, macro does not work either. I think this is how you > discovered the issue. Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how I found the bug. > The llvm does not provide target hooks to > influence this transformation. Oh, that is unfortunate :) Thanks for looking into it! I feel that the real fix would be in libbpf anyway and the rest is just workarounds. > So, there are two workarounds, > (1). __u64 param_working[2]; > param_working[0] = (__u64)str1; > param_working[1] = (__u64)str2; > (2). BPF_SEQ_PRINTF(seq, "%s ", str1); > BPF_SEQ_PRINTF(seq, "%s", str2); (2) is a bit impractical for my actual usecase. I am implementing a bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep the selftest short with a few BPF_SNPRINTF() calls that exercise most format specifiers. > In practice, if you have at least one non-const format argument, > you should be fine. But if all format arguments are constant, then > none of them should be strings. Just for context, this does not only happen for strings but also for all sorts of pointers, for example, when I try to do address lookup of global __ksym variables, which is important for my selftest. > Maybe we could change marco > unsigned long long ___param[] = { args }; > to declare an array explicitly and then have a loop to > assign each array element? I think this would be a good workaround for now, indeed. :) I'll look into it today and send it as part of my bpf_snprintf series. Thanks!
On 3/10/21 3:48 AM, Florent Revest wrote: > On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song <yhs@fb.com> wrote: >> On 3/9/21 7:43 PM, Yonghong Song wrote: >>> On 3/9/21 5:54 PM, Florent Revest wrote: >>>> I noticed that initializing an array of pointers using this syntax: >>>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; >>>> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) >>>> always results in array[0] and array[1] being NULL. >>>> >>>> Interestingly, if the array is only initialized with one pointer, ex: >>>> __u64 array[] = { (__u64)&var1 }; >>>> Then array[0] will not be NULL. >>>> >>>> Or if the array is initialized field by field, ex: >>>> __u64 array[2]; >>>> array[0] = (__u64)&var1; >>>> array[1] = (__u64)&var2; >>>> Then array[0] and array[1] will not be NULL either. >>>> >>>> I'm assuming that this should have something to do with relocations >>>> and might be a bug in clang or in libbpf but because I don't know much >>>> about these, I thought that reporting could be a good first step. :) >>> >>> Thanks for reporting. What you guess is correct, this is due to >>> relocations :-( >>> >>> The compiler notoriously tend to put complex initial values into >>> rodata section. For example, for >>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; >>> the compiler will put >>> { (__u64)&var1, (__u64)&var2 } >>> into rodata section. >>> >>> But &var1 and &var2 themselves need relocation since they are >>> address of static variables which will sit inside .data section. >>> >>> So in the elf file, you will see the following relocations: >>> >>> RELOCATION RECORDS FOR [.rodata]: >>> OFFSET TYPE VALUE >>> 0000000000000018 R_BPF_64_64 .data >>> 0000000000000020 R_BPF_64_64 .data > > Right :) Thank you for the explanations Yonghong! > >>> Currently, libbpf does not handle relocation inside .rodata >>> section, so they content remains 0. > > Just for my own edification, why is .rodata relocation not yet handled > in libbpf ? Is it because of a read-only mapping that makes it more > difficult ? We don't have this use case before. In general, people do not put string pointers in init code in the declaration. I think bpf_seq_printf() is special about this and hence triggering the issue. To support relocation of rodata section, kernel needs to be involved and this is actually more complicated as the relocation is against .data section. Two issues the kernel needs to deal with: - .data section will be another map in kernel, so i.e., relocation of .rodata map value against another map. - .data section may be modified, some protection might be needed to prevent this. We may ignore this requirement since user space may have similar issue. This is a corner case, if we can workaround in the libbpf, in this particular case, bpf_tracing.h. I think it will be good enough, not adding further complexity in kernel for such a corner case. > >>> That is why you see the issue with pointer as NULL. >>> >>> With array size of 1, compiler does not bother to put it into >>> rodata section. >>> >>> I *guess* that it works in the macro due to some kind of heuristics, >>> e.g., nested blocks, etc, and llvm did not promote the array init value >>> to rodata. I will double check whether llvm can complete prevent >>> such transformation. >>> >>> Maybe in the future libbpf is able to handle relocations for >>> rodata section too. But for the time being, please just consider to use >>> either macro, or the explicit array assignment. >> >> Digging into the compiler, the compiler tries to make *const* initial >> value into rodata section if the initial value size > 64, so in >> this case, macro does not work either. I think this is how you >> discovered the issue. > > Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how > I found the bug. > >> The llvm does not provide target hooks to >> influence this transformation. > > Oh, that is unfortunate :) Thanks for looking into it! I feel that the > real fix would be in libbpf anyway and the rest is just workarounds. The real fix will need libbpf and kernel. > >> So, there are two workarounds, >> (1). __u64 param_working[2]; >> param_working[0] = (__u64)str1; >> param_working[1] = (__u64)str2; >> (2). BPF_SEQ_PRINTF(seq, "%s ", str1); >> BPF_SEQ_PRINTF(seq, "%s", str2); > > (2) is a bit impractical for my actual usecase. I am implementing a > bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep > the selftest short with a few BPF_SNPRINTF() calls that exercise most > format specifiers. > >> In practice, if you have at least one non-const format argument, >> you should be fine. But if all format arguments are constant, then >> none of them should be strings. > > Just for context, this does not only happen for strings but also for > all sorts of pointers, for example, when I try to do address lookup of > global __ksym variables, which is important for my selftest. Currently, in bpf_seq_printf(), we do memory copy for string and certain ipv4/ipv6 addresses. ipv4 is not an issue as the compiler less likely put it into rodata. for ipv6, if it is a constant, we can just directly put it into the format string. For many other sort of pointers, we just print pointer values, I don't see a value to print pointer value for something like static const param[] = { &str1, &str2 }; bpf_seq_printf(seq, "%px\n", param[0]); The global __ksym variable cannot be pointing to rodata at compile time, so it should be fine. > >> Maybe we could change marco >> unsigned long long ___param[] = { args }; >> to declare an array explicitly and then have a loop to >> assign each array element? > > I think this would be a good workaround for now, indeed. :) I'll look > into it today and send it as part of my bpf_snprintf series. If we can make it work, that will be great! thanks for working on this. > > Thanks! >
On Wed, Mar 10, 2021 at 8:59 AM Yonghong Song <yhs@fb.com> wrote: > > > > On 3/10/21 3:48 AM, Florent Revest wrote: > > On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song <yhs@fb.com> wrote: > >> On 3/9/21 7:43 PM, Yonghong Song wrote: > >>> On 3/9/21 5:54 PM, Florent Revest wrote: > >>>> I noticed that initializing an array of pointers using this syntax: > >>>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > >>>> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) > >>>> always results in array[0] and array[1] being NULL. > >>>> > >>>> Interestingly, if the array is only initialized with one pointer, ex: > >>>> __u64 array[] = { (__u64)&var1 }; > >>>> Then array[0] will not be NULL. > >>>> > >>>> Or if the array is initialized field by field, ex: > >>>> __u64 array[2]; > >>>> array[0] = (__u64)&var1; > >>>> array[1] = (__u64)&var2; > >>>> Then array[0] and array[1] will not be NULL either. > >>>> > >>>> I'm assuming that this should have something to do with relocations > >>>> and might be a bug in clang or in libbpf but because I don't know much > >>>> about these, I thought that reporting could be a good first step. :) > >>> > >>> Thanks for reporting. What you guess is correct, this is due to > >>> relocations :-( > >>> > >>> The compiler notoriously tend to put complex initial values into > >>> rodata section. For example, for > >>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > >>> the compiler will put > >>> { (__u64)&var1, (__u64)&var2 } > >>> into rodata section. > >>> > >>> But &var1 and &var2 themselves need relocation since they are > >>> address of static variables which will sit inside .data section. > >>> > >>> So in the elf file, you will see the following relocations: > >>> > >>> RELOCATION RECORDS FOR [.rodata]: > >>> OFFSET TYPE VALUE > >>> 0000000000000018 R_BPF_64_64 .data > >>> 0000000000000020 R_BPF_64_64 .data > > > > Right :) Thank you for the explanations Yonghong! > > > >>> Currently, libbpf does not handle relocation inside .rodata > >>> section, so they content remains 0. > > > > Just for my own edification, why is .rodata relocation not yet handled > > in libbpf ? Is it because of a read-only mapping that makes it more > > difficult ? > > We don't have this use case before. In general, people do not put > string pointers in init code in the declaration. I think > bpf_seq_printf() is special about this and hence triggering > the issue. > > To support relocation of rodata section, kernel needs to be > involved and this is actually more complicated as Exactly. It would be trivial for libbpf to support it, but it needs to resolve to the actual in-kernel address of a map (plus offset), which libbpf has no way of knowing. > the relocation is against .data section. Two issues the kernel > needs to deal with: > - .data section will be another map in kernel, so i.e., > relocation of .rodata map value against another map. > - .data section may be modified, some protection might > be needed to prevent this. We may ignore this requirement > since user space may have similar issue. > > This is a corner case, if we can workaround in the libbpf, in > this particular case, bpf_tracing.h. I think it will be > good enough, not adding further complexity in kernel for > such a corner case. Is there some way to trick compiler into thinking that those values are not constant? Some volatile and pointers game? Or any other magic? > > > > >>> That is why you see the issue with pointer as NULL. > >>> > >>> With array size of 1, compiler does not bother to put it into > >>> rodata section. > >>> > >>> I *guess* that it works in the macro due to some kind of heuristics, > >>> e.g., nested blocks, etc, and llvm did not promote the array init value > >>> to rodata. I will double check whether llvm can complete prevent > >>> such transformation. > >>> > >>> Maybe in the future libbpf is able to handle relocations for > >>> rodata section too. But for the time being, please just consider to use > >>> either macro, or the explicit array assignment. > >> > >> Digging into the compiler, the compiler tries to make *const* initial > >> value into rodata section if the initial value size > 64, so in > >> this case, macro does not work either. I think this is how you > >> discovered the issue. > > > > Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how > > I found the bug. > > > >> The llvm does not provide target hooks to > >> influence this transformation. > > > > Oh, that is unfortunate :) Thanks for looking into it! I feel that the > > real fix would be in libbpf anyway and the rest is just workarounds. > > The real fix will need libbpf and kernel. > > > > >> So, there are two workarounds, > >> (1). __u64 param_working[2]; > >> param_working[0] = (__u64)str1; > >> param_working[1] = (__u64)str2; > >> (2). BPF_SEQ_PRINTF(seq, "%s ", str1); > >> BPF_SEQ_PRINTF(seq, "%s", str2); > > > > (2) is a bit impractical for my actual usecase. I am implementing a > > bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep > > the selftest short with a few BPF_SNPRINTF() calls that exercise most > > format specifiers. > > > >> In practice, if you have at least one non-const format argument, > >> you should be fine. But if all format arguments are constant, then > >> none of them should be strings. > > > > Just for context, this does not only happen for strings but also for > > all sorts of pointers, for example, when I try to do address lookup of > > global __ksym variables, which is important for my selftest. > > Currently, in bpf_seq_printf(), we do memory copy for string > and certain ipv4/ipv6 addresses. ipv4 is not an issue as the compiler > less likely put it into rodata. for ipv6, > if it is a constant, we can just directly put it into the format > string. For many other sort of pointers, we just print pointer > values, I don't see a value to print pointer value for something like > static const param[] = { &str1, &str2 }; > bpf_seq_printf(seq, "%px\n", param[0]); > > The global __ksym variable cannot be pointing to rodata at compile time, > so it should be fine. > > > > >> Maybe we could change marco > >> unsigned long long ___param[] = { args }; > >> to declare an array explicitly and then have a loop to > >> assign each array element? > > > > I think this would be a good workaround for now, indeed. :) I'll look > > into it today and send it as part of my bpf_snprintf series. > > If we can make it work, that will be great! thanks for working on this. > > > > > Thanks! > >
On Wed, Mar 10, 2021 at 12:12 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Mar 10, 2021 at 8:59 AM Yonghong Song <yhs@fb.com> wrote: > > > > > > > > On 3/10/21 3:48 AM, Florent Revest wrote: > > > On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song <yhs@fb.com> wrote: > > >> On 3/9/21 7:43 PM, Yonghong Song wrote: > > >>> On 3/9/21 5:54 PM, Florent Revest wrote: > > >>>> I noticed that initializing an array of pointers using this syntax: > > >>>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > > >>>> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) > > >>>> always results in array[0] and array[1] being NULL. > > >>>> > > >>>> Interestingly, if the array is only initialized with one pointer, ex: > > >>>> __u64 array[] = { (__u64)&var1 }; > > >>>> Then array[0] will not be NULL. > > >>>> > > >>>> Or if the array is initialized field by field, ex: > > >>>> __u64 array[2]; > > >>>> array[0] = (__u64)&var1; > > >>>> array[1] = (__u64)&var2; > > >>>> Then array[0] and array[1] will not be NULL either. > > >>>> > > >>>> I'm assuming that this should have something to do with relocations > > >>>> and might be a bug in clang or in libbpf but because I don't know much > > >>>> about these, I thought that reporting could be a good first step. :) > > >>> > > >>> Thanks for reporting. What you guess is correct, this is due to > > >>> relocations :-( > > >>> > > >>> The compiler notoriously tend to put complex initial values into > > >>> rodata section. For example, for > > >>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > > >>> the compiler will put > > >>> { (__u64)&var1, (__u64)&var2 } > > >>> into rodata section. > > >>> > > >>> But &var1 and &var2 themselves need relocation since they are > > >>> address of static variables which will sit inside .data section. > > >>> > > >>> So in the elf file, you will see the following relocations: > > >>> > > >>> RELOCATION RECORDS FOR [.rodata]: > > >>> OFFSET TYPE VALUE > > >>> 0000000000000018 R_BPF_64_64 .data > > >>> 0000000000000020 R_BPF_64_64 .data > > > > > > Right :) Thank you for the explanations Yonghong! > > > > > >>> Currently, libbpf does not handle relocation inside .rodata > > >>> section, so they content remains 0. > > > > > > Just for my own edification, why is .rodata relocation not yet handled > > > in libbpf ? Is it because of a read-only mapping that makes it more > > > difficult ? > > > > We don't have this use case before. In general, people do not put > > string pointers in init code in the declaration. I think > > bpf_seq_printf() is special about this and hence triggering > > the issue. > > > > To support relocation of rodata section, kernel needs to be > > involved and this is actually more complicated as > > Exactly. It would be trivial for libbpf to support it, but it needs to > resolve to the actual in-kernel address of a map (plus offset), which > libbpf has no way of knowing. Having said that, libbpf should probably error out when such relocation is present, because there is no way the application with such relocations is going to be correct. > > > the relocation is against .data section. Two issues the kernel > > needs to deal with: > > - .data section will be another map in kernel, so i.e., > > relocation of .rodata map value against another map. > > - .data section may be modified, some protection might > > be needed to prevent this. We may ignore this requirement > > since user space may have similar issue. > > > > This is a corner case, if we can workaround in the libbpf, in > > this particular case, bpf_tracing.h. I think it will be > > good enough, not adding further complexity in kernel for > > such a corner case. > > Is there some way to trick compiler into thinking that those values > are not constant? Some volatile and pointers game? Or any other magic? > > > > > > > > > >>> That is why you see the issue with pointer as NULL. > > >>> > > >>> With array size of 1, compiler does not bother to put it into > > >>> rodata section. > > >>> > > >>> I *guess* that it works in the macro due to some kind of heuristics, > > >>> e.g., nested blocks, etc, and llvm did not promote the array init value > > >>> to rodata. I will double check whether llvm can complete prevent > > >>> such transformation. > > >>> > > >>> Maybe in the future libbpf is able to handle relocations for > > >>> rodata section too. But for the time being, please just consider to use > > >>> either macro, or the explicit array assignment. > > >> > > >> Digging into the compiler, the compiler tries to make *const* initial > > >> value into rodata section if the initial value size > 64, so in > > >> this case, macro does not work either. I think this is how you > > >> discovered the issue. > > > > > > Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how > > > I found the bug. > > > > > >> The llvm does not provide target hooks to > > >> influence this transformation. > > > > > > Oh, that is unfortunate :) Thanks for looking into it! I feel that the > > > real fix would be in libbpf anyway and the rest is just workarounds. > > > > The real fix will need libbpf and kernel. > > > > > > > >> So, there are two workarounds, > > >> (1). __u64 param_working[2]; > > >> param_working[0] = (__u64)str1; > > >> param_working[1] = (__u64)str2; > > >> (2). BPF_SEQ_PRINTF(seq, "%s ", str1); > > >> BPF_SEQ_PRINTF(seq, "%s", str2); > > > > > > (2) is a bit impractical for my actual usecase. I am implementing a > > > bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep > > > the selftest short with a few BPF_SNPRINTF() calls that exercise most > > > format specifiers. > > > > > >> In practice, if you have at least one non-const format argument, > > >> you should be fine. But if all format arguments are constant, then > > >> none of them should be strings. > > > > > > Just for context, this does not only happen for strings but also for > > > all sorts of pointers, for example, when I try to do address lookup of > > > global __ksym variables, which is important for my selftest. > > > > Currently, in bpf_seq_printf(), we do memory copy for string > > and certain ipv4/ipv6 addresses. ipv4 is not an issue as the compiler > > less likely put it into rodata. for ipv6, > > if it is a constant, we can just directly put it into the format > > string. For many other sort of pointers, we just print pointer > > values, I don't see a value to print pointer value for something like > > static const param[] = { &str1, &str2 }; > > bpf_seq_printf(seq, "%px\n", param[0]); > > > > The global __ksym variable cannot be pointing to rodata at compile time, > > so it should be fine. > > > > > > > >> Maybe we could change marco > > >> unsigned long long ___param[] = { args }; > > >> to declare an array explicitly and then have a loop to > > >> assign each array element? > > > > > > I think this would be a good workaround for now, indeed. :) I'll look > > > into it today and send it as part of my bpf_snprintf series. > > > > If we can make it work, that will be great! thanks for working on this. > > > > > > > > Thanks! > > >
On Wed, Mar 10, 2021 at 10:51 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > On Wed, Mar 10, 2021 at 12:12 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > On Wed, Mar 10, 2021 at 8:59 AM Yonghong Song <yhs@fb.com> wrote: > > > On 3/10/21 3:48 AM, Florent Revest wrote: > > > > On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song <yhs@fb.com> wrote: > > > >> On 3/9/21 7:43 PM, Yonghong Song wrote: > > > >>> On 3/9/21 5:54 PM, Florent Revest wrote: > > > >>>> I noticed that initializing an array of pointers using this syntax: > > > >>>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > > > >>>> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) > > > >>>> always results in array[0] and array[1] being NULL. > > > >>>> > > > >>>> Interestingly, if the array is only initialized with one pointer, ex: > > > >>>> __u64 array[] = { (__u64)&var1 }; > > > >>>> Then array[0] will not be NULL. > > > >>>> > > > >>>> Or if the array is initialized field by field, ex: > > > >>>> __u64 array[2]; > > > >>>> array[0] = (__u64)&var1; > > > >>>> array[1] = (__u64)&var2; > > > >>>> Then array[0] and array[1] will not be NULL either. > > > >>>> > > > >>>> I'm assuming that this should have something to do with relocations > > > >>>> and might be a bug in clang or in libbpf but because I don't know much > > > >>>> about these, I thought that reporting could be a good first step. :) > > > >>> > > > >>> Thanks for reporting. What you guess is correct, this is due to > > > >>> relocations :-( > > > >>> > > > >>> The compiler notoriously tend to put complex initial values into > > > >>> rodata section. For example, for > > > >>> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > > > >>> the compiler will put > > > >>> { (__u64)&var1, (__u64)&var2 } > > > >>> into rodata section. > > > >>> > > > >>> But &var1 and &var2 themselves need relocation since they are > > > >>> address of static variables which will sit inside .data section. > > > >>> > > > >>> So in the elf file, you will see the following relocations: > > > >>> > > > >>> RELOCATION RECORDS FOR [.rodata]: > > > >>> OFFSET TYPE VALUE > > > >>> 0000000000000018 R_BPF_64_64 .data > > > >>> 0000000000000020 R_BPF_64_64 .data > > > > > > > > Right :) Thank you for the explanations Yonghong! > > > > > > > >>> Currently, libbpf does not handle relocation inside .rodata > > > >>> section, so they content remains 0. > > > > > > > > Just for my own edification, why is .rodata relocation not yet handled > > > > in libbpf ? Is it because of a read-only mapping that makes it more > > > > difficult ? > > > > > > We don't have this use case before. In general, people do not put > > > string pointers in init code in the declaration. I think > > > bpf_seq_printf() is special about this and hence triggering > > > the issue. Fair enough, the only reasonable usecase that I can think of is a selftest like the one I wrote for bpf_snprintf and the macro in bpf_tracing.h will be a good enough workaround for that. > > > To support relocation of rodata section, kernel needs to be > > > involved and this is actually more complicated as > > > > Exactly. It would be trivial for libbpf to support it, but it needs to > > resolve to the actual in-kernel address of a map (plus offset), which > > libbpf has no way of knowing. Ah right, I see now, thanks! Indeed this would be quite complex and probably not very useful. > Having said that, libbpf should probably error out when such > relocation is present, because there is no way the application with > such relocations is going to be correct. Good point, it would have helped me notice the problem earlier. :)
diff --git a/tools/testing/selftests/bpf/prog_tests/bug.c b/tools/testing/selftests/bpf/prog_tests/bug.c new file mode 100644 index 000000000000..4b0fafd936b7 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/bug.c @@ -0,0 +1,41 @@ +#include <test_progs.h> +#include "test_bug.skel.h" + +static int duration; + +void test_bug(void) +{ + struct test_bug *skel; + struct bpf_link *link; + char buf[64] = {}; + int iter_fd, len; + + skel = test_bug__open_and_load(); + if (CHECK(!skel, "test_bug__open_and_load", + "skeleton open_and_load failed\n")) + goto destroy; + + link = bpf_program__attach_iter(skel->progs.bug, NULL); + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + goto destroy; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n")) + goto free_link; + + len = read(iter_fd, buf, sizeof(buf)); + CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)); + // BUG: We expect the strings to be printed in both cases but only the + // second case works. + // actual 'str1= str2= str1=STR1 str2=STR2 ' + // != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 ' + ASSERT_STREQ(buf, "str1=STR1 str2=STR2 str1=STR1 str2=STR2 ", "read"); + + close(iter_fd); + +free_link: + bpf_link__destroy(link); +destroy: + test_bug__destroy(skel); +} + diff --git a/tools/testing/selftests/bpf/progs/test_bug.c b/tools/testing/selftests/bpf/progs/test_bug.c new file mode 100644 index 000000000000..c41e69483785 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_bug.c @@ -0,0 +1,43 @@ +#include "bpf_iter.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +char _license[] SEC("license") = "GPL"; + +SEC("iter/task") +int bug(struct bpf_iter__task *ctx) +{ + struct seq_file *seq = ctx->meta->seq; + + /* We want to print two strings */ + static const char fmt[] = "str1=%s str2=%s "; + static char str1[] = "STR1"; + static char str2[] = "STR2"; + + /* + * Because bpf_seq_printf takes parameters to its format specifiers in + * an array, we need to stuff pointers to str1 and str2 in a u64 array. + */ + + /* First, we try a one-liner array initialization. Note that this is + * what the BPF_SEQ_PRINTF macro does under the hood. */ + __u64 param_not_working[] = { (__u64)str1, (__u64)str2 }; + /* But we also try a field by field initialization of the array. We + * would expect the arrays and the behavior to be exactly the same. */ + __u64 param_working[2]; + param_working[0] = (__u64)str1; + param_working[1] = (__u64)str2; + + /* For convenience, only print once */ + if (ctx->meta->seq_num != 0) + return 0; + + /* Using the one-liner array of params, it does not print the strings */ + bpf_seq_printf(seq, fmt, sizeof(fmt), + param_not_working, sizeof(param_not_working)); + /* Using the field-by-field array of params, it prints the strings */ + bpf_seq_printf(seq, fmt, sizeof(fmt), + param_working, sizeof(param_working)); + + return 0; +}
I noticed that initializing an array of pointers using this syntax: __u64 array[] = { (__u64)&var1, (__u64)&var2 }; (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) always results in array[0] and array[1] being NULL. Interestingly, if the array is only initialized with one pointer, ex: __u64 array[] = { (__u64)&var1 }; Then array[0] will not be NULL. Or if the array is initialized field by field, ex: __u64 array[2]; array[0] = (__u64)&var1; array[1] = (__u64)&var2; Then array[0] and array[1] will not be NULL either. I'm assuming that this should have something to do with relocations and might be a bug in clang or in libbpf but because I don't know much about these, I thought that reporting could be a good first step. :) I attached below a repro with a dummy selftest that I expect should pass but fails to pass with the latest clang and bpf-next. Hopefully, the logic should be simple: I try to print two strings from pointers in an array using bpf_seq_printf but depending on how the array is initialized the helper either receives the string pointers or NULL pointers: test_bug:FAIL:read unexpected read: actual 'str1= str2= str1=STR1 str2=STR2 ' != expected 'str1=STR1 str2=STR2 str1=STR1 str2=STR2 ' Signed-off-by: Florent Revest <revest@chromium.org> --- tools/testing/selftests/bpf/prog_tests/bug.c | 41 +++++++++++++++++++ tools/testing/selftests/bpf/progs/test_bug.c | 43 ++++++++++++++++++++ 2 files changed, 84 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/bug.c create mode 100644 tools/testing/selftests/bpf/progs/test_bug.c