Message ID | 20221126105351.2578782-2-hengqi.chen@gmail.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | BPF |
Headers | show |
Series | Check timer_off for map_in_map only when map value has timer | expand |
On 11/26/22 2:53 AM, Hengqi Chen wrote: > The timer_off value could be -EINVAL or -ENOENT when map value of > inner map is struct and contains no bpf_timer. The EINVAL case happens > when the map is created without BTF key/value info, map->timer_off > is set to -EINVAL in map_create(). The ENOENT case happens when > the map is created with BTF key/value info (e.g. from BPF skeleton), > map->timer_off is set to -ENOENT as what btf_find_timer() returns. > In bpf_map_meta_equal(), we expect timer_off to be equal even if > map value does not contains bpf_timer. This rejects map_in_map created > with BTF key/value info to be updated using inner map without BTF > key/value info in case inner map value is struct. This commit lifts > such restriction. > > Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > --- > kernel/bpf/map_in_map.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c > index 135205d0d560..0840872de486 100644 > --- a/kernel/bpf/map_in_map.c > +++ b/kernel/bpf/map_in_map.c > @@ -80,11 +80,18 @@ void bpf_map_meta_free(struct bpf_map *map_meta) > bool bpf_map_meta_equal(const struct bpf_map *meta0, > const struct bpf_map *meta1) > { > + bool timer_off_equal; > + > + if (!map_value_has_timer(meta0) && !map_value_has_timer(meta1)) > + timer_off_equal = true; > + else > + timer_off_equal = meta0->timer_off == meta1->timer_off; > + Is it possible we assign -1 to meta->timer_off directly instead of -EINVAL or -ENOENT, to indicate it does not exist? This will make this and possible other future timer_off comparison much easier? > /* No need to compare ops because it is covered by map_type */ > return meta0->map_type == meta1->map_type && > meta0->key_size == meta1->key_size && > meta0->value_size == meta1->value_size && > - meta0->timer_off == meta1->timer_off && > + timer_off_equal && > meta0->map_flags == meta1->map_flags && > bpf_map_equal_kptr_off_tab(meta0, meta1); > } > -- > 2.34.1
On Sat, Nov 26, 2022 at 2:54 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > The timer_off value could be -EINVAL or -ENOENT when map value of > inner map is struct and contains no bpf_timer. The EINVAL case happens > when the map is created without BTF key/value info, map->timer_off > is set to -EINVAL in map_create(). The ENOENT case happens when > the map is created with BTF key/value info (e.g. from BPF skeleton), > map->timer_off is set to -ENOENT as what btf_find_timer() returns. > In bpf_map_meta_equal(), we expect timer_off to be equal even if > map value does not contains bpf_timer. This rejects map_in_map created > with BTF key/value info to be updated using inner map without BTF > key/value info in case inner map value is struct. This commit lifts > such restriction. Sorry, but I prefer to label this issue as 'wont-fix'. Mixing BTF enabled and non-BTF inner maps is a corner case that is not worth fixing. At some point we will require all programs and maps to contain BTF. It's necessary for introspection. The maps as blobs of data should not be used. Much so adding support for mixed use as inner maps.
Hi, Yonghong: On 2022/11/27 11:21, Yonghong Song wrote: > > > On 11/26/22 2:53 AM, Hengqi Chen wrote: >> The timer_off value could be -EINVAL or -ENOENT when map value of >> inner map is struct and contains no bpf_timer. The EINVAL case happens >> when the map is created without BTF key/value info, map->timer_off >> is set to -EINVAL in map_create(). The ENOENT case happens when >> the map is created with BTF key/value info (e.g. from BPF skeleton), >> map->timer_off is set to -ENOENT as what btf_find_timer() returns. >> In bpf_map_meta_equal(), we expect timer_off to be equal even if >> map value does not contains bpf_timer. This rejects map_in_map created >> with BTF key/value info to be updated using inner map without BTF >> key/value info in case inner map value is struct. This commit lifts >> such restriction. >> >> Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") >> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> >> --- >> kernel/bpf/map_in_map.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c >> index 135205d0d560..0840872de486 100644 >> --- a/kernel/bpf/map_in_map.c >> +++ b/kernel/bpf/map_in_map.c >> @@ -80,11 +80,18 @@ void bpf_map_meta_free(struct bpf_map *map_meta) >> bool bpf_map_meta_equal(const struct bpf_map *meta0, >> const struct bpf_map *meta1) >> { >> + bool timer_off_equal; >> + >> + if (!map_value_has_timer(meta0) && !map_value_has_timer(meta1)) >> + timer_off_equal = true; >> + else >> + timer_off_equal = meta0->timer_off == meta1->timer_off; >> + > > Is it possible we assign -1 to meta->timer_off directly instead of > -EINVAL or -ENOENT, to indicate it does not exist? This will make > this and possible other future timer_off comparison much easier? > These error codes are checked in verifier, so didn't touch it. >> /* No need to compare ops because it is covered by map_type */ >> return meta0->map_type == meta1->map_type && >> meta0->key_size == meta1->key_size && >> meta0->value_size == meta1->value_size && >> - meta0->timer_off == meta1->timer_off && >> + timer_off_equal && >> meta0->map_flags == meta1->map_flags && >> bpf_map_equal_kptr_off_tab(meta0, meta1); >> } >> -- >> 2.34.1
Hi, Alexei: On 2022/11/28 08:44, Alexei Starovoitov wrote: > On Sat, Nov 26, 2022 at 2:54 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: >> >> The timer_off value could be -EINVAL or -ENOENT when map value of >> inner map is struct and contains no bpf_timer. The EINVAL case happens >> when the map is created without BTF key/value info, map->timer_off >> is set to -EINVAL in map_create(). The ENOENT case happens when >> the map is created with BTF key/value info (e.g. from BPF skeleton), >> map->timer_off is set to -ENOENT as what btf_find_timer() returns. >> In bpf_map_meta_equal(), we expect timer_off to be equal even if >> map value does not contains bpf_timer. This rejects map_in_map created >> with BTF key/value info to be updated using inner map without BTF >> key/value info in case inner map value is struct. This commit lifts >> such restriction. > > Sorry, but I prefer to label this issue as 'wont-fix'. > Mixing BTF enabled and non-BTF inner maps is a corner case We do have such usecase. The BPF progs and maps are pinned to bpffs using BPF object file. And the map_in_map is updated by some other process which don't have access to such BTF info. > that is not worth fixing. Is there a way to get this fixed for v5.x series only ? > At some point we will require all programs and maps to contain BTF. > It's necessary for introspection. We don't care much about BTF for introspection. In production, we always have a version field and some reserved fields in the map value for backward compatibility. The interpretation of such map values are left to upper layer. > The maps as blobs of data should not be used. > Much so adding support for mixed use as inner maps.
On Sun, Nov 27, 2022 at 6:42 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > Hi, Alexei: > > On 2022/11/28 08:44, Alexei Starovoitov wrote: > > On Sat, Nov 26, 2022 at 2:54 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: > >> > >> The timer_off value could be -EINVAL or -ENOENT when map value of > >> inner map is struct and contains no bpf_timer. The EINVAL case happens > >> when the map is created without BTF key/value info, map->timer_off > >> is set to -EINVAL in map_create(). The ENOENT case happens when > >> the map is created with BTF key/value info (e.g. from BPF skeleton), > >> map->timer_off is set to -ENOENT as what btf_find_timer() returns. > >> In bpf_map_meta_equal(), we expect timer_off to be equal even if > >> map value does not contains bpf_timer. This rejects map_in_map created > >> with BTF key/value info to be updated using inner map without BTF > >> key/value info in case inner map value is struct. This commit lifts > >> such restriction. > > > > Sorry, but I prefer to label this issue as 'wont-fix'. > > Mixing BTF enabled and non-BTF inner maps is a corner case > > We do have such usecase. The BPF progs and maps are pinned to bpffs > using BPF object file. And the map_in_map is updated by some other > process which don't have access to such BTF info. > > > that is not worth fixing. > > Is there a way to get this fixed for v5.x series only ? > > > At some point we will require all programs and maps to contain BTF. > > It's necessary for introspection. > > We don't care much about BTF for introspection. In production, we always > have a version field and some reserved fields in the map value for backward > compatibility. The interpretation of such map values are left to upper layer. That "interpretation of such map values are left to upper layer"... is exactly the reason why we will enforce BTF in the future. Production engineers and people outside of "upper layer" sw team has to be able to debug maps and progs.
On 2022/11/28 10:49, Alexei Starovoitov wrote: > On Sun, Nov 27, 2022 at 6:42 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: >> >> Hi, Alexei: >> >> On 2022/11/28 08:44, Alexei Starovoitov wrote: >>> On Sat, Nov 26, 2022 at 2:54 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: >>>> >>>> The timer_off value could be -EINVAL or -ENOENT when map value of >>>> inner map is struct and contains no bpf_timer. The EINVAL case happens >>>> when the map is created without BTF key/value info, map->timer_off >>>> is set to -EINVAL in map_create(). The ENOENT case happens when >>>> the map is created with BTF key/value info (e.g. from BPF skeleton), >>>> map->timer_off is set to -ENOENT as what btf_find_timer() returns. >>>> In bpf_map_meta_equal(), we expect timer_off to be equal even if >>>> map value does not contains bpf_timer. This rejects map_in_map created >>>> with BTF key/value info to be updated using inner map without BTF >>>> key/value info in case inner map value is struct. This commit lifts >>>> such restriction. >>> >>> Sorry, but I prefer to label this issue as 'wont-fix'. >>> Mixing BTF enabled and non-BTF inner maps is a corner case >> >> We do have such usecase. The BPF progs and maps are pinned to bpffs >> using BPF object file. And the map_in_map is updated by some other >> process which don't have access to such BTF info. >> >>> that is not worth fixing. >> >> Is there a way to get this fixed for v5.x series only ? >> >>> At some point we will require all programs and maps to contain BTF. >>> It's necessary for introspection. >> >> We don't care much about BTF for introspection. In production, we always >> have a version field and some reserved fields in the map value for backward >> compatibility. The interpretation of such map values are left to upper layer. > > That "interpretation of such map values are left to upper layer"... > is exactly the reason why we will enforce BTF in the future. > Production engineers and people outside of "upper layer" sw team > has to be able to debug maps and progs. Fine. In libbpf, we have: if (is_inner) { pr_warn("map '%s': inner def can't be pinned.\n", map_name); return -EINVAL; } Can we lift this restriction so that we can have an easy way to access BTF info via pinned map ?
On Sun, Nov 27, 2022 at 7:07 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > On 2022/11/28 10:49, Alexei Starovoitov wrote: > > On Sun, Nov 27, 2022 at 6:42 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > >> > >> Hi, Alexei: > >> > >> On 2022/11/28 08:44, Alexei Starovoitov wrote: > >>> On Sat, Nov 26, 2022 at 2:54 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: > >>>> > >>>> The timer_off value could be -EINVAL or -ENOENT when map value of > >>>> inner map is struct and contains no bpf_timer. The EINVAL case happens > >>>> when the map is created without BTF key/value info, map->timer_off > >>>> is set to -EINVAL in map_create(). The ENOENT case happens when > >>>> the map is created with BTF key/value info (e.g. from BPF skeleton), > >>>> map->timer_off is set to -ENOENT as what btf_find_timer() returns. > >>>> In bpf_map_meta_equal(), we expect timer_off to be equal even if > >>>> map value does not contains bpf_timer. This rejects map_in_map created > >>>> with BTF key/value info to be updated using inner map without BTF > >>>> key/value info in case inner map value is struct. This commit lifts > >>>> such restriction. > >>> > >>> Sorry, but I prefer to label this issue as 'wont-fix'. > >>> Mixing BTF enabled and non-BTF inner maps is a corner case > >> > >> We do have such usecase. The BPF progs and maps are pinned to bpffs > >> using BPF object file. And the map_in_map is updated by some other > >> process which don't have access to such BTF info. > >> > >>> that is not worth fixing. > >> > >> Is there a way to get this fixed for v5.x series only ? > >> > >>> At some point we will require all programs and maps to contain BTF. > >>> It's necessary for introspection. > >> > >> We don't care much about BTF for introspection. In production, we always > >> have a version field and some reserved fields in the map value for backward > >> compatibility. The interpretation of such map values are left to upper layer. > > > > That "interpretation of such map values are left to upper layer"... > > is exactly the reason why we will enforce BTF in the future. > > Production engineers and people outside of "upper layer" sw team > > has to be able to debug maps and progs. > > Fine. > > In libbpf, we have: > > if (is_inner) { > pr_warn("map '%s': inner def can't be pinned.\n", map_name); > return -EINVAL; > } > > > Can we lift this restriction so that we can have an easy way to access BTF info > via pinned map ? Probably. Note that __uint(pinning, LIBBPF_PIN_BY_NAME) is the only mode libbpf understands. It's simplistic. but why do you want to use that mode? Just pin it directly with bpf_map__pin() ? Or even more low level bpf_obj_pin() ?
On 2022/11/28 11:14, Alexei Starovoitov wrote: > On Sun, Nov 27, 2022 at 7:07 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: >> >> >> >> On 2022/11/28 10:49, Alexei Starovoitov wrote: >>> On Sun, Nov 27, 2022 at 6:42 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: >>>> >>>> Hi, Alexei: >>>> >>>> On 2022/11/28 08:44, Alexei Starovoitov wrote: >>>>> On Sat, Nov 26, 2022 at 2:54 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: >>>>>> >>>>>> The timer_off value could be -EINVAL or -ENOENT when map value of >>>>>> inner map is struct and contains no bpf_timer. The EINVAL case happens >>>>>> when the map is created without BTF key/value info, map->timer_off >>>>>> is set to -EINVAL in map_create(). The ENOENT case happens when >>>>>> the map is created with BTF key/value info (e.g. from BPF skeleton), >>>>>> map->timer_off is set to -ENOENT as what btf_find_timer() returns. >>>>>> In bpf_map_meta_equal(), we expect timer_off to be equal even if >>>>>> map value does not contains bpf_timer. This rejects map_in_map created >>>>>> with BTF key/value info to be updated using inner map without BTF >>>>>> key/value info in case inner map value is struct. This commit lifts >>>>>> such restriction. >>>>> >>>>> Sorry, but I prefer to label this issue as 'wont-fix'. >>>>> Mixing BTF enabled and non-BTF inner maps is a corner case >>>> >>>> We do have such usecase. The BPF progs and maps are pinned to bpffs >>>> using BPF object file. And the map_in_map is updated by some other >>>> process which don't have access to such BTF info. >>>> >>>>> that is not worth fixing. >>>> >>>> Is there a way to get this fixed for v5.x series only ? >>>> >>>>> At some point we will require all programs and maps to contain BTF. >>>>> It's necessary for introspection. >>>> >>>> We don't care much about BTF for introspection. In production, we always >>>> have a version field and some reserved fields in the map value for backward >>>> compatibility. The interpretation of such map values are left to upper layer. >>> >>> That "interpretation of such map values are left to upper layer"... >>> is exactly the reason why we will enforce BTF in the future. >>> Production engineers and people outside of "upper layer" sw team >>> has to be able to debug maps and progs. >> >> Fine. >> >> In libbpf, we have: >> >> if (is_inner) { >> pr_warn("map '%s': inner def can't be pinned.\n", map_name); >> return -EINVAL; >> } >> >> >> Can we lift this restriction so that we can have an easy way to access BTF info >> via pinned map ? > > Probably. Note that __uint(pinning, LIBBPF_PIN_BY_NAME) > is the only mode libbpf understands. It's simplistic. > but why do you want to use that mode? > Just pin it directly with bpf_map__pin() ? > Or even more low level bpf_obj_pin() ? Will try. Currently, we use `__uint(pinning, LIBBPF_PIN_BY_NAME)` and let libbpf and Cilium's ebpf go library handle all the pinning jobs.
On Sun, Nov 27, 2022 at 6:42 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > Hi, Alexei: > > On 2022/11/28 08:44, Alexei Starovoitov wrote: > > On Sat, Nov 26, 2022 at 2:54 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: > >> > >> The timer_off value could be -EINVAL or -ENOENT when map value of > >> inner map is struct and contains no bpf_timer. The EINVAL case happens > >> when the map is created without BTF key/value info, map->timer_off > >> is set to -EINVAL in map_create(). The ENOENT case happens when > >> the map is created with BTF key/value info (e.g. from BPF skeleton), > >> map->timer_off is set to -ENOENT as what btf_find_timer() returns. > >> In bpf_map_meta_equal(), we expect timer_off to be equal even if > >> map value does not contains bpf_timer. This rejects map_in_map created > >> with BTF key/value info to be updated using inner map without BTF > >> key/value info in case inner map value is struct. This commit lifts > >> such restriction. > > > > Sorry, but I prefer to label this issue as 'wont-fix'. > > Mixing BTF enabled and non-BTF inner maps is a corner case > > We do have such usecase. The BPF progs and maps are pinned to bpffs > using BPF object file. And the map_in_map is updated by some other > process which don't have access to such BTF info. > > > that is not worth fixing. > > Is there a way to get this fixed for v5.x series only ? > > > At some point we will require all programs and maps to contain BTF. > > It's necessary for introspection. > > We don't care much about BTF for introspection. In production, we always > have a version field and some reserved fields in the map value for backward > compatibility. The interpretation of such map values are left to upper layer. All the BTF stuff aside, wouldn't this be the best and most minimal fix? It seems to define correct semantic meaning: no timer is found (because no BTF in this case). Easy to backport, solves the immediate problem. This code seems to be completely reworked in bpf-next, though, so I don't know what the situation is there. diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 7b373a5e861f..9e38cc1e136c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1118,7 +1118,7 @@ static int map_create(union bpf_attr *attr) spin_lock_init(&map->owner.lock); map->spin_lock_off = -EINVAL; - map->timer_off = -EINVAL; + map->timer_off = -ENOENT; if (attr->btf_key_type_id || attr->btf_value_type_id || /* Even the map's value is a kernel's struct, * the bpf_prog.o must have BTF to begin with > > > The maps as blobs of data should not be used. > > Much so adding support for mixed use as inner maps.
diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c index 135205d0d560..0840872de486 100644 --- a/kernel/bpf/map_in_map.c +++ b/kernel/bpf/map_in_map.c @@ -80,11 +80,18 @@ void bpf_map_meta_free(struct bpf_map *map_meta) bool bpf_map_meta_equal(const struct bpf_map *meta0, const struct bpf_map *meta1) { + bool timer_off_equal; + + if (!map_value_has_timer(meta0) && !map_value_has_timer(meta1)) + timer_off_equal = true; + else + timer_off_equal = meta0->timer_off == meta1->timer_off; + /* No need to compare ops because it is covered by map_type */ return meta0->map_type == meta1->map_type && meta0->key_size == meta1->key_size && meta0->value_size == meta1->value_size && - meta0->timer_off == meta1->timer_off && + timer_off_equal && meta0->map_flags == meta1->map_flags && bpf_map_equal_kptr_off_tab(meta0, meta1); }
The timer_off value could be -EINVAL or -ENOENT when map value of inner map is struct and contains no bpf_timer. The EINVAL case happens when the map is created without BTF key/value info, map->timer_off is set to -EINVAL in map_create(). The ENOENT case happens when the map is created with BTF key/value info (e.g. from BPF skeleton), map->timer_off is set to -ENOENT as what btf_find_timer() returns. In bpf_map_meta_equal(), we expect timer_off to be equal even if map value does not contains bpf_timer. This rejects map_in_map created with BTF key/value info to be updated using inner map without BTF key/value info in case inner map value is struct. This commit lifts such restriction. Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> --- kernel/bpf/map_in_map.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) -- 2.34.1