Message ID | 20210924095542.33697-5-lmb@cloudflare.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | Fix up bpf_jit_limit some more | expand |
Context | Check | Description |
---|---|---|
bpf/vmtest-bpf-next-PR | success | PR summary |
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for bpf-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 7 maintainers not CCed: kpsingh@kernel.org masahiroy@kernel.org john.fastabend@gmail.com yhs@fb.com dvyukov@google.com songliubraving@fb.com kafai@fb.com |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 3339 this patch: 3339 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 35 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 3451 this patch: 3451 |
netdev/header_inline | success | Link |
bpf/vmtest-bpf-next | success | VM_Test |
On 9/24/21 11:55 AM, Lorenz Bauer wrote: > Expose bpf_jit_current as a read only value via sysctl. > > Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> > --- > include/linux/filter.h | 1 + > kernel/bpf/core.c | 3 +-- > net/core/sysctl_net_core.c | 7 +++++++ > 3 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/include/linux/filter.h b/include/linux/filter.h > index ef03ff34234d..b2143ad5ce00 100644 > --- a/include/linux/filter.h > +++ b/include/linux/filter.h > @@ -1052,6 +1052,7 @@ extern int bpf_jit_harden; > extern int bpf_jit_kallsyms; > extern long bpf_jit_limit; > extern long bpf_jit_limit_max; > +extern atomic_long_t bpf_jit_current; > > typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size); > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c > index e844a2a4c99a..93f95e9ee8be 100644 > --- a/kernel/bpf/core.c > +++ b/kernel/bpf/core.c > @@ -525,6 +525,7 @@ int bpf_jit_kallsyms __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_DEFAULT_ON); > int bpf_jit_harden __read_mostly; > long bpf_jit_limit __read_mostly; > long bpf_jit_limit_max __read_mostly; > +atomic_long_t bpf_jit_current __read_mostly; > > static void > bpf_prog_ksym_set_addr(struct bpf_prog *prog) > @@ -800,8 +801,6 @@ int bpf_jit_add_poke_descriptor(struct bpf_prog *prog, > return slot; > } > > -static atomic_long_t bpf_jit_current; > - > /* Can be overridden by an arch's JIT compiler if it has a custom, > * dedicated BPF backend memory area, or if neither of the two > * below apply. > diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c > index 5f88526ad61c..674aac163b84 100644 > --- a/net/core/sysctl_net_core.c > +++ b/net/core/sysctl_net_core.c > @@ -421,6 +421,13 @@ static struct ctl_table net_core_table[] = { > .extra1 = &long_one, > .extra2 = &bpf_jit_limit_max, > }, > + { > + .procname = "bpf_jit_current", > + .data = &bpf_jit_current, > + .maxlen = sizeof(long), > + .mode = 0400, > + .proc_handler = proc_dolongvec_minmax_bpf_restricted, Overall series looks good to me. The only nit I would have is that the above could (in theory) be subject to atomic_long_t vs long type confusion. I would rather prefer to have a small handler which properly reads out the atomic_long_t and then passes it onwards as a temporary/plain long to user space. Thanks, Daniel > + }, > #endif > { > .procname = "netdev_tstamp_prequeue", >
On Mon, Sep 27, 2021 at 03:34 PM CEST, Daniel Borkmann wrote: > On 9/24/21 11:55 AM, Lorenz Bauer wrote: >> Expose bpf_jit_current as a read only value via sysctl. >> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> >> --- I find exposing stats via system configuration variables a bit unexpected. Not sure if there is any example today that we're following. Maybe an entry under /sys/kernel/debug would be a better fit? That way we don't have to commit to a sysctl that might go away if we start charging JIT allocs against memory cgroup quota. Although that brings up question against which cgroup iptables xt_bpf allocations should be charged? Root cgroup?
On Mon, 27 Sept 2021 at 15:01, Jakub Sitnicki <jakub@cloudflare.com> wrote: > > I find exposing stats via system configuration variables a bit > unexpected. Not sure if there is any example today that we're following. > > Maybe an entry under /sys/kernel/debug would be a better fit? > > That way we don't have to commit to a sysctl that might go away if we > start charging JIT allocs against memory cgroup quota. I had a look around, there are no other obvious places in debugfs or proc where we already have bpf info exposed. It currently all goes via sysctl. There are examples of readonly sysctls: $ sudo find /proc/sys -perm 0444 | wc -l 90 There are no examples of sysctls with mode 0400 however: $ sudo find /proc/sys -perm 0400 | wc -l 0 I find it kind of weird that the bpf sysctls are so tightly locked down (CAP_SYS_ADMIN && root) even for reading. Maybe something I can change?
Le 24/09/2021 à 11:55, Lorenz Bauer a écrit :
> Expose bpf_jit_current as a read only value via sysctl.
bpf_jit_current unit is 'pages' and bpf_jit_limit unit is bytes.
Maybe exposing those values with the same unit will ease debugging.
Regards,
Nicolas
diff --git a/include/linux/filter.h b/include/linux/filter.h index ef03ff34234d..b2143ad5ce00 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1052,6 +1052,7 @@ extern int bpf_jit_harden; extern int bpf_jit_kallsyms; extern long bpf_jit_limit; extern long bpf_jit_limit_max; +extern atomic_long_t bpf_jit_current; typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index e844a2a4c99a..93f95e9ee8be 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -525,6 +525,7 @@ int bpf_jit_kallsyms __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_DEFAULT_ON); int bpf_jit_harden __read_mostly; long bpf_jit_limit __read_mostly; long bpf_jit_limit_max __read_mostly; +atomic_long_t bpf_jit_current __read_mostly; static void bpf_prog_ksym_set_addr(struct bpf_prog *prog) @@ -800,8 +801,6 @@ int bpf_jit_add_poke_descriptor(struct bpf_prog *prog, return slot; } -static atomic_long_t bpf_jit_current; - /* Can be overridden by an arch's JIT compiler if it has a custom, * dedicated BPF backend memory area, or if neither of the two * below apply. diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 5f88526ad61c..674aac163b84 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -421,6 +421,13 @@ static struct ctl_table net_core_table[] = { .extra1 = &long_one, .extra2 = &bpf_jit_limit_max, }, + { + .procname = "bpf_jit_current", + .data = &bpf_jit_current, + .maxlen = sizeof(long), + .mode = 0400, + .proc_handler = proc_dolongvec_minmax_bpf_restricted, + }, #endif { .procname = "netdev_tstamp_prequeue",
Expose bpf_jit_current as a read only value via sysctl. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> --- include/linux/filter.h | 1 + kernel/bpf/core.c | 3 +-- net/core/sysctl_net_core.c | 7 +++++++ 3 files changed, 9 insertions(+), 2 deletions(-)