diff mbox series

[v3,bpf-next,7/8] bpf: Allows per-cpu maps and map-in-map in sleepable programs

Message ID 20210209194856.24269-8-alexei.starovoitov@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: Misc improvements | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 8 maintainers not CCed: yhs@fb.com kafai@fb.com netdev@vger.kernel.org ast@kernel.org songliubraving@fb.com john.fastabend@gmail.com kpsingh@kernel.org andrii@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 33 this patch: 33
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 31 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 33 this patch: 33
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Alexei Starovoitov Feb. 9, 2021, 7:48 p.m. UTC
From: Alexei Starovoitov <ast@kernel.org>

Since sleepable programs are now executing under migrate_disable
the per-cpu maps are safe to use.
The map-in-map were ok to use in sleepable from the time sleepable
progs were introduced.

Note that non-preallocated maps are still not safe, since there is
no rcu_read_lock yet in sleepable programs and dynamically allocated
map elements are relying on rcu protection. The sleepable programs
have rcu_read_lock_trace instead. That limitation will be addresses
in the future.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/hashtab.c  | 4 ++--
 kernel/bpf/verifier.c | 7 ++++++-
 2 files changed, 8 insertions(+), 3 deletions(-)

Comments

KP Singh Feb. 9, 2021, 9:12 p.m. UTC | #1
On Tue, Feb 9, 2021 at 9:57 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Since sleepable programs are now executing under migrate_disable
> the per-cpu maps are safe to use.
> The map-in-map were ok to use in sleepable from the time sleepable
> progs were introduced.
>
> Note that non-preallocated maps are still not safe, since there is
> no rcu_read_lock yet in sleepable programs and dynamically allocated
> map elements are relying on rcu protection. The sleepable programs
> have rcu_read_lock_trace instead. That limitation will be addresses
> in the future.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: KP Singh <kpsingh@kernel.org>

Thanks! I actually tested out some of our logic which uses per-cpu maps by
switching the programs to their sleepable counterparts
Alexei Starovoitov Feb. 9, 2021, 10:31 p.m. UTC | #2
On 2/9/21 1:12 PM, KP Singh wrote:
> On Tue, Feb 9, 2021 at 9:57 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> From: Alexei Starovoitov <ast@kernel.org>
>>
>> Since sleepable programs are now executing under migrate_disable
>> the per-cpu maps are safe to use.
>> The map-in-map were ok to use in sleepable from the time sleepable
>> progs were introduced.
>>
>> Note that non-preallocated maps are still not safe, since there is
>> no rcu_read_lock yet in sleepable programs and dynamically allocated
>> map elements are relying on rcu protection. The sleepable programs
>> have rcu_read_lock_trace instead. That limitation will be addresses
>> in the future.
>>
>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> 
> Acked-by: KP Singh <kpsingh@kernel.org>
>
> Thanks! I actually tested out some of our logic which uses per-cpu maps by
> switching the programs to their sleepable counterparts

You mean after applying this set, right?
migrate_disable is the key.
It will be difficult to backport to your kernels though.
The bpf change to enable per-cpu is easy, but backporting
sched support is a different game.
KP Singh Feb. 9, 2021, 10:43 p.m. UTC | #3
On Tue, Feb 9, 2021 at 11:32 PM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 2/9/21 1:12 PM, KP Singh wrote:
> > On Tue, Feb 9, 2021 at 9:57 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> >>
> >> From: Alexei Starovoitov <ast@kernel.org>
> >>
> >> Since sleepable programs are now executing under migrate_disable
> >> the per-cpu maps are safe to use.
> >> The map-in-map were ok to use in sleepable from the time sleepable
> >> progs were introduced.
> >>
> >> Note that non-preallocated maps are still not safe, since there is
> >> no rcu_read_lock yet in sleepable programs and dynamically allocated
> >> map elements are relying on rcu protection. The sleepable programs
> >> have rcu_read_lock_trace instead. That limitation will be addresses
> >> in the future.
> >>
> >> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> >
> > Acked-by: KP Singh <kpsingh@kernel.org>
> >
> > Thanks! I actually tested out some of our logic which uses per-cpu maps by
> > switching the programs to their sleepable counterparts
>
> You mean after applying this set, right?
> migrate_disable is the key.
> It will be difficult to backport to your kernels though.
> The bpf change to enable per-cpu is easy, but backporting
> sched support is a different game.
>

Yes after applying the whole set.

Also, I think I also got it to work on 5.10 by (I am little less sure
of this one though)

-  Backporting https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=12fa97c64dce2f3c2e6eed5dc618bb9046e40bf0
-  Backporting https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=74d862b682f51e45d25b95b1ecf212428a4967b0
- And, backporting this set (I initially missed
https://lore.kernel.org/bpf/20210209194856.24269-3-alexei.starovoitov@gmail.com
where you add the
  calls and ran into issues).

- KP
Alexei Starovoitov Feb. 9, 2021, 11:13 p.m. UTC | #4
On 2/9/21 2:43 PM, KP Singh wrote:
> On Tue, Feb 9, 2021 at 11:32 PM Alexei Starovoitov <ast@fb.com> wrote:
>>
>> On 2/9/21 1:12 PM, KP Singh wrote:
>>> On Tue, Feb 9, 2021 at 9:57 PM Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>>
>>>> From: Alexei Starovoitov <ast@kernel.org>
>>>>
>>>> Since sleepable programs are now executing under migrate_disable
>>>> the per-cpu maps are safe to use.
>>>> The map-in-map were ok to use in sleepable from the time sleepable
>>>> progs were introduced.
>>>>
>>>> Note that non-preallocated maps are still not safe, since there is
>>>> no rcu_read_lock yet in sleepable programs and dynamically allocated
>>>> map elements are relying on rcu protection. The sleepable programs
>>>> have rcu_read_lock_trace instead. That limitation will be addresses
>>>> in the future.
>>>>
>>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>>> Acked-by: Andrii Nakryiko <andrii@kernel.org>
>>>
>>> Acked-by: KP Singh <kpsingh@kernel.org>
>>>
>>> Thanks! I actually tested out some of our logic which uses per-cpu maps by
>>> switching the programs to their sleepable counterparts
>>
>> You mean after applying this set, right?
>> migrate_disable is the key.
>> It will be difficult to backport to your kernels though.
>> The bpf change to enable per-cpu is easy, but backporting
>> sched support is a different game.
>>
> 
> Yes after applying the whole set.
> 
> Also, I think I also got it to work on 5.10 by (I am little less sure
> of this one though)
> 
> -  Backporting https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=12fa97c64dce2f3c2e6eed5dc618bb9046e40bf0
> -  Backporting https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=74d862b682f51e45d25b95b1ecf212428a4967b0
> - And, backporting this set (I initially missed
> https://lore.kernel.org/bpf/20210209194856.24269-3-alexei.starovoitov@gmail.com
> where you add the
>    calls and ran into issues).

and the whole machinery that it depends on.
KP Singh Feb. 9, 2021, 11:22 p.m. UTC | #5
On Wed, Feb 10, 2021 at 12:14 AM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 2/9/21 2:43 PM, KP Singh wrote:
> > On Tue, Feb 9, 2021 at 11:32 PM Alexei Starovoitov <ast@fb.com> wrote:
> >>
> >> On 2/9/21 1:12 PM, KP Singh wrote:
> >>> On Tue, Feb 9, 2021 at 9:57 PM Alexei Starovoitov
> >>> <alexei.starovoitov@gmail.com> wrote:
> >>>>
> >>>> From: Alexei Starovoitov <ast@kernel.org>
> >>>>
> >>>> Since sleepable programs are now executing under migrate_disable
> >>>> the per-cpu maps are safe to use.
> >>>> The map-in-map were ok to use in sleepable from the time sleepable
> >>>> progs were introduced.
> >>>>
> >>>> Note that non-preallocated maps are still not safe, since there is
> >>>> no rcu_read_lock yet in sleepable programs and dynamically allocated
> >>>> map elements are relying on rcu protection. The sleepable programs
> >>>> have rcu_read_lock_trace instead. That limitation will be addresses
> >>>> in the future.
> >>>>
> >>>> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> >>>> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> >>>
> >>> Acked-by: KP Singh <kpsingh@kernel.org>
> >>>
> >>> Thanks! I actually tested out some of our logic which uses per-cpu maps by
> >>> switching the programs to their sleepable counterparts
> >>
> >> You mean after applying this set, right?
> >> migrate_disable is the key.
> >> It will be difficult to backport to your kernels though.
> >> The bpf change to enable per-cpu is easy, but backporting
> >> sched support is a different game.
> >>
> >
> > Yes after applying the whole set.
> >
> > Also, I think I also got it to work on 5.10 by (I am little less sure
> > of this one though)
> >
> > -  Backporting https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=12fa97c64dce2f3c2e6eed5dc618bb9046e40bf0
> > -  Backporting https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=74d862b682f51e45d25b95b1ecf212428a4967b0
> > - And, backporting this set (I initially missed
> > https://lore.kernel.org/bpf/20210209194856.24269-3-alexei.starovoitov@gmail.com
> > where you add the
> >    calls and ran into issues).
>
> and the whole machinery that it depends on.

 https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=12fa97c64dce2f3c2e6eed5dc618bb9046e40bf0
and https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=74d862b682f51e45d25b95b1ecf212428a4967b0
is the machinery I could find between 5.10 and now. But the backport
is not really relevant here, I just mentioned it to clarify that I was
testing the series more than just applying this single patch :)
diff mbox series

Patch

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index c1ac7f964bc9..d63912e73ad9 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -1148,7 +1148,7 @@  static int __htab_percpu_map_update_elem(struct bpf_map *map, void *key,
 		/* unknown flags */
 		return -EINVAL;
 
-	WARN_ON_ONCE(!rcu_read_lock_held());
+	WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
 
 	key_size = map->key_size;
 
@@ -1202,7 +1202,7 @@  static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
 		/* unknown flags */
 		return -EINVAL;
 
-	WARN_ON_ONCE(!rcu_read_lock_held());
+	WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held());
 
 	key_size = map->key_size;
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4189edb41b73..9561f2af7710 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10020,9 +10020,14 @@  static int check_map_prog_compatibility(struct bpf_verifier_env *env,
 		case BPF_MAP_TYPE_HASH:
 		case BPF_MAP_TYPE_LRU_HASH:
 		case BPF_MAP_TYPE_ARRAY:
+		case BPF_MAP_TYPE_PERCPU_HASH:
+		case BPF_MAP_TYPE_PERCPU_ARRAY:
+		case BPF_MAP_TYPE_LRU_PERCPU_HASH:
+		case BPF_MAP_TYPE_ARRAY_OF_MAPS:
+		case BPF_MAP_TYPE_HASH_OF_MAPS:
 			if (!is_preallocated_map(map)) {
 				verbose(env,
-					"Sleepable programs can only use preallocated hash maps\n");
+					"Sleepable programs can only use preallocated maps\n");
 				return -EINVAL;
 			}
 			break;