diff mbox series

[bpf] bpf: add schedule points in batch ops

Message ID 20220217181902.808742-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit 75134f16e7dd0007aa474b281935c5f42e79f2c8
Delegated to: BPF
Headers show
Series [bpf] bpf: add schedule points in batch ops | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 12 this patch: 12
netdev/cc_maintainers fail 1 blamed authors not CCed: yhs@fb.com; 6 maintainers not CCed: andrii@kernel.org kpsingh@kernel.org john.fastabend@gmail.com kafai@fb.com songliubraving@fb.com yhs@fb.com
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 17 this patch: 17
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 21 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR pending PR summary
bpf/vmtest-bpf pending VM_Test

Commit Message

Eric Dumazet Feb. 17, 2022, 6:19 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

syzbot reported various soft lockups caused by bpf batch operations.

 INFO: task kworker/1:1:27 blocked for more than 140 seconds.
 INFO: task hung in rcu_barrier

Nothing prevents batch ops to process huge amount of data,
we need to add schedule points in them.

Note that maybe_wait_bpf_programs(map) calls from
generic_map_delete_batch() can be factorized by moving
the call after the loop.

This will be done later in -next tree once we get this fix merged,
unless there is strong opinion doing this optimization sooner.

Fixes: aa2e93b8e58e ("bpf: Add generic support for update and delete batch ops")
Fixes: cb4d03ab499d ("bpf: Add generic support for lookup batch op")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Brian Vazquez <brianvv@google.com>
Cc: Stanislav Fomichev <sdf@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
---
 kernel/bpf/syscall.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Stanislav Fomichev Feb. 17, 2022, 6:36 p.m. UTC | #1
On 02/17, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>

> syzbot reported various soft lockups caused by bpf batch operations.

>   INFO: task kworker/1:1:27 blocked for more than 140 seconds.
>   INFO: task hung in rcu_barrier

> Nothing prevents batch ops to process huge amount of data,
> we need to add schedule points in them.

> Note that maybe_wait_bpf_programs(map) calls from
> generic_map_delete_batch() can be factorized by moving
> the call after the loop.

> This will be done later in -next tree once we get this fix merged,
> unless there is strong opinion doing this optimization sooner.

> Fixes: aa2e93b8e58e ("bpf: Add generic support for update and delete  
> batch ops")
> Fixes: cb4d03ab499d ("bpf: Add generic support for lookup batch op")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Brian Vazquez <brianvv@google.com>
> Cc: Stanislav Fomichev <sdf@google.com>

Looks good, thank you!

Reviewed-by: Stanislav Fomichev <sdf@google.com>

> Reported-by: syzbot <syzkaller@googlegroups.com>
> ---
>   kernel/bpf/syscall.c | 3 +++
>   1 file changed, 3 insertions(+)

> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index  
> fa4505f9b6119bcb219ab9733847a98da65d1b21..ca70fe6fba387937dfb54f10826f19ac55a8a8e7  
> 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1355,6 +1355,7 @@ int generic_map_delete_batch(struct bpf_map *map,
>   		maybe_wait_bpf_programs(map);
>   		if (err)
>   			break;
> +		cond_resched();
>   	}
>   	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
>   		err = -EFAULT;
> @@ -1412,6 +1413,7 @@ int generic_map_update_batch(struct bpf_map *map,

>   		if (err)
>   			break;
> +		cond_resched();
>   	}

>   	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
> @@ -1509,6 +1511,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
>   		swap(prev_key, key);
>   		retry = MAP_LOOKUP_RETRIES;
>   		cp++;
> +		cond_resched();
>   	}

>   	if (err == -EFAULT)
> --
> 2.35.1.265.g69c8d7142f-goog
Brian Vazquez Feb. 17, 2022, 6:37 p.m. UTC | #2
Acked-by: Brian Vazquez <brianvv@google.com>


On Thu, Feb 17, 2022 at 10:19 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> syzbot reported various soft lockups caused by bpf batch operations.
>
>  INFO: task kworker/1:1:27 blocked for more than 140 seconds.
>  INFO: task hung in rcu_barrier
>
> Nothing prevents batch ops to process huge amount of data,
> we need to add schedule points in them.
>
> Note that maybe_wait_bpf_programs(map) calls from
> generic_map_delete_batch() can be factorized by moving
> the call after the loop.
>
> This will be done later in -next tree once we get this fix merged,
> unless there is strong opinion doing this optimization sooner.
>
> Fixes: aa2e93b8e58e ("bpf: Add generic support for update and delete batch ops")
> Fixes: cb4d03ab499d ("bpf: Add generic support for lookup batch op")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Brian Vazquez <brianvv@google.com>
> Cc: Stanislav Fomichev <sdf@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> ---
>  kernel/bpf/syscall.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index fa4505f9b6119bcb219ab9733847a98da65d1b21..ca70fe6fba387937dfb54f10826f19ac55a8a8e7 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1355,6 +1355,7 @@ int generic_map_delete_batch(struct bpf_map *map,
>                 maybe_wait_bpf_programs(map);
>                 if (err)
>                         break;
> +               cond_resched();
>         }
>         if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
>                 err = -EFAULT;
> @@ -1412,6 +1413,7 @@ int generic_map_update_batch(struct bpf_map *map,
>
>                 if (err)
>                         break;
> +               cond_resched();
>         }
>
>         if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
> @@ -1509,6 +1511,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
>                 swap(prev_key, key);
>                 retry = MAP_LOOKUP_RETRIES;
>                 cp++;
> +               cond_resched();
>         }
>
>         if (err == -EFAULT)
> --
> 2.35.1.265.g69c8d7142f-goog
>
patchwork-bot+netdevbpf@kernel.org Feb. 17, 2022, 7 p.m. UTC | #3
Hello:

This patch was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Thu, 17 Feb 2022 10:19:02 -0800 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> syzbot reported various soft lockups caused by bpf batch operations.
> 
>  INFO: task kworker/1:1:27 blocked for more than 140 seconds.
>  INFO: task hung in rcu_barrier
> 
> [...]

Here is the summary with links:
  - [bpf] bpf: add schedule points in batch ops
    https://git.kernel.org/bpf/bpf/c/75134f16e7dd

You are awesome, thank you!
diff mbox series

Patch

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index fa4505f9b6119bcb219ab9733847a98da65d1b21..ca70fe6fba387937dfb54f10826f19ac55a8a8e7 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1355,6 +1355,7 @@  int generic_map_delete_batch(struct bpf_map *map,
 		maybe_wait_bpf_programs(map);
 		if (err)
 			break;
+		cond_resched();
 	}
 	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
 		err = -EFAULT;
@@ -1412,6 +1413,7 @@  int generic_map_update_batch(struct bpf_map *map,
 
 		if (err)
 			break;
+		cond_resched();
 	}
 
 	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
@@ -1509,6 +1511,7 @@  int generic_map_lookup_batch(struct bpf_map *map,
 		swap(prev_key, key);
 		retry = MAP_LOOKUP_RETRIES;
 		cp++;
+		cond_resched();
 	}
 
 	if (err == -EFAULT)