[bpf-next,v3] bpf: Refactor active lock management

Message ID	20241104151716.2079893-1-memxor@gmail.com (mailing list archive)
State	Superseded
Delegated to:	BPF
Headers	show Received: from mail-lf1-f65.google.com (mail-lf1-f65.google.com [209.85.167.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2CD04A08 for <bpf@vger.kernel.org>; Mon, 4 Nov 2024 15:17:20 +0000 (UTC) From: Kumar Kartikeya Dwivedi <memxor@gmail.com> To: bpf@vger.kernel.org Cc: kkd@meta.com, Alexei Starovoitov <ast@kernel.org>, Andrii Nakryiko <andrii@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Martin KaFai Lau <martin.lau@kernel.org>, Eduard Zingerman <eddyz87@gmail.com>, kernel-team@fb.com Subject: [PATCH bpf-next v3] bpf: Refactor active lock management Date: Mon, 4 Nov 2024 07:17:16 -0800 Message-ID: <20241104151716.2079893-1-memxor@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[bpf-next,v3] bpf: Refactor active lock management \| expand [bpf-next,v3] bpf: Refactor active lock management

Message ID

20241104151716.2079893-1-memxor@gmail.com (mailing list archive)

State

Superseded

Delegated to:

BPF

Headers

From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org
Cc: kkd@meta.com,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	kernel-team@fb.com
Subject: [PATCH bpf-next v3] bpf: Refactor active lock management
Date: Mon,  4 Nov 2024 07:17:16 -0800
Message-ID: <20241104151716.2079893-1-memxor@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

[bpf-next,v3] bpf: Refactor active lock management | expand

Context	Check	Description
bpf/vmtest-bpf-next-PR	success	PR summary
bpf/vmtest-bpf-next-VM_Test-23	success	Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-31	success	Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-39	success	Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-36	success	Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38	success	Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-37	success	Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40	success	Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
netdev/series_format	success	Single patches do not need cover letters
netdev/tree_selection	success	Clearly marked for bpf-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 5 this patch: 5
netdev/build_tools	success	Errors and warnings before: 0 (+0) this patch: 0 (+0)
netdev/cc_maintainers	warning	8 maintainers not CCed: kpsingh@kernel.org martin.lau@linux.dev sdf@fomichev.me john.fastabend@gmail.com song@kernel.org jolsa@kernel.org haoluo@google.com yonghong.song@linux.dev
netdev/build_clang	success	Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 68 this patch: 68
netdev/checkpatch	warning	WARNING: line length of 102 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns WARNING: line length of 99 exceeds 80 columns
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-0	success	Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10	success	Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-26	success	Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-35	success	Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-33	success	Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-41	success	Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-15	success	Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34	success	Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-14	success	Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-30	success	Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-32	success	Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17

Context

Check

Description

bpf/vmtest-bpf-next-PR

success

PR summary

bpf/vmtest-bpf-next-VM_Test-23

success

Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc

bpf/vmtest-bpf-next-VM_Test-21

success

Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc

bpf/vmtest-bpf-next-VM_Test-22

success

Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc

bpf/vmtest-bpf-next-VM_Test-25

success

Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc

bpf/vmtest-bpf-next-VM_Test-31

success

Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17

bpf/vmtest-bpf-next-VM_Test-24

success

Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc

bpf/vmtest-bpf-next-VM_Test-39

success

Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18

bpf/vmtest-bpf-next-VM_Test-36

success

Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18

bpf/vmtest-bpf-next-VM_Test-38

success

Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18

bpf/vmtest-bpf-next-VM_Test-37

success

Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18

bpf/vmtest-bpf-next-VM_Test-40

success

Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18

netdev/series_format

success

Single patches do not need cover letters

netdev/tree_selection

success

Clearly marked for bpf-next

netdev/ynl

success

Generated files up to date; no warnings/errors; no diff in generated;

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 5 this patch: 5

netdev/build_tools

success

Errors and warnings before: 0 (+0) this patch: 0 (+0)

netdev/cc_maintainers

warning

8 maintainers not CCed: kpsingh@kernel.org martin.lau@linux.dev sdf@fomichev.me john.fastabend@gmail.com song@kernel.org jolsa@kernel.org haoluo@google.com yonghong.song@linux.dev

netdev/build_clang

success

Errors and warnings before: 3 this patch: 3

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

success

None detected

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 68 this patch: 68

netdev/checkpatch

warning

WARNING: line length of 102 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns WARNING: line length of 99 exceeds 80 columns

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

bpf/vmtest-bpf-next-VM_Test-3

success

Logs for Validate matrix.py

bpf/vmtest-bpf-next-VM_Test-2

success

Logs for Unittests

bpf/vmtest-bpf-next-VM_Test-0

success

Logs for Lint

bpf/vmtest-bpf-next-VM_Test-1

success

Logs for ShellCheck

bpf/vmtest-bpf-next-VM_Test-5

success

Logs for aarch64-gcc / build-release

bpf/vmtest-bpf-next-VM_Test-12

success

Logs for s390x-gcc / build-release

bpf/vmtest-bpf-next-VM_Test-4

success

Logs for aarch64-gcc / build / build for aarch64 with gcc

bpf/vmtest-bpf-next-VM_Test-10

success

Logs for aarch64-gcc / veristat

bpf/vmtest-bpf-next-VM_Test-9

success

Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc

bpf/vmtest-bpf-next-VM_Test-6

success

Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc

bpf/vmtest-bpf-next-VM_Test-7

success

Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc

bpf/vmtest-bpf-next-VM_Test-8

success

Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc

bpf/vmtest-bpf-next-VM_Test-26

success

Logs for x86_64-llvm-17 / veristat

bpf/vmtest-bpf-next-VM_Test-19

success

Logs for x86_64-gcc / build-release

bpf/vmtest-bpf-next-VM_Test-17

success

Logs for set-matrix

bpf/vmtest-bpf-next-VM_Test-20

success

Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17

bpf/vmtest-bpf-next-VM_Test-16

success

Logs for s390x-gcc / veristat

bpf/vmtest-bpf-next-VM_Test-11

success

Logs for s390x-gcc / build / build for s390x with gcc

bpf/vmtest-bpf-next-VM_Test-35

success

Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2

bpf/vmtest-bpf-next-VM_Test-28

success

Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2

bpf/vmtest-bpf-next-VM_Test-33

success

Logs for x86_64-llvm-17 / veristat

bpf/vmtest-bpf-next-VM_Test-41

success

Logs for x86_64-llvm-18 / veristat

bpf/vmtest-bpf-next-VM_Test-15

success

Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc

bpf/vmtest-bpf-next-VM_Test-18

success

Logs for x86_64-gcc / build / build for x86_64 with gcc

bpf/vmtest-bpf-next-VM_Test-27

success

Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17

bpf/vmtest-bpf-next-VM_Test-34

success

Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18

bpf/vmtest-bpf-next-VM_Test-14

success

Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc

bpf/vmtest-bpf-next-VM_Test-13

success

Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc

bpf/vmtest-bpf-next-VM_Test-29

success

Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17

bpf/vmtest-bpf-next-VM_Test-30

success

Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17

bpf/vmtest-bpf-next-VM_Test-32

success

Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17

Commit Message

Kumar Kartikeya Dwivedi Nov. 4, 2024, 3:17 p.m. UTC

When bpf_spin_lock was introduced originally, there was deliberation on
whether to use an array of lock IDs, but since bpf_spin_lock is limited
to holding a single lock at any given time, we've been using a single ID
to identify the held lock.

In preparation for introducing spin locks that can be taken multiple
times, introduce support for acquiring multiple lock IDs. For this
purpose, reuse the acquired_refs array and store both lock and pointer
references. We tag the entry with REF_TYPE_PTR or REF_TYPE_BPF_LOCK to
disambiguate and find the relevant entry. The ptr field is used to track
the map_ptr or btf (for bpf_obj_new allocations) to ensure locks can be
matched with protected fields within the same "allocation", i.e.
bpf_obj_new object or map value.

The struct active_lock is changed to a boolean as the state is part of
the acquired_refs array, and we only need active_lock as a cheap way
of detecting lock presence.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
Changelog:
v2 -> v3
v2: https://lore.kernel.org/bpf/20241103212252.547071-1-memxor@gmail.com

 * Rebase on bpf-next to resolve merge conflict

v1 -> v2
v1: https://lore.kernel.org/bpf/20241103205856.345580-1-memxor@gmail.com

 * Fix refsafe state comparison to check callback_ref and ptr separately.
---
 include/linux/bpf_verifier.h |  34 ++++++---
 kernel/bpf/verifier.c        | 138 ++++++++++++++++++++++++++---------
 2 files changed, 126 insertions(+), 46 deletions(-)

--
2.43.5

Comments

Alexei Starovoitov Nov. 4, 2024, 7:33 p.m. UTC | #1

On Mon, Nov 4, 2024 at 7:17 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> When bpf_spin_lock was introduced originally, there was deliberation on
> whether to use an array of lock IDs, but since bpf_spin_lock is limited
> to holding a single lock at any given time, we've been using a single ID
> to identify the held lock.
>
> In preparation for introducing spin locks that can be taken multiple
> times, introduce support for acquiring multiple lock IDs. For this
> purpose, reuse the acquired_refs array and store both lock and pointer
> references. We tag the entry with REF_TYPE_PTR or REF_TYPE_BPF_LOCK to
> disambiguate and find the relevant entry. The ptr field is used to track
> the map_ptr or btf (for bpf_obj_new allocations) to ensure locks can be
> matched with protected fields within the same "allocation", i.e.
> bpf_obj_new object or map value.
>
> The struct active_lock is changed to a boolean as the state is part of
> the acquired_refs array, and we only need active_lock as a cheap way
> of detecting lock presence.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
> Changelog:
> v2 -> v3
> v2: https://lore.kernel.org/bpf/20241103212252.547071-1-memxor@gmail.com
>
>  * Rebase on bpf-next to resolve merge conflict
>
> v1 -> v2
> v1: https://lore.kernel.org/bpf/20241103205856.345580-1-memxor@gmail.com
>
>  * Fix refsafe state comparison to check callback_ref and ptr separately.
> ---
>  include/linux/bpf_verifier.h |  34 ++++++---
>  kernel/bpf/verifier.c        | 138 ++++++++++++++++++++++++++---------
>  2 files changed, 126 insertions(+), 46 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 4513372c5bc8..1e7e1803d78b 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -266,6 +266,10 @@ struct bpf_stack_state {
>  };
>
>  struct bpf_reference_state {
> +       /* Each reference object has a type. Ensure REF_TYPE_PTR is zero to
> +        * default to pointer reference on zero initialization of a state.
> +        */
> +       enum { REF_TYPE_PTR = 0, REF_TYPE_BPF_LOCK } type;

In the future commit two more values will be added, right?
And it may push it over line limit.
I think cleaner to do each value on separate line:
  enum ref_state_type {
    REF_TYPE_PTR = 0,
    REF_TYPE_BPF_LOCK
  } type;

easier to read and less churn.
Also name that enum type to use later.

Also I'd drop _BPF_ from the middle.
Just REF_TYPE_LOCK would do.

>         /* Track each reference created with a unique id, even if the same
>          * instruction creates the reference multiple times (eg, via CALL).
>          */
> @@ -274,17 +278,23 @@ struct bpf_reference_state {
>          * is used purely to inform the user of a reference leak.
>          */
>         int insn_idx;
> -       /* There can be a case like:
> -        * main (frame 0)
> -        *  cb (frame 1)
> -        *   func (frame 3)
> -        *    cb (frame 4)
> -        * Hence for frame 4, if callback_ref just stored boolean, it would be
> -        * impossible to distinguish nested callback refs. Hence store the
> -        * frameno and compare that to callback_ref in check_reference_leak when
> -        * exiting a callback function.
> -        */
> -       int callback_ref;
> +       union {
> +               /* There can be a case like:
> +                * main (frame 0)
> +                *  cb (frame 1)
> +                *   func (frame 3)
> +                *    cb (frame 4)
> +                * Hence for frame 4, if callback_ref just stored boolean, it would be
> +                * impossible to distinguish nested callback refs. Hence store the
> +                * frameno and compare that to callback_ref in check_reference_leak when
> +                * exiting a callback function.
> +                */
> +               int callback_ref;
> +               /* Use to keep track of the source object of a lock, to ensure
> +                * it matches on unlock.
> +                */
> +               void *ptr;
> +       };
>  };
>
>  struct bpf_retval_range {
> @@ -434,7 +444,7 @@ struct bpf_verifier_state {
>         u32 insn_idx;
>         u32 curframe;
>
> -       struct bpf_active_lock active_lock;

remove struct bpf_active_lock as well.

> +       bool active_lock;

In the next patch it becomes 'int',
so let's make it 'int' right away and move it to bpf_func_state
next to:
        int acquired_refs;
        struct bpf_reference_state *refs;

since it's counting locks in this array.
'bool' is just a weird from of 1 or 0.
So 'int' is cleaner in this patch too.

>         bool speculative;
>         bool active_rcu_lock;
>         u32 active_preempt_lock;
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index ba800c7611e3..ea8ad320e6cc 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1335,6 +1335,7 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
>         if (err)
>                 return err;
>         id = ++env->id_gen;
> +       state->refs[new_ofs].type = REF_TYPE_PTR;
>         state->refs[new_ofs].id = id;
>         state->refs[new_ofs].insn_idx = insn_idx;
>         state->refs[new_ofs].callback_ref = state->in_callback_fn ? state->frameno : 0;
> @@ -1342,6 +1343,23 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
>         return id;
>  }
>
> +static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, int type, int id, void *ptr)

enum ref_state_type instead of 'int type' ?

> +{
> +       struct bpf_func_state *state = cur_func(env);
> +       int new_ofs = state->acquired_refs;
> +       int err;
> +
> +       err = resize_reference_state(state, state->acquired_refs + 1);
> +       if (err)
> +               return err;
> +       state->refs[new_ofs].type = type;
> +       state->refs[new_ofs].id = id;
> +       state->refs[new_ofs].insn_idx = insn_idx;
> +       state->refs[new_ofs].ptr = ptr;
> +
> +       return 0;
> +}
> +
>  /* release function corresponding to acquire_reference_state(). Idempotent. */
>  static int release_reference_state(struct bpf_func_state *state, int ptr_id)
>  {
> @@ -1349,6 +1367,8 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
>
>         last_idx = state->acquired_refs - 1;
>         for (i = 0; i < state->acquired_refs; i++) {
> +               if (state->refs[i].type != REF_TYPE_PTR)
> +                       continue;
>                 if (state->refs[i].id == ptr_id) {
>                         /* Cannot release caller references in callbacks */
>                         if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> @@ -1364,6 +1384,43 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
>         return -EINVAL;
>  }
>
> +static int release_lock_state(struct bpf_func_state *state, int type, int id, void *ptr)
> +{
> +       int i, last_idx;
> +
> +       last_idx = state->acquired_refs - 1;
> +       for (i = 0; i < state->acquired_refs; i++) {
> +               if (state->refs[i].type != type)
> +                       continue;
> +               if (state->refs[i].id == id && state->refs[i].ptr == ptr) {
> +                       if (last_idx && i != last_idx)
> +                               memcpy(&state->refs[i], &state->refs[last_idx],
> +                                      sizeof(*state->refs));
> +                       memset(&state->refs[last_idx], 0, sizeof(*state->refs));
> +                       state->acquired_refs--;
> +                       return 0;
> +               }
> +       }
> +       return -EINVAL;
> +}
> +
> +static struct bpf_reference_state *find_lock_state(struct bpf_verifier_env *env, int id, void *ptr)
> +{
> +       struct bpf_func_state *state = cur_func(env);
> +       int i;
> +
> +       for (i = 0; i < state->acquired_refs; i++) {
> +               struct bpf_reference_state *s = &state->refs[i];
> +
> +               if (s->type == REF_TYPE_PTR)
> +                       continue;

Let's pass 'enum ref_state_type type' and compare here
to make it similar to release_lock_state().
I think it will clean up future patches too.

> +
> +               if (s->id == id && s->ptr == ptr)
> +                       return s;
> +       }
> +       return NULL;
> +}
> +
>  static void free_func_state(struct bpf_func_state *state)
>  {
>         if (!state)
> @@ -1430,12 +1487,11 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
>                 dst_state->frame[i] = NULL;
>         }
>         dst_state->speculative = src->speculative;
> +       dst_state->active_lock = src->active_lock;
>         dst_state->active_rcu_lock = src->active_rcu_lock;
>         dst_state->active_preempt_lock = src->active_preempt_lock;
>         dst_state->in_sleepable = src->in_sleepable;
>         dst_state->curframe = src->curframe;
> -       dst_state->active_lock.ptr = src->active_lock.ptr;
> -       dst_state->active_lock.id = src->active_lock.id;
>         dst_state->branches = src->branches;
>         dst_state->parent = src->parent;
>         dst_state->first_insn_idx = src->first_insn_idx;
> @@ -5423,7 +5479,7 @@ static bool in_sleepable(struct bpf_verifier_env *env)
>  static bool in_rcu_cs(struct bpf_verifier_env *env)
>  {
>         return env->cur_state->active_rcu_lock ||
> -              env->cur_state->active_lock.ptr ||
> +              env->cur_state->active_lock ||
>                !in_sleepable(env);
>  }
>
> @@ -7698,6 +7754,7 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
>         struct bpf_map *map = NULL;
>         struct btf *btf = NULL;
>         struct btf_record *rec;
> +       int err;
>
>         if (!is_const) {
>                 verbose(env,
> @@ -7729,16 +7786,27 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
>                 return -EINVAL;
>         }
>         if (is_lock) {
> -               if (cur->active_lock.ptr) {
> +               void *ptr;
> +
> +               if (map)
> +                       ptr = map;
> +               else
> +                       ptr = btf;
> +
> +               if (cur->active_lock) {
>                         verbose(env,
>                                 "Locking two bpf_spin_locks are not allowed\n");
>                         return -EINVAL;
>                 }
> -               if (map)
> -                       cur->active_lock.ptr = map;
> -               else
> -                       cur->active_lock.ptr = btf;
> -               cur->active_lock.id = reg->id;
> +               err = acquire_lock_state(env, env->insn_idx, REF_TYPE_BPF_LOCK, reg->id, ptr);
> +               if (err < 0) {
> +                       verbose(env, "Failed to acquire lock state\n");
> +                       return err;
> +               }
> +               /* It is not safe to allow multiple bpf_spin_lock calls, so
> +                * disallow them until this lock has been unlocked.
> +                */
> +               cur->active_lock = true;
>         } else {
>                 void *ptr;
>
> @@ -7747,20 +7815,18 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
>                 else
>                         ptr = btf;
>
> -               if (!cur->active_lock.ptr) {
> +               if (!cur->active_lock) {
>                         verbose(env, "bpf_spin_unlock without taking a lock\n");
>                         return -EINVAL;
>                 }
> -               if (cur->active_lock.ptr != ptr ||
> -                   cur->active_lock.id != reg->id) {
> +
> +               if (release_lock_state(cur_func(env), REF_TYPE_BPF_LOCK, reg->id, ptr)) {
>                         verbose(env, "bpf_spin_unlock of different lock\n");
>                         return -EINVAL;
>                 }
>
>                 invalidate_non_owning_refs(env);
> -
> -               cur->active_lock.ptr = NULL;
> -               cur->active_lock.id = 0;
> +               cur->active_lock = false;
>         }
>         return 0;
>  }
> @@ -9818,7 +9884,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>                 const char *sub_name = subprog_name(env, subprog);
>
>                 /* Only global subprogs cannot be called with a lock held. */
> -               if (env->cur_state->active_lock.ptr) {
> +               if (env->cur_state->active_lock) {
>                         verbose(env, "global function calls are not allowed while holding a lock,\n"
>                                      "use static function instead\n");
>                         return -EINVAL;
> @@ -10343,6 +10409,8 @@ static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exi
>                 return 0;
>
>         for (i = 0; i < state->acquired_refs; i++) {
> +               if (state->refs[i].type != REF_TYPE_PTR)
> +                       continue;

why ?
check_reference_leak() should complain about unreleased locks too.

>                 if (!exception_exit && state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
>                         continue;
>                 verbose(env, "Unreleased reference id=%d alloc_insn=%d\n",
> @@ -10356,7 +10424,7 @@ static int check_resource_leak(struct bpf_verifier_env *env, bool exception_exit
>  {
>         int err;
>
> -       if (check_lock && env->cur_state->active_lock.ptr) {
> +       if (check_lock && env->cur_state->active_lock) {
>                 verbose(env, "%s cannot be used inside bpf_spin_lock-ed region\n", prefix);
>                 return -EINVAL;
>         }
> @@ -11580,7 +11648,7 @@ static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state
>         struct bpf_verifier_state *state = env->cur_state;
>         struct btf_record *rec = reg_btf_record(reg);
>
> -       if (!state->active_lock.ptr) {
> +       if (!state->active_lock) {
>                 verbose(env, "verifier internal error: ref_set_non_owning w/o active lock\n");
>                 return -EFAULT;
>         }
> @@ -11677,6 +11745,7 @@ static int ref_convert_owning_non_owning(struct bpf_verifier_env *env, u32 ref_o
>   */
>  static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
> +       struct bpf_reference_state *s;
>         void *ptr;
>         u32 id;
>
> @@ -11693,10 +11762,10 @@ static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_
>         }
>         id = reg->id;
>
> -       if (!env->cur_state->active_lock.ptr)
> +       if (!env->cur_state->active_lock)
>                 return -EINVAL;
> -       if (env->cur_state->active_lock.ptr != ptr ||
> -           env->cur_state->active_lock.id != id) {
> +       s = find_lock_state(env, id, ptr);
> +       if (!s) {
>                 verbose(env, "held lock and object are not in the same allocation\n");
>                 return -EINVAL;
>         }
> @@ -17561,8 +17630,19 @@ static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
>                 return false;
>
>         for (i = 0; i < old->acquired_refs; i++) {
> -               if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap))
> +               if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap) ||
> +                   old->refs[i].type != cur->refs[i].type)
>                         return false;
> +               switch (old->refs[i].type) {
> +               case REF_TYPE_PTR:
> +                       if (old->refs[i].callback_ref != cur->refs[i].callback_ref)
> +                               return false;
> +                       break;
> +               default:
> +                       if (old->refs[i].ptr != cur->refs[i].ptr)
> +                               return false;
> +                       break;

I'd future proof a bit by explicitly handling all enum values
and WARN_ONCE in default.

pw-bot: cr

Kumar Kartikeya Dwivedi Nov. 4, 2024, 8:26 p.m. UTC | #2

On Mon, 4 Nov 2024 at 13:33, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Nov 4, 2024 at 7:17 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >
> > When bpf_spin_lock was introduced originally, there was deliberation on
> > whether to use an array of lock IDs, but since bpf_spin_lock is limited
> > to holding a single lock at any given time, we've been using a single ID
> > to identify the held lock.
> >
> > In preparation for introducing spin locks that can be taken multiple
> > times, introduce support for acquiring multiple lock IDs. For this
> > purpose, reuse the acquired_refs array and store both lock and pointer
> > references. We tag the entry with REF_TYPE_PTR or REF_TYPE_BPF_LOCK to
> > disambiguate and find the relevant entry. The ptr field is used to track
> > the map_ptr or btf (for bpf_obj_new allocations) to ensure locks can be
> > matched with protected fields within the same "allocation", i.e.
> > bpf_obj_new object or map value.
> >
> > The struct active_lock is changed to a boolean as the state is part of
> > the acquired_refs array, and we only need active_lock as a cheap way
> > of detecting lock presence.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> > Changelog:
> > v2 -> v3
> > v2: https://lore.kernel.org/bpf/20241103212252.547071-1-memxor@gmail.com
> >
> >  * Rebase on bpf-next to resolve merge conflict
> >
> > v1 -> v2
> > v1: https://lore.kernel.org/bpf/20241103205856.345580-1-memxor@gmail.com
> >
> >  * Fix refsafe state comparison to check callback_ref and ptr separately.
> > ---
> >  include/linux/bpf_verifier.h |  34 ++++++---
> >  kernel/bpf/verifier.c        | 138 ++++++++++++++++++++++++++---------
> >  2 files changed, 126 insertions(+), 46 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 4513372c5bc8..1e7e1803d78b 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -266,6 +266,10 @@ struct bpf_stack_state {
> >  };
> >
> >  struct bpf_reference_state {
> > +       /* Each reference object has a type. Ensure REF_TYPE_PTR is zero to
> > +        * default to pointer reference on zero initialization of a state.
> > +        */
> > +       enum { REF_TYPE_PTR = 0, REF_TYPE_BPF_LOCK } type;
>
> In the future commit two more values will be added, right?
> And it may push it over line limit.
> I think cleaner to do each value on separate line:
>   enum ref_state_type {
>     REF_TYPE_PTR = 0,
>     REF_TYPE_BPF_LOCK
>   } type;
>
> easier to read and less churn.
> Also name that enum type to use later.
>
> Also I'd drop _BPF_ from the middle.
> Just REF_TYPE_LOCK would do.

We're going to have a different lock type though.
But I can rename it when I add it later.

>
> >         /* Track each reference created with a unique id, even if the same
> >          * instruction creates the reference multiple times (eg, via CALL).
> >          */
> > @@ -274,17 +278,23 @@ struct bpf_reference_state {
> >          * is used purely to inform the user of a reference leak.
> >          */
> >         int insn_idx;
> > -       /* There can be a case like:
> > -        * main (frame 0)
> > -        *  cb (frame 1)
> > -        *   func (frame 3)
> > -        *    cb (frame 4)
> > -        * Hence for frame 4, if callback_ref just stored boolean, it would be
> > -        * impossible to distinguish nested callback refs. Hence store the
> > -        * frameno and compare that to callback_ref in check_reference_leak when
> > -        * exiting a callback function.
> > -        */
> > -       int callback_ref;
> > +       union {
> > +               /* There can be a case like:
> > +                * main (frame 0)
> > +                *  cb (frame 1)
> > +                *   func (frame 3)
> > +                *    cb (frame 4)
> > +                * Hence for frame 4, if callback_ref just stored boolean, it would be
> > +                * impossible to distinguish nested callback refs. Hence store the
> > +                * frameno and compare that to callback_ref in check_reference_leak when
> > +                * exiting a callback function.
> > +                */
> > +               int callback_ref;
> > +               /* Use to keep track of the source object of a lock, to ensure
> > +                * it matches on unlock.
> > +                */
> > +               void *ptr;
> > +       };
> >  };
> >
> >  struct bpf_retval_range {
> > @@ -434,7 +444,7 @@ struct bpf_verifier_state {
> >         u32 insn_idx;
> >         u32 curframe;
> >
> > -       struct bpf_active_lock active_lock;
>
> remove struct bpf_active_lock as well.
>

Ack.

> > +       bool active_lock;
>
> In the next patch it becomes 'int',
> so let's make it 'int' right away and move it to bpf_func_state
> next to:
>         int acquired_refs;
>         struct bpf_reference_state *refs;
>
> since it's counting locks in this array.
> 'bool' is just a weird from of 1 or 0.
> So 'int' is cleaner in this patch too.
>

Ack.

> >         bool speculative;
> >         bool active_rcu_lock;
> >         u32 active_preempt_lock;
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index ba800c7611e3..ea8ad320e6cc 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -1335,6 +1335,7 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
> >         if (err)
> >                 return err;
> >         id = ++env->id_gen;
> > +       state->refs[new_ofs].type = REF_TYPE_PTR;
> >         state->refs[new_ofs].id = id;
> >         state->refs[new_ofs].insn_idx = insn_idx;
> >         state->refs[new_ofs].callback_ref = state->in_callback_fn ? state->frameno : 0;
> > @@ -1342,6 +1343,23 @@ static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
> >         return id;
> >  }
> >
> > +static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, int type, int id, void *ptr)
>
> enum ref_state_type instead of 'int type' ?
>

Ack, but eventually it may require passing OR of flags, which I
believe makes the compiler unhappy.
I can make that change later though.

> > +{
> > +       struct bpf_func_state *state = cur_func(env);
> > +       int new_ofs = state->acquired_refs;
> > +       int err;
> > +
> > +       err = resize_reference_state(state, state->acquired_refs + 1);
> > +       if (err)
> > +               return err;
> > +       state->refs[new_ofs].type = type;
> > +       state->refs[new_ofs].id = id;
> > +       state->refs[new_ofs].insn_idx = insn_idx;
> > +       state->refs[new_ofs].ptr = ptr;
> > +
> > +       return 0;
> > +}
> > +
> >  /* release function corresponding to acquire_reference_state(). Idempotent. */
> >  static int release_reference_state(struct bpf_func_state *state, int ptr_id)
> >  {
> > @@ -1349,6 +1367,8 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
> >
> >         last_idx = state->acquired_refs - 1;
> >         for (i = 0; i < state->acquired_refs; i++) {
> > +               if (state->refs[i].type != REF_TYPE_PTR)
> > +                       continue;
> >                 if (state->refs[i].id == ptr_id) {
> >                         /* Cannot release caller references in callbacks */
> >                         if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> > @@ -1364,6 +1384,43 @@ static int release_reference_state(struct bpf_func_state *state, int ptr_id)
> >         return -EINVAL;
> >  }
> >
> > +static int release_lock_state(struct bpf_func_state *state, int type, int id, void *ptr)
> > +{
> > +       int i, last_idx;
> > +
> > +       last_idx = state->acquired_refs - 1;
> > +       for (i = 0; i < state->acquired_refs; i++) {
> > +               if (state->refs[i].type != type)
> > +                       continue;
> > +               if (state->refs[i].id == id && state->refs[i].ptr == ptr) {
> > +                       if (last_idx && i != last_idx)
> > +                               memcpy(&state->refs[i], &state->refs[last_idx],
> > +                                      sizeof(*state->refs));
> > +                       memset(&state->refs[last_idx], 0, sizeof(*state->refs));
> > +                       state->acquired_refs--;
> > +                       return 0;
> > +               }
> > +       }
> > +       return -EINVAL;
> > +}
> > +
> > +static struct bpf_reference_state *find_lock_state(struct bpf_verifier_env *env, int id, void *ptr)
> > +{
> > +       struct bpf_func_state *state = cur_func(env);
> > +       int i;
> > +
> > +       for (i = 0; i < state->acquired_refs; i++) {
> > +               struct bpf_reference_state *s = &state->refs[i];
> > +
> > +               if (s->type == REF_TYPE_PTR)
> > +                       continue;
>
> Let's pass 'enum ref_state_type type' and compare here
> to make it similar to release_lock_state().
> I think it will clean up future patches too.
>

Ok.

> > +
> > +               if (s->id == id && s->ptr == ptr)
> > +                       return s;
> > +       }
> > +       return NULL;
> > +}
> > +
> >  static void free_func_state(struct bpf_func_state *state)
> >  {
> >         if (!state)
> > @@ -1430,12 +1487,11 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
> >                 dst_state->frame[i] = NULL;
> >         }
> >         dst_state->speculative = src->speculative;
> > +       dst_state->active_lock = src->active_lock;
> >         dst_state->active_rcu_lock = src->active_rcu_lock;
> >         dst_state->active_preempt_lock = src->active_preempt_lock;
> >         dst_state->in_sleepable = src->in_sleepable;
> >         dst_state->curframe = src->curframe;
> > -       dst_state->active_lock.ptr = src->active_lock.ptr;
> > -       dst_state->active_lock.id = src->active_lock.id;
> >         dst_state->branches = src->branches;
> >         dst_state->parent = src->parent;
> >         dst_state->first_insn_idx = src->first_insn_idx;
> > @@ -5423,7 +5479,7 @@ static bool in_sleepable(struct bpf_verifier_env *env)
> >  static bool in_rcu_cs(struct bpf_verifier_env *env)
> >  {
> >         return env->cur_state->active_rcu_lock ||
> > -              env->cur_state->active_lock.ptr ||
> > +              env->cur_state->active_lock ||
> >                !in_sleepable(env);
> >  }
> >
> > @@ -7698,6 +7754,7 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
> >         struct bpf_map *map = NULL;
> >         struct btf *btf = NULL;
> >         struct btf_record *rec;
> > +       int err;
> >
> >         if (!is_const) {
> >                 verbose(env,
> > @@ -7729,16 +7786,27 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
> >                 return -EINVAL;
> >         }
> >         if (is_lock) {
> > -               if (cur->active_lock.ptr) {
> > +               void *ptr;
> > +
> > +               if (map)
> > +                       ptr = map;
> > +               else
> > +                       ptr = btf;
> > +
> > +               if (cur->active_lock) {
> >                         verbose(env,
> >                                 "Locking two bpf_spin_locks are not allowed\n");
> >                         return -EINVAL;
> >                 }
> > -               if (map)
> > -                       cur->active_lock.ptr = map;
> > -               else
> > -                       cur->active_lock.ptr = btf;
> > -               cur->active_lock.id = reg->id;
> > +               err = acquire_lock_state(env, env->insn_idx, REF_TYPE_BPF_LOCK, reg->id, ptr);
> > +               if (err < 0) {
> > +                       verbose(env, "Failed to acquire lock state\n");
> > +                       return err;
> > +               }
> > +               /* It is not safe to allow multiple bpf_spin_lock calls, so
> > +                * disallow them until this lock has been unlocked.
> > +                */
> > +               cur->active_lock = true;
> >         } else {
> >                 void *ptr;
> >
> > @@ -7747,20 +7815,18 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
> >                 else
> >                         ptr = btf;
> >
> > -               if (!cur->active_lock.ptr) {
> > +               if (!cur->active_lock) {
> >                         verbose(env, "bpf_spin_unlock without taking a lock\n");
> >                         return -EINVAL;
> >                 }
> > -               if (cur->active_lock.ptr != ptr ||
> > -                   cur->active_lock.id != reg->id) {
> > +
> > +               if (release_lock_state(cur_func(env), REF_TYPE_BPF_LOCK, reg->id, ptr)) {
> >                         verbose(env, "bpf_spin_unlock of different lock\n");
> >                         return -EINVAL;
> >                 }
> >
> >                 invalidate_non_owning_refs(env);
> > -
> > -               cur->active_lock.ptr = NULL;
> > -               cur->active_lock.id = 0;
> > +               cur->active_lock = false;
> >         }
> >         return 0;
> >  }
> > @@ -9818,7 +9884,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> >                 const char *sub_name = subprog_name(env, subprog);
> >
> >                 /* Only global subprogs cannot be called with a lock held. */
> > -               if (env->cur_state->active_lock.ptr) {
> > +               if (env->cur_state->active_lock) {
> >                         verbose(env, "global function calls are not allowed while holding a lock,\n"
> >                                      "use static function instead\n");
> >                         return -EINVAL;
> > @@ -10343,6 +10409,8 @@ static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exi
> >                 return 0;
> >
> >         for (i = 0; i < state->acquired_refs; i++) {
> > +               if (state->refs[i].type != REF_TYPE_PTR)
> > +                       continue;
>
> why ?
> check_reference_leak() should complain about unreleased locks too.
>

I'm trying to change that here. Locks will be checked with the
active_locks counter everywhere, reference types are separate, and
only REF_TYPE_PTR.
Now that we've unified everything in check_resource_leak, it should
not be problematic.

> >                 if (!exception_exit && state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
> >                         continue;
> >                 verbose(env, "Unreleased reference id=%d alloc_insn=%d\n",
> > @@ -10356,7 +10424,7 @@ static int check_resource_leak(struct bpf_verifier_env *env, bool exception_exit
> >  {
> >         int err;
> >
> > -       if (check_lock && env->cur_state->active_lock.ptr) {
> > +       if (check_lock && env->cur_state->active_lock) {
> >                 verbose(env, "%s cannot be used inside bpf_spin_lock-ed region\n", prefix);
> >                 return -EINVAL;
> >         }
> > @@ -11580,7 +11648,7 @@ static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state
> >         struct bpf_verifier_state *state = env->cur_state;
> >         struct btf_record *rec = reg_btf_record(reg);
> >
> > -       if (!state->active_lock.ptr) {
> > +       if (!state->active_lock) {
> >                 verbose(env, "verifier internal error: ref_set_non_owning w/o active lock\n");
> >                 return -EFAULT;
> >         }
> > @@ -11677,6 +11745,7 @@ static int ref_convert_owning_non_owning(struct bpf_verifier_env *env, u32 ref_o
> >   */
> >  static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> > +       struct bpf_reference_state *s;
> >         void *ptr;
> >         u32 id;
> >
> > @@ -11693,10 +11762,10 @@ static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_
> >         }
> >         id = reg->id;
> >
> > -       if (!env->cur_state->active_lock.ptr)
> > +       if (!env->cur_state->active_lock)
> >                 return -EINVAL;
> > -       if (env->cur_state->active_lock.ptr != ptr ||
> > -           env->cur_state->active_lock.id != id) {
> > +       s = find_lock_state(env, id, ptr);
> > +       if (!s) {
> >                 verbose(env, "held lock and object are not in the same allocation\n");
> >                 return -EINVAL;
> >         }
> > @@ -17561,8 +17630,19 @@ static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
> >                 return false;
> >
> >         for (i = 0; i < old->acquired_refs; i++) {
> > -               if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap))
> > +               if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap) ||
> > +                   old->refs[i].type != cur->refs[i].type)
> >                         return false;
> > +               switch (old->refs[i].type) {
> > +               case REF_TYPE_PTR:
> > +                       if (old->refs[i].callback_ref != cur->refs[i].callback_ref)
> > +                               return false;
> > +                       break;
> > +               default:
> > +                       if (old->refs[i].ptr != cur->refs[i].ptr)
> > +                               return false;
> > +                       break;
>
> I'd future proof a bit by explicitly handling all enum values
> and WARN_ONCE in default.
>

Ok.

> pw-bot: cr

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 4513372c5bc8..1e7e1803d78b 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -266,6 +266,10 @@  struct bpf_stack_state {
 };

 struct bpf_reference_state {
+	/* Each reference object has a type. Ensure REF_TYPE_PTR is zero to
+	 * default to pointer reference on zero initialization of a state.
+	 */
+	enum { REF_TYPE_PTR = 0, REF_TYPE_BPF_LOCK } type;
 	/* Track each reference created with a unique id, even if the same
 	 * instruction creates the reference multiple times (eg, via CALL).
 	 */
@@ -274,17 +278,23 @@  struct bpf_reference_state {
 	 * is used purely to inform the user of a reference leak.
 	 */
 	int insn_idx;
-	/* There can be a case like:
-	 * main (frame 0)
-	 *  cb (frame 1)
-	 *   func (frame 3)
-	 *    cb (frame 4)
-	 * Hence for frame 4, if callback_ref just stored boolean, it would be
-	 * impossible to distinguish nested callback refs. Hence store the
-	 * frameno and compare that to callback_ref in check_reference_leak when
-	 * exiting a callback function.
-	 */
-	int callback_ref;
+	union {
+		/* There can be a case like:
+		 * main (frame 0)
+		 *  cb (frame 1)
+		 *   func (frame 3)
+		 *    cb (frame 4)
+		 * Hence for frame 4, if callback_ref just stored boolean, it would be
+		 * impossible to distinguish nested callback refs. Hence store the
+		 * frameno and compare that to callback_ref in check_reference_leak when
+		 * exiting a callback function.
+		 */
+		int callback_ref;
+		/* Use to keep track of the source object of a lock, to ensure
+		 * it matches on unlock.
+		 */
+		void *ptr;
+	};
 };

 struct bpf_retval_range {
@@ -434,7 +444,7 @@  struct bpf_verifier_state {
 	u32 insn_idx;
 	u32 curframe;

-	struct bpf_active_lock active_lock;
+	bool active_lock;
 	bool speculative;
 	bool active_rcu_lock;
 	u32 active_preempt_lock;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ba800c7611e3..ea8ad320e6cc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1335,6 +1335,7 @@  static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
 	if (err)
 		return err;
 	id = ++env->id_gen;
+	state->refs[new_ofs].type = REF_TYPE_PTR;
 	state->refs[new_ofs].id = id;
 	state->refs[new_ofs].insn_idx = insn_idx;
 	state->refs[new_ofs].callback_ref = state->in_callback_fn ? state->frameno : 0;
@@ -1342,6 +1343,23 @@  static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx)
 	return id;
 }

+static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, int type, int id, void *ptr)
+{
+	struct bpf_func_state *state = cur_func(env);
+	int new_ofs = state->acquired_refs;
+	int err;
+
+	err = resize_reference_state(state, state->acquired_refs + 1);
+	if (err)
+		return err;
+	state->refs[new_ofs].type = type;
+	state->refs[new_ofs].id = id;
+	state->refs[new_ofs].insn_idx = insn_idx;
+	state->refs[new_ofs].ptr = ptr;
+
+	return 0;
+}
+
 /* release function corresponding to acquire_reference_state(). Idempotent. */
 static int release_reference_state(struct bpf_func_state *state, int ptr_id)
 {
@@ -1349,6 +1367,8 @@  static int release_reference_state(struct bpf_func_state *state, int ptr_id)

 	last_idx = state->acquired_refs - 1;
 	for (i = 0; i < state->acquired_refs; i++) {
+		if (state->refs[i].type != REF_TYPE_PTR)
+			continue;
 		if (state->refs[i].id == ptr_id) {
 			/* Cannot release caller references in callbacks */
 			if (state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
@@ -1364,6 +1384,43 @@  static int release_reference_state(struct bpf_func_state *state, int ptr_id)
 	return -EINVAL;
 }

+static int release_lock_state(struct bpf_func_state *state, int type, int id, void *ptr)
+{
+	int i, last_idx;
+
+	last_idx = state->acquired_refs - 1;
+	for (i = 0; i < state->acquired_refs; i++) {
+		if (state->refs[i].type != type)
+			continue;
+		if (state->refs[i].id == id && state->refs[i].ptr == ptr) {
+			if (last_idx && i != last_idx)
+				memcpy(&state->refs[i], &state->refs[last_idx],
+				       sizeof(*state->refs));
+			memset(&state->refs[last_idx], 0, sizeof(*state->refs));
+			state->acquired_refs--;
+			return 0;
+		}
+	}
+	return -EINVAL;
+}
+
+static struct bpf_reference_state *find_lock_state(struct bpf_verifier_env *env, int id, void *ptr)
+{
+	struct bpf_func_state *state = cur_func(env);
+	int i;
+
+	for (i = 0; i < state->acquired_refs; i++) {
+		struct bpf_reference_state *s = &state->refs[i];
+
+		if (s->type == REF_TYPE_PTR)
+			continue;
+
+		if (s->id == id && s->ptr == ptr)
+			return s;
+	}
+	return NULL;
+}
+
 static void free_func_state(struct bpf_func_state *state)
 {
 	if (!state)
@@ -1430,12 +1487,11 @@  static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 		dst_state->frame[i] = NULL;
 	}
 	dst_state->speculative = src->speculative;
+	dst_state->active_lock = src->active_lock;
 	dst_state->active_rcu_lock = src->active_rcu_lock;
 	dst_state->active_preempt_lock = src->active_preempt_lock;
 	dst_state->in_sleepable = src->in_sleepable;
 	dst_state->curframe = src->curframe;
-	dst_state->active_lock.ptr = src->active_lock.ptr;
-	dst_state->active_lock.id = src->active_lock.id;
 	dst_state->branches = src->branches;
 	dst_state->parent = src->parent;
 	dst_state->first_insn_idx = src->first_insn_idx;
@@ -5423,7 +5479,7 @@  static bool in_sleepable(struct bpf_verifier_env *env)
 static bool in_rcu_cs(struct bpf_verifier_env *env)
 {
 	return env->cur_state->active_rcu_lock ||
-	       env->cur_state->active_lock.ptr ||
+	       env->cur_state->active_lock ||
 	       !in_sleepable(env);
 }

@@ -7698,6 +7754,7 @@  static int process_spin_lock(struct bpf_verifier_env *env, int regno,
 	struct bpf_map *map = NULL;
 	struct btf *btf = NULL;
 	struct btf_record *rec;
+	int err;

 	if (!is_const) {
 		verbose(env,
@@ -7729,16 +7786,27 @@  static int process_spin_lock(struct bpf_verifier_env *env, int regno,
 		return -EINVAL;
 	}
 	if (is_lock) {
-		if (cur->active_lock.ptr) {
+		void *ptr;
+
+		if (map)
+			ptr = map;
+		else
+			ptr = btf;
+
+		if (cur->active_lock) {
 			verbose(env,
 				"Locking two bpf_spin_locks are not allowed\n");
 			return -EINVAL;
 		}
-		if (map)
-			cur->active_lock.ptr = map;
-		else
-			cur->active_lock.ptr = btf;
-		cur->active_lock.id = reg->id;
+		err = acquire_lock_state(env, env->insn_idx, REF_TYPE_BPF_LOCK, reg->id, ptr);
+		if (err < 0) {
+			verbose(env, "Failed to acquire lock state\n");
+			return err;
+		}
+		/* It is not safe to allow multiple bpf_spin_lock calls, so
+		 * disallow them until this lock has been unlocked.
+		 */
+		cur->active_lock = true;
 	} else {
 		void *ptr;

@@ -7747,20 +7815,18 @@  static int process_spin_lock(struct bpf_verifier_env *env, int regno,
 		else
 			ptr = btf;

-		if (!cur->active_lock.ptr) {
+		if (!cur->active_lock) {
 			verbose(env, "bpf_spin_unlock without taking a lock\n");
 			return -EINVAL;
 		}
-		if (cur->active_lock.ptr != ptr ||
-		    cur->active_lock.id != reg->id) {
+
+		if (release_lock_state(cur_func(env), REF_TYPE_BPF_LOCK, reg->id, ptr)) {
 			verbose(env, "bpf_spin_unlock of different lock\n");
 			return -EINVAL;
 		}

 		invalidate_non_owning_refs(env);
-
-		cur->active_lock.ptr = NULL;
-		cur->active_lock.id = 0;
+		cur->active_lock = false;
 	}
 	return 0;
 }
@@ -9818,7 +9884,7 @@  static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		const char *sub_name = subprog_name(env, subprog);

 		/* Only global subprogs cannot be called with a lock held. */
-		if (env->cur_state->active_lock.ptr) {
+		if (env->cur_state->active_lock) {
 			verbose(env, "global function calls are not allowed while holding a lock,\n"
 				     "use static function instead\n");
 			return -EINVAL;
@@ -10343,6 +10409,8 @@  static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exi
 		return 0;

 	for (i = 0; i < state->acquired_refs; i++) {
+		if (state->refs[i].type != REF_TYPE_PTR)
+			continue;
 		if (!exception_exit && state->in_callback_fn && state->refs[i].callback_ref != state->frameno)
 			continue;
 		verbose(env, "Unreleased reference id=%d alloc_insn=%d\n",
@@ -10356,7 +10424,7 @@  static int check_resource_leak(struct bpf_verifier_env *env, bool exception_exit
 {
 	int err;

-	if (check_lock && env->cur_state->active_lock.ptr) {
+	if (check_lock && env->cur_state->active_lock) {
 		verbose(env, "%s cannot be used inside bpf_spin_lock-ed region\n", prefix);
 		return -EINVAL;
 	}
@@ -11580,7 +11648,7 @@  static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state
 	struct bpf_verifier_state *state = env->cur_state;
 	struct btf_record *rec = reg_btf_record(reg);

-	if (!state->active_lock.ptr) {
+	if (!state->active_lock) {
 		verbose(env, "verifier internal error: ref_set_non_owning w/o active lock\n");
 		return -EFAULT;
 	}
@@ -11677,6 +11745,7 @@  static int ref_convert_owning_non_owning(struct bpf_verifier_env *env, u32 ref_o
  */
 static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
+	struct bpf_reference_state *s;
 	void *ptr;
 	u32 id;

@@ -11693,10 +11762,10 @@  static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_
 	}
 	id = reg->id;

-	if (!env->cur_state->active_lock.ptr)
+	if (!env->cur_state->active_lock)
 		return -EINVAL;
-	if (env->cur_state->active_lock.ptr != ptr ||
-	    env->cur_state->active_lock.id != id) {
+	s = find_lock_state(env, id, ptr);
+	if (!s) {
 		verbose(env, "held lock and object are not in the same allocation\n");
 		return -EINVAL;
 	}
@@ -17561,8 +17630,19 @@  static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
 		return false;

 	for (i = 0; i < old->acquired_refs; i++) {
-		if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap))
+		if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap) ||
+		    old->refs[i].type != cur->refs[i].type)
 			return false;
+		switch (old->refs[i].type) {
+		case REF_TYPE_PTR:
+			if (old->refs[i].callback_ref != cur->refs[i].callback_ref)
+				return false;
+			break;
+		default:
+			if (old->refs[i].ptr != cur->refs[i].ptr)
+				return false;
+			break;
+		}
 	}

 	return true;
@@ -17640,17 +17720,7 @@  static bool states_equal(struct bpf_verifier_env *env,
 	if (old->speculative && !cur->speculative)
 		return false;

-	if (old->active_lock.ptr != cur->active_lock.ptr)
-		return false;
-
-	/* Old and cur active_lock's have to be either both present
-	 * or both absent.
-	 */
-	if (!!old->active_lock.id != !!cur->active_lock.id)
-		return false;
-
-	if (old->active_lock.id &&
-	    !check_ids(old->active_lock.id, cur->active_lock.id, &env->idmap_scratch))
+	if (old->active_lock != cur->active_lock)
 		return false;

 	if (old->active_rcu_lock != cur->active_rcu_lock)
@@ -18551,7 +18621,7 @@  static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}

-				if (env->cur_state->active_lock.ptr) {
+				if (env->cur_state->active_lock) {
 					if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) ||
 					    (insn->src_reg == BPF_PSEUDO_KFUNC_CALL &&
 					     (insn->off != 0 || !is_bpf_graph_api_kfunc(insn->imm)))) {

[bpf-next,v3] bpf: Refactor active lock management

Checks

Commit Message

Comments

Patch