diff mbox series

[bpf-next,v3] docs/bpf: Add documentation for BPF_MAP_TYPE_SK_STORAGE

Message ID 20221207102721.33378-1-donald.hunter@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next,v3] docs/bpf: Add documentation for BPF_MAP_TYPE_SK_STORAGE | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 8 maintainers not CCed: kpsingh@kernel.org haoluo@google.com song@kernel.org yhs@fb.com martin.lau@linux.dev sdf@google.com john.fastabend@gmail.com jolsa@kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 fail Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc

Commit Message

Donald Hunter Dec. 7, 2022, 10:27 a.m. UTC
Add documentation for the BPF_MAP_TYPE_SK_STORAGE including
kernel version introduced, usage and examples.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
---
v2 -> v3:
- Fix void * return, reported by Yonghong Song
- Add tracing programs to API note, reported by Yonghong Song
v1 -> v2:
- Fix bpf_sk_storage_* function signatures, reported by Yonghong Song
- Fix NULL return on failure, reported by Yonghong Song

 Documentation/bpf/map_sk_storage.rst | 142 +++++++++++++++++++++++++++
 1 file changed, 142 insertions(+)
 create mode 100644 Documentation/bpf/map_sk_storage.rst

Comments

Yonghong Song Dec. 7, 2022, 4:41 p.m. UTC | #1
On 12/7/22 2:27 AM, Donald Hunter wrote:
> Add documentation for the BPF_MAP_TYPE_SK_STORAGE including
> kernel version introduced, usage and examples.
> 
> Signed-off-by: Donald Hunter <donald.hunter@gmail.com>

Acked-by: Yonghong Song <yhs@fb.com>
David Vernet Dec. 7, 2022, 9:13 p.m. UTC | #2
On Wed, Dec 07, 2022 at 10:27:21AM +0000, Donald Hunter wrote:
> Add documentation for the BPF_MAP_TYPE_SK_STORAGE including
> kernel version introduced, usage and examples.
> 
> Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
> ---
> v2 -> v3:
> - Fix void * return, reported by Yonghong Song
> - Add tracing programs to API note, reported by Yonghong Song
> v1 -> v2:
> - Fix bpf_sk_storage_* function signatures, reported by Yonghong Song
> - Fix NULL return on failure, reported by Yonghong Song
> 
>  Documentation/bpf/map_sk_storage.rst | 142 +++++++++++++++++++++++++++
>  1 file changed, 142 insertions(+)
>  create mode 100644 Documentation/bpf/map_sk_storage.rst
> 
> diff --git a/Documentation/bpf/map_sk_storage.rst b/Documentation/bpf/map_sk_storage.rst
> new file mode 100644
> index 000000000000..955b287bb7de
> --- /dev/null
> +++ b/Documentation/bpf/map_sk_storage.rst
> @@ -0,0 +1,142 @@
> +.. SPDX-License-Identifier: GPL-2.0-only
> +.. Copyright (C) 2022 Red Hat, Inc.
> +
> +=======================
> +BPF_MAP_TYPE_SK_STORAGE
> +=======================
> +
> +.. note::
> +   - ``BPF_MAP_TYPE_SK_STORAGE`` was introduced in kernel version 5.2
> +
> +``BPF_MAP_TYPE_SK_STORAGE`` is used to provide socket-local storage for BPF programs. A map of
> +type ``BPF_MAP_TYPE_SK_STORAGE`` declares the type of storage to be provided and acts as the
> +handle for accessing the socket-local storage from a BPF program. The key type must be ``int``
> +and ``max_entries`` must be set to ``0``.
> +
> +The ``BPF_F_NO_PREALLOC`` must be used when creating a map for socket-local storage. The kernel
> +is responsible for allocating storage for a socket when requested and for freeing the storage
> +when either the map or the socket is deleted.

Hi Donald,

Thanks for writing these excellent docs.

Would you mind please wrapping your text to 80 columns (throughout the
document)? It's not a hard and fast rule, but it would be ideal to keep
text that can wrap without formatting issues to 80 columns. For
documentation where you're e.g. showing a function signature (such as
bpf_sk_storage_get() below), it's fine to exceed it.

> +
> +Usage
> +=====
> +
> +Kernel BPF
> +----------
> +
> +bpf_sk_storage_get()
> +~~~~~~~~~~~~~~~~~~~~
> +
> +.. code-block:: c
> +
> +   void *bpf_sk_storage_get(struct bpf_map *map, void *sk, void *value, u64 flags)
> +
> +Socket-local storage can be retrieved using the ``bpf_sk_storage_get()`` helper. The helper gets
> +the storage from ``sk`` that is identified by ``map``.  If the
> +``BPF_LOCAL_STORAGE_GET_F_CREATE`` flag is used then ``bpf_sk_storage_get()`` will create the
> +storage for ``sk`` if it does not already exist. ``value`` can be used together with
> +``BPF_LOCAL_STORAGE_GET_F_CREATE`` to initialize the storage value, otherwise it will be zero
> +initialized. Returns a pointer to the storage on success, or ``NULL`` in case of failure.
> +
> +.. note::
> +   - ``sk`` is a kernel ``struct sock`` pointer for LSM or tracing programs.
> +   - ``sk`` is a ``struct bpf_sock`` pointer for other program types.
> +
> +bpf_sk_storage_delete()
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +.. code-block:: c
> +
> +   long bpf_sk_storage_delete(struct bpf_map *map, void *sk)
> +
> +Socket-local storage can be deleted using the ``bpf_sk_storage_delete()`` helper. The helper
> +deletes the storage from ``sk`` that is identified by ``map``. Returns ``0`` on success, or negative
> +error in case of failure.
> +
> +User space
> +----------
> +
> +bpf_map_update_elem()
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +.. code-block:: c
> +
> +   int bpf_map_update_elem(int map_fd, const void *key, const void *value, __u64 flags)
> +
> +Socket-local storage with type identified by ``map_fd`` for the socket identified by ``key`` can

Could you please clarify what you mean by "with type identified by"?
``map_fd`` just corresponds to the fd of the map, correct? Not following
what's meant by "type".

> +be added or updated using the ``bpf_map_update_elem()`` libbpf function. ``key`` must be a
> +pointer to a valid ``fd`` in the user space program. The ``flags`` parameter can be used to
> +control the update behaviour:
> +
> +- ``BPF_ANY`` will create storage for ``fd`` or update existing storage.
> +- ``BPF_NOEXIST`` will create storage for ``fd`` only if it did not already
> +  exist

I believe that if BPF_NOEXIST is specified, if storage already exists
then in addition to storage not being created, the call will fail with
-EEXIST.

> +- ``BPF_EXIST`` will update existing storage for ``fd``

Can we also mention that if BPF_EXIST is specified, and no element is
present with the specified ``key``, that the call will fail with
-ENOENT?

> +
> +Returns ``0`` on success, or negative error in case of failure.
> +
> +bpf_map_lookup_elem()
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +.. code-block:: c
> +
> +   int bpf_map_lookup_elem(int map_fd, const void *key, void *value)
> +
> +Socket-local storage for the socket identified by ``key`` belonging to ``map_fd`` can be
> +retrieved using the ``bpf_map_lookup_elem()`` libbpf function. ``key`` must be a pointer to a
> +valid ``fd`` in the user space program. Returns ``0`` on success, or negative error in case of
> +failure.
> +
> +bpf_map_delete_elem()
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +.. code-block:: c
> +
> +   int bpf_map_delete_elem (int map_fd, const void *key)

Extra space between _elem and (.

> +
> +Socket-local storage for the socket identified by ``key`` belonging to ``map_fd`` can be deleted
> +using the ``bpf_map_delete_elem()`` libbpf function. Returns ``0`` on success, or negative error
> +in case of failure.
> +
> +Examples
> +========
> +
> +Kernel BPF
> +----------
> +
> +This snippet shows how to declare socket-local storage in a BPF program:
> +
> +.. code-block:: c
> +
> +    struct {
> +            __uint(type, BPF_MAP_TYPE_SK_STORAGE);
> +            __uint(map_flags, BPF_F_NO_PREALLOC);
> +            __type(key, int);
> +            __type(value, struct my_storage);
> +    } socket_storage SEC(".maps");
> +
> +This snippet shows how to retrieve socket-local storage in a BPF program:
> +
> +.. code-block:: c
> +
> +    SEC("sockops")
> +    int _sockops(struct bpf_sock_ops *ctx)
> +    {
> +            struct my_storage *storage;
> +            struct bpf_sock *sk;
> +
> +            sk = ctx->sk;
> +            if (!sk)
> +                    return 1;

Don't feel strongly about this one, but IMO it's nice for examples to
illustrate code that's as close to real and pristine as possible. To
that point, should this example perhaps be updated to return -ENOENT
here, and -ENOMEM below?

> +
> +            storage = bpf_sk_storage_get(&socket_storage, sk, 0,
> +                                         BPF_LOCAL_STORAGE_GET_F_CREATE);
> +            if (!storage)
> +                    return 1;
> +
> +            /* Use 'storage' here */

Let's return 0 at the end to make the example program technically
correct.

> +    }

Thanks,
David
Donald Hunter Dec. 8, 2022, 12:05 p.m. UTC | #3
David Vernet <void@manifault.com> writes:

> On Wed, Dec 07, 2022 at 10:27:21AM +0000, Donald Hunter wrote:
>> Add documentation for the BPF_MAP_TYPE_SK_STORAGE including
>> kernel version introduced, usage and examples.
>> 
>> Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
>> ---
>> v2 -> v3:
>> - Fix void * return, reported by Yonghong Song
>> - Add tracing programs to API note, reported by Yonghong Song
>> v1 -> v2:
>> - Fix bpf_sk_storage_* function signatures, reported by Yonghong Song
>> - Fix NULL return on failure, reported by Yonghong Song
>> 
>>  Documentation/bpf/map_sk_storage.rst | 142 +++++++++++++++++++++++++++
>>  1 file changed, 142 insertions(+)
>>  create mode 100644 Documentation/bpf/map_sk_storage.rst
>> 
>> diff --git a/Documentation/bpf/map_sk_storage.rst b/Documentation/bpf/map_sk_storage.rst
>> new file mode 100644
>> index 000000000000..955b287bb7de
>> --- /dev/null
>> +++ b/Documentation/bpf/map_sk_storage.rst
>> @@ -0,0 +1,142 @@
>> +.. SPDX-License-Identifier: GPL-2.0-only
>> +.. Copyright (C) 2022 Red Hat, Inc.
>> +
>> +=======================
>> +BPF_MAP_TYPE_SK_STORAGE
>> +=======================
>> +
>> +.. note::
>> +   - ``BPF_MAP_TYPE_SK_STORAGE`` was introduced in kernel version 5.2
>> +
>> +``BPF_MAP_TYPE_SK_STORAGE`` is used to provide socket-local storage for BPF programs. A map of
>> +type ``BPF_MAP_TYPE_SK_STORAGE`` declares the type of storage to be provided and acts as the
>> +handle for accessing the socket-local storage from a BPF program. The key type must be ``int``
>> +and ``max_entries`` must be set to ``0``.
>> +
>> +The ``BPF_F_NO_PREALLOC`` must be used when creating a map for socket-local storage. The kernel
>> +is responsible for allocating storage for a socket when requested and for freeing the storage
>> +when either the map or the socket is deleted.
>
> Hi Donald,
>
> Thanks for writing these excellent docs.
>
> Would you mind please wrapping your text to 80 columns (throughout the
> document)? It's not a hard and fast rule, but it would be ideal to keep
> text that can wrap without formatting issues to 80 columns. For
> documentation where you're e.g. showing a function signature (such as
> bpf_sk_storage_get() below), it's fine to exceed it.

Yes, of course. That was an oversight on my part.

>> +
>> +Usage
>> +=====
>> +
>> +Kernel BPF
>> +----------
>> +
>> +bpf_sk_storage_get()
>> +~~~~~~~~~~~~~~~~~~~~
>> +
>> +.. code-block:: c
>> +
>> +   void *bpf_sk_storage_get(struct bpf_map *map, void *sk, void *value, u64 flags)
>> +
>> +Socket-local storage can be retrieved using the ``bpf_sk_storage_get()`` helper. The helper gets
>> +the storage from ``sk`` that is identified by ``map``.  If the
>> +``BPF_LOCAL_STORAGE_GET_F_CREATE`` flag is used then ``bpf_sk_storage_get()`` will create the
>> +storage for ``sk`` if it does not already exist. ``value`` can be used together with
>> +``BPF_LOCAL_STORAGE_GET_F_CREATE`` to initialize the storage value, otherwise it will be zero
>> +initialized. Returns a pointer to the storage on success, or ``NULL`` in case of failure.
>> +
>> +.. note::
>> +   - ``sk`` is a kernel ``struct sock`` pointer for LSM or tracing programs.
>> +   - ``sk`` is a ``struct bpf_sock`` pointer for other program types.
>> +
>> +bpf_sk_storage_delete()
>> +~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +.. code-block:: c
>> +
>> +   long bpf_sk_storage_delete(struct bpf_map *map, void *sk)
>> +
>> +Socket-local storage can be deleted using the ``bpf_sk_storage_delete()`` helper. The helper
>> +deletes the storage from ``sk`` that is identified by ``map``. Returns ``0`` on success, or negative
>> +error in case of failure.
>> +
>> +User space
>> +----------
>> +
>> +bpf_map_update_elem()
>> +~~~~~~~~~~~~~~~~~~~~~
>> +
>> +.. code-block:: c
>> +
>> +   int bpf_map_update_elem(int map_fd, const void *key, const void *value, __u64 flags)
>> +
>> +Socket-local storage with type identified by ``map_fd`` for the socket identified by ``key`` can
>
> Could you please clarify what you mean by "with type identified by"?
> ``map_fd`` just corresponds to the fd of the map, correct? Not following
> what's meant by "type".

A clumsy attempt to say that the map definition declares the type of the
storage that gets added to the socket, as opposed to behaving like a
traditional map. I'll rework this to be more clear.

>> +be added or updated using the ``bpf_map_update_elem()`` libbpf function. ``key`` must be a
>> +pointer to a valid ``fd`` in the user space program. The ``flags`` parameter can be used to
>> +control the update behaviour:
>> +
>> +- ``BPF_ANY`` will create storage for ``fd`` or update existing storage.
>> +- ``BPF_NOEXIST`` will create storage for ``fd`` only if it did not already
>> +  exist
>
> I believe that if BPF_NOEXIST is specified, if storage already exists
> then in addition to storage not being created, the call will fail with
> -EEXIST.

Good point, I'll add this.

>> +- ``BPF_EXIST`` will update existing storage for ``fd``
>
> Can we also mention that if BPF_EXIST is specified, and no element is
> present with the specified ``key``, that the call will fail with
> -ENOENT?

I'll add this too.

>> +
>> +Returns ``0`` on success, or negative error in case of failure.
>> +
>> +bpf_map_lookup_elem()
>> +~~~~~~~~~~~~~~~~~~~~~
>> +
>> +.. code-block:: c
>> +
>> +   int bpf_map_lookup_elem(int map_fd, const void *key, void *value)
>> +
>> +Socket-local storage for the socket identified by ``key`` belonging to ``map_fd`` can be
>> +retrieved using the ``bpf_map_lookup_elem()`` libbpf function. ``key`` must be a pointer to a
>> +valid ``fd`` in the user space program. Returns ``0`` on success, or negative error in case of
>> +failure.
>> +
>> +bpf_map_delete_elem()
>> +~~~~~~~~~~~~~~~~~~~~~
>> +
>> +.. code-block:: c
>> +
>> +   int bpf_map_delete_elem (int map_fd, const void *key)
>
> Extra space between _elem and (.

Good catch, thx.

>> +
>> +Socket-local storage for the socket identified by ``key`` belonging to ``map_fd`` can be deleted
>> +using the ``bpf_map_delete_elem()`` libbpf function. Returns ``0`` on success, or negative error
>> +in case of failure.
>> +
>> +Examples
>> +========
>> +
>> +Kernel BPF
>> +----------
>> +
>> +This snippet shows how to declare socket-local storage in a BPF program:
>> +
>> +.. code-block:: c
>> +
>> +    struct {
>> +            __uint(type, BPF_MAP_TYPE_SK_STORAGE);
>> +            __uint(map_flags, BPF_F_NO_PREALLOC);
>> +            __type(key, int);
>> +            __type(value, struct my_storage);
>> +    } socket_storage SEC(".maps");
>> +
>> +This snippet shows how to retrieve socket-local storage in a BPF program:
>> +
>> +.. code-block:: c
>> +
>> +    SEC("sockops")
>> +    int _sockops(struct bpf_sock_ops *ctx)
>> +    {
>> +            struct my_storage *storage;
>> +            struct bpf_sock *sk;
>> +
>> +            sk = ctx->sk;
>> +            if (!sk)
>> +                    return 1;
>
> Don't feel strongly about this one, but IMO it's nice for examples to
> illustrate code that's as close to real and pristine as possible. To
> that point, should this example perhaps be updated to return -ENOENT
> here, and -ENOMEM below?

Will do.

>> +
>> +            storage = bpf_sk_storage_get(&socket_storage, sk, 0,
>> +                                         BPF_LOCAL_STORAGE_GET_F_CREATE);
>> +            if (!storage)
>> +                    return 1;
>> +
>> +            /* Use 'storage' here */
>
> Let's return 0 at the end to make the example program technically
> correct.

Will do.

>> +    }
>
> Thanks,
> David
Donald Hunter Dec. 8, 2022, 4:35 p.m. UTC | #4
Donald Hunter <donald.hunter@gmail.com> writes:

> David Vernet <void@manifault.com> writes:
>
>> On Wed, Dec 07, 2022 at 10:27:21AM +0000, Donald Hunter wrote:
>>> +
>>> +This snippet shows how to retrieve socket-local storage in a BPF program:
>>> +
>>> +.. code-block:: c
>>> +
>>> +    SEC("sockops")
>>> +    int _sockops(struct bpf_sock_ops *ctx)
>>> +    {
>>> +            struct my_storage *storage;
>>> +            struct bpf_sock *sk;
>>> +
>>> +            sk = ctx->sk;
>>> +            if (!sk)
>>> +                    return 1;
>>
>> Don't feel strongly about this one, but IMO it's nice for examples to
>> illustrate code that's as close to real and pristine as possible. To
>> that point, should this example perhaps be updated to return -ENOENT
>> here, and -ENOMEM below?
>
> Will do.
>

After digging into this a bit more I notice that the sockops programs in
tools/testing/selftests/bpf/progs mostly return 1 in all cases.

I'm assuming that sockops programs should return valid values for
some op types such as BPF_SOCK_OPS_TIMEOUT_INIT. Other than that I can't
find a definitive list. Do you know if valid return values are
enumerated anywhere, or do I need to dig some more?

>>> +
>>> +            storage = bpf_sk_storage_get(&socket_storage, sk, 0,
>>> +                                         BPF_LOCAL_STORAGE_GET_F_CREATE);
>>> +            if (!storage)
>>> +                    return 1;
>>> +
>>> +            /* Use 'storage' here */
>>
>> Let's return 0 at the end to make the example program technically
>> correct.
>
> Will do.
>
>>> +    }
>>
>> Thanks,
>> David
Yonghong Song Dec. 8, 2022, 5:19 p.m. UTC | #5
On 12/8/22 8:35 AM, Donald Hunter wrote:
> Donald Hunter <donald.hunter@gmail.com> writes:
> 
>> David Vernet <void@manifault.com> writes:
>>
>>> On Wed, Dec 07, 2022 at 10:27:21AM +0000, Donald Hunter wrote:
>>>> +
>>>> +This snippet shows how to retrieve socket-local storage in a BPF program:
>>>> +
>>>> +.. code-block:: c
>>>> +
>>>> +    SEC("sockops")
>>>> +    int _sockops(struct bpf_sock_ops *ctx)
>>>> +    {
>>>> +            struct my_storage *storage;
>>>> +            struct bpf_sock *sk;
>>>> +
>>>> +            sk = ctx->sk;
>>>> +            if (!sk)
>>>> +                    return 1;
>>>
>>> Don't feel strongly about this one, but IMO it's nice for examples to
>>> illustrate code that's as close to real and pristine as possible. To
>>> that point, should this example perhaps be updated to return -ENOENT
>>> here, and -ENOMEM below?
>>
>> Will do.
>>
> 
> After digging into this a bit more I notice that the sockops programs in
> tools/testing/selftests/bpf/progs mostly return 1 in all cases.
> 
> I'm assuming that sockops programs should return valid values for
> some op types such as BPF_SOCK_OPS_TIMEOUT_INIT. Other than that I can't
> find a definitive list. Do you know if valid return values are
> enumerated anywhere, or do I need to dig some more?

It can return any integer.

static inline u32 tcp_timeout_init(struct sock *sk)
{
         int timeout;

         timeout = tcp_call_bpf(sk, BPF_SOCK_OPS_TIMEOUT_INIT, 0, NULL);

         if (timeout <= 0)
                 timeout = TCP_TIMEOUT_INIT;
         return min_t(int, timeout, TCP_RTO_MAX);
}

In uapi bpf.h,

         BPF_SOCK_OPS_TIMEOUT_INIT,      /* Should return SYN-RTO value 
to use or
                                          * -1 if default value should 
be used
                                          */

I think the above code is from selftests tcp_rtt.c. You can add a
reference to provide more context. If people are really interested,
they can go to selftest to find more.

> 
>>>> +
>>>> +            storage = bpf_sk_storage_get(&socket_storage, sk, 0,
>>>> +                                         BPF_LOCAL_STORAGE_GET_F_CREATE);
>>>> +            if (!storage)
>>>> +                    return 1;
>>>> +
>>>> +            /* Use 'storage' here */
>>>
>>> Let's return 0 at the end to make the example program technically
>>> correct.
>>
>> Will do.

In this case, 'return 0' probably not correct. I suggest keep the
code as is to sync with selftests. Add a reference to selftest
for more reference.

>>
>>>> +    }
>>>
>>> Thanks,
>>> David
diff mbox series

Patch

diff --git a/Documentation/bpf/map_sk_storage.rst b/Documentation/bpf/map_sk_storage.rst
new file mode 100644
index 000000000000..955b287bb7de
--- /dev/null
+++ b/Documentation/bpf/map_sk_storage.rst
@@ -0,0 +1,142 @@ 
+.. SPDX-License-Identifier: GPL-2.0-only
+.. Copyright (C) 2022 Red Hat, Inc.
+
+=======================
+BPF_MAP_TYPE_SK_STORAGE
+=======================
+
+.. note::
+   - ``BPF_MAP_TYPE_SK_STORAGE`` was introduced in kernel version 5.2
+
+``BPF_MAP_TYPE_SK_STORAGE`` is used to provide socket-local storage for BPF programs. A map of
+type ``BPF_MAP_TYPE_SK_STORAGE`` declares the type of storage to be provided and acts as the
+handle for accessing the socket-local storage from a BPF program. The key type must be ``int``
+and ``max_entries`` must be set to ``0``.
+
+The ``BPF_F_NO_PREALLOC`` must be used when creating a map for socket-local storage. The kernel
+is responsible for allocating storage for a socket when requested and for freeing the storage
+when either the map or the socket is deleted.
+
+Usage
+=====
+
+Kernel BPF
+----------
+
+bpf_sk_storage_get()
+~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+   void *bpf_sk_storage_get(struct bpf_map *map, void *sk, void *value, u64 flags)
+
+Socket-local storage can be retrieved using the ``bpf_sk_storage_get()`` helper. The helper gets
+the storage from ``sk`` that is identified by ``map``.  If the
+``BPF_LOCAL_STORAGE_GET_F_CREATE`` flag is used then ``bpf_sk_storage_get()`` will create the
+storage for ``sk`` if it does not already exist. ``value`` can be used together with
+``BPF_LOCAL_STORAGE_GET_F_CREATE`` to initialize the storage value, otherwise it will be zero
+initialized. Returns a pointer to the storage on success, or ``NULL`` in case of failure.
+
+.. note::
+   - ``sk`` is a kernel ``struct sock`` pointer for LSM or tracing programs.
+   - ``sk`` is a ``struct bpf_sock`` pointer for other program types.
+
+bpf_sk_storage_delete()
+~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+   long bpf_sk_storage_delete(struct bpf_map *map, void *sk)
+
+Socket-local storage can be deleted using the ``bpf_sk_storage_delete()`` helper. The helper
+deletes the storage from ``sk`` that is identified by ``map``. Returns ``0`` on success, or negative
+error in case of failure.
+
+User space
+----------
+
+bpf_map_update_elem()
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+   int bpf_map_update_elem(int map_fd, const void *key, const void *value, __u64 flags)
+
+Socket-local storage with type identified by ``map_fd`` for the socket identified by ``key`` can
+be added or updated using the ``bpf_map_update_elem()`` libbpf function. ``key`` must be a
+pointer to a valid ``fd`` in the user space program. The ``flags`` parameter can be used to
+control the update behaviour:
+
+- ``BPF_ANY`` will create storage for ``fd`` or update existing storage.
+- ``BPF_NOEXIST`` will create storage for ``fd`` only if it did not already
+  exist
+- ``BPF_EXIST`` will update existing storage for ``fd``
+
+Returns ``0`` on success, or negative error in case of failure.
+
+bpf_map_lookup_elem()
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+   int bpf_map_lookup_elem(int map_fd, const void *key, void *value)
+
+Socket-local storage for the socket identified by ``key`` belonging to ``map_fd`` can be
+retrieved using the ``bpf_map_lookup_elem()`` libbpf function. ``key`` must be a pointer to a
+valid ``fd`` in the user space program. Returns ``0`` on success, or negative error in case of
+failure.
+
+bpf_map_delete_elem()
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+   int bpf_map_delete_elem (int map_fd, const void *key)
+
+Socket-local storage for the socket identified by ``key`` belonging to ``map_fd`` can be deleted
+using the ``bpf_map_delete_elem()`` libbpf function. Returns ``0`` on success, or negative error
+in case of failure.
+
+Examples
+========
+
+Kernel BPF
+----------
+
+This snippet shows how to declare socket-local storage in a BPF program:
+
+.. code-block:: c
+
+    struct {
+            __uint(type, BPF_MAP_TYPE_SK_STORAGE);
+            __uint(map_flags, BPF_F_NO_PREALLOC);
+            __type(key, int);
+            __type(value, struct my_storage);
+    } socket_storage SEC(".maps");
+
+This snippet shows how to retrieve socket-local storage in a BPF program:
+
+.. code-block:: c
+
+    SEC("sockops")
+    int _sockops(struct bpf_sock_ops *ctx)
+    {
+            struct my_storage *storage;
+            struct bpf_sock *sk;
+
+            sk = ctx->sk;
+            if (!sk)
+                    return 1;
+
+            storage = bpf_sk_storage_get(&socket_storage, sk, 0,
+                                         BPF_LOCAL_STORAGE_GET_F_CREATE);
+            if (!storage)
+                    return 1;
+
+            /* Use 'storage' here */
+    }
+
+References
+==========
+
+https://lwn.net/ml/netdev/20190426171103.61892-1-kafai@fb.com/