mbox series

[v2,bpf-next,0/8] Allocate bpf trampoline on bpf_prog_pack

Message ID 20230925215324.2962716-1-song@kernel.org (mailing list archive)
Headers show
Series Allocate bpf trampoline on bpf_prog_pack | expand

Message

Song Liu Sept. 25, 2023, 9:53 p.m. UTC
This set enables allocating bpf trampoline from bpf_prog_pack on x86. The
majority of this work, however, is the refactoring of trampoline code.
This is needed because we need to handle 4 archs and 2 users (trampoline
and struct_ops).

1/8 is a dependency that is already applied to bpf tree.
2/8 through 7/8 refactors trampoline code. A few helpers are added.
8/8 finally let bpf trampoline on x86 use bpf_prog_pack.

Changes in v2:
1. Add missing changes in net/bpf/bpf_dummy_struct_ops.c.
2. Reduce one dry run in arch_prepare_bpf_trampoline. (Xu Kuohai)
3. Other small fixes.

Song Liu (8):
  s390/bpf: Let arch_prepare_bpf_trampoline return program size
  bpf: Let bpf_prog_pack_free handle any pointer
  bpf: Adjust argument names of arch_prepare_bpf_trampoline()
  bpf: Add helpers for trampoline image management
  bpf, x86: Adjust arch_prepare_bpf_trampoline return value
  bpf: Add arch_bpf_trampoline_size()
  bpf: Use arch_bpf_trampoline_size
  x86, bpf: Use bpf_prog_pack for bpf trampoline

 arch/arm64/net/bpf_jit_comp.c   |  55 +++++++++-----
 arch/riscv/net/bpf_jit_comp64.c |  24 ++++---
 arch/s390/net/bpf_jit_comp.c    |  43 ++++++-----
 arch/x86/net/bpf_jit_comp.c     | 124 +++++++++++++++++++++++++-------
 include/linux/bpf.h             |  12 +++-
 include/linux/filter.h          |   2 +-
 kernel/bpf/bpf_struct_ops.c     |  19 +++--
 kernel/bpf/core.c               |  21 +++---
 kernel/bpf/dispatcher.c         |   5 +-
 kernel/bpf/trampoline.c         |  93 ++++++++++++++++++------
 net/bpf/bpf_dummy_struct_ops.c  |   7 +-
 11 files changed, 277 insertions(+), 128 deletions(-)

--
2.34.1

Comments

Ilya Leoshkevich Sept. 26, 2023, 12:26 p.m. UTC | #1
On Mon, 2023-09-25 at 14:53 -0700, Song Liu wrote:
> This set enables allocating bpf trampoline from bpf_prog_pack on x86.
> The
> majority of this work, however, is the refactoring of trampoline
> code.
> This is needed because we need to handle 4 archs and 2 users
> (trampoline
> and struct_ops).
> 
> 1/8 is a dependency that is already applied to bpf tree.
> 2/8 through 7/8 refactors trampoline code. A few helpers are added.
> 8/8 finally let bpf trampoline on x86 use bpf_prog_pack.
> 
> Changes in v2:
> 1. Add missing changes in net/bpf/bpf_dummy_struct_ops.c.
> 2. Reduce one dry run in arch_prepare_bpf_trampoline. (Xu Kuohai)
> 3. Other small fixes.
> 
> Song Liu (8):
>   s390/bpf: Let arch_prepare_bpf_trampoline return program size
>   bpf: Let bpf_prog_pack_free handle any pointer
>   bpf: Adjust argument names of arch_prepare_bpf_trampoline()
>   bpf: Add helpers for trampoline image management
>   bpf, x86: Adjust arch_prepare_bpf_trampoline return value
>   bpf: Add arch_bpf_trampoline_size()
>   bpf: Use arch_bpf_trampoline_size
>   x86, bpf: Use bpf_prog_pack for bpf trampoline
> 
>  arch/arm64/net/bpf_jit_comp.c   |  55 +++++++++-----
>  arch/riscv/net/bpf_jit_comp64.c |  24 ++++---
>  arch/s390/net/bpf_jit_comp.c    |  43 ++++++-----
>  arch/x86/net/bpf_jit_comp.c     | 124 +++++++++++++++++++++++++-----
> --
>  include/linux/bpf.h             |  12 +++-
>  include/linux/filter.h          |   2 +-
>  kernel/bpf/bpf_struct_ops.c     |  19 +++--
>  kernel/bpf/core.c               |  21 +++---
>  kernel/bpf/dispatcher.c         |   5 +-
>  kernel/bpf/trampoline.c         |  93 ++++++++++++++++++------
>  net/bpf/bpf_dummy_struct_ops.c  |   7 +-
>  11 files changed, 277 insertions(+), 128 deletions(-)
> 
> --
> 2.34.1

Regarding the s390x part, arch_prepare_bpf_trampoline() needs to call
__arch_prepare_bpf_trampoline() twice: the first time in order to
compute various offsets, the second time to actually emit the code. So
I would suggest to either keep the loop or use the following fixup:

--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -2645,7 +2645,15 @@ int arch_prepare_bpf_trampoline(struct
bpf_tramp_image *im, void *image,
        struct bpf_tramp_jit tjit;
        int ret;
 
+       /* Compute offsets. */
        memset(&tjit, 0, sizeof(tjit));
+       ret = __arch_prepare_bpf_trampoline(im, &tjit, m, flags,
+                                           tlinks, func_addr);
+       if (ret < 0)
+               return ret;
+
+       /* Generate the code. */
+       tjit.common.prg = 0;
        tjit.common.prg_buf = image;
        ret = __arch_prepare_bpf_trampoline(im, &tjit, m, flags,
                                            tlinks, func_addr);

With that:

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>  # on s390x

for the series.
Song Liu Sept. 26, 2023, 3:56 p.m. UTC | #2
> On Sep 26, 2023, at 5:26 AM, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> 
> On Mon, 2023-09-25 at 14:53 -0700, Song Liu wrote:
>> This set enables allocating bpf trampoline from bpf_prog_pack on x86.
>> The
>> majority of this work, however, is the refactoring of trampoline
>> code.
>> This is needed because we need to handle 4 archs and 2 users
>> (trampoline
>> and struct_ops).
>> 
>> 1/8 is a dependency that is already applied to bpf tree.
>> 2/8 through 7/8 refactors trampoline code. A few helpers are added.
>> 8/8 finally let bpf trampoline on x86 use bpf_prog_pack.
>> 
>> Changes in v2:
>> 1. Add missing changes in net/bpf/bpf_dummy_struct_ops.c.
>> 2. Reduce one dry run in arch_prepare_bpf_trampoline. (Xu Kuohai)
>> 3. Other small fixes.
>> 
>> Song Liu (8):
>>   s390/bpf: Let arch_prepare_bpf_trampoline return program size
>>   bpf: Let bpf_prog_pack_free handle any pointer
>>   bpf: Adjust argument names of arch_prepare_bpf_trampoline()
>>   bpf: Add helpers for trampoline image management
>>   bpf, x86: Adjust arch_prepare_bpf_trampoline return value
>>   bpf: Add arch_bpf_trampoline_size()
>>   bpf: Use arch_bpf_trampoline_size
>>   x86, bpf: Use bpf_prog_pack for bpf trampoline
>> 
>>  arch/arm64/net/bpf_jit_comp.c   |  55 +++++++++-----
>>  arch/riscv/net/bpf_jit_comp64.c |  24 ++++---
>>  arch/s390/net/bpf_jit_comp.c    |  43 ++++++-----
>>  arch/x86/net/bpf_jit_comp.c     | 124 +++++++++++++++++++++++++-----
>> --
>>  include/linux/bpf.h             |  12 +++-
>>  include/linux/filter.h          |   2 +-
>>  kernel/bpf/bpf_struct_ops.c     |  19 +++--
>>  kernel/bpf/core.c               |  21 +++---
>>  kernel/bpf/dispatcher.c         |   5 +-
>>  kernel/bpf/trampoline.c         |  93 ++++++++++++++++++------
>>  net/bpf/bpf_dummy_struct_ops.c  |   7 +-
>>  11 files changed, 277 insertions(+), 128 deletions(-)
>> 
>> --
>> 2.34.1
> 
> Regarding the s390x part, arch_prepare_bpf_trampoline() needs to call
> __arch_prepare_bpf_trampoline() twice: the first time in order to
> compute various offsets, the second time to actually emit the code. So
> I would suggest to either keep the loop or use the following fixup:

Thanks for the test and the fix! 

I will fold the fix in and send v3. 
Song

> 
> --- a/arch/s390/net/bpf_jit_comp.c
> +++ b/arch/s390/net/bpf_jit_comp.c
> @@ -2645,7 +2645,15 @@ int arch_prepare_bpf_trampoline(struct
> bpf_tramp_image *im, void *image,
>        struct bpf_tramp_jit tjit;
>        int ret;
> 
> +       /* Compute offsets. */
>        memset(&tjit, 0, sizeof(tjit));
> +       ret = __arch_prepare_bpf_trampoline(im, &tjit, m, flags,
> +                                           tlinks, func_addr);
> +       if (ret < 0)
> +               return ret;
> +
> +       /* Generate the code. */
> +       tjit.common.prg = 0;
>        tjit.common.prg_buf = image;
>        ret = __arch_prepare_bpf_trampoline(im, &tjit, m, flags,
>                                            tlinks, func_addr);
> 
> With that:
> 
> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>  # on s390x
> 
> for the series.