[bpf-next,0/2] Fix bpf_perf_event_data ABI breakage

Message ID	20220206145350.2069779-1-iii@linux.ibm.com (mailing list archive)
Headers	show Return-Path: <bpf-owner@kernel.org> From: Ilya Leoshkevich <iii@linux.ibm.com> To: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii.nakryiko@gmail.com>, Heiko Carstens <hca@linux.ibm.com>, Vasily Gorbik <gor@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Alexander Gordeev <agordeev@linux.ibm.com>, Catalin Marinas <catalin.marinas@arm.com> Cc: bpf@vger.kernel.org, Ilya Leoshkevich <iii@linux.ibm.com> Subject: [PATCH bpf-next 0/2] Fix bpf_perf_event_data ABI breakage Date: Sun, 6 Feb 2022 15:53:48 +0100 Message-Id: <20220206145350.2069779-1-iii@linux.ibm.com> Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Precedence: bulk
Series	Fix bpf_perf_event_data ABI breakage \| expand [bpf-next,0/2] Fix bpf_perf_event_data ABI breakage [bpf-next,1/2] s390/bpf: Introduce user_pt_regs_v2 [bpf-next,2/2] arm64/bpf: Introduce struct user_pt_regs_v2

Ilya Leoshkevich Feb. 6, 2022, 2:53 p.m. UTC

libbpf CI noticed that my recent changes broke bpf_perf_event_data ABI
on s390 [1]. Testing shows that they introduced a similar breakage on
arm64. The problem is that we are not allowed to extend user_pt_regs,
since it's used by bpf_perf_event_data.

This series fixes these problems by removing the new members and
introducing user_pt_regs_v2 instead.

[1] https://github.com/libbpf/libbpf/runs/5079938810

Ilya Leoshkevich (2):
  s390/bpf: Introduce user_pt_regs_v2
  arm64/bpf: Introduce struct user_pt_regs_v2

 arch/arm64/include/asm/ptrace.h      |  1 +
 arch/arm64/include/uapi/asm/ptrace.h |  7 +++++++
 arch/s390/include/asm/ptrace.h       |  1 +
 arch/s390/include/uapi/asm/ptrace.h  | 10 ++++++++--
 tools/lib/bpf/bpf_tracing.h          | 10 ++++++----
 5 files changed, 23 insertions(+), 6 deletions(-)

Andrii Nakryiko Feb. 6, 2022, 7:31 p.m. UTC | #1

On Sun, Feb 6, 2022 at 6:54 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> libbpf CI noticed that my recent changes broke bpf_perf_event_data ABI
> on s390 [1]. Testing shows that they introduced a similar breakage on
> arm64. The problem is that we are not allowed to extend user_pt_regs,
> since it's used by bpf_perf_event_data.
>
> This series fixes these problems by removing the new members and
> introducing user_pt_regs_v2 instead.
>
> [1] https://github.com/libbpf/libbpf/runs/5079938810
>
> Ilya Leoshkevich (2):
>   s390/bpf: Introduce user_pt_regs_v2
>   arm64/bpf: Introduce struct user_pt_regs_v2

Given it is bpf_perf_event_data and thus bpf_user_pt_regs_t
definitions that are set in stone now, wouldn't it be better to
instead just change

typedef user_pt_regs bpf_user_pt_regs_t; (s390x)
typedef struct user_pt_regs bpf_user_pt_regs_t; (arm64)

to just define that fixed layout instead of reusing user_ptr_regs?

This whole v2 business looks really ugly.


>
>  arch/arm64/include/asm/ptrace.h      |  1 +
>  arch/arm64/include/uapi/asm/ptrace.h |  7 +++++++
>  arch/s390/include/asm/ptrace.h       |  1 +
>  arch/s390/include/uapi/asm/ptrace.h  | 10 ++++++++--
>  tools/lib/bpf/bpf_tracing.h          | 10 ++++++----
>  5 files changed, 23 insertions(+), 6 deletions(-)
>
> --
> 2.34.1
>

Ilya Leoshkevich Feb. 6, 2022, 7:57 p.m. UTC | #2

On Sun, 2022-02-06 at 11:31 -0800, Andrii Nakryiko wrote:
> On Sun, Feb 6, 2022 at 6:54 AM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > libbpf CI noticed that my recent changes broke bpf_perf_event_data
> > ABI
> > on s390 [1]. Testing shows that they introduced a similar breakage
> > on
> > arm64. The problem is that we are not allowed to extend
> > user_pt_regs,
> > since it's used by bpf_perf_event_data.
> > 
> > This series fixes these problems by removing the new members and
> > introducing user_pt_regs_v2 instead.
> > 
> > [1] https://github.com/libbpf/libbpf/runs/5079938810
> > 
> > Ilya Leoshkevich (2):
> >   s390/bpf: Introduce user_pt_regs_v2
> >   arm64/bpf: Introduce struct user_pt_regs_v2
> 
> Given it is bpf_perf_event_data and thus bpf_user_pt_regs_t
> definitions that are set in stone now, wouldn't it be better to
> instead just change
> 
> typedef user_pt_regs bpf_user_pt_regs_t; (s390x)
> typedef struct user_pt_regs bpf_user_pt_regs_t; (arm64)
> 
> to just define that fixed layout instead of reusing user_ptr_regs?
> 
> This whole v2 business looks really ugly.

Wouldn't it break compilation of code like this?

    bpf_perf_event_data data;
    user_pt_regs *regs = &data.regs;

Additionaly, after this I'm no longer sure I haven't missed any other
places where user_pt_regs might be used. For example, arm64 seems to be
using it not only for BPF, but also for ptrace?

static int gpr_get(struct task_struct *target,
                   const struct user_regset *regset,
                   struct membuf to)
{
        struct user_pt_regs *uregs = &task_pt_regs(target)->user_regs;
        return membuf_write(&to, uregs, sizeof(*uregs));
}

and then in e.g. gdb:

static void
aarch64_fill_gregset (struct regcache *regcache, void *buf)
{
  struct user_pt_regs *regset = (struct user_pt_regs *) buf;
  ...

I'm also not a big fan of the _v2 solution, but it looked the safest
to me. At least for s390, a viable alternative that Vasily proposed
would be to go ahead with replacing args[1] with orig_gpr2 and then
also backporting the patch, so that the new libbpf would still work on
the old stable kernels. But this won't work for arm64.

Andrii Nakryiko Feb. 7, 2022, 6:23 a.m. UTC | #3

On Sun, Feb 6, 2022 at 11:57 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Sun, 2022-02-06 at 11:31 -0800, Andrii Nakryiko wrote:
> > On Sun, Feb 6, 2022 at 6:54 AM Ilya Leoshkevich <iii@linux.ibm.com>
> > wrote:
> > >
> > > libbpf CI noticed that my recent changes broke bpf_perf_event_data
> > > ABI
> > > on s390 [1]. Testing shows that they introduced a similar breakage
> > > on
> > > arm64. The problem is that we are not allowed to extend
> > > user_pt_regs,
> > > since it's used by bpf_perf_event_data.
> > >
> > > This series fixes these problems by removing the new members and
> > > introducing user_pt_regs_v2 instead.
> > >
> > > [1] https://github.com/libbpf/libbpf/runs/5079938810
> > >
> > > Ilya Leoshkevich (2):
> > >   s390/bpf: Introduce user_pt_regs_v2
> > >   arm64/bpf: Introduce struct user_pt_regs_v2
> >
> > Given it is bpf_perf_event_data and thus bpf_user_pt_regs_t
> > definitions that are set in stone now, wouldn't it be better to
> > instead just change
> >
> > typedef user_pt_regs bpf_user_pt_regs_t; (s390x)
> > typedef struct user_pt_regs bpf_user_pt_regs_t; (arm64)
> >
> > to just define that fixed layout instead of reusing user_ptr_regs?
> >
> > This whole v2 business looks really ugly.
>
> Wouldn't it break compilation of code like this?
>
>     bpf_perf_event_data data;
>     user_pt_regs *regs = &data.regs;

why would it break? user_pt_regs gained extra fields at the end, so
whoever works with the assumption of an old definition of user_pt_regs
*through pointer* should be totally fine. The problem with
bpf_perf_event_data is that user_pt_regs are embedded in the struct
directly, so adding anything to it changes bpf_perf_event_data layout.

I, of course, can't know if this breaks any other use case (including
ones you mentioned below), but using user_pt_regs_v2 will probably not
work with CO-RE, because older kernels won't have such type defined
(and thus relocations will fail).

I'm not sure the origins of the need for user_pt_regs (as opposed to
using pt_regs directly, like x86-64 does), but with CO-RE and
vmlinux.h it would be more reliable and straightforward to just stick
to kernel-internal struct pt_regs everywhere. And for non-CO-RE macros
maybe just using an offset within struct pt_regs (i.e.,
offsetofend(gprs)) would do it?

>
> Additionaly, after this I'm no longer sure I haven't missed any other
> places where user_pt_regs might be used. For example, arm64 seems to be
> using it not only for BPF, but also for ptrace?
>
> static int gpr_get(struct task_struct *target,
>                    const struct user_regset *regset,
>                    struct membuf to)
> {
>         struct user_pt_regs *uregs = &task_pt_regs(target)->user_regs;
>         return membuf_write(&to, uregs, sizeof(*uregs));
> }
>
> and then in e.g. gdb:
>
> static void
> aarch64_fill_gregset (struct regcache *regcache, void *buf)
> {
>   struct user_pt_regs *regset = (struct user_pt_regs *) buf;
>   ...
>
> I'm also not a big fan of the _v2 solution, but it looked the safest
> to me. At least for s390, a viable alternative that Vasily proposed
> would be to go ahead with replacing args[1] with orig_gpr2 and then
> also backporting the patch, so that the new libbpf would still work on
> the old stable kernels. But this won't work for arm64.

Heiko Carstens Feb. 7, 2022, 9:46 a.m. UTC | #4

On Sun, Feb 06, 2022 at 10:23:19PM -0800, Andrii Nakryiko wrote:
> I'm not sure the origins of the need for user_pt_regs (as opposed to
> using pt_regs directly, like x86-64 does), but with CO-RE and
> vmlinux.h it would be more reliable and straightforward to just stick
> to kernel-internal struct pt_regs everywhere. And for non-CO-RE macros
> maybe just using an offset within struct pt_regs (i.e.,
> offsetofend(gprs)) would do it?

user_pt_regs was introduced on s390 because struct pt_regs is _not_
stable. Only the first n entries (aka user_pt_regs) are supposed to be
stable.

We could of course reduce struct pt_regs to the bare minimum, which seems
to be the current user_pt_regs plus orig_gpr2; which semantically would
match more or less what x86 has.

Then move pt_regs to uapi, so it is clear that it cannot be changed
anymore, and have additional data in a separate structure on the stack,
which has pt_regs at the beginning, and access this additional data with
container_of & friends.

I guess that could work, even though this requires to keep user_pt_regs
"for historical reasons".

Ilya Leoshkevich Feb. 7, 2022, 11:52 a.m. UTC | #5

On Sun, 2022-02-06 at 22:23 -0800, Andrii Nakryiko wrote:
> On Sun, Feb 6, 2022 at 11:57 AM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > On Sun, 2022-02-06 at 11:31 -0800, Andrii Nakryiko wrote:
> > > On Sun, Feb 6, 2022 at 6:54 AM Ilya Leoshkevich
> > > <iii@linux.ibm.com>
> > > wrote:
> > > > 
> > > > libbpf CI noticed that my recent changes broke
> > > > bpf_perf_event_data
> > > > ABI
> > > > on s390 [1]. Testing shows that they introduced a similar
> > > > breakage
> > > > on
> > > > arm64. The problem is that we are not allowed to extend
> > > > user_pt_regs,
> > > > since it's used by bpf_perf_event_data.
> > > > 
> > > > This series fixes these problems by removing the new members
> > > > and
> > > > introducing user_pt_regs_v2 instead.
> > > > 
> > > > [1] https://github.com/libbpf/libbpf/runs/5079938810
> > > > 
> > > > Ilya Leoshkevich (2):
> > > >   s390/bpf: Introduce user_pt_regs_v2
> > > >   arm64/bpf: Introduce struct user_pt_regs_v2
> > > 
> > > Given it is bpf_perf_event_data and thus bpf_user_pt_regs_t
> > > definitions that are set in stone now, wouldn't it be better to
> > > instead just change
> > > 
> > > typedef user_pt_regs bpf_user_pt_regs_t; (s390x)
> > > typedef struct user_pt_regs bpf_user_pt_regs_t; (arm64)
> > > 
> > > to just define that fixed layout instead of reusing
> > > user_ptr_regs?
> > > 
> > > This whole v2 business looks really ugly.
> > 
> > Wouldn't it break compilation of code like this?
> > 
> >     bpf_perf_event_data data;
> >     user_pt_regs *regs = &data.regs;
> 
> why would it break? user_pt_regs gained extra fields at the end, so
> whoever works with the assumption of an old definition of
> user_pt_regs
> *through pointer* should be totally fine. The problem with
> bpf_perf_event_data is that user_pt_regs are embedded in the struct
> directly, so adding anything to it changes bpf_perf_event_data
> layout.

I meant only building from source, at runtime it should be fine. At
compile time, the compiler should at least warn that pointer types
don't match.

> I, of course, can't know if this breaks any other use case (including
> ones you mentioned below), but using user_pt_regs_v2 will probably
> not
> work with CO-RE, because older kernels won't have such type defined
> (and thus relocations will fail).
> 
> I'm not sure the origins of the need for user_pt_regs (as opposed to
> using pt_regs directly, like x86-64 does), but with CO-RE and
> vmlinux.h it would be more reliable and straightforward to just stick
> to kernel-internal struct pt_regs everywhere. And for non-CO-RE
> macros
> maybe just using an offset within struct pt_regs (i.e.,
> offsetofend(gprs)) would do it?

offsetofend sounds like a nice compromise. I'll give it a try, thanks.

> > 
> > Additionaly, after this I'm no longer sure I haven't missed any
> > other
> > places where user_pt_regs might be used. For example, arm64 seems
> > to be
> > using it not only for BPF, but also for ptrace?
> > 
> > static int gpr_get(struct task_struct *target,
> >                    const struct user_regset *regset,
> >                    struct membuf to)
> > {
> >         struct user_pt_regs *uregs = &task_pt_regs(target)-
> > >user_regs;
> >         return membuf_write(&to, uregs, sizeof(*uregs));
> > }
> > 
> > and then in e.g. gdb:
> > 
> > static void
> > aarch64_fill_gregset (struct regcache *regcache, void *buf)
> > {
> >   struct user_pt_regs *regset = (struct user_pt_regs *) buf;
> >   ...
> > 
> > I'm also not a big fan of the _v2 solution, but it looked the
> > safest
> > to me. At least for s390, a viable alternative that Vasily proposed
> > would be to go ahead with replacing args[1] with orig_gpr2 and then
> > also backporting the patch, so that the new libbpf would still work
> > on
> > the old stable kernels. But this won't work for arm64.

Andrii Nakryiko Feb. 7, 2022, 8:08 p.m. UTC | #6

On Mon, Feb 7, 2022 at 3:52 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Sun, 2022-02-06 at 22:23 -0800, Andrii Nakryiko wrote:
> > On Sun, Feb 6, 2022 at 11:57 AM Ilya Leoshkevich <iii@linux.ibm.com>
> > wrote:
> > >
> > > On Sun, 2022-02-06 at 11:31 -0800, Andrii Nakryiko wrote:
> > > > On Sun, Feb 6, 2022 at 6:54 AM Ilya Leoshkevich
> > > > <iii@linux.ibm.com>
> > > > wrote:
> > > > >
> > > > > libbpf CI noticed that my recent changes broke
> > > > > bpf_perf_event_data
> > > > > ABI
> > > > > on s390 [1]. Testing shows that they introduced a similar
> > > > > breakage
> > > > > on
> > > > > arm64. The problem is that we are not allowed to extend
> > > > > user_pt_regs,
> > > > > since it's used by bpf_perf_event_data.
> > > > >
> > > > > This series fixes these problems by removing the new members
> > > > > and
> > > > > introducing user_pt_regs_v2 instead.
> > > > >
> > > > > [1] https://github.com/libbpf/libbpf/runs/5079938810
> > > > >
> > > > > Ilya Leoshkevich (2):
> > > > >   s390/bpf: Introduce user_pt_regs_v2
> > > > >   arm64/bpf: Introduce struct user_pt_regs_v2
> > > >
> > > > Given it is bpf_perf_event_data and thus bpf_user_pt_regs_t
> > > > definitions that are set in stone now, wouldn't it be better to
> > > > instead just change
> > > >
> > > > typedef user_pt_regs bpf_user_pt_regs_t; (s390x)
> > > > typedef struct user_pt_regs bpf_user_pt_regs_t; (arm64)
> > > >
> > > > to just define that fixed layout instead of reusing
> > > > user_ptr_regs?
> > > >
> > > > This whole v2 business looks really ugly.
> > >
> > > Wouldn't it break compilation of code like this?
> > >
> > >     bpf_perf_event_data data;
> > >     user_pt_regs *regs = &data.regs;
> >
> > why would it break? user_pt_regs gained extra fields at the end, so
> > whoever works with the assumption of an old definition of
> > user_pt_regs
> > *through pointer* should be totally fine. The problem with
> > bpf_perf_event_data is that user_pt_regs are embedded in the struct
> > directly, so adding anything to it changes bpf_perf_event_data
> > layout.
>
> I meant only building from source, at runtime it should be fine. At
> compile time, the compiler should at least warn that pointer types
> don't match.

Oh, you meant that cast would be necessary. Well, strictly speaking
code like in your example is broken, it should use the type specified
in struct bpf_perf_event_data: bpf_user_pt_regs_t. But the fix to
satisfy compilation is trivial as well, so doesn't matter much.

>
> > I, of course, can't know if this breaks any other use case (including
> > ones you mentioned below), but using user_pt_regs_v2 will probably
> > not
> > work with CO-RE, because older kernels won't have such type defined
> > (and thus relocations will fail).
> >
> > I'm not sure the origins of the need for user_pt_regs (as opposed to
> > using pt_regs directly, like x86-64 does), but with CO-RE and
> > vmlinux.h it would be more reliable and straightforward to just stick
> > to kernel-internal struct pt_regs everywhere. And for non-CO-RE
> > macros
> > maybe just using an offset within struct pt_regs (i.e.,
> > offsetofend(gprs)) would do it?
>
> offsetofend sounds like a nice compromise. I'll give it a try, thanks.

It's kind of dangerous as well, let's maybe leave a comment in pt_regs
that this orig_gpr2 location is assumed by libbpf's tracing macros so
shouldn't be willy-nilly moved

>
> > >
> > > Additionaly, after this I'm no longer sure I haven't missed any
> > > other
> > > places where user_pt_regs might be used. For example, arm64 seems
> > > to be
> > > using it not only for BPF, but also for ptrace?
> > >
> > > static int gpr_get(struct task_struct *target,
> > >                    const struct user_regset *regset,
> > >                    struct membuf to)
> > > {
> > >         struct user_pt_regs *uregs = &task_pt_regs(target)-
> > > >user_regs;
> > >         return membuf_write(&to, uregs, sizeof(*uregs));
> > > }
> > >
> > > and then in e.g. gdb:
> > >
> > > static void
> > > aarch64_fill_gregset (struct regcache *regcache, void *buf)
> > > {
> > >   struct user_pt_regs *regset = (struct user_pt_regs *) buf;
> > >   ...
> > >
> > > I'm also not a big fan of the _v2 solution, but it looked the
> > > safest
> > > to me. At least for s390, a viable alternative that Vasily proposed
> > > would be to go ahead with replacing args[1] with orig_gpr2 and then
> > > also backporting the patch, so that the new libbpf would still work
> > > on
> > > the old stable kernels. But this won't work for arm64.
>

Andrii Nakryiko Feb. 7, 2022, 8:09 p.m. UTC | #7

On Mon, Feb 7, 2022 at 1:46 AM Heiko Carstens <hca@linux.ibm.com> wrote:
>
> On Sun, Feb 06, 2022 at 10:23:19PM -0800, Andrii Nakryiko wrote:
> > I'm not sure the origins of the need for user_pt_regs (as opposed to
> > using pt_regs directly, like x86-64 does), but with CO-RE and
> > vmlinux.h it would be more reliable and straightforward to just stick
> > to kernel-internal struct pt_regs everywhere. And for non-CO-RE macros
> > maybe just using an offset within struct pt_regs (i.e.,
> > offsetofend(gprs)) would do it?
>
> user_pt_regs was introduced on s390 because struct pt_regs is _not_
> stable. Only the first n entries (aka user_pt_regs) are supposed to be
> stable.
>
> We could of course reduce struct pt_regs to the bare minimum, which seems
> to be the current user_pt_regs plus orig_gpr2; which semantically would
> match more or less what x86 has.
>
> Then move pt_regs to uapi, so it is clear that it cannot be changed
> anymore, and have additional data in a separate structure on the stack,
> which has pt_regs at the beginning, and access this additional data with
> container_of & friends.
>
> I guess that could work, even though this requires to keep user_pt_regs
> "for historical reasons".

It's just surprising and constraining that UAPI struct can be extended
by adding new fields at the end :( I'll let you guys decide it for
s390 and arm64. From libbpf's side, we have somewhat hacky ways to
work around that, it seems.

[bpf-next,0/2] Fix bpf_perf_event_data ABI breakage

Message

Comments