Message ID | 20230907210023.2467151-1-sdf@google.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: Future-proof connect4_prog.c | expand |
On Thu, Sep 7, 2023 at 2:00 PM Stanislav Fomichev <sdf@google.com> wrote: > > With the new internal clang version I see the following optimization > that makes connect4 program unverifiable. > > The following code: > > int do_bind() Yonghong added __weak to do_bind a few months ago ([0]), which makes it illegal for the compiler to assume 0 or 1 return. Can you please double check that this is the issue with __weak? [0] https://lore.kernel.org/bpf/20230310012410.2920570-1-yhs@fb.com/ > { > if (bpf_bind() != 0) > return 0; > return 1; > } > int connect_v4_prog() > { > return do_bind() ? 1 : 0; > } > > Becomes: > > int do_bind() > { > if (bpf_bind() != 0) > return 0; > return 1; > } > int connect_v4_prog() > { > return do_bind(); > } > > IOW, looks like clang is able to see that do_bind returns only 0 and > 1 and the extra branch around 'return do_bind' is not needed. > This, however, seems to break the verifier, which assumes that > bpf2bpf calls can return 0-0xffffffff. > > Note, I can produce those programs only with the internal fork of clang. > The latest one from git still produced correct bytecode. It might be > some options/optimizations that we enable and that are still > disabled for the general upstream users, not sure. I've desided > to send this patch out anyway since it seems like a correct optimization > the compiler might do. > > So to be future-proof, reshape the code a bit to return bpf_bind > result directly. This will not give any hint to the clang about > the return value and will force it generate that '? 1: 0' branch > at the callee. > > Good program: > > 0000000000000000 <do_bind>: > 0: b4 02 00 00 7f 00 00 04 w2 = 0x400007f > 1: 63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2 > 2: b4 02 00 00 02 00 00 00 w2 = 0x2 > 3: 63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2 > 4: b7 02 00 00 00 00 00 00 r2 = 0x0 > 5: 63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2 > 6: 63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2 > 7: bf a2 00 00 00 00 00 00 r2 = r10 > 8: 07 02 00 00 f0 ff ff ff r2 += -0x10 > 9: b4 03 00 00 10 00 00 00 w3 = 0x10 > 10: 85 00 00 00 40 00 00 00 call 0x40 > 11: bf 01 00 00 00 00 00 00 r1 = r0 > 12: b4 00 00 00 01 00 00 00 w0 = 0x1 > 13: 15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2> > 14: b4 00 00 00 00 00 00 00 w0 = 0x0 > > 00000000000001b0 <LBB1_30>: > 54: bc 60 00 00 00 00 00 00 w0 = w6 > 55: 95 00 00 00 00 00 00 00 exit > > 0000000000000578 <LBB1_28>: > ... > 180: 85 10 00 00 ff ff ff ff call -0x1 > 181: b4 06 00 00 01 00 00 00 w6 = 0x1 > 182: 56 00 7f ff 00 00 00 00 if w0 != 0x0 goto -0x81 <LBB1_30> > 183: b4 06 00 00 00 00 00 00 w6 = 0x0 > 184: 05 00 7d ff 00 00 00 00 goto -0x83 <LBB1_30> > > Bad program: > 0000000000000000 <do_bind>: > 0: b4 02 00 00 7f 00 00 04 w2 = 0x400007f > 1: 63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2 > 2: b4 02 00 00 02 00 00 00 w2 = 0x2 > 3: 63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2 > 4: b7 02 00 00 00 00 00 00 r2 = 0x0 > 5: 63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2 > 6: 63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2 > 7: bf a2 00 00 00 00 00 00 r2 = r10 > 8: 07 02 00 00 f0 ff ff ff r2 += -0x10 > 9: b4 03 00 00 10 00 00 00 w3 = 0x10 > 10: 85 00 00 00 40 00 00 00 call 0x40 > 11: bf 01 00 00 00 00 00 00 r1 = r0 > 12: b4 00 00 00 01 00 00 00 w0 = 0x1 > 13: 15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2> > 14: b4 00 00 00 00 00 00 00 w0 = 0x0 > > 00000000000001b0 <LBB1_3>: > 54: bc 60 00 00 00 00 00 00 w0 = w6 > 55: 95 00 00 00 00 00 00 00 exit > > 0000000000000578 <LBB1_28>: > ... > 180: 85 10 00 00 ff ff ff ff call -0x1 > 181: bc 06 00 00 00 00 00 00 w6 = w0 > 182: 05 00 7f ff 00 00 00 00 goto -0x81 <LBB1_3> > > Cc: Nick Desaulniers <ndesaulniers@google.com> > Signed-off-by: Stanislav Fomichev <sdf@google.com> > --- > tools/testing/selftests/bpf/progs/connect4_prog.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/tools/testing/selftests/bpf/progs/connect4_prog.c b/tools/testing/selftests/bpf/progs/connect4_prog.c > index 7ef49ec04838..b7fc46a0787b 100644 > --- a/tools/testing/selftests/bpf/progs/connect4_prog.c > +++ b/tools/testing/selftests/bpf/progs/connect4_prog.c > @@ -41,10 +41,7 @@ int do_bind(struct bpf_sock_addr *ctx) > sa.sin_port = bpf_htons(0); > sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4); > > - if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0) > - return 0; > - > - return 1; > + return bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)); > } > > static __inline int verify_cc(struct bpf_sock_addr *ctx, > @@ -194,7 +191,7 @@ int connect_v4_prog(struct bpf_sock_addr *ctx) > ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4); > ctx->user_port = bpf_htons(DST_REWRITE_PORT4); > > - return do_bind(ctx) ? 1 : 0; > + return do_bind(ctx) ? 0 : 1; > } > > char _license[] SEC("license") = "GPL"; > -- > 2.42.0.283.g2d96d420d3-goog >
On Fri, Sep 8, 2023 at 4:42 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, Sep 7, 2023 at 2:00 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > With the new internal clang version I see the following optimization > > that makes connect4 program unverifiable. > > > > The following code: > > > > int do_bind() > > Yonghong added __weak to do_bind a few months ago ([0]), which makes > it illegal for the compiler to assume 0 or 1 return. Can you please > double check that this is the issue with __weak? > > [0] https://lore.kernel.org/bpf/20230310012410.2920570-1-yhs@fb.com/ It does indeed fix it for me, thank you! Mystery solved on "why I can't repro this on the upstream" :-) I've completely missed that extra __weak.. > > > { > > if (bpf_bind() != 0) > > return 0; > > return 1; > > } > > int connect_v4_prog() > > { > > return do_bind() ? 1 : 0; > > } > > > > Becomes: > > > > int do_bind() > > { > > if (bpf_bind() != 0) > > return 0; > > return 1; > > } > > int connect_v4_prog() > > { > > return do_bind(); > > } > > > > IOW, looks like clang is able to see that do_bind returns only 0 and > > 1 and the extra branch around 'return do_bind' is not needed. > > This, however, seems to break the verifier, which assumes that > > bpf2bpf calls can return 0-0xffffffff. > > > > Note, I can produce those programs only with the internal fork of clang. > > The latest one from git still produced correct bytecode. It might be > > some options/optimizations that we enable and that are still > > disabled for the general upstream users, not sure. I've desided > > to send this patch out anyway since it seems like a correct optimization > > the compiler might do. > > > > So to be future-proof, reshape the code a bit to return bpf_bind > > result directly. This will not give any hint to the clang about > > the return value and will force it generate that '? 1: 0' branch > > at the callee. > > > > Good program: > > > > 0000000000000000 <do_bind>: > > 0: b4 02 00 00 7f 00 00 04 w2 = 0x400007f > > 1: 63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2 > > 2: b4 02 00 00 02 00 00 00 w2 = 0x2 > > 3: 63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2 > > 4: b7 02 00 00 00 00 00 00 r2 = 0x0 > > 5: 63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2 > > 6: 63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2 > > 7: bf a2 00 00 00 00 00 00 r2 = r10 > > 8: 07 02 00 00 f0 ff ff ff r2 += -0x10 > > 9: b4 03 00 00 10 00 00 00 w3 = 0x10 > > 10: 85 00 00 00 40 00 00 00 call 0x40 > > 11: bf 01 00 00 00 00 00 00 r1 = r0 > > 12: b4 00 00 00 01 00 00 00 w0 = 0x1 > > 13: 15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2> > > 14: b4 00 00 00 00 00 00 00 w0 = 0x0 > > > > 00000000000001b0 <LBB1_30>: > > 54: bc 60 00 00 00 00 00 00 w0 = w6 > > 55: 95 00 00 00 00 00 00 00 exit > > > > 0000000000000578 <LBB1_28>: > > ... > > 180: 85 10 00 00 ff ff ff ff call -0x1 > > 181: b4 06 00 00 01 00 00 00 w6 = 0x1 > > 182: 56 00 7f ff 00 00 00 00 if w0 != 0x0 goto -0x81 <LBB1_30> > > 183: b4 06 00 00 00 00 00 00 w6 = 0x0 > > 184: 05 00 7d ff 00 00 00 00 goto -0x83 <LBB1_30> > > > > Bad program: > > 0000000000000000 <do_bind>: > > 0: b4 02 00 00 7f 00 00 04 w2 = 0x400007f > > 1: 63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2 > > 2: b4 02 00 00 02 00 00 00 w2 = 0x2 > > 3: 63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2 > > 4: b7 02 00 00 00 00 00 00 r2 = 0x0 > > 5: 63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2 > > 6: 63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2 > > 7: bf a2 00 00 00 00 00 00 r2 = r10 > > 8: 07 02 00 00 f0 ff ff ff r2 += -0x10 > > 9: b4 03 00 00 10 00 00 00 w3 = 0x10 > > 10: 85 00 00 00 40 00 00 00 call 0x40 > > 11: bf 01 00 00 00 00 00 00 r1 = r0 > > 12: b4 00 00 00 01 00 00 00 w0 = 0x1 > > 13: 15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2> > > 14: b4 00 00 00 00 00 00 00 w0 = 0x0 > > > > 00000000000001b0 <LBB1_3>: > > 54: bc 60 00 00 00 00 00 00 w0 = w6 > > 55: 95 00 00 00 00 00 00 00 exit > > > > 0000000000000578 <LBB1_28>: > > ... > > 180: 85 10 00 00 ff ff ff ff call -0x1 > > 181: bc 06 00 00 00 00 00 00 w6 = w0 > > 182: 05 00 7f ff 00 00 00 00 goto -0x81 <LBB1_3> > > > > Cc: Nick Desaulniers <ndesaulniers@google.com> > > Signed-off-by: Stanislav Fomichev <sdf@google.com> > > --- > > tools/testing/selftests/bpf/progs/connect4_prog.c | 7 ++----- > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/tools/testing/selftests/bpf/progs/connect4_prog.c b/tools/testing/selftests/bpf/progs/connect4_prog.c > > index 7ef49ec04838..b7fc46a0787b 100644 > > --- a/tools/testing/selftests/bpf/progs/connect4_prog.c > > +++ b/tools/testing/selftests/bpf/progs/connect4_prog.c > > @@ -41,10 +41,7 @@ int do_bind(struct bpf_sock_addr *ctx) > > sa.sin_port = bpf_htons(0); > > sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4); > > > > - if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0) > > - return 0; > > - > > - return 1; > > + return bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)); > > } > > > > static __inline int verify_cc(struct bpf_sock_addr *ctx, > > @@ -194,7 +191,7 @@ int connect_v4_prog(struct bpf_sock_addr *ctx) > > ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4); > > ctx->user_port = bpf_htons(DST_REWRITE_PORT4); > > > > - return do_bind(ctx) ? 1 : 0; > > + return do_bind(ctx) ? 0 : 1; > > } > > > > char _license[] SEC("license") = "GPL"; > > -- > > 2.42.0.283.g2d96d420d3-goog > >
diff --git a/tools/testing/selftests/bpf/progs/connect4_prog.c b/tools/testing/selftests/bpf/progs/connect4_prog.c index 7ef49ec04838..b7fc46a0787b 100644 --- a/tools/testing/selftests/bpf/progs/connect4_prog.c +++ b/tools/testing/selftests/bpf/progs/connect4_prog.c @@ -41,10 +41,7 @@ int do_bind(struct bpf_sock_addr *ctx) sa.sin_port = bpf_htons(0); sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4); - if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0) - return 0; - - return 1; + return bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)); } static __inline int verify_cc(struct bpf_sock_addr *ctx, @@ -194,7 +191,7 @@ int connect_v4_prog(struct bpf_sock_addr *ctx) ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4); ctx->user_port = bpf_htons(DST_REWRITE_PORT4); - return do_bind(ctx) ? 1 : 0; + return do_bind(ctx) ? 0 : 1; } char _license[] SEC("license") = "GPL";
With the new internal clang version I see the following optimization that makes connect4 program unverifiable. The following code: int do_bind() { if (bpf_bind() != 0) return 0; return 1; } int connect_v4_prog() { return do_bind() ? 1 : 0; } Becomes: int do_bind() { if (bpf_bind() != 0) return 0; return 1; } int connect_v4_prog() { return do_bind(); } IOW, looks like clang is able to see that do_bind returns only 0 and 1 and the extra branch around 'return do_bind' is not needed. This, however, seems to break the verifier, which assumes that bpf2bpf calls can return 0-0xffffffff. Note, I can produce those programs only with the internal fork of clang. The latest one from git still produced correct bytecode. It might be some options/optimizations that we enable and that are still disabled for the general upstream users, not sure. I've desided to send this patch out anyway since it seems like a correct optimization the compiler might do. So to be future-proof, reshape the code a bit to return bpf_bind result directly. This will not give any hint to the clang about the return value and will force it generate that '? 1: 0' branch at the callee. Good program: 0000000000000000 <do_bind>: 0: b4 02 00 00 7f 00 00 04 w2 = 0x400007f 1: 63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2 2: b4 02 00 00 02 00 00 00 w2 = 0x2 3: 63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2 4: b7 02 00 00 00 00 00 00 r2 = 0x0 5: 63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2 6: 63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2 7: bf a2 00 00 00 00 00 00 r2 = r10 8: 07 02 00 00 f0 ff ff ff r2 += -0x10 9: b4 03 00 00 10 00 00 00 w3 = 0x10 10: 85 00 00 00 40 00 00 00 call 0x40 11: bf 01 00 00 00 00 00 00 r1 = r0 12: b4 00 00 00 01 00 00 00 w0 = 0x1 13: 15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2> 14: b4 00 00 00 00 00 00 00 w0 = 0x0 00000000000001b0 <LBB1_30>: 54: bc 60 00 00 00 00 00 00 w0 = w6 55: 95 00 00 00 00 00 00 00 exit 0000000000000578 <LBB1_28>: ... 180: 85 10 00 00 ff ff ff ff call -0x1 181: b4 06 00 00 01 00 00 00 w6 = 0x1 182: 56 00 7f ff 00 00 00 00 if w0 != 0x0 goto -0x81 <LBB1_30> 183: b4 06 00 00 00 00 00 00 w6 = 0x0 184: 05 00 7d ff 00 00 00 00 goto -0x83 <LBB1_30> Bad program: 0000000000000000 <do_bind>: 0: b4 02 00 00 7f 00 00 04 w2 = 0x400007f 1: 63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2 2: b4 02 00 00 02 00 00 00 w2 = 0x2 3: 63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2 4: b7 02 00 00 00 00 00 00 r2 = 0x0 5: 63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2 6: 63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2 7: bf a2 00 00 00 00 00 00 r2 = r10 8: 07 02 00 00 f0 ff ff ff r2 += -0x10 9: b4 03 00 00 10 00 00 00 w3 = 0x10 10: 85 00 00 00 40 00 00 00 call 0x40 11: bf 01 00 00 00 00 00 00 r1 = r0 12: b4 00 00 00 01 00 00 00 w0 = 0x1 13: 15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2> 14: b4 00 00 00 00 00 00 00 w0 = 0x0 00000000000001b0 <LBB1_3>: 54: bc 60 00 00 00 00 00 00 w0 = w6 55: 95 00 00 00 00 00 00 00 exit 0000000000000578 <LBB1_28>: ... 180: 85 10 00 00 ff ff ff ff call -0x1 181: bc 06 00 00 00 00 00 00 w6 = w0 182: 05 00 7f ff 00 00 00 00 goto -0x81 <LBB1_3> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/testing/selftests/bpf/progs/connect4_prog.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)