Message ID | 20230830011128.1415752-2-iii@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | BPF |
Headers | show |
Series | Implement cpuv4 support for s390x | expand |
On 8/29/23 9:07 PM, Ilya Leoshkevich wrote: > On the architectures that use bpf_jit_needs_zext(), e.g., s390x, the > verifier incorrectly inserts a zero-extension after BPF_MEMSX, leading > to miscompilations like the one below: > > 24: 89 1a ff fe 00 00 00 00 "r1 = *(s16 *)(r10 - 2);" # zext_dst set > 0x3ff7fdb910e: lgh %r2,-2(%r13,%r0) # load halfword > 0x3ff7fdb9114: llgfr %r2,%r2 # wrong! > 25: 65 10 00 03 00 00 7f ff if r1 s> 32767 goto +3 <l0_1> # check_cond_jmp_op() > > Disable such zero-extensions. The JITs need to insert sign-extension > themselves, if necessary. > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev>
Hi Ilya On Wed, Aug 30, 2023 at 3:12 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote: > > On the architectures that use bpf_jit_needs_zext(), e.g., s390x, the > verifier incorrectly inserts a zero-extension after BPF_MEMSX, leading > to miscompilations like the one below: > > 24: 89 1a ff fe 00 00 00 00 "r1 = *(s16 *)(r10 - 2);" # zext_dst set > 0x3ff7fdb910e: lgh %r2,-2(%r13,%r0) # load halfword > 0x3ff7fdb9114: llgfr %r2,%r2 # wrong! > 25: 65 10 00 03 00 00 7f ff if r1 s> 32767 goto +3 <l0_1> # check_cond_jmp_op() > > Disable such zero-extensions. The JITs need to insert sign-extension > themselves, if necessary. > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > --- > kernel/bpf/verifier.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index bb78212fa5b2..097985a46edc 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -3110,7 +3110,9 @@ static void mark_insn_zext(struct bpf_verifier_env *env, > { > s32 def_idx = reg->subreg_def; > > - if (def_idx == DEF_NOT_SUBREG) The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for registers that are used as destination registers of BPF_LDX | BPF_MEMSX. I am seeing the same problem on ARM32 and was going to send a patch today. The problem is that is_reg64() returns false for destination registers of BPF_LDX | BPF_MEMSX. But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the sign extension so is_reg64() should return true. I have written a patch that I will be sending as a reply to this. Please let me know if that makes sense. > + if (def_idx == DEF_NOT_SUBREG || > + (BPF_CLASS(env->prog->insnsi[def_idx - 1].code) == BPF_LDX && > + BPF_MODE(env->prog->insnsi[def_idx - 1].code) == BPF_MEMSX)) > return; > > env->insn_aux_data[def_idx - 1].zext_dst = true; > -- > 2.41.0 > > Thanks, Puranjay
On Fri, Sep 01 2023, Puranjay Mohan wrote: > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for > registers that are used as destination registers of BPF_LDX | > BPF_MEMSX. I am seeing > the same problem on ARM32 and was going to send a patch today. > > The problem is that is_reg64() returns false for destination registers > of BPF_LDX | BPF_MEMSX. > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the > sign extension so > is_reg64() should return true. > > I have written a patch that I will be sending as a reply to this. > Please let me know if that makes sense. > The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination registers if is_reg64() returns true for these registers. My patch below make is_reg64() return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the correct way to fix this problem. Here is my patch: --- 8< --- >From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 From: Puranjay Mohan <puranjay12@gmail.com> Date: Fri, 1 Sep 2023 11:18:59 +0000 Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as 64 bit The verifier can emit instructions to zero-extend destination registers when the register is being used to keep 32 bit values. This behaviour is enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case of a sign extended load instruction, the destination register always has a 64-bit value, therefore the verifier should not emit zero-extend instructions for it. Change is_reg64() to return true if the register under consideration is a destination register of LDX instruction with mode = BPF_MEMSX. Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> --- kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index bb78212fa5b2..93f84b868ccc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, if (class == BPF_LDX) { if (t != SRC_OP) - return BPF_SIZE(code) == BPF_DW; + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); /* LDX source must be ptr. */ return true; }
On Fri, 2023-09-01 at 16:19 +0200, Puranjay Mohan wrote: > Hi Ilya > > On Wed, Aug 30, 2023 at 3:12 AM Ilya Leoshkevich <iii@linux.ibm.com> > wrote: > > > > On the architectures that use bpf_jit_needs_zext(), e.g., s390x, > > the > > verifier incorrectly inserts a zero-extension after BPF_MEMSX, > > leading > > to miscompilations like the one below: > > > > 24: 89 1a ff fe 00 00 00 00 "r1 = *(s16 *)(r10 - > > 2);" # zext_dst set > > 0x3ff7fdb910e: lgh %r2,- > > 2(%r13,%r0) # load halfword > > 0x3ff7fdb9114: llgfr > > %r2,%r2 # wrong! > > 25: 65 10 00 03 00 00 7f ff if r1 s> 32767 goto +3 > > <l0_1> # check_cond_jmp_op() > > > > Disable such zero-extensions. The JITs need to insert sign- > > extension > > themselves, if necessary. > > > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > > --- > > kernel/bpf/verifier.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index bb78212fa5b2..097985a46edc 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -3110,7 +3110,9 @@ static void mark_insn_zext(struct > > bpf_verifier_env *env, > > { > > s32 def_idx = reg->subreg_def; > > > > - if (def_idx == DEF_NOT_SUBREG) > > The problem here is that reg->subreg_def should be set as > DEF_NOT_SUBREG for > registers that are used as destination registers of BPF_LDX | > BPF_MEMSX. I am seeing > the same problem on ARM32 and was going to send a patch today. > > The problem is that is_reg64() returns false for destination > registers > of BPF_LDX | BPF_MEMSX. > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the > sign extension so > is_reg64() should return true. > > I have written a patch that I will be sending as a reply to this. > Please let me know if that makes sense. > > > + if (def_idx == DEF_NOT_SUBREG || > > + (BPF_CLASS(env->prog->insnsi[def_idx - 1].code) == > > BPF_LDX && > > + BPF_MODE(env->prog->insnsi[def_idx - 1].code) == > > BPF_MEMSX)) > > return; > > > > env->insn_aux_data[def_idx - 1].zext_dst = true; > > -- > > 2.41.0 > > > > > > Thanks, > Puranjay Hi, I also considered doing this, and I think both approaches are functionally equivalent and work in practice. However, I can envision that, just like we have the zext_dst optimization today, we might want a sext_dst optimization in the future. Therefore I think it's better to fix this by not setting zext_dst instead of not setting subreg_def. Best regards, Ilya
On Fri, Sep 1, 2023 at 7:57 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > > On Fri, Sep 01 2023, Puranjay Mohan wrote: > > > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for > > registers that are used as destination registers of BPF_LDX | > > BPF_MEMSX. I am seeing > > the same problem on ARM32 and was going to send a patch today. > > > > The problem is that is_reg64() returns false for destination registers > > of BPF_LDX | BPF_MEMSX. > > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the > > sign extension so > > is_reg64() should return true. > > > > I have written a patch that I will be sending as a reply to this. > > Please let me know if that makes sense. > > > > The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination > registers if is_reg64() returns true for these registers. My patch below make is_reg64() > return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the > correct way to fix this problem. > > Here is my patch: > > --- 8< --- > From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 > From: Puranjay Mohan <puranjay12@gmail.com> > Date: Fri, 1 Sep 2023 11:18:59 +0000 > Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as > 64 bit > > The verifier can emit instructions to zero-extend destination registers > when the register is being used to keep 32 bit values. This behaviour is > enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case > of a sign extended load instruction, the destination register always has a > 64-bit value, therefore the verifier should not emit zero-extend > instructions for it. > > Change is_reg64() to return true if the register under consideration is a > destination register of LDX instruction with mode = BPF_MEMSX. > > Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") > Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> > --- > kernel/bpf/verifier.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index bb78212fa5b2..93f84b868ccc 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, > > if (class == BPF_LDX) { > if (t != SRC_OP) > - return BPF_SIZE(code) == BPF_DW; > + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); Looks like we have a bug here for normal LDX too. This 'if' condition was inserting unnecessary zext for LDX. It was harmless for LDX and broken for LDSX. Both LDX and LDSX write all bits of 64-bit register. I think the proper fix is to remove above two lines. wdyt?
On Wed, Sep 06 2023, Alexei Starovoitov wrote: > On Fri, Sep 1, 2023 at 7:57 AM Puranjay Mohan <puranjay12@gmail.com> wrote: >> >> On Fri, Sep 01 2023, Puranjay Mohan wrote: >> >> > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for >> > registers that are used as destination registers of BPF_LDX | >> > BPF_MEMSX. I am seeing >> > the same problem on ARM32 and was going to send a patch today. >> > >> > The problem is that is_reg64() returns false for destination registers >> > of BPF_LDX | BPF_MEMSX. >> > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the >> > sign extension so >> > is_reg64() should return true. >> > >> > I have written a patch that I will be sending as a reply to this. >> > Please let me know if that makes sense. >> > >> >> The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination >> registers if is_reg64() returns true for these registers. My patch below make is_reg64() >> return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the >> correct way to fix this problem. >> >> Here is my patch: >> >> --- 8< --- >> From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 >> From: Puranjay Mohan <puranjay12@gmail.com> >> Date: Fri, 1 Sep 2023 11:18:59 +0000 >> Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as >> 64 bit >> >> The verifier can emit instructions to zero-extend destination registers >> when the register is being used to keep 32 bit values. This behaviour is >> enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case >> of a sign extended load instruction, the destination register always has a >> 64-bit value, therefore the verifier should not emit zero-extend >> instructions for it. >> >> Change is_reg64() to return true if the register under consideration is a >> destination register of LDX instruction with mode = BPF_MEMSX. >> >> Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") >> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> >> --- >> kernel/bpf/verifier.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index bb78212fa5b2..93f84b868ccc 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, >> >> if (class == BPF_LDX) { >> if (t != SRC_OP) >> - return BPF_SIZE(code) == BPF_DW; >> + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); > > Looks like we have a bug here for normal LDX too. > This 'if' condition was inserting unnecessary zext for LDX. > It was harmless for LDX and broken for LDSX. > Both LDX and LDSX write all bits of 64-bit register. > > I think the proper fix is to remove above two lines. > wdyt? For LDX this returns true only if it is with a BPF_DW, for others it returns false. This means a zext is inserted for BPF_LDX | BPF_B/H/W. This is not a bug because LDX writes 64 bits of the register only with BPF_DW. With BPF_B/H/W It only writes the lower 32bits and needs zext for upper 32 bits. On 32 bit architectures where a 64-bit BPF register is simulated with two 32-bit registers, explicit zext is required for BPF_LDX | BPF_B/H/W. So, we should not remove this. Thanks, Puranjay
On Thu, Sep 7, 2023 at 12:33 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > > On Wed, Sep 06 2023, Alexei Starovoitov wrote: > > > On Fri, Sep 1, 2023 at 7:57 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > >> > >> On Fri, Sep 01 2023, Puranjay Mohan wrote: > >> > >> > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for > >> > registers that are used as destination registers of BPF_LDX | > >> > BPF_MEMSX. I am seeing > >> > the same problem on ARM32 and was going to send a patch today. > >> > > >> > The problem is that is_reg64() returns false for destination registers > >> > of BPF_LDX | BPF_MEMSX. > >> > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the > >> > sign extension so > >> > is_reg64() should return true. > >> > > >> > I have written a patch that I will be sending as a reply to this. > >> > Please let me know if that makes sense. > >> > > >> > >> The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination > >> registers if is_reg64() returns true for these registers. My patch below make is_reg64() > >> return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the > >> correct way to fix this problem. > >> > >> Here is my patch: > >> > >> --- 8< --- > >> From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 > >> From: Puranjay Mohan <puranjay12@gmail.com> > >> Date: Fri, 1 Sep 2023 11:18:59 +0000 > >> Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as > >> 64 bit > >> > >> The verifier can emit instructions to zero-extend destination registers > >> when the register is being used to keep 32 bit values. This behaviour is > >> enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case > >> of a sign extended load instruction, the destination register always has a > >> 64-bit value, therefore the verifier should not emit zero-extend > >> instructions for it. > >> > >> Change is_reg64() to return true if the register under consideration is a > >> destination register of LDX instruction with mode = BPF_MEMSX. > >> > >> Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") > >> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> > >> --- > >> kernel/bpf/verifier.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > >> index bb78212fa5b2..93f84b868ccc 100644 > >> --- a/kernel/bpf/verifier.c > >> +++ b/kernel/bpf/verifier.c > >> @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, > >> > >> if (class == BPF_LDX) { > >> if (t != SRC_OP) > >> - return BPF_SIZE(code) == BPF_DW; > >> + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); > > > > Looks like we have a bug here for normal LDX too. > > This 'if' condition was inserting unnecessary zext for LDX. > > It was harmless for LDX and broken for LDSX. > > Both LDX and LDSX write all bits of 64-bit register. > > > > I think the proper fix is to remove above two lines. > > wdyt? > > For LDX this returns true only if it is with a BPF_DW, for others it returns false. > This means a zext is inserted for BPF_LDX | BPF_B/H/W. > > This is not a bug because LDX writes 64 bits of the register only with BPF_DW. > With BPF_B/H/W It only writes the lower 32bits and needs zext for upper 32 bits. No. The interpreter writes all 64-bit for any LDX insn. All JITs must do it as well. > On 32 bit architectures where a 64-bit BPF register is simulated with two 32-bit registers, > explicit zext is required for BPF_LDX | BPF_B/H/W. zext JIT-aid done by the verifier has nothing to do with 32-bit architecture. It's necessary on 64-bit as well when HW doesn't automatically zero out upper 32-bit like it does on arm64 and x86-64 > So, we should not remove this. I still think we should.
On Thu, Sep 07 2023, Alexei Starovoitov wrote: > On Thu, Sep 7, 2023 at 12:33 AM Puranjay Mohan <puranjay12@gmail.com> wrote: >> >> On Wed, Sep 06 2023, Alexei Starovoitov wrote: >> >> > On Fri, Sep 1, 2023 at 7:57 AM Puranjay Mohan <puranjay12@gmail.com> wrote: >> >> >> >> On Fri, Sep 01 2023, Puranjay Mohan wrote: >> >> >> >> > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for >> >> > registers that are used as destination registers of BPF_LDX | >> >> > BPF_MEMSX. I am seeing >> >> > the same problem on ARM32 and was going to send a patch today. >> >> > >> >> > The problem is that is_reg64() returns false for destination registers >> >> > of BPF_LDX | BPF_MEMSX. >> >> > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the >> >> > sign extension so >> >> > is_reg64() should return true. >> >> > >> >> > I have written a patch that I will be sending as a reply to this. >> >> > Please let me know if that makes sense. >> >> > >> >> >> >> The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination >> >> registers if is_reg64() returns true for these registers. My patch below make is_reg64() >> >> return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the >> >> correct way to fix this problem. >> >> >> >> Here is my patch: >> >> >> >> --- 8< --- >> >> From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 >> >> From: Puranjay Mohan <puranjay12@gmail.com> >> >> Date: Fri, 1 Sep 2023 11:18:59 +0000 >> >> Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as >> >> 64 bit >> >> >> >> The verifier can emit instructions to zero-extend destination registers >> >> when the register is being used to keep 32 bit values. This behaviour is >> >> enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case >> >> of a sign extended load instruction, the destination register always has a >> >> 64-bit value, therefore the verifier should not emit zero-extend >> >> instructions for it. >> >> >> >> Change is_reg64() to return true if the register under consideration is a >> >> destination register of LDX instruction with mode = BPF_MEMSX. >> >> >> >> Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") >> >> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> >> >> --- >> >> kernel/bpf/verifier.c | 2 +- >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> >> index bb78212fa5b2..93f84b868ccc 100644 >> >> --- a/kernel/bpf/verifier.c >> >> +++ b/kernel/bpf/verifier.c >> >> @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, >> >> >> >> if (class == BPF_LDX) { >> >> if (t != SRC_OP) >> >> - return BPF_SIZE(code) == BPF_DW; >> >> + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); >> > >> > Looks like we have a bug here for normal LDX too. >> > This 'if' condition was inserting unnecessary zext for LDX. >> > It was harmless for LDX and broken for LDSX. >> > Both LDX and LDSX write all bits of 64-bit register. >> > >> > I think the proper fix is to remove above two lines. >> > wdyt? >> >> For LDX this returns true only if it is with a BPF_DW, for others it returns false. >> This means a zext is inserted for BPF_LDX | BPF_B/H/W. >> >> This is not a bug because LDX writes 64 bits of the register only with BPF_DW. >> With BPF_B/H/W It only writes the lower 32bits and needs zext for upper 32 bits. > > No. The interpreter writes all 64-bit for any LDX insn. > All JITs must do it as well. > >> On 32 bit architectures where a 64-bit BPF register is simulated with two 32-bit registers, >> explicit zext is required for BPF_LDX | BPF_B/H/W. > > zext JIT-aid done by the verifier has nothing to do with 32-bit architecture. > It's necessary on 64-bit as well when HW doesn't automatically zero out > upper 32-bit like it does on arm64 and x86-64 Yes, I agree that zext JIT-aid is required for all 32-bit architectures and some 64-bit architectures that can't automatically zero out the upper 32-bits. Basically any architecture that sets bpf_jit_needs_zext() -> true. >> So, we should not remove this. > > I still think we should. If we remove this then some JITs will not zero extend the upper 32-bits for BPF_LDX | BPF_B/H/W. My understanding is that Verifier sets prog->aux->verifier_zext if it emits zext instructions. If the verifier doesn't emit zext for LDX but sets prog->aux->verifier_zext that would cause wrong behavior for some JITs: Example code from ARM32 jit doing BPF_LDX | BPF_MEM | BPF_B: case BPF_B: /* Load a Byte */ emit(ARM_LDRB_I(rd[1], rm, off), ctx); if (!ctx->prog->aux->verifier_zext) emit_a32_mov_i(rd[0], 0, ctx); break; Here if ctx->prog->aux->verifier_zext is set by the verifier, and zext was not emitted for LDX, JIT will not zero the upper 32-bits. RISCV32, PowerPC32, x86-32 JITs have similar code paths. Only MIPS32 JIT zero-extends for LDX without checking prog->aux->verifier_zext. So, if we want to stop emitting zext for LDX then we would need to modify all these JITs to always zext for LDX. Let me know if my understanding has some gaps, also if we decide to remove it, I am happy to send patches for it and fix the JITs that need modifications. Thanks, Puranjay
On Thu, Sep 7, 2023 at 9:39 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > > On Thu, Sep 07 2023, Alexei Starovoitov wrote: > > > On Thu, Sep 7, 2023 at 12:33 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > >> > >> On Wed, Sep 06 2023, Alexei Starovoitov wrote: > >> > >> > On Fri, Sep 1, 2023 at 7:57 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > >> >> > >> >> On Fri, Sep 01 2023, Puranjay Mohan wrote: > >> >> > >> >> > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for > >> >> > registers that are used as destination registers of BPF_LDX | > >> >> > BPF_MEMSX. I am seeing > >> >> > the same problem on ARM32 and was going to send a patch today. > >> >> > > >> >> > The problem is that is_reg64() returns false for destination registers > >> >> > of BPF_LDX | BPF_MEMSX. > >> >> > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the > >> >> > sign extension so > >> >> > is_reg64() should return true. > >> >> > > >> >> > I have written a patch that I will be sending as a reply to this. > >> >> > Please let me know if that makes sense. > >> >> > > >> >> > >> >> The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination > >> >> registers if is_reg64() returns true for these registers. My patch below make is_reg64() > >> >> return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the > >> >> correct way to fix this problem. > >> >> > >> >> Here is my patch: > >> >> > >> >> --- 8< --- > >> >> From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 > >> >> From: Puranjay Mohan <puranjay12@gmail.com> > >> >> Date: Fri, 1 Sep 2023 11:18:59 +0000 > >> >> Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as > >> >> 64 bit > >> >> > >> >> The verifier can emit instructions to zero-extend destination registers > >> >> when the register is being used to keep 32 bit values. This behaviour is > >> >> enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case > >> >> of a sign extended load instruction, the destination register always has a > >> >> 64-bit value, therefore the verifier should not emit zero-extend > >> >> instructions for it. > >> >> > >> >> Change is_reg64() to return true if the register under consideration is a > >> >> destination register of LDX instruction with mode = BPF_MEMSX. > >> >> > >> >> Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") > >> >> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> > >> >> --- > >> >> kernel/bpf/verifier.c | 2 +- > >> >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> >> > >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > >> >> index bb78212fa5b2..93f84b868ccc 100644 > >> >> --- a/kernel/bpf/verifier.c > >> >> +++ b/kernel/bpf/verifier.c > >> >> @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, > >> >> > >> >> if (class == BPF_LDX) { > >> >> if (t != SRC_OP) > >> >> - return BPF_SIZE(code) == BPF_DW; > >> >> + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); > >> > > >> > Looks like we have a bug here for normal LDX too. > >> > This 'if' condition was inserting unnecessary zext for LDX. > >> > It was harmless for LDX and broken for LDSX. > >> > Both LDX and LDSX write all bits of 64-bit register. > >> > > >> > I think the proper fix is to remove above two lines. > >> > wdyt? > >> > >> For LDX this returns true only if it is with a BPF_DW, for others it returns false. > >> This means a zext is inserted for BPF_LDX | BPF_B/H/W. > >> > >> This is not a bug because LDX writes 64 bits of the register only with BPF_DW. > >> With BPF_B/H/W It only writes the lower 32bits and needs zext for upper 32 bits. > > > > No. The interpreter writes all 64-bit for any LDX insn. > > All JITs must do it as well. > > > >> On 32 bit architectures where a 64-bit BPF register is simulated with two 32-bit registers, > >> explicit zext is required for BPF_LDX | BPF_B/H/W. > > > > zext JIT-aid done by the verifier has nothing to do with 32-bit architecture. > > It's necessary on 64-bit as well when HW doesn't automatically zero out > > upper 32-bit like it does on arm64 and x86-64 > > Yes, I agree that zext JIT-aid is required for all 32-bit architectures and some 64-bit architectures > that can't automatically zero out the upper 32-bits. > Basically any architecture that sets bpf_jit_needs_zext() -> true. > > >> So, we should not remove this. > > > > I still think we should. > > If we remove this then some JITs will not zero extend the upper 32-bits for BPF_LDX | BPF_B/H/W. > > My understanding is that Verifier sets prog->aux->verifier_zext if it emits zext instructions. If the verifier > doesn't emit zext for LDX but sets prog->aux->verifier_zext that would cause wrong behavior for some JITs: > > Example code from ARM32 jit doing BPF_LDX | BPF_MEM | BPF_B: > > case BPF_B: > /* Load a Byte */ > emit(ARM_LDRB_I(rd[1], rm, off), ctx); > if (!ctx->prog->aux->verifier_zext) > emit_a32_mov_i(rd[0], 0, ctx); > break; > > Here if ctx->prog->aux->verifier_zext is set by the verifier, and zext was not emitted for LDX, JIT will not zero > the upper 32-bits. > > RISCV32, PowerPC32, x86-32 JITs have similar code paths. Only MIPS32 JIT zero-extends for LDX without checking > prog->aux->verifier_zext. > > So, if we want to stop emitting zext for LDX then we would need to modify all these JITs to always zext for LDX. I guess we never clearly defined what 'needs_zext' is supposed to be, so it wouldn't be fair to call 32-bit JITs buggy. But we better address this issue now. This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. I believe all 4 JITs emit proper zero extension into 64-bit register by using single cpu instruction, but they also define bpf_jit_needs_zext() as true, so extra BPF_ZEXT_REG() is added by the verifier and it is a pure run-time overhead. It's better to remove if (t != SRC_OP) return BPF_SIZE(code) == BPF_DW; from is_reg64() to avoid adding BPF_ZEXT_REG() insn and fix 32-bit JITs at the same time. RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches to always zero upper 32-bit after LDX and then 4th patch to remove these two lines. > Let me know if my understanding has some gaps, also if we decide to remove it, I am happy to send patches for it > and fix the JITs that need modifications. Thank you for working on it! cc-ing JIT experts.
On Fri, Sep 8, 2023 at 12:45 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Sep 7, 2023 at 9:39 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > > > > On Thu, Sep 07 2023, Alexei Starovoitov wrote: > > > > > On Thu, Sep 7, 2023 at 12:33 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > > >> > > >> On Wed, Sep 06 2023, Alexei Starovoitov wrote: > > >> > > >> > On Fri, Sep 1, 2023 at 7:57 AM Puranjay Mohan <puranjay12@gmail.com> wrote: > > >> >> > > >> >> On Fri, Sep 01 2023, Puranjay Mohan wrote: > > >> >> > > >> >> > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for > > >> >> > registers that are used as destination registers of BPF_LDX | > > >> >> > BPF_MEMSX. I am seeing > > >> >> > the same problem on ARM32 and was going to send a patch today. > > >> >> > > > >> >> > The problem is that is_reg64() returns false for destination registers > > >> >> > of BPF_LDX | BPF_MEMSX. > > >> >> > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the > > >> >> > sign extension so > > >> >> > is_reg64() should return true. > > >> >> > > > >> >> > I have written a patch that I will be sending as a reply to this. > > >> >> > Please let me know if that makes sense. > > >> >> > > > >> >> > > >> >> The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination > > >> >> registers if is_reg64() returns true for these registers. My patch below make is_reg64() > > >> >> return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the > > >> >> correct way to fix this problem. > > >> >> > > >> >> Here is my patch: > > >> >> > > >> >> --- 8< --- > > >> >> From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 > > >> >> From: Puranjay Mohan <puranjay12@gmail.com> > > >> >> Date: Fri, 1 Sep 2023 11:18:59 +0000 > > >> >> Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as > > >> >> 64 bit > > >> >> > > >> >> The verifier can emit instructions to zero-extend destination registers > > >> >> when the register is being used to keep 32 bit values. This behaviour is > > >> >> enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case > > >> >> of a sign extended load instruction, the destination register always has a > > >> >> 64-bit value, therefore the verifier should not emit zero-extend > > >> >> instructions for it. > > >> >> > > >> >> Change is_reg64() to return true if the register under consideration is a > > >> >> destination register of LDX instruction with mode = BPF_MEMSX. > > >> >> > > >> >> Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") > > >> >> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> > > >> >> --- > > >> >> kernel/bpf/verifier.c | 2 +- > > >> >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> >> > > >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > >> >> index bb78212fa5b2..93f84b868ccc 100644 > > >> >> --- a/kernel/bpf/verifier.c > > >> >> +++ b/kernel/bpf/verifier.c > > >> >> @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, > > >> >> > > >> >> if (class == BPF_LDX) { > > >> >> if (t != SRC_OP) > > >> >> - return BPF_SIZE(code) == BPF_DW; > > >> >> + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); > > >> > > > >> > Looks like we have a bug here for normal LDX too. > > >> > This 'if' condition was inserting unnecessary zext for LDX. > > >> > It was harmless for LDX and broken for LDSX. > > >> > Both LDX and LDSX write all bits of 64-bit register. > > >> > > > >> > I think the proper fix is to remove above two lines. > > >> > wdyt? > > >> > > >> For LDX this returns true only if it is with a BPF_DW, for others it returns false. > > >> This means a zext is inserted for BPF_LDX | BPF_B/H/W. > > >> > > >> This is not a bug because LDX writes 64 bits of the register only with BPF_DW. > > >> With BPF_B/H/W It only writes the lower 32bits and needs zext for upper 32 bits. > > > > > > No. The interpreter writes all 64-bit for any LDX insn. > > > All JITs must do it as well. > > > > > >> On 32 bit architectures where a 64-bit BPF register is simulated with two 32-bit registers, > > >> explicit zext is required for BPF_LDX | BPF_B/H/W. > > > > > > zext JIT-aid done by the verifier has nothing to do with 32-bit architecture. > > > It's necessary on 64-bit as well when HW doesn't automatically zero out > > > upper 32-bit like it does on arm64 and x86-64 > > > > Yes, I agree that zext JIT-aid is required for all 32-bit architectures and some 64-bit architectures > > that can't automatically zero out the upper 32-bits. > > Basically any architecture that sets bpf_jit_needs_zext() -> true. > > > > >> So, we should not remove this. > > > > > > I still think we should. > > > > If we remove this then some JITs will not zero extend the upper 32-bits for BPF_LDX | BPF_B/H/W. > > > > My understanding is that Verifier sets prog->aux->verifier_zext if it emits zext instructions. If the verifier > > doesn't emit zext for LDX but sets prog->aux->verifier_zext that would cause wrong behavior for some JITs: > > > > Example code from ARM32 jit doing BPF_LDX | BPF_MEM | BPF_B: > > > > case BPF_B: > > /* Load a Byte */ > > emit(ARM_LDRB_I(rd[1], rm, off), ctx); > > if (!ctx->prog->aux->verifier_zext) > > emit_a32_mov_i(rd[0], 0, ctx); > > break; > > > > Here if ctx->prog->aux->verifier_zext is set by the verifier, and zext was not emitted for LDX, JIT will not zero > > the upper 32-bits. > > > > RISCV32, PowerPC32, x86-32 JITs have similar code paths. Only MIPS32 JIT zero-extends for LDX without checking > > prog->aux->verifier_zext. > > > > So, if we want to stop emitting zext for LDX then we would need to modify all these JITs to always zext for LDX. > > I guess we never clearly defined what 'needs_zext' is supposed to be, > so it wouldn't be fair to call 32-bit JITs buggy. > But we better address this issue now. > This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. > I believe all 4 JITs emit proper zero extension into 64-bit register > by using single cpu instruction, > but they also define bpf_jit_needs_zext() as true, > so extra BPF_ZEXT_REG() is added by the verifier > and it is a pure run-time overhead. > > It's better to remove > if (t != SRC_OP) > return BPF_SIZE(code) == BPF_DW; > from is_reg64() to avoid adding BPF_ZEXT_REG() insn > and fix 32-bit JITs at the same time. > RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches > to always zero upper 32-bit after LDX and > then 4th patch to remove these two lines. > > > Let me know if my understanding has some gaps, also if we decide to remove it, I am happy to send patches for it > > and fix the JITs that need modifications. > > Thank you for working on it! > > cc-ing JIT experts. Thanks for the detailed explanation. I agree with this approach. I will be sending the patches for this soon. Thanks, Puranjay
Hi Alexei, [...] > I guess we never clearly defined what 'needs_zext' is supposed to be, > so it wouldn't be fair to call 32-bit JITs buggy. > But we better address this issue now. > This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. > I believe all 4 JITs emit proper zero extension into 64-bit register > by using single cpu instruction, > but they also define bpf_jit_needs_zext() as true, > so extra BPF_ZEXT_REG() is added by the verifier > and it is a pure run-time overhead. I just realised that these zext instructions will not be a runtime overhead because the JITs ignore them. Like s390 does: case BPF_LDX | BPF_MEM | BPF_B: /* dst = *(u8 *)(ul) (src + off) */ case BPF_LDX | BPF_PROBE_MEM | BPF_B: /* llgc %dst,0(off,%src) */ EMIT6_DISP_LH(0xe3000000, 0x0090, dst_reg, src_reg, REG_0, off); jit->seen |= SEEN_MEM; if (insn_is_zext(&insn[1])) insn_count = 2; /* this will skip the next zext instruction */ break; powerpc does after LDX: if (size != BPF_DW && insn_is_zext(&insn[i + 1])) addrs[++i] = ctx->idx * 4; > It's better to remove > if (t != SRC_OP) > return BPF_SIZE(code) == BPF_DW; > from is_reg64() to avoid adding BPF_ZEXT_REG() insn > and fix 32-bit JITs at the same time. > RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches > to always zero upper 32-bit after LDX and > then 4th patch to remove these two lines. I have sent the patches for above, although I think this optimization is useful because zero extension after LDX is only required when the loaded value is later being used as a 64-bit value. If it is not the case then the verifier will not emit the zext and 32-bit JITs will emit 1 less instruction because they expect the verifier to do the zext for them where required. Link to patch series: https://lore.kernel.org/bpf/20230912224654.6556-1-puranjay12@gmail.com/T/#t Thanks, Puranjay
On Tue, Sep 12, 2023 at 3:49 PM Puranjay Mohan <puranjay12@gmail.com> wrote: > > Hi Alexei, > > [...] > > > I guess we never clearly defined what 'needs_zext' is supposed to be, > > so it wouldn't be fair to call 32-bit JITs buggy. > > But we better address this issue now. > > This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. > > I believe all 4 JITs emit proper zero extension into 64-bit register > > by using single cpu instruction, > > but they also define bpf_jit_needs_zext() as true, > > so extra BPF_ZEXT_REG() is added by the verifier > > and it is a pure run-time overhead. > > I just realised that these zext instructions will not be a runtime > overhead because the JITs ignore them. > Like > s390 does: > case BPF_LDX | BPF_MEM | BPF_B: /* dst = *(u8 *)(ul) (src + off) */ > case BPF_LDX | BPF_PROBE_MEM | BPF_B: > /* llgc %dst,0(off,%src) */ > EMIT6_DISP_LH(0xe3000000, 0x0090, dst_reg, src_reg, REG_0, off); > jit->seen |= SEEN_MEM; > if (insn_is_zext(&insn[1])) > insn_count = 2; /* this will skip the next zext instruction */ > break; > > powerpc does after LDX: > if (size != BPF_DW && insn_is_zext(&insn[i + 1])) > addrs[++i] = ctx->idx * 4; I see. Indeed the 64-bit JITs ignore this special zext insn after LDX. > > It's better to remove > > if (t != SRC_OP) > > return BPF_SIZE(code) == BPF_DW; > > from is_reg64() to avoid adding BPF_ZEXT_REG() insn > > and fix 32-bit JITs at the same time. > > RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches > > to always zero upper 32-bit after LDX and > > then 4th patch to remove these two lines. > > I have sent the patches for above, although I think this optimization > is useful because > zero extension after LDX is only required when the loaded value is > later being used as > a 64-bit value. If it is not the case then the verifier will not emit > the zext and 32-bit JITs will emit > 1 less instruction because they expect the verifier to do the zext for > them where required. You're correct. Ok. Let's keep zext for LDX as-is.
On Wed, Sep 13, 2023 at 2:09 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Sep 12, 2023 at 3:49 PM Puranjay Mohan <puranjay12@gmail.com> wrote: > > > > Hi Alexei, > > > > [...] > > > > > I guess we never clearly defined what 'needs_zext' is supposed to be, > > > so it wouldn't be fair to call 32-bit JITs buggy. > > > But we better address this issue now. > > > This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. > > > I believe all 4 JITs emit proper zero extension into 64-bit register > > > by using single cpu instruction, > > > but they also define bpf_jit_needs_zext() as true, > > > so extra BPF_ZEXT_REG() is added by the verifier > > > and it is a pure run-time overhead. > > > > I just realised that these zext instructions will not be a runtime > > overhead because the JITs ignore them. > > Like > > s390 does: > > case BPF_LDX | BPF_MEM | BPF_B: /* dst = *(u8 *)(ul) (src + off) */ > > case BPF_LDX | BPF_PROBE_MEM | BPF_B: > > /* llgc %dst,0(off,%src) */ > > EMIT6_DISP_LH(0xe3000000, 0x0090, dst_reg, src_reg, REG_0, off); > > jit->seen |= SEEN_MEM; > > if (insn_is_zext(&insn[1])) > > insn_count = 2; /* this will skip the next zext instruction */ > > break; > > > > powerpc does after LDX: > > if (size != BPF_DW && insn_is_zext(&insn[i + 1])) > > addrs[++i] = ctx->idx * 4; > > > I see. Indeed the 64-bit JITs ignore this special zext insn after LDX. > > > > It's better to remove > > > if (t != SRC_OP) > > > return BPF_SIZE(code) == BPF_DW; > > > from is_reg64() to avoid adding BPF_ZEXT_REG() insn > > > and fix 32-bit JITs at the same time. > > > RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches > > > to always zero upper 32-bit after LDX and > > > then 4th patch to remove these two lines. > > > > I have sent the patches for above, although I think this optimization > > is useful because > > zero extension after LDX is only required when the loaded value is > > later being used as > > a 64-bit value. If it is not the case then the verifier will not emit > > the zext and 32-bit JITs will emit > > 1 less instruction because they expect the verifier to do the zext for > > them where required. > > You're correct. > Ok. Let's keep zext for LDX as-is. Yes, let's do if (class == BPF_LDX) { if (t != SRC_OP) - return BPF_SIZE(code) == BPF_DW; + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); Thanks, Puranjay
On Tue, Sep 12, 2023 at 5:22 PM Puranjay Mohan <puranjay12@gmail.com> wrote: > > On Wed, Sep 13, 2023 at 2:09 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Tue, Sep 12, 2023 at 3:49 PM Puranjay Mohan <puranjay12@gmail.com> wrote: > > > > > > Hi Alexei, > > > > > > [...] > > > > > > > I guess we never clearly defined what 'needs_zext' is supposed to be, > > > > so it wouldn't be fair to call 32-bit JITs buggy. > > > > But we better address this issue now. > > > > This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. > > > > I believe all 4 JITs emit proper zero extension into 64-bit register > > > > by using single cpu instruction, > > > > but they also define bpf_jit_needs_zext() as true, > > > > so extra BPF_ZEXT_REG() is added by the verifier > > > > and it is a pure run-time overhead. > > > > > > I just realised that these zext instructions will not be a runtime > > > overhead because the JITs ignore them. > > > Like > > > s390 does: > > > case BPF_LDX | BPF_MEM | BPF_B: /* dst = *(u8 *)(ul) (src + off) */ > > > case BPF_LDX | BPF_PROBE_MEM | BPF_B: > > > /* llgc %dst,0(off,%src) */ > > > EMIT6_DISP_LH(0xe3000000, 0x0090, dst_reg, src_reg, REG_0, off); > > > jit->seen |= SEEN_MEM; > > > if (insn_is_zext(&insn[1])) > > > insn_count = 2; /* this will skip the next zext instruction */ > > > break; > > > > > > powerpc does after LDX: > > > if (size != BPF_DW && insn_is_zext(&insn[i + 1])) > > > addrs[++i] = ctx->idx * 4; > > > > > > I see. Indeed the 64-bit JITs ignore this special zext insn after LDX. > > > > > > It's better to remove > > > > if (t != SRC_OP) > > > > return BPF_SIZE(code) == BPF_DW; > > > > from is_reg64() to avoid adding BPF_ZEXT_REG() insn > > > > and fix 32-bit JITs at the same time. > > > > RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches > > > > to always zero upper 32-bit after LDX and > > > > then 4th patch to remove these two lines. > > > > > > I have sent the patches for above, although I think this optimization > > > is useful because > > > zero extension after LDX is only required when the loaded value is > > > later being used as > > > a 64-bit value. If it is not the case then the verifier will not emit > > > the zext and 32-bit JITs will emit > > > 1 less instruction because they expect the verifier to do the zext for > > > them where required. > > > > You're correct. > > Ok. Let's keep zext for LDX as-is. > > Yes, > let's do > if (class == BPF_LDX) { > if (t != SRC_OP) > - return BPF_SIZE(code) == BPF_DW; > + return (BPF_SIZE(code) == BPF_DW || > BPF_MODE(code) == BPF_MEMSX); Agree. imo that's a cleaner approach vs changing mark_insn_zext().
Le 13/09/2023 à 02:22, Puranjay Mohan a écrit : > On Wed, Sep 13, 2023 at 2:09 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: >> >> On Tue, Sep 12, 2023 at 3:49 PM Puranjay Mohan <puranjay12@gmail.com> wrote: >>> >>> Hi Alexei, >>> >>> [...] >>> >>>> I guess we never clearly defined what 'needs_zext' is supposed to be, >>>> so it wouldn't be fair to call 32-bit JITs buggy. >>>> But we better address this issue now. >>>> This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. >>>> I believe all 4 JITs emit proper zero extension into 64-bit register >>>> by using single cpu instruction, >>>> but they also define bpf_jit_needs_zext() as true, >>>> so extra BPF_ZEXT_REG() is added by the verifier >>>> and it is a pure run-time overhead. >>> >>> I just realised that these zext instructions will not be a runtime >>> overhead because the JITs ignore them. >>> Like >>> s390 does: >>> case BPF_LDX | BPF_MEM | BPF_B: /* dst = *(u8 *)(ul) (src + off) */ >>> case BPF_LDX | BPF_PROBE_MEM | BPF_B: >>> /* llgc %dst,0(off,%src) */ >>> EMIT6_DISP_LH(0xe3000000, 0x0090, dst_reg, src_reg, REG_0, off); >>> jit->seen |= SEEN_MEM; >>> if (insn_is_zext(&insn[1])) >>> insn_count = 2; /* this will skip the next zext instruction */ >>> break; >>> >>> powerpc does after LDX: >>> if (size != BPF_DW && insn_is_zext(&insn[i + 1])) >>> addrs[++i] = ctx->idx * 4; >> >> >> I see. Indeed the 64-bit JITs ignore this special zext insn after LDX. >> >>>> It's better to remove >>>> if (t != SRC_OP) >>>> return BPF_SIZE(code) == BPF_DW; >>>> from is_reg64() to avoid adding BPF_ZEXT_REG() insn >>>> and fix 32-bit JITs at the same time. >>>> RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches >>>> to always zero upper 32-bit after LDX and >>>> then 4th patch to remove these two lines. >>> >>> I have sent the patches for above, although I think this optimization >>> is useful because >>> zero extension after LDX is only required when the loaded value is >>> later being used as >>> a 64-bit value. If it is not the case then the verifier will not emit >>> the zext and 32-bit JITs will emit >>> 1 less instruction because they expect the verifier to do the zext for >>> them where required. >> >> You're correct. >> Ok. Let's keep zext for LDX as-is. > > Yes, > let's do > if (class == BPF_LDX) { > if (t != SRC_OP) > - return BPF_SIZE(code) == BPF_DW; > + return (BPF_SIZE(code) == BPF_DW || > BPF_MODE(code) == BPF_MEMSX); You don't need the parenthesis, just do return BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX; Christophe
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index bb78212fa5b2..097985a46edc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3110,7 +3110,9 @@ static void mark_insn_zext(struct bpf_verifier_env *env, { s32 def_idx = reg->subreg_def; - if (def_idx == DEF_NOT_SUBREG) + if (def_idx == DEF_NOT_SUBREG || + (BPF_CLASS(env->prog->insnsi[def_idx - 1].code) == BPF_LDX && + BPF_MODE(env->prog->insnsi[def_idx - 1].code) == BPF_MEMSX)) return; env->insn_aux_data[def_idx - 1].zext_dst = true;
On the architectures that use bpf_jit_needs_zext(), e.g., s390x, the verifier incorrectly inserts a zero-extension after BPF_MEMSX, leading to miscompilations like the one below: 24: 89 1a ff fe 00 00 00 00 "r1 = *(s16 *)(r10 - 2);" # zext_dst set 0x3ff7fdb910e: lgh %r2,-2(%r13,%r0) # load halfword 0x3ff7fdb9114: llgfr %r2,%r2 # wrong! 25: 65 10 00 03 00 00 7f ff if r1 s> 32767 goto +3 <l0_1> # check_cond_jmp_op() Disable such zero-extensions. The JITs need to insert sign-extension themselves, if necessary. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- kernel/bpf/verifier.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)