diff mbox series

[bpf-next,1/2] bpf, x86: Fix tail call count offset calculation on bpf2bpf call

Message ID 20220615151721.404596-2-jakub@cloudflare.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Fix tail call counting with bpf2bpf | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 14 maintainers not CCed: tglx@linutronix.de bp@alien8.de songliubraving@fb.com x86@kernel.org mingo@redhat.com yoshfuji@linux-ipv6.org hpa@zytor.com yhs@fb.com davem@davemloft.net john.fastabend@gmail.com kafai@fb.com dsahern@kernel.org dave.hansen@linux.intel.com kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch warning WARNING: line length of 88 exceeds 80 columns WARNING: line length of 90 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 fail Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-next-VM_Test-2 fail Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for Kernel LATEST on z15 with gcc

Commit Message

Jakub Sitnicki June 15, 2022, 3:17 p.m. UTC
On x86-64 the tail call count is passed from one BPF function to another
through %rax. Additionally, on function entry, the tail call count value is
stored on stack right after the BPF program stack, due to register
shortage.

The stored count is later loaded from stack either when performing a tail
call - to check if we have not reached the tail call limit - or before
calling another BPF function call in order to pass it via %rax.

In the latter case, we miscalculate the offset at which the tail call count
was stored on function entry. The JIT does not take into account that the
allocated BPF program stack is always a multiple of 8 on x86, while the
actual stack depth does not have to be.

This leads to a load from an offset that belongs to the BPF stack, as shown
in the example below:

SEC("tc")
int entry(struct __sk_buff *skb)
{
	/* Have data on stack which size is not a multiple of 8 */
	volatile char arr[1] = {};
	return subprog_tail(skb);
}

int entry(struct __sk_buff * skb):
   0: (b4) w2 = 0
   1: (73) *(u8 *)(r10 -1) = r2
   2: (85) call pc+1#bpf_prog_ce2f79bb5f3e06dd_F
   3: (95) exit

int entry(struct __sk_buff * skb):
   0xffffffffa0201788:  nop    DWORD PTR [rax+rax*1+0x0]
   0xffffffffa020178d:  xor    eax,eax
   0xffffffffa020178f:  push   rbp
   0xffffffffa0201790:  mov    rbp,rsp
   0xffffffffa0201793:  sub    rsp,0x8
   0xffffffffa020179a:  push   rax
   0xffffffffa020179b:  xor    esi,esi
   0xffffffffa020179d:  mov    BYTE PTR [rbp-0x1],sil
   0xffffffffa02017a1:  mov    rax,QWORD PTR [rbp-0x9]	!!! tail call count
   0xffffffffa02017a8:  call   0xffffffffa02017d8       !!! is at rbp-0x10
   0xffffffffa02017ad:  leave
   0xffffffffa02017ae:  ret

Fix it by rounding up the BPF stack depth to a multiple of 8, when
calculating the tail call count offset on stack.

Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 arch/x86/net/bpf_jit_comp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Daniel Borkmann June 16, 2022, 2:45 p.m. UTC | #1
On 6/15/22 5:17 PM, Jakub Sitnicki wrote:
[...]
> int entry(struct __sk_buff * skb):
>     0xffffffffa0201788:  nop    DWORD PTR [rax+rax*1+0x0]
>     0xffffffffa020178d:  xor    eax,eax
>     0xffffffffa020178f:  push   rbp
>     0xffffffffa0201790:  mov    rbp,rsp
>     0xffffffffa0201793:  sub    rsp,0x8
>     0xffffffffa020179a:  push   rax
>     0xffffffffa020179b:  xor    esi,esi
>     0xffffffffa020179d:  mov    BYTE PTR [rbp-0x1],sil
>     0xffffffffa02017a1:  mov    rax,QWORD PTR [rbp-0x9]	!!! tail call count
>     0xffffffffa02017a8:  call   0xffffffffa02017d8       !!! is at rbp-0x10
>     0xffffffffa02017ad:  leave
>     0xffffffffa02017ae:  ret
> 
> Fix it by rounding up the BPF stack depth to a multiple of 8, when
> calculating the tail call count offset on stack.
> 
> Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>   arch/x86/net/bpf_jit_comp.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index f298b18a9a3d..c98b8c0ed3b8 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1420,8 +1420,9 @@ st:			if (is_imm8(insn->off))
>   		case BPF_JMP | BPF_CALL:
>   			func = (u8 *) __bpf_call_base + imm32;
>   			if (tail_call_reachable) {
> +				/* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */
>   				EMIT3_off32(0x48, 0x8B, 0x85,
> -					    -(bpf_prog->aux->stack_depth + 8));
> +					    -round_up(bpf_prog->aux->stack_depth, 8) - 8);

Lgtm, great catch by the way!
Maciej Fijalkowski June 16, 2022, 3:01 p.m. UTC | #2
On Thu, Jun 16, 2022 at 04:45:09PM +0200, Daniel Borkmann wrote:
> On 6/15/22 5:17 PM, Jakub Sitnicki wrote:
> [...]
> > int entry(struct __sk_buff * skb):
> >     0xffffffffa0201788:  nop    DWORD PTR [rax+rax*1+0x0]
> >     0xffffffffa020178d:  xor    eax,eax
> >     0xffffffffa020178f:  push   rbp
> >     0xffffffffa0201790:  mov    rbp,rsp
> >     0xffffffffa0201793:  sub    rsp,0x8
> >     0xffffffffa020179a:  push   rax
> >     0xffffffffa020179b:  xor    esi,esi
> >     0xffffffffa020179d:  mov    BYTE PTR [rbp-0x1],sil
> >     0xffffffffa02017a1:  mov    rax,QWORD PTR [rbp-0x9]	!!! tail call count
> >     0xffffffffa02017a8:  call   0xffffffffa02017d8       !!! is at rbp-0x10
> >     0xffffffffa02017ad:  leave
> >     0xffffffffa02017ae:  ret
> > 
> > Fix it by rounding up the BPF stack depth to a multiple of 8, when
> > calculating the tail call count offset on stack.
> > 
> > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > ---
> >   arch/x86/net/bpf_jit_comp.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index f298b18a9a3d..c98b8c0ed3b8 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1420,8 +1420,9 @@ st:			if (is_imm8(insn->off))
> >   		case BPF_JMP | BPF_CALL:
> >   			func = (u8 *) __bpf_call_base + imm32;
> >   			if (tail_call_reachable) {
> > +				/* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */
> >   				EMIT3_off32(0x48, 0x8B, 0x85,
> > -					    -(bpf_prog->aux->stack_depth + 8));
> > +					    -round_up(bpf_prog->aux->stack_depth, 8) - 8);
> 
> Lgtm, great catch by the way!

Indeed!

Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

I was wondering if it would be possible to work only on rounded up to 8
stack depth from JIT POV since it's what we do everywhere we use it...
diff mbox series

Patch

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index f298b18a9a3d..c98b8c0ed3b8 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1420,8 +1420,9 @@  st:			if (is_imm8(insn->off))
 		case BPF_JMP | BPF_CALL:
 			func = (u8 *) __bpf_call_base + imm32;
 			if (tail_call_reachable) {
+				/* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */
 				EMIT3_off32(0x48, 0x8B, 0x85,
-					    -(bpf_prog->aux->stack_depth + 8));
+					    -round_up(bpf_prog->aux->stack_depth, 8) - 8);
 				if (!imm32 || emit_call(&prog, func, image + addrs[i - 1] + 7))
 					return -EINVAL;
 			} else {