diff mbox series

4-year old off-by-two bug in the BPF verifier's boundary checks?

Message ID 279b0a91-506d-e657-022d-aad52c17dfc6@nvidia.com (mailing list archive)
State RFC
Delegated to: BPF
Headers show
Series 4-year old off-by-two bug in the BPF verifier's boundary checks? | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR fail merge-conflict
bpf/vmtest-bpf-PR fail merge-conflict
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Maxim Mikityanskiy Nov. 2, 2021, 3:12 p.m. UTC
Hi guys,

I think I found cases where the BPF verifier mistakenly rejects valid 
BPF programs when doing pkt_end boundary checks, and the selftests for 
these cases test wrong things as well.

Daniel's commit fb2a311a31d3 ("bpf: fix off by one for range markings 
with L{T, E} patterns") [1] attempts to fix an off-by-one bug in 
boundary checks, but I think it shifts the index by 1 in a wrong 
direction, so instead of fixing, the bug becomes off-by-two.

A following commit b37242c773b2 ("bpf: add test cases to bpf selftests 
to cover all access tests") [2] adds unit tests to check the new 
behavior, but the tests look also wrong to me.

Let me analyze these two tests:

{
         "XDP pkt read, pkt_data' > pkt_end, good access",
         .insns = {
         BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct 
xdp_md, data)),
         BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
                     offsetof(struct xdp_md, data_end)),
         BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
         BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
         BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_3, 1),
         BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
         BPF_MOV64_IMM(BPF_REG_0, 0),
         BPF_EXIT_INSN(),
         },
         .result = ACCEPT,
         .prog_type = BPF_PROG_TYPE_XDP,
         .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
},

{
         "XDP pkt read, pkt_data' >= pkt_end, bad access 1",
         .insns = {
         BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct 
xdp_md, data)),
         BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
                     offsetof(struct xdp_md, data_end)),
         BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
         BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
         BPF_JMP_REG(BPF_JGE, BPF_REG_1, BPF_REG_3, 1),
         BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
         BPF_MOV64_IMM(BPF_REG_0, 0),
         BPF_EXIT_INSN(),
         },
         .errstr = "R1 offset is outside of the packet",
         .result = REJECT,
         .prog_type = BPF_PROG_TYPE_XDP,
         .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
},

The first program looks good both to me and the verifier: if data + 8 > 
data_end, we bail out, otherwise, if data + 8 <= data_end, we read 8 
bytes: [data; data+7].

The second program doesn't pass the verifier, and the test expects it to 
be rejected, but the program itself still looks fine to me: if data + 8 
 >= data_end, we bail out, otherwise, if data + 8 < data_end, we read 8 
bytes: [data; data+7], and this is fine, because data + 7 is for sure < 
data_end. The verifier considers data + 7 to be out of bounds, although 
both data + 7 and data + 8 are still valid offsets, hence the off-by-two 
bug.

Are my considerations valid, or am I stupidly missing anything?

I suggest to fix it like this:

          *

I don't think this bug poses any security threat, since the checks are 
stricter than needed, but it's a huge functional issue.

Thanks,
Max

[1]: 
https://patchwork.ozlabs.org/project/netdev/patch/3df9cce096b139eb0efb3b0c7bf9fcc5c5dc6629.1508545543.git.daniel@iogearbox.net/
[2]: 
https://patchwork.ozlabs.org/project/netdev/patch/3bc01f5985324b0e233e86616f4fe171c0d4ca8b.1508545543.git.daniel@iogearbox.net/

Comments

Alexei Starovoitov Nov. 5, 2021, 2:14 a.m. UTC | #1
On Tue, Nov 2, 2021 at 8:12 AM Maxim Mikityanskiy <maximmi@nvidia.com> wrote:
>
> Hi guys,
>
> I think I found cases where the BPF verifier mistakenly rejects valid
> BPF programs when doing pkt_end boundary checks, and the selftests for
> these cases test wrong things as well.
>
> Daniel's commit fb2a311a31d3 ("bpf: fix off by one for range markings
> with L{T, E} patterns") [1] attempts to fix an off-by-one bug in
> boundary checks, but I think it shifts the index by 1 in a wrong
> direction, so instead of fixing, the bug becomes off-by-two.
>
> A following commit b37242c773b2 ("bpf: add test cases to bpf selftests
> to cover all access tests") [2] adds unit tests to check the new
> behavior, but the tests look also wrong to me.
>
> Let me analyze these two tests:
>
> {
>          "XDP pkt read, pkt_data' > pkt_end, good access",
>          .insns = {
>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct
> xdp_md, data)),
>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>                      offsetof(struct xdp_md, data_end)),
>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>          BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_3, 1),
>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>          BPF_MOV64_IMM(BPF_REG_0, 0),
>          BPF_EXIT_INSN(),
>          },
>          .result = ACCEPT,
>          .prog_type = BPF_PROG_TYPE_XDP,
>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
> },
>
> {
>          "XDP pkt read, pkt_data' >= pkt_end, bad access 1",
>          .insns = {
>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct
> xdp_md, data)),
>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>                      offsetof(struct xdp_md, data_end)),
>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>          BPF_JMP_REG(BPF_JGE, BPF_REG_1, BPF_REG_3, 1),
>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>          BPF_MOV64_IMM(BPF_REG_0, 0),
>          BPF_EXIT_INSN(),
>          },
>          .errstr = "R1 offset is outside of the packet",
>          .result = REJECT,
>          .prog_type = BPF_PROG_TYPE_XDP,
>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
> },
>
> The first program looks good both to me and the verifier: if data + 8 >
> data_end, we bail out, otherwise, if data + 8 <= data_end, we read 8
> bytes: [data; data+7].
>
> The second program doesn't pass the verifier, and the test expects it to
> be rejected, but the program itself still looks fine to me: if data + 8
>  >= data_end, we bail out, otherwise, if data + 8 < data_end, we read 8
> bytes: [data; data+7], and this is fine, because data + 7 is for sure <
> data_end. The verifier considers data + 7 to be out of bounds, although
> both data + 7 and data + 8 are still valid offsets, hence the off-by-two
> bug.
>
> Are my considerations valid, or am I stupidly missing anything?
>
> I suggest to fix it like this:
>
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -8492,7 +8492,7 @@ static void find_good_pkt_pointers(struct
> bpf_verifier_state *vstate,
>
>          new_range = dst_reg->off;
>          if (range_right_open)
> -               new_range--;
> +               new_range++;
>
>          /* Examples for register markings:
>           *
>
> I don't think this bug poses any security threat, since the checks are
> stricter than needed, but it's a huge functional issue.

Thanks for the analysis.
It looks correct to me.
Hopefully Daniel will take a look soon.
Daniel Borkmann Nov. 9, 2021, 11:34 a.m. UTC | #2
Hi Maxim,

On 11/2/21 4:12 PM, Maxim Mikityanskiy wrote:
> Hi guys,
> 
> I think I found cases where the BPF verifier mistakenly rejects valid BPF programs when doing pkt_end boundary checks, and the selftests for these cases test wrong things as well.
> 
> Daniel's commit fb2a311a31d3 ("bpf: fix off by one for range markings with L{T, E} patterns") [1] attempts to fix an off-by-one bug in boundary checks, but I think it shifts the index by 1 in a wrong direction, so instead of fixing, the bug becomes off-by-two.
> 
> A following commit b37242c773b2 ("bpf: add test cases to bpf selftests to cover all access tests") [2] adds unit tests to check the new behavior, but the tests look also wrong to me.
> 
> Let me analyze these two tests:
> 
> {
>          "XDP pkt read, pkt_data' > pkt_end, good access",
>          .insns = {
>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct xdp_md, data)),
>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>                      offsetof(struct xdp_md, data_end)),
>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>          BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_3, 1),
>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>          BPF_MOV64_IMM(BPF_REG_0, 0),
>          BPF_EXIT_INSN(),
>          },
>          .result = ACCEPT,
>          .prog_type = BPF_PROG_TYPE_XDP,
>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
> },
> 
> {
>          "XDP pkt read, pkt_data' >= pkt_end, bad access 1",
>          .insns = {
>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct xdp_md, data)),
>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>                      offsetof(struct xdp_md, data_end)),
>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>          BPF_JMP_REG(BPF_JGE, BPF_REG_1, BPF_REG_3, 1),
>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>          BPF_MOV64_IMM(BPF_REG_0, 0),
>          BPF_EXIT_INSN(),
>          },
>          .errstr = "R1 offset is outside of the packet",
>          .result = REJECT,
>          .prog_type = BPF_PROG_TYPE_XDP,
>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
> },
> 
> The first program looks good both to me and the verifier: if data + 8 > data_end, we bail out, otherwise, if data + 8 <= data_end, we read 8 bytes: [data; data+7].
> 
> The second program doesn't pass the verifier, and the test expects it to be rejected, but the program itself still looks fine to me: if data + 8 >= data_end, we bail out, otherwise, if data + 8 < data_end, we read 8 bytes: [data; data+7], and this is fine, because data + 7 is for sure < data_end. The verifier considers data + 7 to be out of bounds, although both data + 7 and data + 8 are still valid offsets, hence the off-by-two bug.
> 
> Are my considerations valid, or am I stupidly missing anything?

Sorry for my late reply, bit too swamped lately. So we have the two variants:

   r2 = data;
   r2 += 8;
   if (r2 > data_end) goto <handle exception>
     <access okay>

   r2 = data;
   r2 += 8;
   if (r2 >= data_end) goto <handle exception>
     <access okay>

Technically, the first option is the more correct way to check, meaning, we have 8 bytes of
access in the <access okay> branch. The second one is overly pessimistic in that if r2 equals
data_end we bail out even though we wouldn't have to. So in that case <access okay> branch
would have 9 bytes for access since r2 with offset 8 is already < data_end.

Anyway, please send a fix and updated test cases. Thanks Maxim!
Maxim Mikityanskiy Nov. 25, 2021, 2:33 p.m. UTC | #3
On 2021-11-09 13:34, Daniel Borkmann wrote:
> Hi Maxim,
> 
> On 11/2/21 4:12 PM, Maxim Mikityanskiy wrote:
>> Hi guys,
>>
>> I think I found cases where the BPF verifier mistakenly rejects valid 
>> BPF programs when doing pkt_end boundary checks, and the selftests for 
>> these cases test wrong things as well.
>>
>> Daniel's commit fb2a311a31d3 ("bpf: fix off by one for range markings 
>> with L{T, E} patterns") [1] attempts to fix an off-by-one bug in 
>> boundary checks, but I think it shifts the index by 1 in a wrong 
>> direction, so instead of fixing, the bug becomes off-by-two.
>>
>> A following commit b37242c773b2 ("bpf: add test cases to bpf selftests 
>> to cover all access tests") [2] adds unit tests to check the new 
>> behavior, but the tests look also wrong to me.
>>
>> Let me analyze these two tests:
>>
>> {
>>          "XDP pkt read, pkt_data' > pkt_end, good access",
>>          .insns = {
>>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct 
>> xdp_md, data)),
>>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>>                      offsetof(struct xdp_md, data_end)),
>>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>>          BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_3, 1),
>>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>>          BPF_MOV64_IMM(BPF_REG_0, 0),
>>          BPF_EXIT_INSN(),
>>          },
>>          .result = ACCEPT,
>>          .prog_type = BPF_PROG_TYPE_XDP,
>>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
>> },
>>
>> {
>>          "XDP pkt read, pkt_data' >= pkt_end, bad access 1",
>>          .insns = {
>>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct 
>> xdp_md, data)),
>>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>>                      offsetof(struct xdp_md, data_end)),
>>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>>          BPF_JMP_REG(BPF_JGE, BPF_REG_1, BPF_REG_3, 1),
>>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>>          BPF_MOV64_IMM(BPF_REG_0, 0),
>>          BPF_EXIT_INSN(),
>>          },
>>          .errstr = "R1 offset is outside of the packet",
>>          .result = REJECT,
>>          .prog_type = BPF_PROG_TYPE_XDP,
>>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
>> },
>>
>> The first program looks good both to me and the verifier: if data + 8 
>> > data_end, we bail out, otherwise, if data + 8 <= data_end, we read 8 
>> bytes: [data; data+7].
>>
>> The second program doesn't pass the verifier, and the test expects it 
>> to be rejected, but the program itself still looks fine to me: if data 
>> + 8 >= data_end, we bail out, otherwise, if data + 8 < data_end, we 
>> read 8 bytes: [data; data+7], and this is fine, because data + 7 is 
>> for sure < data_end. The verifier considers data + 7 to be out of 
>> bounds, although both data + 7 and data + 8 are still valid offsets, 
>> hence the off-by-two bug.
>>
>> Are my considerations valid, or am I stupidly missing anything?
> 
> Sorry for my late reply, bit too swamped lately. So we have the two 
> variants:
> 
>    r2 = data;
>    r2 += 8;
>    if (r2 > data_end) goto <handle exception>
>      <access okay>
> 
>    r2 = data;
>    r2 += 8;
>    if (r2 >= data_end) goto <handle exception>
>      <access okay>
> 
> Technically, the first option is the more correct way to check, meaning, 
> we have 8 bytes of
> access in the <access okay> branch. The second one is overly pessimistic 
> in that if r2 equals
> data_end we bail out even though we wouldn't have to. So in that case 
> <access okay> branch
> would have 9 bytes for access since r2 with offset 8 is already < data_end.
> 
> Anyway, please send a fix and updated test cases. Thanks Maxim!

Just pinging with my status: I'm still on it, I returned from vacation 
and back to work, but I'm currently struggling with running the BPF 
selftests.

I'm using tools/testing/selftests/bpf/vmtest.sh, I've hit a few issues 
trying to make it work, especially the glibc version issue (I have glibc 
2.33 on my host, but the VM image has 2.32 and can't run binaries 
compiled on the host), for which I applied this workaround to build the 
test progs statically:

https://www.spinics.net/lists/bpf/msg41647.html

However, the test suite just hangs after:

...
+ /etc/rcS.d/S50-startup
./test_progs
[    1.639277] bpf_testmod: loading out-of-tree module taints kernel.
#1 align:OK
#2 atomic_bounds:OK
[    1.824515] tsc: Refined TSC clocksource calibration: 2399.983 MHz
[    1.826421] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 
0x22982765f14, max_idle_ns: 440795222551 ns
[    1.829486] clocksource: Switched to clocksource tsc
#3 atomics:OK
#4 attach_probe:OK
#5 autoload:OK
#6 bind_perm:OK
#7 bloom_filter_map:OK
#8 bpf_cookie:OK

Any hint would be much appreciated. I'm trying to do my debugging too.

Thanks,
Max
Maxim Mikityanskiy Nov. 26, 2021, 5:07 p.m. UTC | #4
On 2021-11-25 16:33, Maxim Mikityanskiy wrote:
> On 2021-11-09 13:34, Daniel Borkmann wrote:
>> Hi Maxim,
>>
>> On 11/2/21 4:12 PM, Maxim Mikityanskiy wrote:
>>> Hi guys,
>>>
>>> I think I found cases where the BPF verifier mistakenly rejects valid 
>>> BPF programs when doing pkt_end boundary checks, and the selftests 
>>> for these cases test wrong things as well.
>>>
>>> Daniel's commit fb2a311a31d3 ("bpf: fix off by one for range markings 
>>> with L{T, E} patterns") [1] attempts to fix an off-by-one bug in 
>>> boundary checks, but I think it shifts the index by 1 in a wrong 
>>> direction, so instead of fixing, the bug becomes off-by-two.
>>>
>>> A following commit b37242c773b2 ("bpf: add test cases to bpf 
>>> selftests to cover all access tests") [2] adds unit tests to check 
>>> the new behavior, but the tests look also wrong to me.
>>>
>>> Let me analyze these two tests:
>>>
>>> {
>>>          "XDP pkt read, pkt_data' > pkt_end, good access",
>>>          .insns = {
>>>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct 
>>> xdp_md, data)),
>>>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>>>                      offsetof(struct xdp_md, data_end)),
>>>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>>>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>>>          BPF_JMP_REG(BPF_JGT, BPF_REG_1, BPF_REG_3, 1),
>>>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>>>          BPF_MOV64_IMM(BPF_REG_0, 0),
>>>          BPF_EXIT_INSN(),
>>>          },
>>>          .result = ACCEPT,
>>>          .prog_type = BPF_PROG_TYPE_XDP,
>>>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
>>> },
>>>
>>> {
>>>          "XDP pkt read, pkt_data' >= pkt_end, bad access 1",
>>>          .insns = {
>>>          BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct 
>>> xdp_md, data)),
>>>          BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_1,
>>>                      offsetof(struct xdp_md, data_end)),
>>>          BPF_MOV64_REG(BPF_REG_1, BPF_REG_2),
>>>          BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
>>>          BPF_JMP_REG(BPF_JGE, BPF_REG_1, BPF_REG_3, 1),
>>>          BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8),
>>>          BPF_MOV64_IMM(BPF_REG_0, 0),
>>>          BPF_EXIT_INSN(),
>>>          },
>>>          .errstr = "R1 offset is outside of the packet",
>>>          .result = REJECT,
>>>          .prog_type = BPF_PROG_TYPE_XDP,
>>>          .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
>>> },
>>>
>>> The first program looks good both to me and the verifier: if data + 8 
>>> > data_end, we bail out, otherwise, if data + 8 <= data_end, we read 
>>> 8 bytes: [data; data+7].
>>>
>>> The second program doesn't pass the verifier, and the test expects it 
>>> to be rejected, but the program itself still looks fine to me: if 
>>> data + 8 >= data_end, we bail out, otherwise, if data + 8 < data_end, 
>>> we read 8 bytes: [data; data+7], and this is fine, because data + 7 
>>> is for sure < data_end. The verifier considers data + 7 to be out of 
>>> bounds, although both data + 7 and data + 8 are still valid offsets, 
>>> hence the off-by-two bug.
>>>
>>> Are my considerations valid, or am I stupidly missing anything?
>>
>> Sorry for my late reply, bit too swamped lately. So we have the two 
>> variants:
>>
>>    r2 = data;
>>    r2 += 8;
>>    if (r2 > data_end) goto <handle exception>
>>      <access okay>
>>
>>    r2 = data;
>>    r2 += 8;
>>    if (r2 >= data_end) goto <handle exception>
>>      <access okay>
>>
>> Technically, the first option is the more correct way to check, 
>> meaning, we have 8 bytes of
>> access in the <access okay> branch. The second one is overly 
>> pessimistic in that if r2 equals
>> data_end we bail out even though we wouldn't have to. So in that case 
>> <access okay> branch
>> would have 9 bytes for access since r2 with offset 8 is already < 
>> data_end.
>>
>> Anyway, please send a fix and updated test cases. Thanks Maxim!
> 
> Just pinging with my status: I'm still on it, I returned from vacation 
> and back to work, but I'm currently struggling with running the BPF 
> selftests.
> 
> I'm using tools/testing/selftests/bpf/vmtest.sh, I've hit a few issues 
> trying to make it work, especially the glibc version issue (I have glibc 
> 2.33 on my host, but the VM image has 2.32 and can't run binaries 
> compiled on the host), for which I applied this workaround to build the 
> test progs statically:
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Fbpf%2Fmsg41647.html&amp;data=04%7C01%7Cmaximmi%40nvidia.com%7C13976fd5b93e4df1a6ca08d9b020eaaf%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637734477702442983%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=sklOvJNabJJtUzktnCw1s0M4pLb7UJnd0xezhvvH8os%3D&amp;reserved=0 
> 
> 
> However, the test suite just hangs after:
> 
> ...
> + /etc/rcS.d/S50-startup
> ./test_progs
> [    1.639277] bpf_testmod: loading out-of-tree module taints kernel.
> #1 align:OK
> #2 atomic_bounds:OK
> [    1.824515] tsc: Refined TSC clocksource calibration: 2399.983 MHz
> [    1.826421] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 
> 0x22982765f14, max_idle_ns: 440795222551 ns
> [    1.829486] clocksource: Switched to clocksource tsc
> #3 atomics:OK
> #4 attach_probe:OK
> #5 autoload:OK
> #6 bind_perm:OK
> #7 bloom_filter_map:OK
> #8 bpf_cookie:OK

I figured out that I actually had to run test_verifier, rather than 
test_progs, but I also found the issue with test_progs:

tools/testing/selftests/bpf/prog_tests/bpf_iter.c:

/* Read CMP_BUFFER_SIZE (1kB) from bpf_iter. Read in small chunks
  * to trigger seq_file corner cases. The expected output is much
  * longer than 1kB, so the while loop will terminate.
  */
len = 0;
while (len < CMP_BUFFER_SIZE) {
         err = read_fd_into_buffer(iter_fd, task_vma_output + len,
                                   min(read_size, CMP_BUFFER_SIZE - len));
         if (CHECK(err < 0, "read_iter_fd", "read_iter_fd failed\n"))
                 goto out;
         len += err;
}

The output was actually shorter than 1K, err was 0, and the loop was 
infinite. I think a simple `if (!err) break;` should fix it. I'll submit 
it as well.

> 
> Any hint would be much appreciated. I'm trying to do my debugging too.
> 
> Thanks,
> Max
diff mbox series

Patch

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8492,7 +8492,7 @@  static void find_good_pkt_pointers(struct 
bpf_verifier_state *vstate,

         new_range = dst_reg->off;
         if (range_right_open)
-               new_range--;
+               new_range++;

         /* Examples for register markings: