[bpf-next,1/2] bpf: Allow reads from uninit stack

This commits updates the following functions to allow reads from
uninitialized stack locations when env->allow_uninit_stack option is
enabled:
- check_stack_read_fixed_off()
- check_stack_range_initialized(), called from:
  - check_stack_read_var_off()
  - check_helper_mem_access()

Such change allows to relax logic in stacksafe() to treat STACK_MISC
and STACK_INVALID in a same way and make the following stack slot
configurations equivalent:

  |  Cached state    |  Current state   |
  |   stack slot     |   stack slot     |
  |------------------+------------------|
  | STACK_INVALID or | STACK_INVALID or |
  | STACK_MISC       | STACK_SPILL   or |
  |                  | STACK_MISC    or |
  |                  | STACK_ZERO    or |
  |                  | STACK_DYNPTR     |

This leads to significant verification speed gains (see below).

The idea was suggested by Andrii Nakryiko [1] and initial patch was
created by Alexei Starovoitov [2].

Currently the env->allow_uninit_stack is allowed for programs loaded
by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.

A number of test cases from verifier/*.c were expecting uninitialized
stack access to be an error. These test cases were updated to execute
in unprivileged mode (thus preserving the tests).

The test progs/test_global_func10.c expected "invalid indirect access
to stack" error message because of the access to uninitialized memory
region. The test is updated to provoke the same error message by
accessing stack out of allocated range.

The following tests had to be removed because these can't be made
unprivileged:
- verifier/sock.c:
  - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
  stack_value"
  BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
- verifier/var_off.c:
  - "indirect variable-offset stack access, max_off+size > max_initialized"
  - "indirect variable-offset stack access, uninitialized"
  These tests verify that access to uninitialized stack values is
  detected when stack offset is not a constant. However, variable
  stack access is prohibited in unprivileged mode, thus these tests
  are no longer valid.

 * * *

Here is veristat log comparing this patch with current master on a
set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
and cilium BPF binaries (see [3]):

$ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
File                        Program                     States (A)  States (B)  States    (DIFF)
--------------------------  --------------------------  ----------  ----------  ----------------
bpf_host.o                  tail_handle_ipv6_from_host         349         244    -105 (-30.09%)
bpf_host.o                  tail_handle_nat_fwd_ipv4          1320         895    -425 (-32.20%)
bpf_lxc.o                   tail_handle_nat_fwd_ipv4          1320         895    -425 (-32.20%)
bpf_sock.o                  cil_sock4_connect                   70          48     -22 (-31.43%)
bpf_sock.o                  cil_sock4_sendmsg                   68          46     -22 (-32.35%)
bpf_xdp.o                   tail_handle_nat_fwd_ipv4          1554         803    -751 (-48.33%)
bpf_xdp.o                   tail_lb_ipv4                      6457        2473   -3984 (-61.70%)
bpf_xdp.o                   tail_lb_ipv6                      7249        3908   -3341 (-46.09%)
pyperf600_bpf_loop.bpf.o    on_event                           287         145    -142 (-49.48%)
strobemeta.bpf.o            on_event                         15915        4772  -11143 (-70.02%)
strobemeta_nounroll2.bpf.o  on_event                         17087        3820  -13267 (-77.64%)
xdp_synproxy_kern.bpf.o     syncookie_tc                     21271        6635  -14636 (-68.81%)
xdp_synproxy_kern.bpf.o     syncookie_xdp                    23122        6024  -17098 (-73.95%)
--------------------------  --------------------------  ----------  ----------  ----------------

Note: I limited selection by states_pct<-30%.

Inspection of differences in pyperf600_bpf_loop behavior shows that
the following patch for the test removes almost all differences:

    - a/tools/testing/selftests/bpf/progs/pyperf.h
    + b/tools/testing/selftests/bpf/progs/pyperf.h
    @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
            }

            if (event->pthread_match || !pidData->use_tls) {
    -               void* frame_ptr;
    -               FrameData frame;
    +               void* frame_ptr = 0;
    +               FrameData frame = {};
                    Symbol sym = {};
                    int cur_cpu = bpf_get_smp_processor_id();

W/o this patch the difference comes from the following pattern
(for different variables):

    static bool get_frame_data(... FrameData *frame ...)
    {
        ...
        bpf_probe_read_user(&frame->f_code, ...);
        if (!frame->f_code)
            return false;
        ...
        bpf_probe_read_user(&frame->co_name, ...);
        if (frame->co_name)
            ...;
    }

    int __on_event(struct bpf_raw_tracepoint_args *ctx)
    {
        FrameData frame;
        ...
        get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
        ...
    }

    SEC("raw_tracepoint/kfree_skb")
    int on_event(struct bpf_raw_tracepoint_args* ctx)
    {
        ...
        ret |= __on_event(ctx);
        ret |= __on_event(ctx);
        ...
    }

With regards to value `frame->co_name` the following is important:
- Because of the conditional `if (!frame->f_code)` each call to
  __on_event() produces two states, one with `frame->co_name` marked
  as STACK_MISC, another with it as is (and marked STACK_INVALID on a
  first call).
- The call to bpf_probe_read_user() does not mark stack slots
  corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
  these slots as BPF_MISC, this happens because of the following loop
  in the check_helper_call():

	for (i = 0; i < meta.access_size; i++) {
		err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
				       BPF_WRITE, -1, false);
		if (err)
			return err;
	}

  Note the size of the write, it is a one byte write for each byte
  touched by a helper. The BPF_B write does not lead to write marks
  for the target stack slot.
- Which means that w/o this patch when second __on_event() call is
  verified `if (frame->co_name)` will propagate read marks first to a
  stack slot with STACK_MISC marks and second to a stack slot with
  STACK_INVALID marks and these states would be considered different.

[1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
[3] git@github.com:anakryiko/cilium.git

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c                         |  10 ++
 .../selftests/bpf/progs/test_global_func10.c  |   6 +-
 tools/testing/selftests/bpf/verifier/calls.c  |  13 ++-
 .../bpf/verifier/helper_access_var_len.c      | 104 ++++++++++++------
 .../testing/selftests/bpf/verifier/int_ptr.c  |   9 +-
 .../selftests/bpf/verifier/search_pruning.c   |  13 ++-
 tools/testing/selftests/bpf/verifier/sock.c   |  27 -----
 .../selftests/bpf/verifier/spill_fill.c       |   7 +-
 .../testing/selftests/bpf/verifier/var_off.c  |  52 ---------
 9 files changed, 107 insertions(+), 134 deletions(-)

Message ID	20230216183606.2483834-2-eddyz87@gmail.com (mailing list archive)
State	Changes Requested
Delegated to:	BPF
Headers	show Return-Path: <bpf-owner@vger.kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC9C8C61DA4 for <bpf@archiver.kernel.org>; Thu, 16 Feb 2023 18:36:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230078AbjBPSg1 (ORCPT <rfc822;bpf@archiver.kernel.org>); Thu, 16 Feb 2023 13:36:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230034AbjBPSg0 (ORCPT <rfc822;bpf@vger.kernel.org>); Thu, 16 Feb 2023 13:36:26 -0500 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F13137F06 for <bpf@vger.kernel.org>; Thu, 16 Feb 2023 10:36:22 -0800 (PST) Received: by mail-ed1-x52c.google.com with SMTP id v11so3907132edx.12 for <bpf@vger.kernel.org>; Thu, 16 Feb 2023 10:36:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N/S7Qaqnr/Qyc1sC60fsQVuqJJoKYeXLwctkAis/SaU=; b=aywdcwdZvgXDq/tvn3CVoFkJg8FEesbnxEo5nK31DFHseWPTfW/TEB4AHMlEWczGwb aNeDyEQx7+qqgSAakn76zDJetcErGwDaIsKcy33cfoI2BrCDc2PT70L2kzr3YR1PgDhn nkoYkx1d+j9S0F1qyQNWCSMmR0SwCpYrxpxQTsPqVTt6nEImUC2rkzyL8jRdfRVHcG1O gHr13BoWuYhHUjHfXHixeAK6Lqwj0j+lHD6/BnZxf+06GqKUjAVWpAt5FIdy/a2QVZ6D CgEyWSCNckocPbyDC1DGOxSzrm5stfnCS6UD9/XDUFHcq0tbMZf0gR8khjcAHFUoCNwb RDfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=N/S7Qaqnr/Qyc1sC60fsQVuqJJoKYeXLwctkAis/SaU=; b=V1JjlG7V9jTqlqM6Ovf/zYuL3ekiRX1AAU0dNEQzWejr+4UJABSijlNGNYNzv/MV6v Dfll+iCvNobbXJe2uXdsXTcHPVDwwlafR4JebeOB0rsRy0dq2bZYW/lIjHREnF4U3RGk H9lRDIK/PbgK5m3767CUFClS2Vo+xTSRUZG6n/zM1oE+bHfxwf5DRJpng89nB30XE7fa j2bOPQM7gk7lWjbwFYG/U2E0/bOp8EkRs1cuvtysn15R+31l9smhe1xb1HX/5dX4DFhP keSr9icMf03gCWcVfdhrV4XO5/2rdlipA8blAjfZvn/uUakr3CXs6uMT6c+pIdCbqeq3 6VWw== X-Gm-Message-State: AO0yUKV5c0SX/8c8TdeK1d7FeudWc0seIzpo3+HET/OfQsdNUSVKro0v 1SZ3pPVSD1gPfcLtjI7gFFTceYlVRUY= X-Google-Smtp-Source: AK7set8e167X3RuSV+9qv99D8xOb0tKG/h8I2/0+JVxfzknhF6XWOsvOpDoAVkMURzFDGQ3GwsY3QA== X-Received: by 2002:a05:6402:14d0:b0:4ac:bd84:43d9 with SMTP id f16-20020a05640214d000b004acbd8443d9mr6555131edx.2.1676572580822; Thu, 16 Feb 2023 10:36:20 -0800 (PST) Received: from bigfoot.. (host-176-36-0-241.b024.la.net.ua. [176.36.0.241]) by smtp.gmail.com with ESMTPSA id v28-20020a50d09c000000b004accc54a9edsm1237854edd.93.2023.02.16.10.36.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Feb 2023 10:36:20 -0800 (PST) From: Eduard Zingerman <eddyz87@gmail.com> To: bpf@vger.kernel.org, ast@kernel.org Cc: andrii@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev, kernel-team@fb.com, yhs@fb.com, Eduard Zingerman <eddyz87@gmail.com> Subject: [PATCH bpf-next 1/2] bpf: Allow reads from uninit stack Date: Thu, 16 Feb 2023 20:36:05 +0200 Message-Id: <20230216183606.2483834-2-eddyz87@gmail.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230216183606.2483834-1-eddyz87@gmail.com> References: <20230216183606.2483834-1-eddyz87@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <bpf.vger.kernel.org> X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net
Series	Allow reads from uninit stack \| expand [bpf-next,0/2] Allow reads from uninit stack [bpf-next,1/2] bpf: Allow reads from uninit stack [bpf-next,2/2] selftests/bpf: Tests for uninitialized stack reads

Context	Check	Description
netdev/tree_selection	success	Clearly marked for bpf-next, async
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Series has a cover letter
netdev/patch_count	success	Link
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 10 this patch: 10
netdev/cc_maintainers	warning	15 maintainers not CCed: linux-kselftest@vger.kernel.org john.fastabend@gmail.com colin.i.king@gmail.com sdf@google.com shuah@kernel.org jolsa@kernel.org kuba@kernel.org netdev@vger.kernel.org memxor@gmail.com song@kernel.org mykolal@fb.com haoluo@google.com hawk@kernel.org kpsingh@kernel.org davem@davemloft.net
netdev/build_clang	success	Errors and warnings before: 1 this patch: 1
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 10 this patch: 10
netdev/checkpatch	warning	WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 93 exceeds 80 columns
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-next-PR	success	PR summary
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3	fail	Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4	fail	Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5	fail	Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6	fail	Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7	fail	Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for set-matrix

[bpf-next,1/2] bpf: Allow reads from uninit stack

Checks

Commit Message

Comments

Patch