From patchwork Wed Aug 7 23:57:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kui-Feng Lee X-Patchwork-Id: 13756871 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 82B8A146A9B for ; Wed, 7 Aug 2024 23:58:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075094; cv=none; b=PvY7O09bKfU+GpCfmVI/Qg2nhEdNuVR0mgpdLU+zVrBEMcE9ZtBSh9WQLhYPFG696UHaOSgALAAQ/HUDvyzrxPLAce47ifFXjuXx10fsb6Qqd/0uERXTLHErEVz9gdiWYyr0cmRQUTr7Z4ataZKv4o5CqkEgKdNDYhjmK1l42ZE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075094; c=relaxed/simple; bh=IsM9L8gY+O7L86yZcvnneAGcxmYcS//7qMiw3CABlO8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rcExNEpKq3Aijr7RKvCQK0IQhLNPtoBwlxoJcsPtcuHRQaBYMpMZvZnvmUG+r5v+wblk9y0eAi2nnZ9FATOeLrsoH8bmvc2kZ2RYKSg2m5MslOQElQ5ZshDuOj+pp8Ll2qZKBuVXFJm5/9u37CgfW7kEx7p4kt4JBfPwIJTcbqw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AtcU/Q3K; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AtcU/Q3K" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-65f7bd30546so3144577b3.1 for ; Wed, 07 Aug 2024 16:58:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723075091; x=1723679891; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xBA5FHZdsY9kGUeAtOyYrFTtsEIujr4bE1qZ7e/u1Jk=; b=AtcU/Q3KPgZH7W8nnmZmWzjKFdCovtTzWuj+d/Jsgxvx/0GywFQ4wO/0+W6nz0Z96d An6E9PX3fsu6LsILm+pLXgkzvBNuMiGLMVKXTB3uWw3MNmhAHEgiw3FTDxHEYVEfCoyO JwBrbYh+9TltvJ0TavuYhoMMYonXPVfJaw8+Fe+ktmkWaHl+isLoApkZTgo9RSfxAKAE Odhwrr3LjNtTSfLJaANGmTuKQ88wv7Lz8ELWCf4zt3xfOLCdblK1QKEOXZ9WqoHLRyAq VF9fmXlnWGQbEg5aeR8s1wA1+qPN3wGBSwtm8lX3ZSvKv6iHeO/Ouj+3yUitYYsCfZT/ IQpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723075091; x=1723679891; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xBA5FHZdsY9kGUeAtOyYrFTtsEIujr4bE1qZ7e/u1Jk=; b=IiZk/XxQ7Zh31MA0u9MYqkzccTjFgkZ/Ge64ds+olsn559KrKPFIrZJQ3IWHUl4zhM Mca0hPxyLsdA/6XRw6MzgWcaYPqdJI85pe+Kj/odDxPbzj2FXB+2pvCXTI4/MChxvTR9 U6+n9ApyxeSoLwZB1V+3E9+cMhT8Q5tUNfdFEj9by2+GI8o1hJf724e8lyaxi4autU0B 6nqAk/0Won4M7aazJyqH4gh7bVt61JwPhLrbHTiXAzFx1BdYWrBo1g1g/oyMpVKmZUgf 0uw7pnhkQztqSH8ylxik3jFL/jB+rslHawKZie0BSEQeCXNsnlRTZDe8uNyqbZHgvvmZ jnGA== X-Gm-Message-State: AOJu0YwAb4+u4Cy24ZzDJjrjhIMM9PaTJXDltCvPOOeGebjqOaREMQmT 2CRXPcZ88HmE3SPdW8Ic5TQISorQXSDIjRr2iJowG7ee0uduTInkdIalOOnF X-Google-Smtp-Source: AGHT+IGZtR2IJZnUHWs7OFUsGVUN8IY8XgEyq3g1mXy9msGsumR4/Mr1SMNW/5+1ashrwFeUM6zGKw== X-Received: by 2002:a05:690c:6e0d:b0:62f:19da:a53f with SMTP id 00721157ae682-69c07fc0249mr74117b3.0.1723075091237; Wed, 07 Aug 2024 16:58:11 -0700 (PDT) Received: from kickker.attlocal.net ([2600:1700:6cf8:1240:fb5f:452b:3dfd:192]) by smtp.gmail.com with ESMTPSA id 00721157ae682-68a0f419358sm21092477b3.26.2024.08.07.16.58.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 16:58:10 -0700 (PDT) From: Kui-Feng Lee To: bpf@vger.kernel.org, ast@kernel.org, martin.lau@linux.dev, song@kernel.org, kernel-team@meta.com, andrii@kernel.org Cc: sinquersw@gmail.com, kuifeng@meta.com, Kui-Feng Lee Subject: [RFC bpf-next 1/5] bpf: Parse and support "kptr_user" tag. Date: Wed, 7 Aug 2024 16:57:51 -0700 Message-Id: <20240807235755.1435806-2-thinker.li@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240807235755.1435806-1-thinker.li@gmail.com> References: <20240807235755.1435806-1-thinker.li@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Parse "kptr_user" tag from BTF, map it to BPF_KPTR_USER, and support it in related functions. Signed-off-by: Kui-Feng Lee --- include/linux/bpf.h | 8 +++++++- kernel/bpf/btf.c | 5 +++++ kernel/bpf/syscall.c | 2 ++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index b9425e410bcb..87d5f98249e2 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -194,7 +194,6 @@ enum btf_field_type { BPF_KPTR_UNREF = (1 << 2), BPF_KPTR_REF = (1 << 3), BPF_KPTR_PERCPU = (1 << 4), - BPF_KPTR = BPF_KPTR_UNREF | BPF_KPTR_REF | BPF_KPTR_PERCPU, BPF_LIST_HEAD = (1 << 5), BPF_LIST_NODE = (1 << 6), BPF_RB_ROOT = (1 << 7), @@ -203,6 +202,8 @@ enum btf_field_type { BPF_GRAPH_ROOT = BPF_RB_ROOT | BPF_LIST_HEAD, BPF_REFCOUNT = (1 << 9), BPF_WORKQUEUE = (1 << 10), + BPF_KPTR_USER = (1 << 11), + BPF_KPTR = BPF_KPTR_UNREF | BPF_KPTR_REF | BPF_KPTR_PERCPU | BPF_KPTR_USER, }; typedef void (*btf_dtor_kfunc_t)(void *); @@ -322,6 +323,8 @@ static inline const char *btf_field_type_name(enum btf_field_type type) return "kptr"; case BPF_KPTR_PERCPU: return "percpu_kptr"; + case BPF_KPTR_USER: + return "user_kptr"; case BPF_LIST_HEAD: return "bpf_list_head"; case BPF_LIST_NODE: @@ -350,6 +353,7 @@ static inline u32 btf_field_type_size(enum btf_field_type type) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: return sizeof(u64); case BPF_LIST_HEAD: return sizeof(struct bpf_list_head); @@ -379,6 +383,7 @@ static inline u32 btf_field_type_align(enum btf_field_type type) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: return __alignof__(u64); case BPF_LIST_HEAD: return __alignof__(struct bpf_list_head); @@ -419,6 +424,7 @@ static inline void bpf_obj_init_field(const struct btf_field *field, void *addr) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: break; default: WARN_ON_ONCE(1); diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 95426d5b634e..3b0f555fbbe6 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3361,6 +3361,8 @@ static int btf_find_kptr(const struct btf *btf, const struct btf_type *t, type = BPF_KPTR_REF; else if (!strcmp("percpu_kptr", __btf_name_by_offset(btf, t->name_off))) type = BPF_KPTR_PERCPU; + else if (!strcmp("kptr_user", __btf_name_by_offset(btf, t->name_off))) + type = BPF_KPTR_USER; else return -EINVAL; @@ -3538,6 +3540,7 @@ static int btf_repeat_fields(struct btf_field_info *info, case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: case BPF_LIST_HEAD: case BPF_RB_ROOT: break; @@ -3664,6 +3667,7 @@ static int btf_find_field_one(const struct btf *btf, case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: ret = btf_find_kptr(btf, var_type, off, sz, info_cnt ? &info[0] : &tmp); if (ret < 0) @@ -3988,6 +3992,7 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: ret = btf_parse_kptr(btf, &rec->fields[i], &info_arr[i]); if (ret < 0) goto end; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index bf6c5f685ea2..90a25307480e 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -548,6 +548,7 @@ void btf_record_free(struct btf_record *rec) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: if (rec->fields[i].kptr.module) module_put(rec->fields[i].kptr.module); btf_put(rec->fields[i].kptr.btf); @@ -596,6 +597,7 @@ struct btf_record *btf_record_dup(const struct btf_record *rec) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: btf_get(fields[i].kptr.btf); if (fields[i].kptr.module && !try_module_get(fields[i].kptr.module)) { ret = -ENXIO; From patchwork Wed Aug 7 23:57:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kui-Feng Lee X-Patchwork-Id: 13756872 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C243E145FF5 for ; Wed, 7 Aug 2024 23:58:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075095; cv=none; b=L2h5x16P0dXhH6Hyitf0LJ6CAYNf/Ctqe0A+7cfYri2/NYOvStm0kLvtxCuCCz/ehOT9aG4vXCAJZSHka/Fuv2aCVJTx1VUFIY6J6ktiOaHd4UbzF/UFnH8z5gmSY91EnGqQBBAOYVrOfYXK9fpYdEpwUyLM3TxIBRo2konbi+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075095; c=relaxed/simple; bh=xaPpeOYLyKCfuhJqZU1oGPD7LCzpB9XY4XCYXBEt7cE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=I9mexDYFjc64BrHTqdC6i9jWI3IaRqPwhCDmnXMG8u8sT9QlNXUKF1eMi29FLraJLIzN9g5Z1MVig5SEa+r7SB1j81NxZ1NNek2wPIO7Z/GnTrT6dsu/khsnMs1N+6kXOfN4pdyaxO9tMr0+bz6YJ0Mxxua4JZv+S8tjy1+67ag= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PB8kEQZL; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PB8kEQZL" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-6659e81bc68so4379277b3.0 for ; Wed, 07 Aug 2024 16:58:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723075092; x=1723679892; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R1PiYJcXm7HnHhy2OJM+gjp5dIY0EsUp0P/UTAN/dMw=; b=PB8kEQZLx12H0hGlUmJLvcWqQY1FcpjidE3/3gAKqL3JRfy4CBhXXBhxUgn9LlgP04 iyAzwYzCPt84vXdS4aAhwhuAzQGngn7AvyaGAmLI/YeShClyO3LRf48xM1Sxxqayqnfy eAKpil5Rc/SpwmzdnzDvXokgjSv+IIGmE4t0IVSB98KcY5xXCq+uBsc+y2DXNtpBw9Bb 0Nn5NTCoTv7hBvfrFHB913h2NcCL6xboDjp+DWoYrVpFBCovETEVPCdnggW9XwFKu9LS A3Duni4RXCVXYCOCYPd6ZJBzQN1IdIBCEG/+W++tvnsR8QTauG+koc+Y/xCDdTOOavJU aJSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723075092; x=1723679892; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R1PiYJcXm7HnHhy2OJM+gjp5dIY0EsUp0P/UTAN/dMw=; b=U/bWHfzQj5kM1LUzBDVHxCDOcSviulj9vI8D8MFUPFFVSWd8umr2Wig2BuPjLR6kKK pvmg4WHkIJuSK5HbpMJ/6pKYkHpi0nePEIjYU/hoIP0LDOF8eF86quwNib4fC2VcvRVU Yd6gDXqYc6ppPDjF7q4cgpofE5LKuFH3YWhg45BwavqkEe2wDGqV3KzQE0L9qsCWdKS9 6NnCZSL3+bCP9n4Dn5ss0NIf4WQRBgnt54jHN2O3Dmmri85akbppEh2qnPzUhtftQHzV nsZzIETb5FMqoYxItC70ZdO0MRuWvgC6jTwsaZMCwqXR1+PCVR6H5kDdp2W3tYoWeeO7 KACQ== X-Gm-Message-State: AOJu0YyRwIEWPxCom7z+EGdZ/XktrQCCfyY4gF86SQXiu6Lyh6ZcaniE 3lDMxLvpqVZ/lyPOsQuwxO8EF7V+8Y8htehexaSbH+KvApGOTh+kcA5rWmGl X-Google-Smtp-Source: AGHT+IEtojsf0UoQUsY7MfhRqNfonfZ8FVhMfwDZD8VihmvKktyRtuTU27a6tv2TBsMnaYBqjOEiYg== X-Received: by 2002:a05:690c:2d07:b0:63b:ce21:da7f with SMTP id 00721157ae682-69bf8a91e54mr1013367b3.21.1723075092521; Wed, 07 Aug 2024 16:58:12 -0700 (PDT) Received: from kickker.attlocal.net ([2600:1700:6cf8:1240:fb5f:452b:3dfd:192]) by smtp.gmail.com with ESMTPSA id 00721157ae682-68a0f419358sm21092477b3.26.2024.08.07.16.58.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 16:58:12 -0700 (PDT) From: Kui-Feng Lee To: bpf@vger.kernel.org, ast@kernel.org, martin.lau@linux.dev, song@kernel.org, kernel-team@meta.com, andrii@kernel.org Cc: sinquersw@gmail.com, kuifeng@meta.com, Kui-Feng Lee Subject: [RFC bpf-next 2/5] bpf: Handle BPF_KPTR_USER in verifier. Date: Wed, 7 Aug 2024 16:57:52 -0700 Message-Id: <20240807235755.1435806-3-thinker.li@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240807235755.1435806-1-thinker.li@gmail.com> References: <20240807235755.1435806-1-thinker.li@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Give PTR_MAYBE_NULL | PTR_UNTRUSTED | MEM_ALLOC | NON_OWN_REF to kptr_user to the memory pointed by it readable and writable. Signed-off-by: Kui-Feng Lee --- kernel/bpf/verifier.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index df3be12096cf..84647e599595 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5340,6 +5340,10 @@ static int map_kptr_match_type(struct bpf_verifier_env *env, int perm_flags; const char *reg_name = ""; + if (kptr_field->type == BPF_KPTR_USER) + /* BPF programs should not change any user kptr */ + return -EACCES; + if (btf_is_kernel(reg->btf)) { perm_flags = PTR_MAYBE_NULL | PTR_TRUSTED | MEM_RCU; @@ -5483,6 +5487,12 @@ static u32 btf_ld_kptr_type(struct bpf_verifier_env *env, struct btf_field *kptr ret |= NON_OWN_REF; } else { ret |= PTR_UNTRUSTED; + if (kptr_field->type == BPF_KPTR_USER) + /* In oder to access directly from bpf + * programs. NON_OWN_REF make the memory + * writable. Check check_ptr_to_btf_access(). + */ + ret |= MEM_ALLOC | NON_OWN_REF; } return ret; @@ -5576,6 +5586,7 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno, case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_KPTR_USER: if (src != ACCESS_DIRECT) { verbose(env, "kptr cannot be accessed indirectly by helper\n"); return -EACCES; From patchwork Wed Aug 7 23:57:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kui-Feng Lee X-Patchwork-Id: 13756873 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7CFC146A9B for ; Wed, 7 Aug 2024 23:58:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075096; cv=none; b=aQQ+gf6bpxx1Vg8Xi898MSWwINMY1NjT8SfCWW4VT1Zu1yzcPpbhxMRIgmZguZ3Izi4OhKf/SPTYr47bjy3Ol7nOfHInq2uWdsnPimWVzwekcSsv56dubnrKrL3HIhCd1YCiXiIlc6wgYjHR0Xy/VG0sv3v+i3iHOSF+I2gZ8Pk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075096; c=relaxed/simple; bh=P68YLEm/P9yl8HdRBpBRcrJA0kH2jmY3AITNMIXymzQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WrQIYasn28H4UQr+3OuAtEOrZqYT54vjKMkteV7ER+J1tjoiUQt4Bcbn796xefvAu/zGn6m5Cao/BDcn4C30jfxdDU7RwQsdIdS4kBcru+bOeDuba40ys2m2S0IJOTL6UicTSZAlLnFLv3FnpUJBQFUitXOyW1/FAQT9zTAAv+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=SkWi9S7/; arc=none smtp.client-ip=209.85.128.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SkWi9S7/" Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-65fe1239f12so3657597b3.0 for ; Wed, 07 Aug 2024 16:58:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723075094; x=1723679894; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8ILO1irpuy4ZBhJS8uRjBHRT6KW+WTYMfg8LMN3YOLM=; b=SkWi9S7/cXxoFUBzHJh4OB9EOgZuIoG1d1kV7rCFleAhx19o7kRyTh46JqKfQjKaDb Mn0kqWsGIWRuy2fz+TWFzEKP8h+eqhR7SuTYk8wUMEniLDOHHbWKa44rOf4z8KDc2vwW 77Do3v2kQmAVQi/5NrNxhjXzY3hzSaYtjn6qlv1YYj6LPGcN8WDsIkgXzz+KGTy2QNhK K+lo3aKsFRHzT3yb3lRIDSv5J3ZTy5VXrJnTwwCW6ROTJgUrPJAI8QFl57pibIcdNaK6 o978kmpr3khldQOD2ahT+DRKRckTT6FUJSy2gw70u6u0v73Dp4uOfTejPfvJcmWtpku8 eVkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723075094; x=1723679894; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8ILO1irpuy4ZBhJS8uRjBHRT6KW+WTYMfg8LMN3YOLM=; b=gBQKdOlg4U8wXIq701PWunrGutPN65rzTjhxlbtEs7+RBEYXwNHbEnpLBBp8uKDoO1 p5WbeedFbjhGlB/29qos47lh0e98GbdixeonXIKhtkYYTLaMo8FuQSSn9lhpXcFdaJzW fxMbXyAYgI1rwe5Rn9K+AouFqo6qu6ClC3vcY5M2DFt9xzH8ipe05EEp7Kzpzrl3GRmd PFAXzKBaiWhBj5WEuZro+fp0Z6YifalFSrZxDtBys0rBWD9tJMhq6paG6GVBTfdAQwS5 qL2sfh53NKRFXcrllSgdvQBSm3saibYZRe9O++e1Lr9VwJi62C0wFg8B6UdP9zeV93+g T/3Q== X-Gm-Message-State: AOJu0YxnQ4DgivJeeZd26d9DbwS1ZFSLUSPz4q7KXIqHtwqKoh2gh6aK 3BHAqNQpjJ0jsCiihWE/NlKlmz41VvxvfR5bvT9YI02axwQ/j7jlNOdC3UjB X-Google-Smtp-Source: AGHT+IFRg6iCcVtsZP8FePjw3RdwCk5PgGKkT3sFqbdHkmWFNE8eDEswYRzTHFC23aef9t0KK8vVFg== X-Received: by 2002:a05:690c:7006:b0:64a:f40d:5fd2 with SMTP id 00721157ae682-69bf7b1c0eamr1152697b3.12.1723075093712; Wed, 07 Aug 2024 16:58:13 -0700 (PDT) Received: from kickker.attlocal.net ([2600:1700:6cf8:1240:fb5f:452b:3dfd:192]) by smtp.gmail.com with ESMTPSA id 00721157ae682-68a0f419358sm21092477b3.26.2024.08.07.16.58.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 16:58:13 -0700 (PDT) From: Kui-Feng Lee To: bpf@vger.kernel.org, ast@kernel.org, martin.lau@linux.dev, song@kernel.org, kernel-team@meta.com, andrii@kernel.org Cc: sinquersw@gmail.com, kuifeng@meta.com, Kui-Feng Lee , linux-mm@kvack.org Subject: [RFC bpf-next 3/5] bpf: pin, translate, and unpin __kptr_user from syscalls. Date: Wed, 7 Aug 2024 16:57:53 -0700 Message-Id: <20240807235755.1435806-4-thinker.li@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240807235755.1435806-1-thinker.li@gmail.com> References: <20240807235755.1435806-1-thinker.li@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC User kptrs are pinned, by pin_user_pages_fast(), and translated to an address in the kernel when the value is updated by user programs. (Call bpf_map_update_elem() from user programs.) And, the pinned pages are unpinned if the value of user kptrs are overritten or if the values of maps are deleted/destroyed. The pages are mapped through vmap() in order to get a continuous space in the kernel if the memory pointed by a user kptr resides in two or more pages. For the case of single page, page_address() is called to get the address of a page in the kernel. User kptr is only supported by task storage maps. One user kptr can pin at most KPTR_USER_MAX_PAGES(16) physical pages. This is a random picked number for safety. We actually can remove this restriction totally. User kptrs could only be set by user programs through syscalls. Any attempts of updating the value of a map with __kptr_user in it should ignore the values of user kptrs from BPF programs. The values of user kptrs will keep as they were if the new values are from BPF programs, not from user programs. Cc: linux-mm@kvack.org Signed-off-by: Kui-Feng Lee --- include/linux/bpf.h | 35 +++++- include/linux/bpf_local_storage.h | 2 +- kernel/bpf/bpf_local_storage.c | 18 +-- kernel/bpf/helpers.c | 12 +- kernel/bpf/local_storage.c | 2 +- kernel/bpf/syscall.c | 177 +++++++++++++++++++++++++++++- net/core/bpf_sk_storage.c | 2 +- 7 files changed, 227 insertions(+), 21 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 87d5f98249e2..f4ad0bc183cb 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -30,6 +30,7 @@ #include #include #include +#include struct bpf_verifier_env; struct bpf_verifier_log; @@ -477,10 +478,12 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size) data_race(*ldst++ = *lsrc++); } +void bpf_obj_unpin_uaddr(const struct btf_field *field, void *addr); + /* copy everything but bpf_spin_lock, bpf_timer, and kptrs. There could be one of each. */ static inline void bpf_obj_memcpy(struct btf_record *rec, void *dst, void *src, u32 size, - bool long_memcpy) + bool long_memcpy, bool from_user) { u32 curr_off = 0; int i; @@ -496,21 +499,40 @@ static inline void bpf_obj_memcpy(struct btf_record *rec, for (i = 0; i < rec->cnt; i++) { u32 next_off = rec->fields[i].offset; u32 sz = next_off - curr_off; + void *addr; memcpy(dst + curr_off, src + curr_off, sz); + if (from_user && rec->fields[i].type == BPF_KPTR_USER) { + /* Unpin old address. + * + * Alignments are guaranteed by btf_find_field_one(). + */ + addr = *(void **)(dst + next_off); + if (virt_addr_valid(addr)) + bpf_obj_unpin_uaddr(&rec->fields[i], addr); + else if (addr) + WARN_ON_ONCE(1); + + *(void **)(dst + next_off) = *(void **)(src + next_off); + } curr_off += rec->fields[i].size + sz; } memcpy(dst + curr_off, src + curr_off, size - curr_off); } +static inline void copy_map_value_user(struct bpf_map *map, void *dst, void *src, bool from_user) +{ + bpf_obj_memcpy(map->record, dst, src, map->value_size, false, from_user); +} + static inline void copy_map_value(struct bpf_map *map, void *dst, void *src) { - bpf_obj_memcpy(map->record, dst, src, map->value_size, false); + bpf_obj_memcpy(map->record, dst, src, map->value_size, false, false); } static inline void copy_map_value_long(struct bpf_map *map, void *dst, void *src) { - bpf_obj_memcpy(map->record, dst, src, map->value_size, true); + bpf_obj_memcpy(map->record, dst, src, map->value_size, true, false); } static inline void bpf_obj_memzero(struct btf_record *rec, void *dst, u32 size) @@ -538,6 +560,8 @@ static inline void zero_map_value(struct bpf_map *map, void *dst) bpf_obj_memzero(map->record, dst, map->value_size); } +void copy_map_value_locked_user(struct bpf_map *map, void *dst, void *src, + bool lock_src, bool from_user); void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, bool lock_src); void bpf_timer_cancel_and_free(void *timer); @@ -775,6 +799,11 @@ enum bpf_arg_type { }; static_assert(__BPF_ARG_TYPE_MAX <= BPF_BASE_TYPE_LIMIT); +#define BPF_MAP_UPDATE_FLAG_BITS 3 +enum bpf_map_update_flag { + BPF_FROM_USER = BIT(0 + BPF_MAP_UPDATE_FLAG_BITS) +}; + /* type of values returned from helper functions */ enum bpf_return_type { RET_INTEGER, /* function returns integer */ diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h index dcddb0aef7d8..d337df68fa23 100644 --- a/include/linux/bpf_local_storage.h +++ b/include/linux/bpf_local_storage.h @@ -181,7 +181,7 @@ void bpf_selem_link_map(struct bpf_local_storage_map *smap, struct bpf_local_storage_elem * bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, void *value, - bool charge_mem, gfp_t gfp_flags); + bool charge_mem, gfp_t gfp_flags, bool from_user); void bpf_selem_free(struct bpf_local_storage_elem *selem, struct bpf_local_storage_map *smap, diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index c938dea5ddbf..c4cf09e27a19 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -73,7 +73,7 @@ static bool selem_linked_to_map(const struct bpf_local_storage_elem *selem) struct bpf_local_storage_elem * bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, - void *value, bool charge_mem, gfp_t gfp_flags) + void *value, bool charge_mem, gfp_t gfp_flags, bool from_user) { struct bpf_local_storage_elem *selem; @@ -100,7 +100,7 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, if (selem) { if (value) - copy_map_value(&smap->map, SDATA(selem)->data, value); + copy_map_value_user(&smap->map, SDATA(selem)->data, value, from_user); /* No need to call check_and_init_map_value as memory is zero init */ return selem; } @@ -530,9 +530,11 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, struct bpf_local_storage_elem *alloc_selem, *selem = NULL; struct bpf_local_storage *local_storage; unsigned long flags; + bool from_user = map_flags & BPF_FROM_USER; int err; /* BPF_EXIST and BPF_NOEXIST cannot be both set */ + map_flags &= ~BPF_FROM_USER; if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST) || /* BPF_F_LOCK can only be used in a value with spin_lock */ unlikely((map_flags & BPF_F_LOCK) && @@ -550,7 +552,7 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, if (err) return ERR_PTR(err); - selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags); + selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags, from_user); if (!selem) return ERR_PTR(-ENOMEM); @@ -575,8 +577,8 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, if (err) return ERR_PTR(err); if (old_sdata && selem_linked_to_storage_lockless(SELEM(old_sdata))) { - copy_map_value_locked(&smap->map, old_sdata->data, - value, false); + copy_map_value_locked_user(&smap->map, old_sdata->data, + value, false, from_user); return old_sdata; } } @@ -584,7 +586,7 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, /* A lookup has just been done before and concluded a new selem is * needed. The chance of an unnecessary alloc is unlikely. */ - alloc_selem = selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags); + alloc_selem = selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags, from_user); if (!alloc_selem) return ERR_PTR(-ENOMEM); @@ -607,8 +609,8 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, goto unlock; if (old_sdata && (map_flags & BPF_F_LOCK)) { - copy_map_value_locked(&smap->map, old_sdata->data, value, - false); + copy_map_value_locked_user(&smap->map, old_sdata->data, value, + false, from_user); selem = SELEM(old_sdata); goto unlock; } diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index d02ae323996b..4aef86209fdd 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -372,8 +372,8 @@ const struct bpf_func_proto bpf_spin_unlock_proto = { .arg1_btf_id = BPF_PTR_POISON, }; -void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, - bool lock_src) +void copy_map_value_locked_user(struct bpf_map *map, void *dst, void *src, + bool lock_src, bool from_user) { struct bpf_spin_lock *lock; @@ -383,11 +383,17 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, lock = dst + map->record->spin_lock_off; preempt_disable(); __bpf_spin_lock_irqsave(lock); - copy_map_value(map, dst, src); + copy_map_value_user(map, dst, src, from_user); __bpf_spin_unlock_irqrestore(lock); preempt_enable(); } +void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, + bool lock_src) +{ + copy_map_value_locked_user(map, dst, src, lock_src, false); +} + BPF_CALL_0(bpf_jiffies64) { return get_jiffies_64(); diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index 3969eb0382af..62a12fa8ce9e 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -147,7 +147,7 @@ static long cgroup_storage_update_elem(struct bpf_map *map, void *key, struct bpf_cgroup_storage *storage; struct bpf_storage_buffer *new; - if (unlikely(flags & ~(BPF_F_LOCK | BPF_EXIST))) + if (unlikely(flags & ~BPF_F_LOCK)) return -EINVAL; if (unlikely((flags & BPF_F_LOCK) && diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 90a25307480e..eaa2a9d13265 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -155,8 +155,134 @@ static void maybe_wait_bpf_programs(struct bpf_map *map) synchronize_rcu(); } -static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, - void *key, void *value, __u64 flags) +static void *trans_addr_pages(struct page **pages, int npages) +{ + if (npages == 1) + return page_address(pages[0]); + /* For multiple pages, we need to use vmap() to get a contiguous + * virtual address range. + */ + return vmap(pages, npages, VM_MAP, PAGE_KERNEL); +} + +#define KPTR_USER_MAX_PAGES 16 + +static int bpf_obj_trans_pin_uaddr(struct btf_field *field, void **addr) +{ + const struct btf_type *t; + struct page *pages[KPTR_USER_MAX_PAGES]; + void *ptr, *kern_addr; + u32 type_id, tsz; + int r, npages; + + ptr = *addr; + type_id = field->kptr.btf_id; + t = btf_type_id_size(field->kptr.btf, &type_id, &tsz); + if (!t) + return -EINVAL; + if (tsz == 0) { + *addr = NULL; + return 0; + } + + npages = (((intptr_t)ptr + tsz + ~PAGE_MASK) - + ((intptr_t)ptr & PAGE_MASK)) >> PAGE_SHIFT; + if (npages > KPTR_USER_MAX_PAGES) + return -E2BIG; + r = pin_user_pages_fast((intptr_t)ptr & PAGE_MASK, npages, 0, pages); + if (r != npages) + return -EINVAL; + kern_addr = trans_addr_pages(pages, npages); + if (!kern_addr) + return -ENOMEM; + *addr = kern_addr + ((intptr_t)ptr & ~PAGE_MASK); + return 0; +} + +void bpf_obj_unpin_uaddr(const struct btf_field *field, void *addr) +{ + struct page *pages[KPTR_USER_MAX_PAGES]; + int npages, i; + u32 size, type_id; + void *ptr; + + type_id = field->kptr.btf_id; + btf_type_id_size(field->kptr.btf, &type_id, &size); + if (size == 0) + return; + + ptr = (void *)((intptr_t)addr & PAGE_MASK); + npages = (((intptr_t)addr + size + ~PAGE_MASK) - (intptr_t)ptr) >> PAGE_SHIFT; + for (i = 0; i < npages; i++) { + pages[i] = virt_to_page(ptr); + ptr += PAGE_SIZE; + } + if (npages > 1) + /* Paired with vmap() in trans_addr_pages() */ + vunmap((void *)((intptr_t)addr & PAGE_MASK)); + unpin_user_pages(pages, npages); +} + +static int bpf_obj_trans_pin_uaddrs(struct btf_record *rec, void *src, u32 size) +{ + u32 next_off; + int i, err; + + if (IS_ERR_OR_NULL(rec)) + return 0; + + if (!btf_record_has_field(rec, BPF_KPTR_USER)) + return 0; + + for (i = 0; i < rec->cnt; i++) { + if (rec->fields[i].type != BPF_KPTR_USER) + continue; + + next_off = rec->fields[i].offset; + if (next_off + sizeof(void *) > size) + return -EINVAL; + err = bpf_obj_trans_pin_uaddr(&rec->fields[i], src + next_off); + if (!err) + continue; + + /* Rollback */ + for (i--; i >= 0; i--) { + if (rec->fields[i].type != BPF_KPTR_USER) + continue; + next_off = rec->fields[i].offset; + bpf_obj_unpin_uaddr(&rec->fields[i], *(void **)(src + next_off)); + *(void **)(src + next_off) = NULL; + } + + return err; + } + + return 0; +} + +static void bpf_obj_unpin_uaddrs(struct btf_record *rec, void *src) +{ + u32 next_off; + int i; + + if (IS_ERR_OR_NULL(rec)) + return; + + if (!btf_record_has_field(rec, BPF_KPTR_USER)) + return; + + for (i = 0; i < rec->cnt; i++) { + if (rec->fields[i].type != BPF_KPTR_USER) + continue; + + next_off = rec->fields[i].offset; + bpf_obj_unpin_uaddr(&rec->fields[i], *(void **)(src + next_off)); + *(void **)(src + next_off) = NULL; + } +} + +static int bpf_map_update_value_inner(struct bpf_map *map, struct file *map_file, + void *key, void *value, __u64 flags) { int err; @@ -208,6 +334,29 @@ static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, return err; } +static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, + void *key, void *value, __u64 flags) +{ + int err; + + if (flags & BPF_FROM_USER) { + /* Pin user memory can lead to context switch, so we need + * to do it before potential RCU lock. + */ + err = bpf_obj_trans_pin_uaddrs(map->record, value, + bpf_map_value_size(map)); + if (err) + return err; + } + + err = bpf_map_update_value_inner(map, map_file, key, value, flags); + + if (err && (flags & BPF_FROM_USER)) + bpf_obj_unpin_uaddrs(map->record, value); + + return err; +} + static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, __u64 flags) { @@ -714,6 +863,11 @@ void bpf_obj_free_fields(const struct btf_record *rec, void *obj) field->kptr.dtor(xchgd_field); } break; + case BPF_KPTR_USER: + if (virt_addr_valid(*(void **)field_ptr)) + bpf_obj_unpin_uaddr(field, *(void **)field_ptr); + *(void **)field_ptr = NULL; + break; case BPF_LIST_HEAD: if (WARN_ON_ONCE(rec->spin_lock_off < 0)) continue; @@ -1155,6 +1309,12 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token, goto free_map_tab; } break; + case BPF_KPTR_USER: + if (map->map_type != BPF_MAP_TYPE_TASK_STORAGE) { + ret = -EOPNOTSUPP; + goto free_map_tab; + } + break; case BPF_LIST_HEAD: case BPF_RB_ROOT: if (map->map_type != BPF_MAP_TYPE_HASH && @@ -1618,11 +1778,15 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr) struct bpf_map *map; void *key, *value; u32 value_size; + u64 extra_flags = 0; struct fd f; int err; if (CHECK_ATTR(BPF_MAP_UPDATE_ELEM)) return -EINVAL; + /* Prevent userspace from setting any internal flags */ + if (attr->flags & ~(BIT(BPF_MAP_UPDATE_FLAG_BITS) - 1)) + return -EINVAL; f = fdget(ufd); map = __bpf_map_get(f); @@ -1653,7 +1817,9 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr) goto free_key; } - err = bpf_map_update_value(map, f.file, key, value, attr->flags); + if (map->map_type == BPF_MAP_TYPE_TASK_STORAGE) + extra_flags |= BPF_FROM_USER; + err = bpf_map_update_value(map, f.file, key, value, attr->flags | extra_flags); if (!err) maybe_wait_bpf_programs(map); @@ -1852,6 +2018,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file, void __user *keys = u64_to_user_ptr(attr->batch.keys); u32 value_size, cp, max_count; void *key, *value; + u64 extra_flags = 0; int err = 0; if (attr->batch.elem_flags & ~BPF_F_LOCK) @@ -1881,6 +2048,8 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file, return -ENOMEM; } + if (map->map_type == BPF_MAP_TYPE_TASK_STORAGE) + extra_flags |= BPF_FROM_USER; for (cp = 0; cp < max_count; cp++) { err = -EFAULT; if (copy_from_user(key, keys + cp * map->key_size, @@ -1889,7 +2058,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file, break; err = bpf_map_update_value(map, map_file, key, value, - attr->batch.elem_flags); + attr->batch.elem_flags | extra_flags); if (err) break; diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c index bc01b3aa6b0f..db5281384e6a 100644 --- a/net/core/bpf_sk_storage.c +++ b/net/core/bpf_sk_storage.c @@ -137,7 +137,7 @@ bpf_sk_storage_clone_elem(struct sock *newsk, { struct bpf_local_storage_elem *copy_selem; - copy_selem = bpf_selem_alloc(smap, newsk, NULL, true, GFP_ATOMIC); + copy_selem = bpf_selem_alloc(smap, newsk, NULL, true, GFP_ATOMIC, false); if (!copy_selem) return NULL; From patchwork Wed Aug 7 23:57:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kui-Feng Lee X-Patchwork-Id: 13756874 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-yw1-f174.google.com (mail-yw1-f174.google.com [209.85.128.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6D0C156F46 for ; Wed, 7 Aug 2024 23:58:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075097; cv=none; b=HfUDVNZ/CIeOoXkjjtdrDHAidHXVLikSLfsU50SgTI/im9ipqVtTXSuYhHyIzcYag3FezC0mwP7U0KO9c1H2cn4AB6SiyqUL7emO5rFabwPibqWMqLQfEsv+DOhdpxZ1DIhoa60qW2lTE3FlUQWpeQDtx/mYPV4/7YifPCoyPQ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075097; c=relaxed/simple; bh=VkEyMaYTV4M7Q5zy3FH3sEz35Wgh9L73d6JtaKql0Mg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Zng9puNYVz7DChzlqKEeWONNuVTDyJ0DG0Q2mzyzYipYSERHc4teFpSsHHYKoOjIa9Q9o8rvl2+O6whDgzFn77cRy0n8VyRVuKJcTHJ4ndZT8vwI+fqL2duXBud13QjAvLoG5TEu5aXvDC7ujcvQABlFR3IZ3qaYeiOzReLpAbU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kUTaH/cN; arc=none smtp.client-ip=209.85.128.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kUTaH/cN" Received: by mail-yw1-f174.google.com with SMTP id 00721157ae682-66599ca3470so3857337b3.2 for ; Wed, 07 Aug 2024 16:58:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723075095; x=1723679895; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H7qVad8YPhHo/SQc4ddqy/bJK9f8K0WBE0WOqtypkrs=; b=kUTaH/cNrSxFNMxfekb7Z3NX56kwXTBF4jtMrP/PbKGVKNrDipLY5icFb+fvWc0DI+ EUN/3jgkZY8L88cicUvn3dOeuBzxYhuEzzmxCWZPbZiZ+zRxgKauCSuNPRD/dT7KQw3P R5MMuEYekoyxTgU7EbhxIsnri8GP+G1LHFOiebIqYdlLuV+3V4Wy1t884AB0YyDyeLWr hx7Tqoj5fhe7TrDvCVoKKlLCvmQL5DmX90t0OaaEec4jk0v/yXGcJbChclVWJMSeU2rU UN5J2XE5pTJn7yfKFrkRDz3YIz0I8tMA4GnnTAbCP6KD9jnqVBrS5Pn/2ojmgLuZxenD 9pFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723075095; x=1723679895; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H7qVad8YPhHo/SQc4ddqy/bJK9f8K0WBE0WOqtypkrs=; b=BwHjkgUQhUNn3VT793QL7JOdoqMgjI0tXRBoFF0amwhUXW7Rf5UY9cqrDMmAQRf7An WvpW7Q3aZETiivi+U4TnJBLxCHTo/x2jpxFsQJmaUFbFzk/bGfC9E2D4HkAGJ4TfN2mU /fvsJ8WAbxLb9ElCz9vSCUhuhYsBp26uyzz4Fd1xTvaK+zaokBFQc6zCLYXwxfYcj9Jk cA0a7/gHb/b42wCmK80qOAkCmFtRfQ3H6o6+CeNxq2nZ+Wqc5VehJEnPHo7f6KSkOa4/ FwyvZOS0C2rOgB9DqwNX6b1vh8RmfZXZKOhI78lTrxxEPj5uAuMbFh212D8+aWGF9LbQ w/KA== X-Gm-Message-State: AOJu0YxjJVAqRQmdgNXgI0DXMQCjCJ3zpNuFWzMBBLaL/d5Yaq571fvs rMK33ZoLp1XRrf35nWCdUg66onioZoxnrOXAnw/A6PArd9V0GMQuWvRcT3F3 X-Google-Smtp-Source: AGHT+IHvn7KpJls2+Cv6GKIJ2NU8/RLMwPdL9QjNkYBCWlo0TowGWozDrfgP9opl07bpC3j0cnVLUg== X-Received: by 2002:a05:690c:4a10:b0:665:b351:25e7 with SMTP id 00721157ae682-69bf77341ddmr1054777b3.14.1723075094875; Wed, 07 Aug 2024 16:58:14 -0700 (PDT) Received: from kickker.attlocal.net ([2600:1700:6cf8:1240:fb5f:452b:3dfd:192]) by smtp.gmail.com with ESMTPSA id 00721157ae682-68a0f419358sm21092477b3.26.2024.08.07.16.58.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 16:58:14 -0700 (PDT) From: Kui-Feng Lee To: bpf@vger.kernel.org, ast@kernel.org, martin.lau@linux.dev, song@kernel.org, kernel-team@meta.com, andrii@kernel.org Cc: sinquersw@gmail.com, kuifeng@meta.com, Kui-Feng Lee Subject: [RFC bpf-next 4/5] libbpf: define __kptr_user. Date: Wed, 7 Aug 2024 16:57:54 -0700 Message-Id: <20240807235755.1435806-5-thinker.li@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240807235755.1435806-1-thinker.li@gmail.com> References: <20240807235755.1435806-1-thinker.li@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Make __kptr_user available to BPF programs to enable them to define user kptrs. Signed-off-by: Kui-Feng Lee --- tools/lib/bpf/bpf_helpers.h | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 305c62817dd3..8f7fb00b90e3 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -185,6 +185,7 @@ enum libbpf_tristate { #define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted"))) #define __kptr __attribute__((btf_type_tag("kptr"))) #define __percpu_kptr __attribute__((btf_type_tag("percpu_kptr"))) +#define __kptr_user __attribute__((btf_type_tag("kptr_user"))) #if defined (__clang__) #define bpf_ksym_exists(sym) ({ \ From patchwork Wed Aug 7 23:57:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kui-Feng Lee X-Patchwork-Id: 13756875 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5343B158A18 for ; Wed, 7 Aug 2024 23:58:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075099; cv=none; b=kePvFzbHejgMumWrBKKz9XZyAWhK0TZht6Ey+hz8h5nF/8jhFYPxERfdKei75yjRt+S41qQagsj27EC1Iy96gFsZEPFrQ2jWJHWkMiCx2500UwsrJIIDSfrBLtzwHCcKb9v9JMcy/hAHPXkmvPRmtGzZcCOMRRDISukvjTwxnnQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723075099; c=relaxed/simple; bh=xOPPAnmBnnTYEPmny61AKWzV6d0fDUh1VxhzDd9NUyM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Zu/rjerAaCkMUxGZdcpfHWiTXGxOnthSMm6Lwu2CCDPySGo/Z6yetD+EGW3OpdhmHJ6TLK3tuDTKd6k4TT/h75SIcOhc+fUClzLqxOuksBf2Eineku8knmaqM2+acwNgacs0BwlJF5KfImfG1VCa4hH8w3kcU7rCbyeNDgMbUUo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Xxb24ku/; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Xxb24ku/" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-6510c0c8e29so3751507b3.0 for ; Wed, 07 Aug 2024 16:58:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723075096; x=1723679896; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OXVoBVK5ETwjgiUjFSj90XM56GxfkhTLt3oq18UAJak=; b=Xxb24ku/miRIld/dUB7Z9JwWA7hDwQTQeXvAutkCsKGUdNUhuCIDuBKng1ZDcS86W4 3MHfh88yafa/pwGyE4hpVcgsASKlyxKKROPDHFtgoRdeFSt16bMKCMSV8gfEjhKckNZ/ S16G+NDeYPnuWKHjcyIxtWUDHq6SoEG4EGhktFx7Zg6p3bG82DLjhhFNypZgmapMlrVg ceXhanVabkCQViMaki7vFwR9/GAFrCJMrqS4J4RdOC42zqSO/N5HzVuAKWYRiHxoxWoJ spOJFc4IaXA8LugAP9f7xCut78lqCEhQU8wMWAxC7GUXBXzfAnO9wHivWifZ/RlrhX4M SebQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723075096; x=1723679896; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OXVoBVK5ETwjgiUjFSj90XM56GxfkhTLt3oq18UAJak=; b=FOXalAKYjhjkPB0LhcWsAhbOKdnZrMroICIMEuy0nXaiKYqyYjCfXlziItrOeAm5q+ f3jqBcj/J9WVss88VKuU0i9d+8lSdGvf8KJVjBuMrN1V0nKZKFPaPVd2j+KPF8wu/rdS 6X9z/21oG5ba8gykzUOLReckICrCf9i4EfKkkYtjwNrwCr8g0Dn3t0sPBu4+Oy0Ird1p srxEPRNku3mBqtlwQWwbjxZGEyGFe9dpeMEcbJ4BsJQL4kYwRSN4vX8ef3zFyIqF8fHV E/gOMzU2BcRRdvfKbv0XnbTo0gSI1BHaZbo2diSKl+cYdaL6ASpxdLoxIQUmxt4PlaCy LSOQ== X-Gm-Message-State: AOJu0YybSgbM0dL4kN1EofD4mPODJLONb3viNm2xuH8zpHxOtdWZYPrN 5y5tRRM0gFRc5pwOWcv0qKWBAD84LA+2utysksC+G8GYgjhB9yqYm/deitL3 X-Google-Smtp-Source: AGHT+IFD+VL0nN+sETog0WZnkzpJ/ugkkZ2JiG4MGO+rBbdLDgN3+NeY+FdIauZPTcpLH1qeIHtBGQ== X-Received: by 2002:a05:690c:7209:b0:645:8fb:71c8 with SMTP id 00721157ae682-69bfb5f8f46mr1063927b3.37.1723075095939; Wed, 07 Aug 2024 16:58:15 -0700 (PDT) Received: from kickker.attlocal.net ([2600:1700:6cf8:1240:fb5f:452b:3dfd:192]) by smtp.gmail.com with ESMTPSA id 00721157ae682-68a0f419358sm21092477b3.26.2024.08.07.16.58.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 16:58:15 -0700 (PDT) From: Kui-Feng Lee To: bpf@vger.kernel.org, ast@kernel.org, martin.lau@linux.dev, song@kernel.org, kernel-team@meta.com, andrii@kernel.org Cc: sinquersw@gmail.com, kuifeng@meta.com, Kui-Feng Lee Subject: [RFC bpf-next 5/5] selftests/bpf: test __kptr_user on the value of a task storage map. Date: Wed, 7 Aug 2024 16:57:55 -0700 Message-Id: <20240807235755.1435806-6-thinker.li@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240807235755.1435806-1-thinker.li@gmail.com> References: <20240807235755.1435806-1-thinker.li@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net X-Patchwork-State: RFC Make sure the memory of user kptrs have been mapped to the kernel properly. Also ensure the values of user kptrs in the kernel haven't been copied to userspace. Signed-off-by: Kui-Feng Lee --- .../bpf/prog_tests/task_local_storage.c | 122 ++++++++++++++++++ .../selftests/bpf/progs/task_ls_kptr_user.c | 72 +++++++++++ 2 files changed, 194 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/task_ls_kptr_user.c diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index c33c05161a9e..17221024fb28 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -5,8 +5,10 @@ #include #include #include +#include #include /* For SYS_xxx definitions */ #include +#include #include #include "task_local_storage_helpers.h" #include "task_local_storage.skel.h" @@ -14,6 +16,21 @@ #include "task_ls_recursion.skel.h" #include "task_storage_nodeadlock.skel.h" +struct user_data { + int a; + int b; + int result; +}; + +struct value_type { + struct user_data *udata_mmap; + struct user_data *udata; +}; + +#define MAGIC_VALUE 0xabcd1234 + +#include "task_ls_kptr_user.skel.h" + static void test_sys_enter_exit(void) { struct task_local_storage *skel; @@ -40,6 +57,109 @@ static void test_sys_enter_exit(void) task_local_storage__destroy(skel); } +static void test_kptr_user(void) +{ + struct user_data user_data = { .a = 1, .b = 2, .result = 0 }; + struct user_data user_data_mmap_v = { .a = 7, .b = 7 }; + struct task_ls_kptr_user *skel = NULL; + struct user_data *user_data_mmap; + int task_fd = -1, ev_fd = -1; + struct value_type value; + pid_t pid; + int err, wstatus; + __u64 dummy = 1; + + user_data_mmap = mmap(NULL, sizeof(*user_data_mmap), PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (!ASSERT_NEQ(user_data_mmap, MAP_FAILED, "mmap")) + return; + + memcpy(user_data_mmap, &user_data_mmap_v, sizeof(*user_data_mmap)); + value.udata_mmap = user_data_mmap; + value.udata = &user_data; + + task_fd = sys_pidfd_open(getpid(), 0); + if (!ASSERT_NEQ(task_fd, -1, "sys_pidfd_open")) + goto out; + + ev_fd = eventfd(0, 0); + if (!ASSERT_NEQ(ev_fd, -1, "eventfd")) + goto out; + + skel = task_ls_kptr_user__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) + goto out; + + err = bpf_map_update_elem(bpf_map__fd(skel->maps.datamap), &task_fd, &value, 0); + if (!ASSERT_OK(err, "update_datamap")) + exit(1); + + /* Make sure the BPF program can access the user_data_mmap even if + * it's munmapped already. + */ + munmap(user_data_mmap, sizeof(*user_data_mmap)); + user_data_mmap = NULL; + + err = task_ls_kptr_user__attach(skel); + if (!ASSERT_OK(err, "skel_attach")) + goto out; + + fflush(stdout); + fflush(stderr); + + pid = fork(); + if (pid < 0) + goto out; + + /* Call syscall in the child process, but access the map value of + * the parent process in the BPF program to check if the user kptr + * is translated/mapped correctly. + */ + if (pid == 0) { + /* child */ + + /* Overwrite the user_data in the child process to check if + * the BPF program accesses the user_data of the parent. + */ + user_data.a = 0; + user_data.b = 0; + + /* Wait for the parent to set child_pid */ + read(ev_fd, &dummy, sizeof(dummy)); + + exit(0); + } + + skel->bss->parent_pid = syscall(SYS_gettid); + skel->bss->child_pid = pid; + + write(ev_fd, &dummy, sizeof(dummy)); + + err = waitpid(pid, &wstatus, 0); + ASSERT_EQ(err, pid, "waitpid"); + skel->bss->child_pid = 0; + + ASSERT_EQ(MAGIC_VALUE + user_data.a + user_data.b + + user_data_mmap_v.a + user_data_mmap_v.b, + user_data.result, "result"); + + /* Check if user programs can access the value of user kptrs + * through bpf_map_lookup_elem(). Make sure the kernel value is not + * leaked. + */ + err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.datamap), &task_fd, &value); + if (!ASSERT_OK(err, "bpf_map_lookup_elem")) + goto out; + ASSERT_EQ(value.udata_mmap, NULL, "lookup_udata_mmap"); + ASSERT_EQ(value.udata, NULL, "lookup_udata"); + +out: + task_ls_kptr_user__destroy(skel); + close(ev_fd); + close(task_fd); + munmap(user_data_mmap, sizeof(*user_data_mmap)); +} + static void test_exit_creds(void) { struct task_local_storage_exit_creds *skel; @@ -237,4 +357,6 @@ void test_task_local_storage(void) test_recursion(); if (test__start_subtest("nodeadlock")) test_nodeadlock(); + if (test__start_subtest("kptr_user")) + test_kptr_user(); } diff --git a/tools/testing/selftests/bpf/progs/task_ls_kptr_user.c b/tools/testing/selftests/bpf/progs/task_ls_kptr_user.c new file mode 100644 index 000000000000..ff5ca3a5da1e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/task_ls_kptr_user.c @@ -0,0 +1,72 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#include "vmlinux.h" +#include +#include +#include "task_kfunc_common.h" + +char _license[] SEC("license") = "GPL"; + +struct user_data { + int a; + int b; + int result; +}; + +struct value_type { + struct user_data __kptr_user *udata_mmap; + struct user_data __kptr_user *udata; +}; + +struct { + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, struct value_type); +} datamap SEC(".maps"); + +#define MAGIC_VALUE 0xabcd1234 + +/* This is a workaround to avoid clang generating a forward reference for + * struct user_data. This is a known issue and will be fixed in the future. + */ +struct user_data __dummy; + +pid_t child_pid = 0; +pid_t parent_pid = 0; + +SEC("tp_btf/sys_enter") +int BPF_PROG(on_enter, struct pt_regs *regs, long id) +{ + struct task_struct *task, *data_task; + struct value_type *ptr; + struct user_data *udata; + int acc; + + task = bpf_get_current_task_btf(); + if (task->pid != child_pid) + return 0; + + data_task = bpf_task_from_pid(parent_pid); + if (!data_task) + return 0; + + ptr = bpf_task_storage_get(&datamap, data_task, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + bpf_task_release(data_task); + if (!ptr) + return 0; + + udata = ptr->udata_mmap; + if (!udata) + return 0; + acc = udata->a + udata->b; + + udata = ptr->udata; + if (!udata) + return 0; + udata->result = MAGIC_VALUE + udata->a + udata->b + acc; + + return 0; +}