From patchwork Tue Feb 6 22:04:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547819 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A39B21BC3C for ; Tue, 6 Feb 2024 22:04:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257092; cv=none; b=ngVk+yZVOO1fyyGDk/1QExE1WKEdX1pdBOd8Zv1rbX+PkQRL29KJRYDBxbiVwq1TVe+0PKLMF1C105fAOTt+sTsqKmVp5/JybS3IivpTQ6edCpjxtcYK+eNWGxLKFj/WUVUIPgtrGtzSR/awqkYPoRfhQcRl8IFJZpL+sEHrrb0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257092; c=relaxed/simple; bh=qGKZjyE8pLTNZYk7vAnLaSt08GEVKgE4wxNq3AWKiJA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mDHFXdiGp7pYvWOeESD9p26+lM96pROC5KIZr7gvxI/UABKftMqX1ZdHDxQOAREb2gS8twTWXzPZtoN5W7vzPMi3H37pJfmWxCdXDlIsGywCEnDRQLM8XQXMITogNW2IQ0Poa9qJMVWTFPPLUVlg7ffKVfpOe7GWTx92IrCwEDc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZsC3nEWw; arc=none smtp.client-ip=209.85.210.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZsC3nEWw" Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-6de3141f041so24703b3a.0 for ; Tue, 06 Feb 2024 14:04:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257089; x=1707861889; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vTdQC2aUXQJJJCr+aCpdI86HWrK6Y2hTgBNTwS7bp1E=; b=ZsC3nEWw4HgEb7XupCRXJt3Z3xK/bLBEz4bIRtA+y0rGp8S5zhqBftYk6Yiubtcrcv u57OYXCYJbKHjudF7jjx9ZGZbhEUXF5JhCNwag9zG2Lenkh1fO0Y3abM6NqhM8Rl0AYA qjLp8h/b0Arkc7r6pwAzXTE8ggWDSU2YjuKFIq0+LzFTIKeBgdAj/vsaSgkMPzt+Bjzs WglRDXJoggiasVIvLXKA9AEzrr811iFgpgHcVIHxPH62vQARtVwDOwUdtOQ4Azt/oQLM aUp+cvFhnhk4pDH5dx7MVPr0O+8L3g2kHOUtXOTzV1HEnjR+mn49mat3PXfzV1OsPKvH RJdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257089; x=1707861889; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vTdQC2aUXQJJJCr+aCpdI86HWrK6Y2hTgBNTwS7bp1E=; b=BLftxSihoE0dRH9U5gxr0vReT+CQqRhw68GqAhBwX+hMzd8/10IERGIVy5waoWG+Nm FXtuYkPsC7kplYRRuwiPXbMuyEP3KuejX21VOjZtqH8a+2nnfmBAxZ4gfKQZPqNtrJu7 SJXboEX+xrj4FbB6jaTuTK+vFtRZL5a8beUt63Q+f8qP/+R2H3TPwyhaivFSGvcLup9z aR0ag5rfdGgmEjyu4CGVd3EKjbUXW1cOJB4g9bbfWhRlvMKl8/tftZMk4jBUPDufnGWV 6Qs/ZAmszqSOi5kiMn/SFgaQXxPX2fCwgs1tXO/P2pbWZs9fiKlP9PQnIS13i2yjqDpD Axbw== X-Gm-Message-State: AOJu0Yx+4OibXUcqARYXJEHPnmXZm4/gTPVyz2Bnbb5P3P/ZSZZJK+tQ JPjWMsMmuBfo1seDcmBb59hMSEFoNvwcXDTykVNGw0EDJJSs0KujW9rHqBpM X-Google-Smtp-Source: AGHT+IEOBWDdHBSn6MYZcDsMAjg06YN4MswMauxkUU1AohdAC2pdYdWD/fKtCiGD2UT6AxSc8zBdng== X-Received: by 2002:a05:6a20:9687:b0:19b:56f0:c880 with SMTP id hp7-20020a056a20968700b0019b56f0c880mr3151794pzc.39.1707257089523; Tue, 06 Feb 2024 14:04:49 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCWKAdt7d7meDWiqsPZBETrzTMdLFa3A8xeHxysTX4BnJCCUBehTYcyFmzK/mu5UhvwvCiGftMYGrDP47TFLDjE660wuY1T5GY0wrbXx4X1S03zWbWSXrOQ+tdSlnAu5YJXkDcgBFW2A3CDYdLKDKBnFqYTMtQA2EY6Myv9mLdg8OuDd+rL3sbhg2fdft6PrbNQoJEP1c/MrDfcTHtSmtmEH1NCKjDvOzx4UtVeD0UtYuEuc0aiQfVw/iVuyTncavF/YiNn57WBDn7q1VeXkmEMUcE3rFGnER8Ry Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id y192-20020a62cec9000000b006e02ce964b7sm2560051pfg.184.2024.02.06.14.04.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:04:49 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 01/16] bpf: Allow kfuncs return 'void *' Date: Tue, 6 Feb 2024 14:04:26 -0800 Message-Id: <20240206220441.38311-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Recognize return of 'void *' from kfunc as returning unknown scalar. Signed-off-by: Alexei Starovoitov Acked-by: Andrii Nakryiko Acked-by: David Vernet --- kernel/bpf/verifier.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ddaf09db1175..d9c2dbb3939f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -12353,6 +12353,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, meta.func_name); return -EFAULT; } + } else if (btf_type_is_void(ptr_type)) { + /* kfunc returning 'void *' is equivalent to returning scalar */ + mark_reg_unknown(env, regs, BPF_REG_0); } else if (!__btf_type_is_struct(ptr_type)) { if (!meta.r0_size) { __u32 sz; From patchwork Tue Feb 6 22:04:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547820 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93D701BC41 for ; Tue, 6 Feb 2024 22:04:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257096; cv=none; b=p9Pf7ZCaSjM5sETUmwAtbr7jxiDNXj6MDmAcpcQKeC9z2cnLqpiKtChizivjtc5PJSEoS3BV1LMsBdjmQs3j4s2WIVddItpiWBKHfa9btamz6vOyd88hcAfgK6R+V1gJ4A5xDXrI+QmdsXdsSCYSHHUNW+Oz3j8DFQ06AFcj3Yg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257096; c=relaxed/simple; bh=y/WnPpS5v7h0NAuexgmi/BtX46N+W3zY7R6LeqdxZUM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bHw/GrVBw8JJ7S5J3Kq/iI1shSsDHbl6o9VXvZmaNFwvg5ZzeWbNj1xqr6qPvwwBHkgY4/oZyD4DI0STwdf1LlkWm5aQk7LAguOtvGzwej+5jeLpW9D9oY4MJGIKPm0NdDyirzmi+000UT023wU60QUn+6kQ2C5GsPTvSp3udjU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=KOl1EX/V; arc=none smtp.client-ip=209.85.210.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KOl1EX/V" Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-6e04fd5e05aso17840b3a.0 for ; Tue, 06 Feb 2024 14:04:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257093; x=1707861893; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RYMIDAZLgBUgMpP99zrxeTSMtNJZxQwPNyQYFWZYzso=; b=KOl1EX/VkcCub6+reNZspY+xprSpgRLI5hQkB+ancywiNOlcj8JZzrwfQ1kMyHsdNZ s6SE1YBL7ZLfYj3KbcHSCnH3YKioET2piBJeQkT6toegOfB3JXGE6YpxUqILkn23ORFf wkgxl+jzmVmgNKYGfwNSFCuQ1OfqvQ6cdUOmgd/pzHh9a+ovISea3Ec4PElBDpITBjc3 W3QmdaXfcssqlcc6u2aS8X8/k7tVLa05Jl1vKLSn5PbdDnfhda43OimG9SlKlrzdYlIf sUsGMkxcrnz9i21j8zbvoKwfWjehYU6f3pEnuU15I930FCqTDkGv3tIKVD1YVka0Kj/I yMcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257093; x=1707861893; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RYMIDAZLgBUgMpP99zrxeTSMtNJZxQwPNyQYFWZYzso=; b=ItMoZq7A2yRSEl1g7YYAbxqelhvSKCIGr2FKlPLO3S8SJ3t3qm99VmxNUk89Ujjw2p +TNAuveSspvKVLOe6QX9UbIW47Cq9OkoZtWYR4U9JvURWdWBirBxqVL6XK4jnK754E6z Rx9bh8JpDXWmJedbDxFhibc24QRW+J7yFnsXfkIRUyzkLaeGrtb1ZVtDIrf0fZ9FQuT6 WhSKGwCoym0MCntqOeOCRchZQWxeSjjW8RjgCS0LmxGhQWpPMzZtX3N+6ripMUupfXlb oU4KkQmlN+Pj8Om0S0cSDVkBcbUVgvFa96EGSwXSb7y5oRKC1JkmEWzLJiE2cuyMlBaK uShg== X-Gm-Message-State: AOJu0YxiRUv+RZcGeNnk0BGEf1tn98cio3SjcP0YNN2tCDWw/Os6KZP7 e5/IbWTjI/hJIAeLNkYBCdOvq2as9NLnorQ0wJYE+Ud/dteKlZ8p3IxEsT/B X-Google-Smtp-Source: AGHT+IGOIOdH45J5dyDgzanegb8JLdqXenulcK+qVHNvlZfYkUWkLxixF+6eKSWhF7qb3dfASt+aRw== X-Received: by 2002:a05:6a00:b93:b0:6e0:4f30:bcfc with SMTP id g19-20020a056a000b9300b006e04f30bcfcmr1405410pfj.9.1707257093567; Tue, 06 Feb 2024 14:04:53 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCWUQoVkIthQQOsy6U4eSoQ5WdjVc0nJCVsDbkA2v0JGF6Yabw2lW6vfJzu68baFahsluKlv1NXHpG3TR+PKOZ23wdIUvI68+73W6afFYOGp3111MbSF540b1PZ6MPixqg3IhHR0uNOvN8cle0tWOioXnGsAI2TDADOi5Fif9LN8YHbGgy0QINttwPB4URjcnMqY8BB11Ij1d6XI/7mcn2HIpchSHafTgbAynmFH5kUNenEDFjaaPYgtauQDqNmqat3SfznSOSAwQYvn61xLPdSXcFtvHj6P1zvY Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id y20-20020aa78554000000b006e0322f072asm2488200pfn.35.2024.02.06.14.04.51 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:04:53 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 02/16] bpf: Recognize '__map' suffix in kfunc arguments Date: Tue, 6 Feb 2024 14:04:27 -0800 Message-Id: <20240206220441.38311-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Recognize 'void *p__map' kfunc argument as 'struct bpf_map *p__map'. It allows kfunc to have 'void *' argument for maps, since bpf progs will call them as: struct { __uint(type, BPF_MAP_TYPE_ARENA); ... } arena SEC(".maps"); bpf_kfunc_with_map(... &arena ...); Underneath libbpf will load CONST_PTR_TO_MAP into the register via ld_imm64 insn. If kfunc was defined with 'struct bpf_map *' it would pass the verifier, but bpf prog would need to use '(void *)&arena'. Which is not clean. Signed-off-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d9c2dbb3939f..db569ce89fb1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10741,6 +10741,11 @@ static bool is_kfunc_arg_ignore(const struct btf *btf, const struct btf_param *a return __kfunc_param_match_suffix(btf, arg, "__ign"); } +static bool is_kfunc_arg_map(const struct btf *btf, const struct btf_param *arg) +{ + return __kfunc_param_match_suffix(btf, arg, "__map"); +} + static bool is_kfunc_arg_alloc_obj(const struct btf *btf, const struct btf_param *arg) { return __kfunc_param_match_suffix(btf, arg, "__alloc"); @@ -11064,7 +11069,7 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, return KF_ARG_PTR_TO_CONST_STR; if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { - if (!btf_type_is_struct(ref_t)) { + if (!btf_type_is_struct(ref_t) && !btf_type_is_void(ref_t)) { verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n", meta->func_name, argno, btf_type_str(ref_t), ref_tname); return -EINVAL; @@ -11660,6 +11665,13 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (kf_arg_type < 0) return kf_arg_type; + if (is_kfunc_arg_map(btf, &args[i])) { + /* If argument has '__map' suffix expect 'struct bpf_map *' */ + ref_id = *reg2btf_ids[CONST_PTR_TO_MAP]; + ref_t = btf_type_by_id(btf_vmlinux, ref_id); + ref_tname = btf_name_by_offset(btf, ref_t->name_off); + } + switch (kf_arg_type) { case KF_ARG_PTR_TO_NULL: continue; From patchwork Tue Feb 6 22:04:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547821 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 458101BC3C for ; Tue, 6 Feb 2024 22:04:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257099; cv=none; b=VF01dhFbFhcsGfzhNzheViM+UEknr9X7eL0s79BagY2bgVlSVbdacNoYf9QTQVLFxu/dNxoUNC/NAT4VYUKu6iUrNozRjprk57njdLpeC6ngncqMpT1v3aG1KvuEM+5uPVbq2Ubglce6JpeF0G7dJGlCrYmUPpI5Jet40pKgiAc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257099; c=relaxed/simple; bh=1Fl8Vhx1TncSqczMymESz0vJAIjHEVQSOmCMMy7aPo8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WeVbdVwS17pCwYAtYE5+FFVwhpPjhXHxpq5PBzMgI8+HAS7w+lGuhur9xevDwgBrVwUmVTqS6Y5pouoOtB37RGbFQBd+b8+wSMN41Jjd+G7Gb6BjIIGseeq5aA7qDQUDH458A2UJdJ9qAeK7l7ytWl24SsqDsQjWuY1eGJupLPo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=gybeiERi; arc=none smtp.client-ip=209.85.210.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gybeiERi" Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-6e04eca492fso11817b3a.1 for ; Tue, 06 Feb 2024 14:04:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257097; x=1707861897; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3Tj/XQbWXLMz6O6W1qhMmj7yBJ5q6YLDIAk/QCdta60=; b=gybeiERix74gaWKVotahzjK4YNy6xeEN8BKvkLevYLRYOUtLw3PdUPtXUcTw8W+3v8 nFQB1/pJw+aHMmzbOQASPhuAoj5CeNymUyAh5bqSMR8maG3ODjOmy4HyIJwlB2jZeVS+ +mCKICuH8ZBPD86KqHXGoyzXTSbU6sPb+1QqqbIf5Ogc6uPurcN+3hSsOIMUufs91biD 9wrbLGWngSFuscBFZFRtAAW0g+RvT+PQpD+efFd/35vInhzPWI6MY83U+p44zZJaBjio tnFoFpWG+Repgva1J2I8jH2zi5pQpkdG2O+sSdIfdFrVn0uRXUFyni1lZDGc8ls3a4e1 pmdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257097; x=1707861897; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3Tj/XQbWXLMz6O6W1qhMmj7yBJ5q6YLDIAk/QCdta60=; b=etoVG0d5v805A+umY3nXtuEnUb58VfFnn10xoDjwP+7wmaNNE9Cx21CGaiOgyR81k9 DS0rd8qDv7x1FIcwkmszags7ji5hYgnNTgrjZ+qpMAiNJPyaZQ9tGceOFOlo4EgdG9+0 6FXZ7wOTnOiGQ+Y+VdL6qOa7Zqh1vh4tLckCn2YgvyIDdqTjYRcVkbNofV64BU/9PqzQ 94JYLX2kgs1yEdb4b3RYJHU0jq7QRw+f8uLM+OO4hVaz/7Qq6l5UAdM0cANeYNLjl/8+ u9paa2DYD7To5kti/DwPz+5TGVqVfDP2LQ2r16W9Mb6hvAJaFIjE6t59U8pUXoSID+VU NwRg== X-Gm-Message-State: AOJu0YzpxNOptx/gJ5bhZgZ84ZISMVBCzLwQ+ZYUUB1ipX9HqrTH/Oyi IHIA7tDk272yf43fOO0LknpE+YQ1FqQxhGfW3cta1j58mlE+xfqUOMic+z2m X-Google-Smtp-Source: AGHT+IEjcAf6uXoS2ZepvZd4WPEhsb74cH/NcraOd61wF2bJGbGsztqlhDT3saZuKsz+GWEev5IaMA== X-Received: by 2002:a05:6a20:3598:b0:19c:2a8d:8b75 with SMTP id j24-20020a056a20359800b0019c2a8d8b75mr2472114pze.28.1707257097406; Tue, 06 Feb 2024 14:04:57 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCUog5B9mBd/4ypN4iPfsLncV0Q6zXF9MI3t+m9OSEmXJlkyaYwbJU3Z5CWn6TyAb2DiJtsdoKuHJLryjn1E+YF6B9ICtSLE8PV+NwaK7s/bobASa1TzhWkioQw/pNKVdOB/uuhkbiOyjjZJRLYOlpKWVmq5s3/lmII40TDt44WDD0EUbz7vv76WbdbJJBqhRa1G7/kQyL5mMT0bXWt/GXEdG4CTcUpNAwn3jDPEp8+fy2NVxuh9M65ywbwVm9XrCApUkiY64O7kR5q1Bkwzb8au/2GMwMcXdy3L Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id c1-20020aa78c01000000b006e02e816f13sm2491180pfd.111.2024.02.06.14.04.55 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:04:57 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 03/16] mm: Expose vmap_pages_range() to the rest of the kernel. Date: Tue, 6 Feb 2024 14:04:28 -0800 Message-Id: <20240206220441.38311-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov The next commit will introduce bpf_arena which is a sparsely populated shared memory region between bpf program and user space process. It will function similar to vmalloc()/vm_map_ram(): - get_vm_area() - alloc_pages() - vmap_pages_range() Signed-off-by: Alexei Starovoitov --- include/linux/vmalloc.h | 2 ++ mm/vmalloc.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index c720be70c8dd..bafb87c69e3d 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -233,6 +233,8 @@ static inline bool is_vm_area_hugepages(const void *addr) #ifdef CONFIG_MMU void vunmap_range(unsigned long addr, unsigned long end); +int vmap_pages_range(unsigned long addr, unsigned long end, + pgprot_t prot, struct page **pages, unsigned int page_shift); static inline void set_vm_flush_reset_perms(void *addr) { struct vm_struct *vm = find_vm_area(addr); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index d12a17fc0c17..eae93d575d1b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -625,8 +625,8 @@ int vmap_pages_range_noflush(unsigned long addr, unsigned long end, * RETURNS: * 0 on success, -errno on failure. */ -static int vmap_pages_range(unsigned long addr, unsigned long end, - pgprot_t prot, struct page **pages, unsigned int page_shift) +int vmap_pages_range(unsigned long addr, unsigned long end, + pgprot_t prot, struct page **pages, unsigned int page_shift) { int err; From patchwork Tue Feb 6 22:04:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547822 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07B011BC27 for ; Tue, 6 Feb 2024 22:05:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257105; cv=none; b=vEKRShuTKlkOJOU6QrByc0RWmMuGKpjgRo6sBvGlWqsJPfGniSgb2BgkQpy6Czpor04i1KoX5/IwxmW+suTI+HUbEOsjlMyDvKL+laM3ulq25jTc9AVC2Zrc/y0tqP5RwN4uxqFq4O3JaOLkHc6jD/BEuLfOikaefMKQ0J2MZ/Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257105; c=relaxed/simple; bh=h5ZgcvQRpCkJimEZK4pTzKKgy7GCYtSgequhDgb3tbc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YFtRZnZ29pFMVq5MSAY2syDRNx3TSJgKMdLoiM2e7ZwQUyrLqmOOBMb6T7W7dBP+zrXwkDKcel1To8YyJDCu6Qht905hachq8DgT9OeikYK253FLFHbn3AThGTXdOg5C8IAA4/eKLIuWiVBQWzK1GaDxwFIZOc5Q9i7hnKwYAQs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WcwD3CXB; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WcwD3CXB" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1d73066880eso36045ad.3 for ; Tue, 06 Feb 2024 14:05:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257101; x=1707861901; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zRPDF2WyJ90sxBcsR07zzIbiv8DSmynkp0VU3oUCItw=; b=WcwD3CXBc11Iz3DKMA4wZbyOKsHEvusFr03Bh1QK6al+1RF+uaz/VxZkrE4kmEhYUC U3pzTpzuKpYN70FP/wK2jgDqG5eqNRoM/boNMCYSX0sHmju6weINQERV/5xWAmICRSdc QCm7cCH5sB+680ybLmN8cT1l6WlyeOpim0N5/le3PaUSgUwXCFtggXsx3NTV7TMQ/Mwa TYkBmy+3uhDzU0lFZFtjWRfXtO2nH5Hr7rOekv4FmpeWrQyDPGH8Ho4d7ItDOplRV+FU v3fAnsBnQjTc6fO0dJ58WbbjAiiqLj4psep5FQ+JqkTunMuAMndk+PC6g8RX/6lOtbOh fx3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257101; x=1707861901; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zRPDF2WyJ90sxBcsR07zzIbiv8DSmynkp0VU3oUCItw=; b=R1MGhtSRWpELeMEljMNApVnbdYyQRcKNBUD5u46TE3+arij42YwKI10BKKTnJXYCYr OYxzXqvYwG/ewS7AJP8IhTEENYCKkIsY0XGud7YL4SUElRi8DYIcMZtUHxnIxZouw7j/ rTIoshYtNJqCKKTUfyKb8vXvds8ivT25pEljdcX9MptclddH1zqEcsuNrjvSIy7e9Zxo jgcBLDXgPinzlMGyF03X4JwgRR9msMyyXezxjIjscETd+Co6+ehT7UVH3srrVrI35jFC qm578cb/XnxHLiZyNpZGfpPsCQYg/ZE7ReummIm0IjyiVUZ+2NW344RJcs5Um77YnGha pSrw== X-Gm-Message-State: AOJu0Yxy4H1Ozr/y8iInvhQmVgSGxhy1VJueXybpY2oHpE9YJ5jiZ2/V +Ww3Zu05ezO20poYtPxpKD4yvXAn7OHziGGxXdM3dp/pkDDOPrBCptkYCPRQ X-Google-Smtp-Source: AGHT+IEtepAdRLBSLWFUq9tirKmmT0gbpTXjex2ZKR4a8TyPd8k5+A2+/4mAAuVQAMyu0aOaVpxbow== X-Received: by 2002:a17:902:650c:b0:1d9:e181:51a1 with SMTP id b12-20020a170902650c00b001d9e18151a1mr2088356plk.63.1707257101247; Tue, 06 Feb 2024 14:05:01 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCV8WWbVPJWh6zaUDnen2OVpdX6umSdhMLuzYuwxIuWOROsCkheZJbKYxODBqj9D4gKW4NLXhbrD3Nc/89x3ajgOY6atbAYsNzkE7gtanRzF5khpPDjCK/uW3EbENwLeTOqVffa8uTuLlwIHyaq/TdhauoinYbr1d1Se8qKo5puES63Zry+0vBAx0X3h82RHVdkNgSjT0JgN32v9KE3qV2778vVMWozCEdYB/WmuDMO4MPssiv+WACo28FfE3ZzL612oL6ZaYyW7gXoCSIkLgN5TJlvucX8Z326K Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id iy15-20020a170903130f00b001d9edac54b2sm2834plb.205.2024.02.06.14.04.59 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:00 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 04/16] bpf: Introduce bpf_arena. Date: Tue, 6 Feb 2024 14:04:29 -0800 Message-Id: <20240206220441.38311-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Introduce bpf_arena, which is a sparse shared memory region between the bpf program and user space. Use cases: 1. User space mmap-s bpf_arena and uses it as a traditional mmap-ed anonymous region, like memcached or any key/value storage. The bpf program implements an in-kernel accelerator. XDP prog can search for a key in bpf_arena and return a value without going to user space. 2. The bpf program builds arbitrary data structures in bpf_arena (hash tables, rb-trees, sparse arrays), while user space occasionally consumes it. 3. bpf_arena is a "heap" of memory from the bpf program's point of view. It is not shared with user space. Initially, the kernel vm_area and user vma are not populated. User space can fault in pages within the range. While servicing a page fault, bpf_arena logic will insert a new page into the kernel and user vmas. The bpf program can allocate pages from that region via bpf_arena_alloc_pages(). This kernel function will insert pages into the kernel vm_area. The subsequent fault-in from user space will populate that page into the user vma. The BPF_F_SEGV_ON_FAULT flag at arena creation time can be used to prevent fault-in from user space. In such a case, if a page is not allocated by the bpf program and not present in the kernel vm_area, the user process will segfault. This is useful for use cases 2 and 3 above. bpf_arena_alloc_pages() is similar to user space mmap(). It allocates pages either at a specific address within the arena or allocates a range with the maple tree. bpf_arena_free_pages() is analogous to munmap(), which frees pages and removes the range from the kernel vm_area and from user process vmas. bpf_arena can be used as a bpf program "heap" of up to 4GB. The memory is not shared with user space. This is use case 3. In such a case, the BPF_F_NO_USER_CONV flag is recommended. It will tell the verifier to treat the rX = bpf_arena_cast_user(rY) instruction as a 32-bit move wX = wY, which will improve bpf prog performance. Otherwise, bpf_arena_cast_user is translated by JIT to conditionally add the upper 32 bits of user vm_start (if the pointer is not NULL) to arena pointers before they are stored into memory. This way, user space sees them as valid 64-bit pointers. Diff https://github.com/llvm/llvm-project/pull/79902 taught LLVM BPF backend to generate the bpf_cast_kern() instruction before dereference of the arena pointer and the bpf_cast_user() instruction when the arena pointer is formed. In a typical bpf program there will be very few bpf_cast_user(). From LLVM's point of view, arena pointers are tagged as __attribute__((address_space(1))). Hence, clang provides helpful diagnostics when pointers cross address space. Libbpf and the kernel support only address_space == 1. All other address space identifiers are reserved. rX = bpf_cast_kern(rY, addr_space) tells the verifier that rX->type = PTR_TO_ARENA. Any further operations on PTR_TO_ARENA register have to be in the 32-bit domain. The verifier will mark load/store through PTR_TO_ARENA with PROBE_MEM32. JIT will generate them as kern_vm_start + 32bit_addr memory accesses. The behavior is similar to copy_from_kernel_nofault() except that no address checks are necessary. The address is guaranteed to be in the 4GB range. If the page is not present, the destination register is zeroed on read, and the operation is ignored on write. rX = bpf_cast_user(rY, addr_space) tells the verifier that rX->type = unknown scalar. If arena->map_flags has BPF_F_NO_USER_CONV set, then the verifier converts cast_user to mov32. Otherwise, JIT will emit native code equivalent to: rX = (u32)rY; if (rX) rX |= arena->user_vm_start & ~(u64)~0U; After such conversion, the pointer becomes a valid user pointer within bpf_arena range. The user process can access data structures created in bpf_arena without any additional computations. For example, a linked list built by a bpf program can be walked natively by user space. Signed-off-by: Alexei Starovoitov --- include/linux/bpf.h | 5 +- include/linux/bpf_types.h | 1 + include/uapi/linux/bpf.h | 7 + kernel/bpf/Makefile | 3 + kernel/bpf/arena.c | 518 +++++++++++++++++++++++++++++++++ kernel/bpf/core.c | 11 + kernel/bpf/syscall.c | 3 + kernel/bpf/verifier.c | 1 + tools/include/uapi/linux/bpf.h | 7 + 9 files changed, 554 insertions(+), 2 deletions(-) create mode 100644 kernel/bpf/arena.c diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 1ebbee1d648e..42f22bc881f0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -37,6 +37,7 @@ struct perf_event; struct bpf_prog; struct bpf_prog_aux; struct bpf_map; +struct bpf_arena; struct sock; struct seq_file; struct btf; @@ -531,8 +532,8 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head, struct bpf_spin_lock *spin_lock); void bpf_rb_root_free(const struct btf_field *field, void *rb_root, struct bpf_spin_lock *spin_lock); - - +u64 bpf_arena_get_kern_vm_start(struct bpf_arena *arena); +u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena); int bpf_obj_name_cpy(char *dst, const char *src, unsigned int size); struct bpf_offload_dev; diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 94baced5a1ad..9f2a6b83b49e 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -132,6 +132,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_STRUCT_OPS, bpf_struct_ops_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_RINGBUF, ringbuf_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_BLOOM_FILTER, bloom_filter_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_USER_RINGBUF, user_ringbuf_map_ops) +BPF_MAP_TYPE(BPF_MAP_TYPE_ARENA, arena_map_ops) BPF_LINK_TYPE(BPF_LINK_TYPE_RAW_TRACEPOINT, raw_tracepoint) BPF_LINK_TYPE(BPF_LINK_TYPE_TRACING, tracing) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d96708380e52..f6648851eae6 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -983,6 +983,7 @@ enum bpf_map_type { BPF_MAP_TYPE_BLOOM_FILTER, BPF_MAP_TYPE_USER_RINGBUF, BPF_MAP_TYPE_CGRP_STORAGE, + BPF_MAP_TYPE_ARENA, __MAX_BPF_MAP_TYPE }; @@ -1370,6 +1371,12 @@ enum { /* BPF token FD is passed in a corresponding command's token_fd field */ BPF_F_TOKEN_FD = (1U << 16), + +/* When user space page faults in bpf_arena send SIGSEGV instead of inserting new page */ + BPF_F_SEGV_ON_FAULT = (1U << 17), + +/* Do not translate kernel bpf_arena pointers to user pointers */ + BPF_F_NO_USER_CONV = (1U << 18), }; /* Flags for BPF_PROG_QUERY. */ diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 4ce95acfcaa7..368c5d86b5b7 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -15,6 +15,9 @@ obj-${CONFIG_BPF_LSM} += bpf_inode_storage.o obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o obj-$(CONFIG_BPF_JIT) += trampoline.o obj-$(CONFIG_BPF_SYSCALL) += btf.o memalloc.o +ifeq ($(CONFIG_MMU)$(CONFIG_64BIT),yy) +obj-$(CONFIG_BPF_SYSCALL) += arena.o +endif obj-$(CONFIG_BPF_JIT) += dispatcher.o ifeq ($(CONFIG_NET),y) obj-$(CONFIG_BPF_SYSCALL) += devmap.o diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c new file mode 100644 index 000000000000..9db720321700 --- /dev/null +++ b/kernel/bpf/arena.c @@ -0,0 +1,518 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include +#include +#include + +/* + * bpf_arena is a sparsely populated shared memory region between bpf program and + * user space process. + * + * For example on x86-64 the values could be: + * user_vm_start 7f7d26200000 // picked by mmap() + * kern_vm_start ffffc90001e69000 // picked by get_vm_area() + * For user space all pointers within the arena are normal 8-byte addresses. + * In this example 7f7d26200000 is the address of the first page (pgoff=0). + * The bpf program will access it as: kern_vm_start + lower_32bit_of_user_ptr + * (u32)7f7d26200000 -> 26200000 + * hence + * ffffc90001e69000 + 26200000 == ffffc90028069000 is "pgoff=0" within 4Gb + * kernel memory region. + * + * BPF JITs generate the following code to access arena: + * mov eax, eax // eax has lower 32-bit of user pointer + * mov word ptr [rax + r12 + off], bx + * where r12 == kern_vm_start and off is s16. + * Hence allocate 4Gb + GUARD_SZ/2 on each side. + * + * Initially kernel vm_area and user vma are not populated. + * User space can fault-in any address which will insert the page + * into kernel and user vma. + * bpf program can allocate a page via bpf_arena_alloc_pages() kfunc + * which will insert it into kernel vm_area. + * The later fault-in from user space will populate that page into user vma. + */ + +/* number of bytes addressable by LDX/STX insn with 16-bit 'off' field */ +#define GUARD_SZ (1ull << sizeof(((struct bpf_insn *)0)->off) * 8) +#define KERN_VM_SZ ((1ull << 32) + GUARD_SZ) + +struct bpf_arena { + struct bpf_map map; + u64 user_vm_start; + u64 user_vm_end; + struct vm_struct *kern_vm; + struct maple_tree mt; + struct list_head vma_list; + struct mutex lock; +}; + +u64 bpf_arena_get_kern_vm_start(struct bpf_arena *arena) +{ + return arena ? (u64) (long) arena->kern_vm->addr + GUARD_SZ / 2 : 0; +} + +u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena) +{ + return arena ? arena->user_vm_start : 0; +} + +static long arena_map_peek_elem(struct bpf_map *map, void *value) +{ + return -EOPNOTSUPP; +} + +static long arena_map_push_elem(struct bpf_map *map, void *value, u64 flags) +{ + return -EOPNOTSUPP; +} + +static long arena_map_pop_elem(struct bpf_map *map, void *value) +{ + return -EOPNOTSUPP; +} + +static long arena_map_delete_elem(struct bpf_map *map, void *value) +{ + return -EOPNOTSUPP; +} + +static int arena_map_get_next_key(struct bpf_map *map, void *key, void *next_key) +{ + return -EOPNOTSUPP; +} + +static long compute_pgoff(struct bpf_arena *arena, long uaddr) +{ + return (u32)(uaddr - (u32)arena->user_vm_start) >> PAGE_SHIFT; +} + +#define MT_ENTRY ((void *)&arena_map_ops) /* unused. has to be valid pointer */ + +/* + * Reserve a "zero page", so that bpf prog and user space never see + * a pointer to arena with lower 32 bits being zero. + * bpf_cast_user() promotes it to full 64-bit NULL. + */ +static int reserve_zero_page(struct bpf_arena *arena) +{ + long pgoff = compute_pgoff(arena, 0); + + return mtree_insert(&arena->mt, pgoff, MT_ENTRY, GFP_KERNEL); +} + +static struct bpf_map *arena_map_alloc(union bpf_attr *attr) +{ + struct vm_struct *kern_vm; + int numa_node = bpf_map_attr_numa_node(attr); + struct bpf_arena *arena; + int err = -ENOMEM; + + if (attr->key_size != 8 || attr->value_size != 8 || + /* BPF_F_MMAPABLE must be set */ + !(attr->map_flags & BPF_F_MMAPABLE) || + /* No unsupported flags present */ + (attr->map_flags & ~(BPF_F_SEGV_ON_FAULT | BPF_F_MMAPABLE | BPF_F_NO_USER_CONV))) + return ERR_PTR(-EINVAL); + + if (attr->map_extra & ~PAGE_MASK) + /* If non-zero the map_extra is an expected user VMA start address */ + return ERR_PTR(-EINVAL); + + kern_vm = get_vm_area(KERN_VM_SZ, VM_MAP | VM_USERMAP); + if (!kern_vm) + return ERR_PTR(-ENOMEM); + + arena = bpf_map_area_alloc(sizeof(*arena), numa_node); + if (!arena) + goto err; + + INIT_LIST_HEAD(&arena->vma_list); + arena->kern_vm = kern_vm; + arena->user_vm_start = attr->map_extra; + bpf_map_init_from_attr(&arena->map, attr); + mt_init_flags(&arena->mt, MT_FLAGS_ALLOC_RANGE); + mutex_init(&arena->lock); + if (arena->user_vm_start) { + err = reserve_zero_page(arena); + if (err) { + bpf_map_area_free(arena); + goto err; + } + } + + return &arena->map; +err: + free_vm_area(kern_vm); + return ERR_PTR(err); +} + +static int for_each_pte(pte_t *ptep, unsigned long addr, void *data) +{ + struct page *page; + pte_t pte; + + pte = ptep_get(ptep); + if (!pte_present(pte)) + return 0; + page = pte_page(pte); + /* + * We do not update pte here: + * 1. Nobody should be accessing bpf_arena's range outside of a kernel bug + * 2. TLB flushing is batched or deferred. Even if we clear pte, + * the TLB entries can stick around and continue to permit access to + * the freed page. So it all relies on 1. + */ + __free_page(page); + return 0; +} + +static void arena_map_free(struct bpf_map *map) +{ + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + /* + * Check that user vma-s are not around when bpf map is freed. + * mmap() holds vm_file which holds bpf_map refcnt. + * munmap() must have happened on vma followed by arena_vm_close() + * which would clear arena->vma_list. + */ + if (WARN_ON_ONCE(!list_empty(&arena->vma_list))) + return; + + /* + * free_vm_area() calls remove_vm_area() that calls free_unmap_vmap_area(). + * It unmaps everything from vmalloc area and clears pgtables. + * Call apply_to_existing_page_range() first to find populated ptes and + * free those pages. + */ + apply_to_existing_page_range(&init_mm, bpf_arena_get_kern_vm_start(arena), + KERN_VM_SZ - GUARD_SZ / 2, for_each_pte, NULL); + free_vm_area(arena->kern_vm); + mtree_destroy(&arena->mt); + bpf_map_area_free(arena); +} + +static void *arena_map_lookup_elem(struct bpf_map *map, void *key) +{ + return ERR_PTR(-EINVAL); +} + +static long arena_map_update_elem(struct bpf_map *map, void *key, + void *value, u64 flags) +{ + return -EOPNOTSUPP; +} + +static int arena_map_check_btf(const struct bpf_map *map, const struct btf *btf, + const struct btf_type *key_type, const struct btf_type *value_type) +{ + return 0; +} + +static u64 arena_map_mem_usage(const struct bpf_map *map) +{ + return 0; +} + +struct vma_list { + struct vm_area_struct *vma; + struct list_head head; +}; + +static int remember_vma(struct bpf_arena *arena, struct vm_area_struct *vma) +{ + struct vma_list *vml; + + vml = kmalloc(sizeof(*vml), GFP_KERNEL); + if (!vml) + return -ENOMEM; + vma->vm_private_data = vml; + vml->vma = vma; + list_add(&vml->head, &arena->vma_list); + return 0; +} + +static void arena_vm_close(struct vm_area_struct *vma) +{ + struct vma_list *vml; + + vml = vma->vm_private_data; + list_del(&vml->head); + vma->vm_private_data = NULL; + kfree(vml); +} + +static vm_fault_t arena_vm_fault(struct vm_fault *vmf) +{ + struct bpf_map *map = vmf->vma->vm_file->private_data; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + struct page *page; + long kbase, kaddr; + int ret; + + kbase = bpf_arena_get_kern_vm_start(arena); + kaddr = kbase + (u32)(vmf->address & PAGE_MASK); + + guard(mutex)(&arena->lock); + page = vmalloc_to_page((void *)kaddr); + if (page) + /* already have a page vmap-ed */ + goto out; + + if (arena->map.map_flags & BPF_F_SEGV_ON_FAULT) + /* User space requested to segfault when page is not allocated by bpf prog */ + return VM_FAULT_SIGSEGV; + + ret = mtree_insert(&arena->mt, vmf->pgoff, MT_ENTRY, GFP_KERNEL); + if (ret == -EEXIST) + return VM_FAULT_RETRY; + if (ret) + return VM_FAULT_SIGSEGV; + + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) { + mtree_erase(&arena->mt, vmf->pgoff); + return VM_FAULT_SIGSEGV; + } + + ret = vmap_pages_range(kaddr, kaddr + PAGE_SIZE, PAGE_KERNEL, &page, PAGE_SHIFT); + if (ret) { + mtree_erase(&arena->mt, vmf->pgoff); + __free_page(page); + return VM_FAULT_SIGSEGV; + } +out: + page_ref_add(page, 1); + vmf->page = page; + return 0; +} + +static const struct vm_operations_struct arena_vm_ops = { + .close = arena_vm_close, + .fault = arena_vm_fault, +}; + +static int arena_map_mmap(struct bpf_map *map, struct vm_area_struct *vma) +{ + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + int err; + + if (arena->user_vm_start && arena->user_vm_start != vma->vm_start) + /* + * 1st user process can do mmap(NULL, ...) to pick user_vm_start + * 2nd user process must pass the same addr to mmap(addr, MAP_FIXED..); + * or + * specify addr in map_extra at map creation time and + * use the same addr later with mmap(addr, MAP_FIXED..); + */ + return -EBUSY; + + if (arena->user_vm_end && arena->user_vm_end != vma->vm_end) + /* all user processes must have the same size of mmap-ed region */ + return -EBUSY; + + if (vma->vm_end - vma->vm_start > 1ull << 32) + /* Must not be bigger than 4Gb */ + return -E2BIG; + + if (remember_vma(arena, vma)) + return -ENOMEM; + + if (!arena->user_vm_start) { + arena->user_vm_start = vma->vm_start; + err = reserve_zero_page(arena); + if (err) + return err; + } + arena->user_vm_end = vma->vm_end; + /* + * bpf_map_mmap() checks that it's being mmaped as VM_SHARED and + * clears VM_MAYEXEC. Set VM_DONTEXPAND as well to avoid + * potential change of user_vm_start. + */ + vm_flags_set(vma, VM_DONTEXPAND); + vma->vm_ops = &arena_vm_ops; + return 0; +} + +BTF_ID_LIST_SINGLE(bpf_arena_map_btf_ids, struct, bpf_arena) +const struct bpf_map_ops arena_map_ops = { + .map_meta_equal = bpf_map_meta_equal, + .map_alloc = arena_map_alloc, + .map_free = arena_map_free, + .map_mmap = arena_map_mmap, + .map_get_next_key = arena_map_get_next_key, + .map_push_elem = arena_map_push_elem, + .map_peek_elem = arena_map_peek_elem, + .map_pop_elem = arena_map_pop_elem, + .map_lookup_elem = arena_map_lookup_elem, + .map_update_elem = arena_map_update_elem, + .map_delete_elem = arena_map_delete_elem, + .map_check_btf = arena_map_check_btf, + .map_mem_usage = arena_map_mem_usage, + .map_btf_id = &bpf_arena_map_btf_ids[0], +}; + +static u64 clear_lo32(u64 val) +{ + return val & ~(u64)~0U; +} + +/* + * Allocate pages and vmap them into kernel vmalloc area. + * Later the pages will be mmaped into user space vma. + */ +static long arena_alloc_pages(struct bpf_arena *arena, long uaddr, long page_cnt, int node_id) +{ + long page_cnt_max = (arena->user_vm_end - arena->user_vm_start) >> PAGE_SHIFT; + u64 kern_vm_start = bpf_arena_get_kern_vm_start(arena); + long pgoff = 0, kaddr, nr_pages = 0; + struct page **pages; + int ret, i; + + if (page_cnt >= page_cnt_max) + return 0; + + if (uaddr) { + if (uaddr & ~PAGE_MASK) + return 0; + pgoff = compute_pgoff(arena, uaddr); + if (pgoff + page_cnt > page_cnt_max) + /* requested address will be outside of user VMA */ + return 0; + } + + /* zeroing is needed, since alloc_pages_bulk_array() only fills in non-zero entries */ + pages = kvcalloc(page_cnt, sizeof(struct page *), GFP_KERNEL); + if (!pages) + return 0; + + guard(mutex)(&arena->lock); + + if (uaddr) + ret = mtree_insert_range(&arena->mt, pgoff, pgoff + page_cnt, + MT_ENTRY, GFP_KERNEL); + else + ret = mtree_alloc_range(&arena->mt, &pgoff, MT_ENTRY, + page_cnt, 0, page_cnt_max, GFP_KERNEL); + if (ret) + goto out_free_pages; + + nr_pages = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_ZERO, node_id, page_cnt, pages); + if (nr_pages != page_cnt) + goto out; + + kaddr = kern_vm_start + (u32)(arena->user_vm_start + pgoff * PAGE_SIZE); + ret = vmap_pages_range(kaddr, kaddr + PAGE_SIZE * page_cnt, PAGE_KERNEL, + pages, PAGE_SHIFT); + if (ret) + goto out; + kvfree(pages); + return clear_lo32(arena->user_vm_start) + (u32)(kaddr - kern_vm_start); +out: + mtree_erase(&arena->mt, pgoff); +out_free_pages: + if (pages) + for (i = 0; i < nr_pages; i++) + __free_page(pages[i]); + kvfree(pages); + return 0; +} + +/* + * If page is present in vmalloc area, unmap it from vmalloc area, + * unmap it from all user space vma-s, + * and free it. + */ +static void zap_pages(struct bpf_arena *arena, long uaddr, long page_cnt) +{ + struct vma_list *vml; + + list_for_each_entry(vml, &arena->vma_list, head) + zap_page_range_single(vml->vma, uaddr, + PAGE_SIZE * page_cnt, NULL); +} + +static void arena_free_pages(struct bpf_arena *arena, long uaddr, long page_cnt) +{ + u64 full_uaddr, uaddr_end; + long kaddr, pgoff, i; + struct page *page; + + /* only aligned lower 32-bit are relevant */ + uaddr = (u32)uaddr; + uaddr &= PAGE_MASK; + full_uaddr = clear_lo32(arena->user_vm_start) + uaddr; + uaddr_end = min(arena->user_vm_end, full_uaddr + (page_cnt << PAGE_SHIFT)); + if (full_uaddr >= uaddr_end) + return; + + page_cnt = (uaddr_end - full_uaddr) >> PAGE_SHIFT; + + kaddr = bpf_arena_get_kern_vm_start(arena) + uaddr; + + guard(mutex)(&arena->lock); + + pgoff = compute_pgoff(arena, uaddr); + /* clear range */ + mtree_store_range(&arena->mt, pgoff, pgoff + page_cnt, NULL, GFP_KERNEL); + + if (page_cnt > 1) + /* bulk zap if multiple pages being freed */ + zap_pages(arena, full_uaddr, page_cnt); + + for (i = 0; i < page_cnt; i++, kaddr += PAGE_SIZE, full_uaddr += PAGE_SIZE) { + page = vmalloc_to_page((void *)kaddr); + if (!page) + continue; + if (page_cnt == 1 && page_mapped(page)) /* mapped by some user process */ + zap_pages(arena, full_uaddr, 1); + vunmap_range(kaddr, kaddr + PAGE_SIZE); + __free_page(page); + } +} + +__bpf_kfunc_start_defs(); + +__bpf_kfunc void *bpf_arena_alloc_pages(void *p__map, void *addr__ign, u32 page_cnt, + int node_id, u64 flags) +{ + struct bpf_map *map = p__map; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + if (map->map_type != BPF_MAP_TYPE_ARENA || !arena->user_vm_start || flags) + return NULL; + + return (void *)arena_alloc_pages(arena, (long)addr__ign, page_cnt, node_id); +} + +__bpf_kfunc void bpf_arena_free_pages(void *p__map, void *ptr__ign, u32 page_cnt) +{ + struct bpf_map *map = p__map; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + if (map->map_type != BPF_MAP_TYPE_ARENA || !arena->user_vm_start) + return; + arena_free_pages(arena, (long)ptr__ign, page_cnt); +} +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(arena_kfuncs) +BTF_ID_FLAGS(func, bpf_arena_alloc_pages, KF_TRUSTED_ARGS | KF_SLEEPABLE) +BTF_ID_FLAGS(func, bpf_arena_free_pages, KF_TRUSTED_ARGS | KF_SLEEPABLE) +BTF_KFUNCS_END(arena_kfuncs) + +static const struct btf_kfunc_id_set common_kfunc_set = { + .owner = THIS_MODULE, + .set = &arena_kfuncs, +}; + +static int __init kfunc_init(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_UNSPEC, &common_kfunc_set); +} +late_initcall(kfunc_init); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 71c459a51d9e..2539d9bfe369 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2970,6 +2970,17 @@ void __weak arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, { } +/* for configs without MMU or 32-bit */ +__weak const struct bpf_map_ops arena_map_ops; +__weak u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena) +{ + return 0; +} +__weak u64 bpf_arena_get_kern_vm_start(struct bpf_arena *arena) +{ + return 0; +} + #ifdef CONFIG_BPF_SYSCALL static int __init bpf_global_ma_init(void) { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index b2750b79ac80..ac0e4a8bb852 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -164,6 +164,7 @@ static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, if (bpf_map_is_offloaded(map)) { return bpf_map_offload_update_elem(map, key, value, flags); } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || + map->map_type == BPF_MAP_TYPE_ARENA || map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { return map->ops->map_update_elem(map, key, value, flags); } else if (map->map_type == BPF_MAP_TYPE_SOCKHASH || @@ -1160,6 +1161,7 @@ static int map_create(union bpf_attr *attr) } if (attr->map_type != BPF_MAP_TYPE_BLOOM_FILTER && + attr->map_type != BPF_MAP_TYPE_ARENA && attr->map_extra != 0) return -EINVAL; @@ -1249,6 +1251,7 @@ static int map_create(union bpf_attr *attr) case BPF_MAP_TYPE_LRU_PERCPU_HASH: case BPF_MAP_TYPE_STRUCT_OPS: case BPF_MAP_TYPE_CPUMAP: + case BPF_MAP_TYPE_ARENA: if (!bpf_token_capable(token, CAP_BPF)) goto put_token; break; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index db569ce89fb1..3c77a3ab1192 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -18047,6 +18047,7 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env, case BPF_MAP_TYPE_SK_STORAGE: case BPF_MAP_TYPE_TASK_STORAGE: case BPF_MAP_TYPE_CGRP_STORAGE: + case BPF_MAP_TYPE_ARENA: break; default: verbose(env, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index d96708380e52..f6648851eae6 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -983,6 +983,7 @@ enum bpf_map_type { BPF_MAP_TYPE_BLOOM_FILTER, BPF_MAP_TYPE_USER_RINGBUF, BPF_MAP_TYPE_CGRP_STORAGE, + BPF_MAP_TYPE_ARENA, __MAX_BPF_MAP_TYPE }; @@ -1370,6 +1371,12 @@ enum { /* BPF token FD is passed in a corresponding command's token_fd field */ BPF_F_TOKEN_FD = (1U << 16), + +/* When user space page faults in bpf_arena send SIGSEGV instead of inserting new page */ + BPF_F_SEGV_ON_FAULT = (1U << 17), + +/* Do not translate kernel bpf_arena pointers to user pointers */ + BPF_F_NO_USER_CONV = (1U << 18), }; /* Flags for BPF_PROG_QUERY. */ From patchwork Tue Feb 6 22:04:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547823 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38F181BC20 for ; Tue, 6 Feb 2024 22:05:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257107; cv=none; b=BKenH6lhqeOtnOOOzaBGcrZ8PaMiCtOAAZ8ani2FGP5y0KEacbwobCDcBEBzltC/Hda0prcLaYA921PnZfGn8v49wbzoiJzBoovt+Phsd9505MlUUnyXQzsLw6kuTzyhx4Bj6puHcEsksR43zxu+gHAdUSenavtupr6h9ehuk+Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257107; c=relaxed/simple; bh=q/Oh4UGLAUx5nS2repzDRhenYLeGL9xgTdHnvBGVAfw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FKudm6fWh3rYMuGs5qyJMzbc57bp8QZ7Roj94dRASHMaxEWt0caRt2Bu8NQJ7qpJEVOTgoRrHV/LYUivmSPJGkciSfVXTOGCvGuqLhRmCcILFcp4G/FjZ7k2UrFEEeLiePnSTGpgdhCuQp3M+s6N1SiQ/8JDwlxqerWlIklLBNc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E4N94U0G; arc=none smtp.client-ip=209.85.210.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E4N94U0G" Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-6da6b0eb2d4so12093b3a.1 for ; Tue, 06 Feb 2024 14:05:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257105; x=1707861905; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Dj+O15ERlc2oNccgk7u96NynUdtw7IljkhhXG+cbct8=; b=E4N94U0GZY5xcnfZ4D00TDphvrkr/tGgJfTZD3z1Rlr4nlF9WShVYUta26B12rbMpS 8SPjn9AiZhuxjqi4lT/n0qQid7hlyVUz4nd7snylibDtJOh0HkSexlLw+BmeVFoDJspF XVUclwkRB3n7Wcu7+XPcrZ1BFtVNwlOuUUvvIm5Ycbhqkk8T07bwzYVulKZSgwdp2sbD brXte8dVK8Gtro1r1xb6iXmMnvNnOQ0sDzcNg4+mATRJIWS6RIJBkNTciPfs5V5/h1Yc xbgpMCVXmFWZJJESIpcIZYu9/la80wDCBlIHwr0LvOqgAmLg44xM6BbDG4J2MYlHBb/O YVPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257105; x=1707861905; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Dj+O15ERlc2oNccgk7u96NynUdtw7IljkhhXG+cbct8=; b=EN7Tv1gWV8fN/AHry5b/ZisRBrPxOCkQHQbY02vJSPKpaVC1azPZQnrOTd1ouKfbiT zjV8lQGKs1K9igPhijATsTI14obbmuWRetvUTvu2QOoxxe03VLuuovDQHnGoO3QEFylf 4furHjaJmXP8JsqKgFknrOgakrRYWpA2DVBLxB3esYD4Hz9gOOSQWqK+GRbxchtT+EjB Lpm2bQo70do7I6wWBJnPxtsnTm9sX5weAPsaHsH/oj7Q3DoJ2if+udoVcmPq9bMwwQMS mFtdKIVZv6iYSmAg7k4P33KfpNiU+dY5MQJt9usBEwOV8BlNdF2ifFtyE93nqYYNWlVG dt4w== X-Gm-Message-State: AOJu0YztByVlQk5G32LnXGqhbWnAaaizQ1L76HfSLpTu7sSpKRz+klQI iluDV82LNEj0ueVZkjp4K4h2s+94zmoRPgKn69kK9TQS7sZmDxMvT8/KOlFs X-Google-Smtp-Source: AGHT+IF7ufxUlX2aXoymen9Yws0uT15WBLonSQYDvE1rmVUQQKdyNHEjxdlBQPSGpbNEV8D1c9ZwsQ== X-Received: by 2002:a05:6a21:3a97:b0:19e:a19f:f4de with SMTP id zv23-20020a056a213a9700b0019ea19ff4demr1218529pzb.41.1707257105150; Tue, 06 Feb 2024 14:05:05 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCXj2AKcZu7kXri6G+OwPqAfvSb7LujQdy2VsFPv9eQ5TTr8zwqyW+OhhQE42K7slkCGEV7sQJxPbSPGLxYy8wHDClhLz9RnrYV7FZ+YKL6xYzjNjxvM+EanhnVJ7dKX24XsaCGCfAiX9N3zPMVgAVfAp5dRizJWf8lYj3JR7V1VJkb1D7jhXy5H1h4GflKBzqGRglGhP2MSRAtPNqQSMMAPaPzrVj0mH2kZaapd1a8eYTmbX1n+M3iNC1d1P9HzYC0d5elXhE2qkBHHLnqX9578jlxiWafpgkAE Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id e26-20020aa798da000000b006dd810cdd91sm2519731pfm.88.2024.02.06.14.05.03 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:04 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 05/16] bpf: Disasm support for cast_kern/user instructions. Date: Tue, 6 Feb 2024 14:04:30 -0800 Message-Id: <20240206220441.38311-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov LLVM generates rX = bpf_cast_kern/_user(rY, address_space) instructions when pointers in non-zero address space are used by the bpf program. Signed-off-by: Alexei Starovoitov --- include/uapi/linux/bpf.h | 5 +++++ kernel/bpf/disasm.c | 11 +++++++++++ tools/include/uapi/linux/bpf.h | 5 +++++ 3 files changed, 21 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index f6648851eae6..3de1581379d4 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1313,6 +1313,11 @@ enum { */ #define BPF_PSEUDO_KFUNC_CALL 2 +enum bpf_arena_cast_kinds { + BPF_ARENA_CAST_KERN = 1, + BPF_ARENA_CAST_USER = 2, +}; + /* flags for BPF_MAP_UPDATE_ELEM command */ enum { BPF_ANY = 0, /* create new element or update existing */ diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index 49940c26a227..37d9b37b34f7 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -166,6 +166,12 @@ static bool is_movsx(const struct bpf_insn *insn) (insn->off == 8 || insn->off == 16 || insn->off == 32); } +static bool is_arena_cast(const struct bpf_insn *insn) +{ + return insn->code == (BPF_ALU64 | BPF_MOV | BPF_X) && + (insn->off == BPF_ARENA_CAST_KERN || insn->off == BPF_ARENA_CAST_USER); +} + void print_bpf_insn(const struct bpf_insn_cbs *cbs, const struct bpf_insn *insn, bool allow_ptr_leaks) @@ -184,6 +190,11 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, insn->code, class == BPF_ALU ? 'w' : 'r', insn->dst_reg, class == BPF_ALU ? 'w' : 'r', insn->dst_reg); + } else if (is_arena_cast(insn)) { + verbose(cbs->private_data, "(%02x) r%d = cast_%s(r%d, %d)\n", + insn->code, insn->dst_reg, + insn->off == BPF_ARENA_CAST_KERN ? "kern" : "user", + insn->src_reg, insn->imm); } else if (BPF_SRC(insn->code) == BPF_X) { verbose(cbs->private_data, "(%02x) %c%d %s %s%c%d\n", insn->code, class == BPF_ALU ? 'w' : 'r', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index f6648851eae6..3de1581379d4 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1313,6 +1313,11 @@ enum { */ #define BPF_PSEUDO_KFUNC_CALL 2 +enum bpf_arena_cast_kinds { + BPF_ARENA_CAST_KERN = 1, + BPF_ARENA_CAST_USER = 2, +}; + /* flags for BPF_MAP_UPDATE_ELEM command */ enum { BPF_ANY = 0, /* create new element or update existing */ From patchwork Tue Feb 6 22:04:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547824 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E3911BDE1 for ; Tue, 6 Feb 2024 22:05:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257112; cv=none; b=MGwlpiA/EeyYwKnWFUJvds6Nuj1I18vOvYLItCS9V1jy2LOGMvJyriwMqBqh+heU5aDedKqOLZw80zAyE8o52eIIUq1p2MGw3ZiCvRRbByxda5v1dUMV5tQe2AuJzvfc3GvrXfVi3dFbVff3ciU9kaBEF3D/1Bn8b89l2Y/vbLc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257112; c=relaxed/simple; bh=L7zX8SsqUigxmPRtqcJkNCYI2d9YxxxYxslLVW+XN0Q=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dV/sQhKmeZTijqOwfD6BQtt2CL5Sp59KETuX3n4bEW4URQAXaoUN92XklngBavfIz+2lUtEVWFzCYL5XtLT68S7szX4aiMTFIBa0qoEDhqJvYPbOGf1I0ekNugHlJ/M/8UPAz8PArylprAccIKECniDrlVT53pLrosOqPW9LHrw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CZFm2OWQ; arc=none smtp.client-ip=209.85.210.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CZFm2OWQ" Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-6e065abd351so3882b3a.3 for ; Tue, 06 Feb 2024 14:05:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257109; x=1707861909; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6enFVSH9RvXWbj69sDWzp9NqEgt7n2TNQMLPChkoghg=; b=CZFm2OWQZWKls1Eskc725uaecVWlwEth36SARCJvOMiAK7UqaItbtCf5Sv4XKDrkMR V7x0hQJUlKzLZCFd7NuO2ClYVNo7Xo3n4tXCVW22/Y+nlHxkFzZFwOHtZ/BivXf3agPD bhZviTHn59fXnYVaPrpRAYkb9UulrCsq6esNHNld6QSJe7aqzzJgrGfwIKhyNkjFFNqs C/o/u/XvehPhZ/onMljwBKfDrOcHcTFjtVf2tAVk8Yd93GYhU4fYSHWWtwSDsWyGMRIg p8RISotiHHCaZ4RasCu+j3GLbz1cxXIFunl7dUMOa3oNvn0nprDMvWC2Gp4PkexiVrLN YOug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257109; x=1707861909; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6enFVSH9RvXWbj69sDWzp9NqEgt7n2TNQMLPChkoghg=; b=EuZiFkK7A7WZdW1pKercCX5bLl1DxbekV1r7tHVb8xID3tng3Cy1Xcq70t4EkwJ6Ak qXw+dkp9lKKV2zIQTfsfoBgQInEyIcgp7MplxXlCkVM9j3dwZbtQc8BvQ9siozPhbQ9x IgQj5HOTcZWSKKACQoGjEdyDtpfix2ZTqdvTbgD7TBDIFf6n7pg5gWbmuXZb1Y6h7vks CBsY8b8FNUU3LAgXsUV3Aojmged3u8xeCRSKz2rz4hNostNVwr4SO9nYBDJ4pgVwnXBc p7ZbWTzXAt9aJZESKIyE5TANQFsmyK37vl44S6GZLTjBM5CvxLlWiTpF08VlRwGU87HB TVGQ== X-Gm-Message-State: AOJu0Yxj6wkQMDSUX9qVILt8Kr94nEXQ7tefvXfrjh0Yxtwh+Drnqu/9 InPPNwC0/XW+q9MVEV3q8ybqVyPW7m/QZQD3Ng6TAEY6xSHyvaH46yeorODf X-Google-Smtp-Source: AGHT+IFJTMJzd37fFoOyeiC8Vlz2bibOaLXp0NCvLZHx/AUz1x1BcqEdisFhlYf4KrANb3jkuOV5FQ== X-Received: by 2002:a17:90a:fc82:b0:296:a70f:e96b with SMTP id ci2-20020a17090afc8200b00296a70fe96bmr810904pjb.46.1707257109520; Tue, 06 Feb 2024 14:05:09 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCWTlE2SqXoxgbHIhq0G2pAApG/U2NZcA5v2musiC9iAqVjKj1tmWN0x2jXiNzl+bSHzBLchN9EcCVA65jMPupfxr+IPyW3uB1BbgGJucFcciAKDPSma9ckJXdxkGwr5dryD0TmNxrXOI2eGFoxgjmbe1V9Glj5vQ9bUH+pRBiCH+njQ0ko5+lmzLDx4II82tzdKWQNT8vvskJIIsUrU5j8iu7kXOy+/2GbiZmkj3OH58FJns8wJnBYFa+KhP1PyQYlDzNsXXw9Uq3cD256jC+iitZhaf55/mvjP Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id pd3-20020a17090b1dc300b00290f9e8b4f9sm2275950pjb.46.2024.02.06.14.05.07 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:09 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 06/16] bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions. Date: Tue, 6 Feb 2024 14:04:31 -0800 Message-Id: <20240206220441.38311-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW] instructions. They are similar to PROBE_MEM instructions with the following differences: - PROBE_MEM has to check that the address is in the kernel range with src_reg + insn->off >= TASK_SIZE_MAX + PAGE_SIZE check - PROBE_MEM doesn't support store - PROBE_MEM32 relies on the verifier to clear upper 32-bit in the register - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in %r12 in the prologue) Due to bpf_arena constructions such %r12 + %reg + off16 access is guaranteed to be within arena virtual range, so no address check at run-time. - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When LDX faults the destination register is zeroed. Signed-off-by: Alexei Starovoitov --- arch/x86/net/bpf_jit_comp.c | 183 +++++++++++++++++++++++++++++++++++- include/linux/bpf.h | 1 + include/linux/filter.h | 3 + 3 files changed, 186 insertions(+), 1 deletion(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index e1390d1e331b..883b7f604b9a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -113,6 +113,7 @@ static int bpf_size_to_x86_bytes(int bpf_size) /* Pick a register outside of BPF range for JIT internal work */ #define AUX_REG (MAX_BPF_JIT_REG + 1) #define X86_REG_R9 (MAX_BPF_JIT_REG + 2) +#define X86_REG_R12 (MAX_BPF_JIT_REG + 3) /* * The following table maps BPF registers to x86-64 registers. @@ -139,6 +140,7 @@ static const int reg2hex[] = { [BPF_REG_AX] = 2, /* R10 temp register */ [AUX_REG] = 3, /* R11 temp register */ [X86_REG_R9] = 1, /* R9 register, 6th function argument */ + [X86_REG_R12] = 4, /* R12 callee saved */ }; static const int reg2pt_regs[] = { @@ -167,6 +169,7 @@ static bool is_ereg(u32 reg) BIT(BPF_REG_8) | BIT(BPF_REG_9) | BIT(X86_REG_R9) | + BIT(X86_REG_R12) | BIT(BPF_REG_AX)); } @@ -205,6 +208,17 @@ static u8 add_2mod(u8 byte, u32 r1, u32 r2) return byte; } +static u8 add_3mod(u8 byte, u32 r1, u32 r2, u32 index) +{ + if (is_ereg(r1)) + byte |= 1; + if (is_ereg(index)) + byte |= 2; + if (is_ereg(r2)) + byte |= 4; + return byte; +} + /* Encode 'dst_reg' register into x86-64 opcode 'byte' */ static u8 add_1reg(u8 byte, u32 dst_reg) { @@ -887,6 +901,18 @@ static void emit_insn_suffix(u8 **pprog, u32 ptr_reg, u32 val_reg, int off) *pprog = prog; } +static void emit_insn_suffix_SIB(u8 **pprog, u32 ptr_reg, u32 val_reg, u32 index_reg, int off) +{ + u8 *prog = *pprog; + + if (is_imm8(off)) { + EMIT3(add_2reg(0x44, BPF_REG_0, val_reg), add_2reg(0, ptr_reg, index_reg) /* SIB */, off); + } else { + EMIT2_off32(add_2reg(0x84, BPF_REG_0, val_reg), add_2reg(0, ptr_reg, index_reg) /* SIB */, off); + } + *pprog = prog; +} + /* * Emit a REX byte if it will be necessary to address these registers */ @@ -968,6 +994,37 @@ static void emit_ldsx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) *pprog = prog; } +static void emit_ldx_index(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, u32 index_reg, int off) +{ + u8 *prog = *pprog; + + switch (size) { + case BPF_B: + /* movzx rax, byte ptr [rax + r12 + off] */ + EMIT3(add_3mod(0x40, src_reg, dst_reg, index_reg), 0x0F, 0xB6); + break; + case BPF_H: + /* movzx rax, word ptr [rax + r12 + off] */ + EMIT3(add_3mod(0x40, src_reg, dst_reg, index_reg), 0x0F, 0xB7); + break; + case BPF_W: + /* mov eax, dword ptr [rax + r12 + off] */ + EMIT2(add_3mod(0x40, src_reg, dst_reg, index_reg), 0x8B); + break; + case BPF_DW: + /* mov rax, qword ptr [rax + r12 + off] */ + EMIT2(add_3mod(0x48, src_reg, dst_reg, index_reg), 0x8B); + break; + } + emit_insn_suffix_SIB(&prog, src_reg, dst_reg, index_reg, off); + *pprog = prog; +} + +static void emit_ldx_r12(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) +{ + emit_ldx_index(pprog, size, dst_reg, src_reg, X86_REG_R12, off); +} + /* STX: *(u8*)(dst_reg + off) = src_reg */ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) { @@ -1002,6 +1059,71 @@ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) *pprog = prog; } +/* STX: *(u8*)(dst_reg + index_reg + off) = src_reg */ +static void emit_stx_index(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, u32 index_reg, int off) +{ + u8 *prog = *pprog; + + switch (size) { + case BPF_B: + /* mov byte ptr [rax + r12 + off], al */ + EMIT2(add_3mod(0x40, dst_reg, src_reg, index_reg), 0x88); + break; + case BPF_H: + /* mov word ptr [rax + r12 + off], ax */ + EMIT3(0x66, add_3mod(0x40, dst_reg, src_reg, index_reg), 0x89); + break; + case BPF_W: + /* mov dword ptr [rax + r12 + 1], eax */ + EMIT2(add_3mod(0x40, dst_reg, src_reg, index_reg), 0x89); + break; + case BPF_DW: + /* mov qword ptr [rax + r12 + 1], rax */ + EMIT2(add_3mod(0x48, dst_reg, src_reg, index_reg), 0x89); + break; + } + emit_insn_suffix_SIB(&prog, dst_reg, src_reg, index_reg, off); + *pprog = prog; +} + +static void emit_stx_r12(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) +{ + emit_stx_index(pprog, size, dst_reg, src_reg, X86_REG_R12, off); +} + +/* ST: *(u8*)(dst_reg + index_reg + off) = imm32 */ +static void emit_st_index(u8 **pprog, u32 size, u32 dst_reg, u32 index_reg, int off, int imm) +{ + u8 *prog = *pprog; + + switch (size) { + case BPF_B: + /* mov byte ptr [rax + r12 + off], imm8 */ + EMIT2(add_3mod(0x40, dst_reg, 0, index_reg), 0xC6); + break; + case BPF_H: + /* mov word ptr [rax + r12 + off], imm16 */ + EMIT3(0x66, add_3mod(0x40, dst_reg, 0, index_reg), 0xC7); + break; + case BPF_W: + /* mov dword ptr [rax + r12 + 1], imm32 */ + EMIT2(add_3mod(0x40, dst_reg, 0, index_reg), 0xC7); + break; + case BPF_DW: + /* mov qword ptr [rax + r12 + 1], imm32 */ + EMIT2(add_3mod(0x48, dst_reg, 0, index_reg), 0xC7); + break; + } + emit_insn_suffix_SIB(&prog, dst_reg, 0, index_reg, off); + EMIT(imm, bpf_size_to_x86_bytes(size)); + *pprog = prog; +} + +static void emit_st_r12(u8 **pprog, u32 size, u32 dst_reg, int off, int imm) +{ + emit_st_index(pprog, size, dst_reg, X86_REG_R12, off, imm); +} + static int emit_atomic(u8 **pprog, u8 atomic_op, u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size) { @@ -1043,12 +1165,15 @@ static int emit_atomic(u8 **pprog, u8 atomic_op, return 0; } +#define DONT_CLEAR 1 + bool ex_handler_bpf(const struct exception_table_entry *x, struct pt_regs *regs) { u32 reg = x->fixup >> 8; /* jump over faulting load and clear dest register */ - *(unsigned long *)((void *)regs + reg) = 0; + if (reg != DONT_CLEAR) + *(unsigned long *)((void *)regs + reg) = 0; regs->ip += x->fixup & 0xff; return true; } @@ -1147,11 +1272,14 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image bool tail_call_seen = false; bool seen_exit = false; u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; + u64 arena_vm_start; int i, excnt = 0; int ilen, proglen = 0; u8 *prog = temp; int err; + arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); + detect_reg_usage(insn, insn_cnt, callee_regs_used, &tail_call_seen); @@ -1172,8 +1300,13 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image push_r12(&prog); push_callee_regs(&prog, all_callee_regs_used); } else { + if (arena_vm_start) + push_r12(&prog); push_callee_regs(&prog, callee_regs_used); } + if (arena_vm_start) + emit_mov_imm64(&prog, X86_REG_R12, + arena_vm_start >> 32, (u32) arena_vm_start); ilen = prog - temp; if (rw_image) @@ -1564,6 +1697,52 @@ st: if (is_imm8(insn->off)) emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); break; + case BPF_ST | BPF_PROBE_MEM32 | BPF_B: + case BPF_ST | BPF_PROBE_MEM32 | BPF_H: + case BPF_ST | BPF_PROBE_MEM32 | BPF_W: + case BPF_ST | BPF_PROBE_MEM32 | BPF_DW: + start_of_ldx = prog; + emit_st_r12(&prog, BPF_SIZE(insn->code), dst_reg, insn->off, insn->imm); + goto populate_extable; + + /* LDX: dst_reg = *(u8*)(src_reg + r12 + off) */ + case BPF_LDX | BPF_PROBE_MEM32 | BPF_B: + case BPF_LDX | BPF_PROBE_MEM32 | BPF_H: + case BPF_LDX | BPF_PROBE_MEM32 | BPF_W: + case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW: + case BPF_STX | BPF_PROBE_MEM32 | BPF_B: + case BPF_STX | BPF_PROBE_MEM32 | BPF_H: + case BPF_STX | BPF_PROBE_MEM32 | BPF_W: + case BPF_STX | BPF_PROBE_MEM32 | BPF_DW: + start_of_ldx = prog; + if (BPF_CLASS(insn->code) == BPF_LDX) + emit_ldx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); + else + emit_stx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); +populate_extable: + { + struct exception_table_entry *ex; + u8 *_insn = image + proglen + (start_of_ldx - temp); + s64 delta; + + if (!bpf_prog->aux->extable) + break; + + ex = &bpf_prog->aux->extable[excnt++]; + + delta = _insn - (u8 *)&ex->insn; + /* switch ex to rw buffer for writes */ + ex = (void *)rw_image + ((void *)ex - (void *)image); + + ex->insn = delta; + + ex->data = EX_TYPE_BPF; + + ex->fixup = (prog - start_of_ldx) | + ((BPF_CLASS(insn->code) == BPF_LDX ? reg2pt_regs[dst_reg] : DONT_CLEAR) << 8); + } + break; + /* LDX: dst_reg = *(u8*)(src_reg + off) */ case BPF_LDX | BPF_MEM | BPF_B: case BPF_LDX | BPF_PROBE_MEM | BPF_B: @@ -2036,6 +2215,8 @@ st: if (is_imm8(insn->off)) pop_r12(&prog); } else { pop_callee_regs(&prog, callee_regs_used); + if (arena_vm_start) + pop_r12(&prog); } EMIT1(0xC9); /* leave */ emit_return(&prog, image + addrs[i - 1] + (prog - temp)); diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 42f22bc881f0..a0d737bb86d1 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1460,6 +1460,7 @@ struct bpf_prog_aux { bool xdp_has_frags; bool exception_cb; bool exception_boundary; + struct bpf_arena *arena; /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */ const struct btf_type *attach_func_proto; /* function name for valid attach_btf_id */ diff --git a/include/linux/filter.h b/include/linux/filter.h index fee070b9826e..cd76d43412d0 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -72,6 +72,9 @@ struct ctl_table_header; /* unused opcode to mark special ldsx instruction. Same as BPF_IND */ #define BPF_PROBE_MEMSX 0x40 +/* unused opcode to mark special load instruction. Same as BPF_MSH */ +#define BPF_PROBE_MEM32 0xa0 + /* unused opcode to mark call to interpreter with arguments */ #define BPF_CALL_ARGS 0xe0 From patchwork Tue Feb 6 22:04:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547825 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55C011C68A for ; Tue, 6 Feb 2024 22:05:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257116; cv=none; b=u2LlnPJQ3MEzii0hJObSHP3L1JeQ961zWF9J9DAoiwiUuKuVIHKPsPUsF8eRPs9BIKAw2bPWGy++G5NpVjZ62nNUDZxmOyKcB1+UIGtHkgNyhg5rQWl9GkbXESPo+D/FXHOri3et8/9WGmz4rR3EDSNfBhc1bJpgZnnvbOAW0UU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257116; c=relaxed/simple; bh=J1UZVDn6uI5UtDGhH2/mRbk7fjIbvzhRqLYTqIkwRcE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=vASa/WsDG5L8ziWDpxhlD0bdCo8eC2fPDNw55G+xQSJsxHtnBtBxgypIhWFish1p6OufPEFnl4UwYAeuCP2roVjKWQ9U903SKJRllJN4bw9FWszjytctrYjntMUsS5NI7taJ91b8X3L7z23Bi+AizoR61zMcmSQ406sWtVlygDg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=T/thiNcO; arc=none smtp.client-ip=209.85.216.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="T/thiNcO" Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-2909a632e40so4683520a91.0 for ; Tue, 06 Feb 2024 14:05:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257113; x=1707861913; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GxmcZQXB6hILUuERStuFRZ0MPT4uFMATBLnu6rXBqLE=; b=T/thiNcOXgEhyEUgcmaRUCDCLqMy+LSsGRa4ywXlH9SovKEdN0mC4gaT45jjhoYZJZ ltGC/b3pzbjHTg8x70a46N5F0RaKDelxHeyhdT7RlhhZj6ia53rZ5ejqTc9gEma9uQd/ HULtAg72CK5fiQgitJCSQAsfWoO4qkdlwMOz8I2GZS+TtVWIq2wm3v31mAzRGpjBhSf+ WLrijyavBXcHGRW4Bp7Qg/aCHTBQnzRGD7yPa65oCJ4PmclXJm0vU6DtCicEgIFH33B2 HZxoQt7bWE3rV/WqzJn2sAQShZQfQ2ToUnmTxMXvY9wCuCXzrZZ/Hw9SMpxTKTGFeorp AUjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257113; x=1707861913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GxmcZQXB6hILUuERStuFRZ0MPT4uFMATBLnu6rXBqLE=; b=n8bHtUTdl3BiGMeQbJtekdjTEvdmJOOS+iAZX6ePi7WmB/80CQ7u1MWKxuzMvkjNpL n9VFjjwShCXEKrhbuRYVFbzladGGoIVS2kkGMLWNCezslBDDTM6janofBVdA+NkG6ydu 1MGFQ0hk1hoh0EwAzRseREFLnl5qo/b6f0EEe78wcznv2VyFOY3A4MYB5S0isLTTVidb IMj/vKgAUw4f7GCjIt/8bZa7N2rCyVyegu3raJgDULTvZ3T81RzFEW+/1aKRoBjxhWjx d6RlvFAMPhdBshs15R8pAVwgcZJhgBDHnQCKQiPVeBqIz+bYwcWwPeD0LRvmjJLVRPa8 4SMg== X-Gm-Message-State: AOJu0YypOfjfEGw2Fh8MxHky6wwAU//JAwCKhQrV1gCcRgsZPR91E1GL iUxDv0CPnIQClHxxLemzTgo7uiglqsS1pMQUrJt4ZFXjzWXp+6kS7m++PBXU X-Google-Smtp-Source: AGHT+IG8oAHzR3nsr++sOa64aFqWnOY2L1+MMxXsvlNE4aY5ZiS0oyor0jQVKD0uoMrje1RiSvSdsA== X-Received: by 2002:a17:90b:350c:b0:296:c1a:f651 with SMTP id ls12-20020a17090b350c00b002960c1af651mr884749pjb.36.1707257113308; Tue, 06 Feb 2024 14:05:13 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXfwYB7FRrQvWfTk6tEr+uUBGbmaCq7+V+6esU7E5tJSAj1KK9tCNvR2xsEZBB081Xgy46YpR2sEOIbwIiK/1CsFdSNyQsZddCFwiLLxjRVFRaNhwuCFDq191OQ8uonJA+2UerpQK+JRy9zZAlFHa1wanOWMMIS+Hq7SpgXQH9IqaaVBQYUqYpa2d39hQfaUwO1B6BULxhIfZ+C2s4IrW++qdX5AoMMGyC9BAWf+AY6fWAWyCtugRHgQWZ74iaZkRKe4wMe8lkrZkmmvR1EPCVE9kBbPDe3DnDo Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id li4-20020a17090b48c400b00296b2f99946sm7168pjb.30.2024.02.06.14.05.11 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:12 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 07/16] bpf: Add x86-64 JIT support for bpf_cast_user instruction. Date: Tue, 6 Feb 2024 14:04:32 -0800 Message-Id: <20240206220441.38311-8-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov LLVM generates bpf_cast_kern and bpf_cast_user instructions while translating pointers with __attribute__((address_space(1))). rX = cast_kern(rY) is processed by the verifier and converted to normal 32-bit move: wX = wY bpf_cast_user has to be converted by JIT. rX = cast_user(rY) is aux_reg = upper_32_bits of arena->user_vm_start aux_reg <<= 32 wX = wY // clear upper 32 bits of dst register if (wX) // if not zero add upper bits of user_vm_start wX |= aux_reg JIT can do it more efficiently: mov dst_reg32, src_reg32 // 32-bit move shl dst_reg, 32 or dst_reg, user_vm_start rol dst_reg, 32 xor r11, r11 test dst_reg32, dst_reg32 // check if lower 32-bit are zero cmove r11, dst_reg // if so, set dst_reg to zero // Intel swapped src/dst register encoding in CMOVcc Signed-off-by: Alexei Starovoitov --- arch/x86/net/bpf_jit_comp.c | 41 ++++++++++++++++++++++++++++++++++++- include/linux/filter.h | 1 + kernel/bpf/core.c | 5 +++++ 3 files changed, 46 insertions(+), 1 deletion(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 883b7f604b9a..a042ed57af7b 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1272,13 +1272,14 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image bool tail_call_seen = false; bool seen_exit = false; u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; - u64 arena_vm_start; + u64 arena_vm_start, user_vm_start; int i, excnt = 0; int ilen, proglen = 0; u8 *prog = temp; int err; arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); + user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena); detect_reg_usage(insn, insn_cnt, callee_regs_used, &tail_call_seen); @@ -1346,6 +1347,39 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image break; case BPF_ALU64 | BPF_MOV | BPF_X: + if (insn->off == BPF_ARENA_CAST_USER) { + if (dst_reg != src_reg) + /* 32-bit mov */ + emit_mov_reg(&prog, false, dst_reg, src_reg); + /* shl dst_reg, 32 */ + maybe_emit_1mod(&prog, dst_reg, true); + EMIT3(0xC1, add_1reg(0xE0, dst_reg), 32); + + /* or dst_reg, user_vm_start */ + maybe_emit_1mod(&prog, dst_reg, true); + if (is_axreg(dst_reg)) + EMIT1_off32(0x0D, user_vm_start >> 32); + else + EMIT2_off32(0x81, add_1reg(0xC8, dst_reg), user_vm_start >> 32); + + /* rol dst_reg, 32 */ + maybe_emit_1mod(&prog, dst_reg, true); + EMIT3(0xC1, add_1reg(0xC0, dst_reg), 32); + + /* xor r11, r11 */ + EMIT3(0x4D, 0x31, 0xDB); + + /* test dst_reg32, dst_reg32; check if lower 32-bit are zero */ + maybe_emit_mod(&prog, dst_reg, dst_reg, false); + EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg)); + + /* cmove r11, dst_reg; if so, set dst_reg to zero */ + /* WARNING: Intel swapped src/dst register encoding in CMOVcc !!! */ + maybe_emit_mod(&prog, AUX_REG, dst_reg, true); + EMIT3(0x0F, 0x44, add_2reg(0xC0, AUX_REG, dst_reg)); + break; + } + fallthrough; case BPF_ALU | BPF_MOV | BPF_X: if (insn->off == 0) emit_mov_reg(&prog, @@ -3424,6 +3458,11 @@ void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke, } } +bool bpf_jit_supports_arena(void) +{ + return true; +} + bool bpf_jit_supports_ptr_xchg(void) { return true; diff --git a/include/linux/filter.h b/include/linux/filter.h index cd76d43412d0..78ea63002531 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -959,6 +959,7 @@ bool bpf_jit_supports_kfunc_call(void); bool bpf_jit_supports_far_kfunc_call(void); bool bpf_jit_supports_exceptions(void); bool bpf_jit_supports_ptr_xchg(void); +bool bpf_jit_supports_arena(void); void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie); bool bpf_helper_changes_pkt_data(void *func); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 2539d9bfe369..2829077f0461 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2926,6 +2926,11 @@ bool __weak bpf_jit_supports_far_kfunc_call(void) return false; } +bool __weak bpf_jit_supports_arena(void) +{ + return false; +} + /* Return TRUE if the JIT backend satisfies the following two conditions: * 1) JIT backend supports atomic_xchg() on pointer-sized words. * 2) Under the specific arch, the implementation of xchg() is the same From patchwork Tue Feb 6 22:04:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547826 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A9E81CF8C for ; Tue, 6 Feb 2024 22:05:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257120; cv=none; b=T4yESmWHMWYSsxj36wp+V3w0iEGcHdjkIEi6ESPZf4SLTa1IcFmS5jKRsBCcHinJgjrD+u/4Rm8P/38b6abtfkynaIoTQPJDdUNI1EO6TnS+TOhhRULEI9E4GYwL7FOB3xes1ugYO6LBQ4pgAX4yq3mo8Hsj5Aml9nnNnHBcWQs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257120; c=relaxed/simple; bh=So2tOH7c4daP8KsByOGdjxk8MyQuuk7BTB9rH1bM94Q=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DIts18ZLDFRVNV0yD/Acy1S4WSSea47C9qzv3hFGwAntScqDBZUhN/GoATpjjvetludeZau9c0ertw4RhD9C7frMkBhqx6F6ppp4TtZf3SfvY23jVqPO1agpjYJ5xqVqtjJHZwCKu3Nm8DPzYwOYN3UmIorwLwUqqYTz5Tp68n4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WX95WVmX; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WX95WVmX" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-6e05d958b61so7924b3a.2 for ; Tue, 06 Feb 2024 14:05:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257117; x=1707861917; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x70vLSLuXpXSToMN/uastZH05CPVL6I6J8RYkA3WTV8=; b=WX95WVmXLn/R+S9uJH35UJBlsSVywKkzaFBLp1bDELXlZy1/3GBSmtZ/nP1RtTFI1L n0EASt330jlvGlKWV9/ZIFfEitrM3vh6iexXZWmEkF5bL1+DGNOX/k+aEWTFrYuZOJIn lp3DYQ/aGoOGQvfW7gDRnwpN3SaXSLioBn6C4jOnwsw+TJ5T93CfCzgLFORw/h2d3JvJ FHU//LR/glX3qZGmlN7/5tb7uqQTwsL22ksWavl5HF6sP0XYQLQrM2QFrOK818yj0BTY pe5fO2aiALs9A5yloBjit4MfCJ/pDU1mCmAZucOMZCQv5vlkPlN1AznCdgXxEWx3A5MG eGZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257117; x=1707861917; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x70vLSLuXpXSToMN/uastZH05CPVL6I6J8RYkA3WTV8=; b=iadquntsSA/bqGijsA21KKcMjzxCCkJ03vu1dBsbnLqHMWXRb16/d+tFXjzKIGLEzj fCIuD8P5eflnpvRMeqhuyN0A2mzd5yZxsqZenAQVnm3R8FQsJj26rdQcy3bnFxWFQhAx mllrJhFQOP2VmGyY35ZPpVNQCHhoQJBL6bOErEuIZizyH7O3iVhtYU3AvPUvNjpKZfxC WDEmiN6M55GWL6JQmMIdnoQ8tvjzgpw/6A3rTTSBVI6RoepJhFBEdf4L3ii9h4BhDnLt bZ56EMELb+F7NkX49cOykmA6uHwdYBOCcl1P75DwPf7KGDPkLUfzPAPCECxw7PmPqH7p nVww== X-Gm-Message-State: AOJu0YxpOBDBnkKM5KA6La0rtd79i5xMzcdctAYTm1/lwxqhoxQxfXGN NJhjumbh5H/C0sy6vFABZK3qdfQUrQ7nAyoVdaqlmZvd8+Lku0AdGvmhuNXH X-Google-Smtp-Source: AGHT+IEzPOfG6Yk1/DlYA0HQAsLNnAeAbJiYKnYVhkz+oFjGy38tTVOSA4WEfIoEPimN4Ul29wfrSg== X-Received: by 2002:a62:e807:0:b0:6de:40e:65a3 with SMTP id c7-20020a62e807000000b006de040e65a3mr834468pfi.16.1707257117133; Tue, 06 Feb 2024 14:05:17 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCU1D19B7Hqb61Kyd2+Fvj5kDQspG3/YW/fJlehTD32oKpMmLdiL77sc0UopdB4RBsoXI9kowCg9Cnov8RwV43zLduFGVuvX6yZl8S6HcHqledsiBtXT3bpKw/GTMsaNbtV4bdRY6yq6CzwY5YsdlEO6vvoKEVhMcejghIFGRFTM7Ul0iQBc/fBoCI39YuCHEA9V5wa7G9k0+0lIS2JEyLnad8uR9ktURW/yhgXJJc4/CTw0cK+b8wA47kg3DaAFHICVILo2nlpQyF162HLFXhp3nNGsQp4v3BJ0 Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id w20-20020aa78594000000b006e046c04b81sm2555282pfn.147.2024.02.06.14.05.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:16 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 08/16] bpf: Recognize cast_kern/user instructions in the verifier. Date: Tue, 6 Feb 2024 14:04:33 -0800 Message-Id: <20240206220441.38311-9-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov rX = bpf_cast_kern(rY, addr_space) tells the verifier that rX->type = PTR_TO_ARENA. Any further operations on PTR_TO_ARENA register have to be in 32-bit domain. The verifier will mark load/store through PTR_TO_ARENA with PROBE_MEM32. JIT will generate them as kern_vm_start + 32bit_addr memory accesses. rX = bpf_cast_user(rY, addr_space) tells the verifier that rX->type = unknown scalar. If arena->map_flags has BPF_F_NO_USER_CONV set then convert cast_user to mov32 as well. Otherwise JIT will convert it to: rX = (u32)rY; if (rX) rX |= arena->user_vm_start & ~(u64)~0U; Signed-off-by: Alexei Starovoitov --- include/linux/bpf.h | 1 + include/linux/bpf_verifier.h | 1 + kernel/bpf/log.c | 3 ++ kernel/bpf/verifier.c | 94 +++++++++++++++++++++++++++++++++--- 4 files changed, 92 insertions(+), 7 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a0d737bb86d1..82f7727e434a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -886,6 +886,7 @@ enum bpf_reg_type { * an explicit null check is required for this struct. */ PTR_TO_MEM, /* reg points to valid memory region */ + PTR_TO_ARENA, PTR_TO_BUF, /* reg points to a read/write buffer */ PTR_TO_FUNC, /* reg points to a bpf program function */ CONST_PTR_TO_DYNPTR, /* reg points to a const struct bpf_dynptr */ diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 84365e6dd85d..43c95e3e2a3c 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -547,6 +547,7 @@ struct bpf_insn_aux_data { u32 seen; /* this insn was processed by the verifier at env->pass_cnt */ bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */ bool zext_dst; /* this insn zero extends dst reg */ + bool needs_zext; /* alu op needs to clear upper bits */ bool storage_get_func_atomic; /* bpf_*_storage_get() with atomic memory alloc */ bool is_iter_next; /* bpf_iter__next() kfunc call */ bool call_with_percpu_alloc_ptr; /* {this,per}_cpu_ptr() with prog percpu alloc */ diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c index 594a234f122b..677076c760ff 100644 --- a/kernel/bpf/log.c +++ b/kernel/bpf/log.c @@ -416,6 +416,7 @@ const char *reg_type_str(struct bpf_verifier_env *env, enum bpf_reg_type type) [PTR_TO_XDP_SOCK] = "xdp_sock", [PTR_TO_BTF_ID] = "ptr_", [PTR_TO_MEM] = "mem", + [PTR_TO_ARENA] = "arena", [PTR_TO_BUF] = "buf", [PTR_TO_FUNC] = "func", [PTR_TO_MAP_KEY] = "map_key", @@ -651,6 +652,8 @@ static void print_reg_state(struct bpf_verifier_env *env, } verbose(env, "%s", reg_type_str(env, t)); + if (t == PTR_TO_ARENA) + return; if (t == PTR_TO_STACK) { if (state->frameno != reg->frameno) verbose(env, "[%d]", reg->frameno); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3c77a3ab1192..6bd5a0f30f72 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4370,6 +4370,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type) case PTR_TO_MEM: case PTR_TO_FUNC: case PTR_TO_MAP_KEY: + case PTR_TO_ARENA: return true; default: return false; @@ -5805,6 +5806,8 @@ static int check_ptr_alignment(struct bpf_verifier_env *env, case PTR_TO_XDP_SOCK: pointer_desc = "xdp_sock "; break; + case PTR_TO_ARENA: + return 0; default: break; } @@ -6906,6 +6909,9 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn if (!err && value_regno >= 0 && (rdonly_mem || t == BPF_READ)) mark_reg_unknown(env, regs, value_regno); + } else if (reg->type == PTR_TO_ARENA) { + if (t == BPF_READ && value_regno >= 0) + mark_reg_unknown(env, regs, value_regno); } else { verbose(env, "R%d invalid mem access '%s'\n", regno, reg_type_str(env, reg->type)); @@ -8377,6 +8383,7 @@ static int check_func_arg_reg_off(struct bpf_verifier_env *env, case PTR_TO_MEM | MEM_RINGBUF: case PTR_TO_BUF: case PTR_TO_BUF | MEM_RDONLY: + case PTR_TO_ARENA: case SCALAR_VALUE: return 0; /* All the rest must be rejected, except PTR_TO_BTF_ID which allows @@ -13837,6 +13844,21 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env, dst_reg = ®s[insn->dst_reg]; src_reg = NULL; + + if (dst_reg->type == PTR_TO_ARENA) { + struct bpf_insn_aux_data *aux = cur_aux(env); + + if (BPF_CLASS(insn->code) == BPF_ALU64) + /* + * 32-bit operations zero upper bits automatically. + * 64-bit operations need to be converted to 32. + */ + aux->needs_zext = true; + + /* Any arithmetic operations are allowed on arena pointers */ + return 0; + } + if (dst_reg->type != SCALAR_VALUE) ptr_reg = dst_reg; else @@ -13954,16 +13976,17 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) } else if (opcode == BPF_MOV) { if (BPF_SRC(insn->code) == BPF_X) { - if (insn->imm != 0) { - verbose(env, "BPF_MOV uses reserved fields\n"); - return -EINVAL; - } - if (BPF_CLASS(insn->code) == BPF_ALU) { - if (insn->off != 0 && insn->off != 8 && insn->off != 16) { + if ((insn->off != 0 && insn->off != 8 && insn->off != 16) || + insn->imm) { verbose(env, "BPF_MOV uses reserved fields\n"); return -EINVAL; } + } else if (insn->off == BPF_ARENA_CAST_KERN || insn->off == BPF_ARENA_CAST_USER) { + if (!insn->imm) { + verbose(env, "cast_kern/user insn must have non zero imm32\n"); + return -EINVAL; + } } else { if (insn->off != 0 && insn->off != 8 && insn->off != 16 && insn->off != 32) { @@ -13993,7 +14016,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) struct bpf_reg_state *dst_reg = regs + insn->dst_reg; if (BPF_CLASS(insn->code) == BPF_ALU64) { - if (insn->off == 0) { + if (insn->imm) { + /* off == BPF_ARENA_CAST_KERN || off == BPF_ARENA_CAST_USER */ + mark_reg_unknown(env, regs, insn->dst_reg); + if (insn->off == BPF_ARENA_CAST_KERN) + dst_reg->type = PTR_TO_ARENA; + } else if (insn->off == 0) { /* case: R1 = R2 * copy register state to dest reg */ @@ -14059,6 +14087,9 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) dst_reg->subreg_def = env->insn_idx + 1; coerce_subreg_to_size_sx(dst_reg, insn->off >> 3); } + } else if (src_reg->type == PTR_TO_ARENA) { + mark_reg_unknown(env, regs, insn->dst_reg); + dst_reg->type = PTR_TO_ARENA; } else { mark_reg_unknown(env, regs, insn->dst_reg); @@ -16519,6 +16550,8 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, * the same stack frame, since fp-8 in foo != fp-8 in bar */ return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno; + case PTR_TO_ARENA: + return true; default: return regs_exact(rold, rcur, idmap); } @@ -18235,6 +18268,27 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env) fdput(f); return -EBUSY; } + if (map->map_type == BPF_MAP_TYPE_ARENA) { + if (env->prog->aux->arena) { + verbose(env, "Only one arena per program\n"); + fdput(f); + return -EBUSY; + } + if (!env->allow_ptr_leaks || !env->bpf_capable) { + verbose(env, "CAP_BPF and CAP_PERFMON are required to use arena\n"); + fdput(f); + return -EPERM; + } + if (!env->prog->jit_requested) { + verbose(env, "JIT is required to use arena\n"); + return -EOPNOTSUPP; + } + if (!bpf_jit_supports_arena()) { + verbose(env, "JIT doesn't support arena\n"); + return -EOPNOTSUPP; + } + env->prog->aux->arena = (void *)map; + } fdput(f); next_insn: @@ -18799,6 +18853,18 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) insn->code == (BPF_ST | BPF_MEM | BPF_W) || insn->code == (BPF_ST | BPF_MEM | BPF_DW)) { type = BPF_WRITE; + } else if (insn->code == (BPF_ALU64 | BPF_MOV | BPF_X) && insn->imm) { + if (insn->off == BPF_ARENA_CAST_KERN || + (((struct bpf_map *)env->prog->aux->arena)->map_flags & BPF_F_NO_USER_CONV)) { + /* convert to 32-bit mov that clears upper 32-bit */ + insn->code = BPF_ALU | BPF_MOV | BPF_X; + /* clear off, so it's a normal 'wX = wY' from JIT pov */ + insn->off = 0; + } /* else insn->off == BPF_ARENA_CAST_USER should be handled by JIT */ + continue; + } else if (env->insn_aux_data[i + delta].needs_zext) { + /* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */ + insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code); } else { continue; } @@ -18856,6 +18922,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) env->prog->aux->num_exentries++; } continue; + case PTR_TO_ARENA: + if (BPF_MODE(insn->code) == BPF_MEMSX) { + verbose(env, "sign extending loads from arena are not supported yet\n"); + return -EOPNOTSUPP; + } + insn->code = BPF_CLASS(insn->code) | BPF_PROBE_MEM32 | BPF_SIZE(insn->code); + env->prog->aux->num_exentries++; + continue; default: continue; } @@ -19041,13 +19115,19 @@ static int jit_subprogs(struct bpf_verifier_env *env) func[i]->aux->nr_linfo = prog->aux->nr_linfo; func[i]->aux->jited_linfo = prog->aux->jited_linfo; func[i]->aux->linfo_idx = env->subprog_info[i].linfo_idx; + func[i]->aux->arena = prog->aux->arena; num_exentries = 0; insn = func[i]->insnsi; for (j = 0; j < func[i]->len; j++, insn++) { if (BPF_CLASS(insn->code) == BPF_LDX && (BPF_MODE(insn->code) == BPF_PROBE_MEM || + BPF_MODE(insn->code) == BPF_PROBE_MEM32 || BPF_MODE(insn->code) == BPF_PROBE_MEMSX)) num_exentries++; + if ((BPF_CLASS(insn->code) == BPF_STX || + BPF_CLASS(insn->code) == BPF_ST) && + BPF_MODE(insn->code) == BPF_PROBE_MEM32) + num_exentries++; } func[i]->aux->num_exentries = num_exentries; func[i]->aux->tail_call_reachable = env->subprog_info[i].tail_call_reachable; From patchwork Tue Feb 6 22:04:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547827 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BE8E1BDED for ; Tue, 6 Feb 2024 22:05:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257123; cv=none; b=t9uTi68vdqlm3sc6Al3DetB1IZH14kj70US8U61oL30Zlmap6QVDlnBtvXryFrMTH8hROJgxNAxWOtiLF8hK+H3nlm8lmC5iX3ZM1MrcyNSqZ+4bsB73vVzDcIE2hHXEOEikHUeFPTwUNDl0TDK3TkB+R6vPDfF3Syk8SXX0je8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257123; c=relaxed/simple; bh=eueHQE607LnJ9GtCHljb8AaXRLkGlzLEzibT//2GRiY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bOgqKOqAq9be1w/ni0lPl+5DLctw4HGx9EHaLzghM1cnq6ruKO5kFyoF/bV0X052Nszb2z5cbYmsxpq0Sd1hWKpNdXfOsrdZ6uTWgawA1n5ozG2s9RSEN9kYVO+RzO6qyC3lPebEJH5zecnmIv5kjkk/8Vd28cGePv6fKKY7Jvw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cAYU5QCR; arc=none smtp.client-ip=209.85.210.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cAYU5QCR" Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-6e02597a0afso12137b3a.1 for ; Tue, 06 Feb 2024 14:05:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257121; x=1707861921; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=waJIZfpNliIHPa/9880riUne4F1M98q2cd6wla9GZjk=; b=cAYU5QCR847DECzpKXaWauFdML4N3o8+vpgtWRMWYKi+LIZaRCj4dZzq4YgGzxe0hi qO122rK5pmuktZ0XEwabECQ+K4GRArocMzp9QA6eNEDu6cfznqpyk7NpiKJprs8Naj46 LCtWXpkrLZRKckl/3vRI62M+Xvy5llP0YNjK8kqGbiiqPmbA2esAMei/q+mhdwylYGcg xlfm64D1Ip+2qtkOENZRSHk4wL8nROXDyVj83cHwvQCJM5UW1ZsElpamZTGSmfnol43V zwoEfUV+TunxCgw46GHV2duuqC83wfnNkx2ofRaEKGVMenP8TyjcScD4UEh0bmUn8GXK wCUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257121; x=1707861921; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=waJIZfpNliIHPa/9880riUne4F1M98q2cd6wla9GZjk=; b=DPwtsRwkVddHhxmdFXvULYpw+JiJjRmHBaGhpt5zequrgKu+Srdu49ls7sx+RCHbR8 XAZMp0MkImrmUSnYzL5UpX1vu3YVGwQH4NTuzluYXVkq90UsN/qArPejLQETBQ/hak0f JBJofJbMWkJbFbBoSHyHv6mRZAbq8iir82+L0Gbfs8dABTg86mJQGaryqnjOF9rKX7CM XIH5ve4QYAVk+g6Va48sjbfFQVq1XFcmwThOJ1Bmto1rT/AjkDQLxIVGROJ6HRB5YDWh KSRec7Ed3XRywJdLazCM6H300E7ZO/QERlCwORc63+sYnQdniDIPQ1qw0ZCq7ZkFfcRm l7eA== X-Gm-Message-State: AOJu0Yyra3MhBnL7JZeXSFyP8/Vhr1OxQXKxoIELc7zYKeB/31w3KFb8 ERMYAxePRDreKB4TARecD0aA2ZvYLlwcWQT/nULzpUMvlFoJmDghlA8vvsqK X-Google-Smtp-Source: AGHT+IGEm41W1mIMfg18EMXRi5EZlNoYfBurQNFDegQZNFeL2j9w6VizaJwjHHTqyxrGN0HSh8f/bg== X-Received: by 2002:a05:6a20:8154:b0:19e:a23f:fa73 with SMTP id u20-20020a056a20815400b0019ea23ffa73mr606797pza.5.1707257121125; Tue, 06 Feb 2024 14:05:21 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCXxN8orindXUKGG1h/zyY3mQyICEUtCIntzBop61kdfolwDs7jXajQYTwYORXc3hFm7NsG4OasP13LHB6c852skW89A5Q3SEJ9IeXDoIlf8HFxDkSEQwtp0/hR6/4zyqyoEUUOyZl7qxTHsrtAYLfY96aJG11wKBcbQfUK+jAghxuHvzl5GfAONh4LbS5QeBT2B9USv2mg4iHKbjiyPWsCkS1ydEmzx+CC1iJngFwB5VSmgSP3SEUQKbB3N9kyl4hqzgAWt3xXJbEfM3pfWaP0AZ0sXpYwjQtuN Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id x17-20020a056a00271100b006e0542f9689sm2474751pfv.103.2024.02.06.14.05.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:20 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 09/16] bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. Date: Tue, 6 Feb 2024 14:04:34 -0800 Message-Id: <20240206220441.38311-10-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov In global bpf functions recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. Note, when the verifier sees: __weak void foo(struct bar *p) it recognizes 'p' as PTR_TO_MEM and 'struct bar' has to be a struct with scalars. Hence the only way to use arena pointers in global functions is to tag them with "arg:arena". Signed-off-by: Alexei Starovoitov --- include/linux/bpf.h | 1 + kernel/bpf/btf.c | 19 +++++++++++++++---- kernel/bpf/verifier.c | 15 +++++++++++++++ 3 files changed, 31 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 82f7727e434a..401c0031090d 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -715,6 +715,7 @@ enum bpf_arg_type { * on eBPF program stack */ ARG_PTR_TO_MEM, /* pointer to valid memory (stack, packet, map value) */ + ARG_PTR_TO_ARENA, ARG_CONST_SIZE, /* number of bytes accessed from memory */ ARG_CONST_SIZE_OR_ZERO, /* number of bytes accessed from memory or 0 */ diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index f7725cb6e564..6d2effb65943 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7053,10 +7053,11 @@ static int btf_get_ptr_to_btf_id(struct bpf_verifier_log *log, int arg_idx, } enum btf_arg_tag { - ARG_TAG_CTX = 0x1, - ARG_TAG_NONNULL = 0x2, - ARG_TAG_TRUSTED = 0x4, - ARG_TAG_NULLABLE = 0x8, + ARG_TAG_CTX = BIT_ULL(0), + ARG_TAG_NONNULL = BIT_ULL(1), + ARG_TAG_TRUSTED = BIT_ULL(2), + ARG_TAG_NULLABLE = BIT_ULL(3), + ARG_TAG_ARENA = BIT_ULL(4), }; /* Process BTF of a function to produce high-level expectation of function @@ -7168,6 +7169,8 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog) tags |= ARG_TAG_NONNULL; } else if (strcmp(tag, "nullable") == 0) { tags |= ARG_TAG_NULLABLE; + } else if (strcmp(tag, "arena") == 0) { + tags |= ARG_TAG_ARENA; } else { bpf_log(log, "arg#%d has unsupported set of tags\n", i); return -EOPNOTSUPP; @@ -7222,6 +7225,14 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog) sub->args[i].btf_id = kern_type_id; continue; } + if (tags & ARG_TAG_ARENA) { + if (tags & ~ARG_TAG_ARENA) { + bpf_log(log, "arg#%d arena cannot be combined with any other tags\n", i); + return -EINVAL; + } + sub->args[i].arg_type = ARG_PTR_TO_ARENA; + continue; + } if (is_global) { /* generic user data pointer */ u32 mem_size; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 6bd5a0f30f72..07b8eec2f006 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9348,6 +9348,18 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, bpf_log(log, "arg#%d is expected to be non-NULL\n", i); return -EINVAL; } + } else if (base_type(arg->arg_type) == ARG_PTR_TO_ARENA) { + /* + * Can pass any value and the kernel won't crash, but + * only PTR_TO_ARENA or SCALAR make sense. Everything + * else is a bug in the bpf program. Point it out to + * the user at the verification time instead of + * run-time debug nightmare. + */ + if (reg->type != PTR_TO_ARENA && reg->type != SCALAR_VALUE) { + bpf_log(log, "R%d is not a pointer to arena or scalar.\n", regno); + return -EINVAL; + } } else if (arg->arg_type == (ARG_PTR_TO_DYNPTR | MEM_RDONLY)) { ret = process_dynptr_func(env, regno, -1, arg->arg_type, 0); if (ret) @@ -20321,6 +20333,9 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) reg->btf = bpf_get_btf_vmlinux(); /* can't fail at this point */ reg->btf_id = arg->btf_id; reg->id = ++env->id_gen; + } else if (base_type(arg->arg_type) == ARG_PTR_TO_ARENA) { + /* caller can pass either PTR_TO_ARENA or SCALAR */ + mark_reg_unknown(env, regs, i); } else { WARN_ONCE(1, "BUG: unhandled arg#%d type %d\n", i - BPF_REG_1, arg->arg_type); From patchwork Tue Feb 6 22:04:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547828 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3157E1BF2A for ; Tue, 6 Feb 2024 22:05:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257127; cv=none; b=PdncmG8uaosjumE82yQAsE1ACPwC6H0O6abEUXYRD7vt2JhyCxxoHVKxe2vlmKeLj1tu2QNLn01LVCwhGDqMrQ2ibSCByGpW6qq6L7xq4tQ44r8zvOPmTOjIKi0fq74eJuw0FjYktxKwn2UUS8ErQ5I7EnBr9tCjCmAzDkvjM8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257127; c=relaxed/simple; bh=B9RYqjZPOWV6C0GMRY3SkdIHMqd4rfBrkE8wPVtghus=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=eX8Mohokw2DakOMZI1OLwdzYaBZ9FF6c24tqeubuS4cBRgctvfTcyms0ACNd5tw3rwEvceerAB1GXB69seGZdx3t0uQ0Qv8RJFRdzu78Ojz3Nm3mmP0b3/yMIsGZAWcjzNNKXE9VG8dd6F8/F4yLHxpRTwZNvL9/0hOdmvyNt70= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Qve+Us7j; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Qve+Us7j" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1d746ce7d13so251745ad.0 for ; Tue, 06 Feb 2024 14:05:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257125; x=1707861925; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kyKA4wCImIZZ5xJE04KmfmUkjYhSBJW7+tRHKBcOaBI=; b=Qve+Us7j7oZ9dywQEK4BChKkAEZi4ZQv3yaMVU0tHd4HlXoIUI7KCaqImoNokbERcM 9iY2/3VCOHggrdwvLbEFqrQImtsBKJJobrYUGh76LMMCkL+ktHQ9PznrW7jnfWoHJayI kPmYqRAUqyy8zqi1JztsWd6Sf3WP7M/9XLNUbVdJK7AqVRk50XfiKScLLh4LTpE22tXt dR97s7FrYDwTdjIaSY0u0fkImUpDzwv86S4R0l0MfUiNWWdTYDC2Z+vS0yrkCSpaDm67 z7mczB1/fn50YWP7IBTEPJuRnMTb7khF/egijiXwKKmJo/GGa4fhMXnizP7ztWiMFGiT Flgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257125; x=1707861925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kyKA4wCImIZZ5xJE04KmfmUkjYhSBJW7+tRHKBcOaBI=; b=qaUJMYuPC2pGkrOBGdV9qbTGq5xFlYvePJszMRo60zaX4fH7Bm05DFUA5uGUMIAmnH Mn8d0vrLoIFbclmhIkUrwwMA6b8uhkDJQcoMBWEIeXoTFAcn9sLvoMumPpaktVsytXid en+bVL0THKrgTvl14C0sjNQ9cZlNWiqbMre8lNXm0NsT8Fo4nWxB8Egu72JN1utMlS02 z7eAR2Jc53ooczXB3R3dcFlr8rMyxlXuPwSoxUnml3goja4rkJ42rsvrsUnsBBxJm9n8 hYPjVMpHxn3tze4pEblp+MJBEzhTlBupOwUzlp9kVtZgGpUkkhqDqs7Jib0Wq9jRXQbs DWxQ== X-Gm-Message-State: AOJu0YyV5LHxr5G77xWhHjvP/20dXsMDr+GfarZpPwLsyl+286QCV1Kf rLvWlRtRXcdyMeK8Hh9p1cbtAwOs0j8aR37/pQKR/hrer6UY/Y6kqu9dE8kF X-Google-Smtp-Source: AGHT+IFCXZo1wWd12w2djsRFUVDKXDBOg++Oi4WlWVJBWuRKgnQggkaBieLi7zmDvrE2yA+j9R0Veg== X-Received: by 2002:a17:903:2344:b0:1d8:94e4:770a with SMTP id c4-20020a170903234400b001d894e4770amr2939440plh.51.1707257125081; Tue, 06 Feb 2024 14:05:25 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXTg5DTgGf7yruJqB/WI8D3C35qukvbUy1OcPrsmlNmRj3bMZLrv6ZKXl+1ZliwY5Yh+RSrbBH5jJH4p9m8Da+qCe4R7jQ30NGMqR05TYm6hYqiEn8rHQLlMXL6NgE36kXCg7GqxrsrdtD2zHMjjnwBdDAYT6ffHki+e2V66HSHGIdS3pm2Bi07FjJ0b371OAyhqlw/auRDkPE0BfpAnLuFVa4mCeL/MsiaPAHWAYGJs0SX24t1cZkpz2mX/gCglBvOUJsHd/acn6fNCiNO1QmPODy2sogSJAE7 Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id w20-20020a170902d11400b001d91b617718sm8619plw.98.2024.02.06.14.05.23 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:24 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 10/16] libbpf: Add __arg_arena to bpf_helpers.h Date: Tue, 6 Feb 2024 14:04:35 -0800 Message-Id: <20240206220441.38311-11-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Add __arg_arena to bpf_helpers.h Signed-off-by: Alexei Starovoitov --- tools/lib/bpf/bpf_helpers.h | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 79eaa581be98..9c777c21da28 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -192,6 +192,7 @@ enum libbpf_tristate { #define __arg_nonnull __attribute((btf_decl_tag("arg:nonnull"))) #define __arg_nullable __attribute((btf_decl_tag("arg:nullable"))) #define __arg_trusted __attribute((btf_decl_tag("arg:trusted"))) +#define __arg_arena __attribute((btf_decl_tag("arg:arena"))) #ifndef ___bpf_concat #define ___bpf_concat(a, b) a ## b From patchwork Tue Feb 6 22:04:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547829 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0203F1BF26 for ; Tue, 6 Feb 2024 22:05:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257131; cv=none; b=H4sjxTkbQRxLuDF9HYYrqEvWcwhIyJwvw7BioVaNmH1QSGFd8bapOyUdEHi/a9PWwKKMOLHGnB0wkFfVe1iVTGIlTe/tLENrE9+vLgdURRcwP+1cozxzWEW4U8snq45rrSJ/yf4zzhud1L6NqwS78ZdQIk4QqRVdjwg5rrH7ktk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257131; c=relaxed/simple; bh=C8ZKxZ/dzfbNIItzxRQpV0ixzlivUtlS/NIhI15j7WQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=G4Pz88AQ/6uo8ydMmcnDUq/SXXyx62DD4HkGnleJDZ76wxgrznoLhimfrMf+PFXlVETYzC0eGJZXwVHr2uKQ2Me8GAhI69SwdyAGqkWVA0nfW9A/2hdFlFnSCwAFFr+FIg3VkcjRzgGvqdvgiqmvLXvfNDbYUvDZudu5/Kuuui4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FybHSSRc; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FybHSSRc" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-6e054f674b3so12061b3a.1 for ; Tue, 06 Feb 2024 14:05:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257129; x=1707861929; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RiXImkPg8/G5+2DOGxZ3xDCLE7/uYVswnoMrJHCmlE0=; b=FybHSSRcvC0dc9TJpVotdcSSne0AK+AnA5a6Ga8LnEoySX1gY9xazAg13+xj2GAITN yAFqDoKcO5rs6D5b31uBD2jg8i0xjaO1ds6cK2G1ZuzeOpUdLlpuwCrDaBOL3mv/qJ52 XIyj4/AU7W2oVukmSctpZkvfKnQbGh9qKFkSIz5PgYMLk8cJbE5BErQyakg22i7G1t21 Olefqg1AblTHuc9anQsvzvrPjAqJsuIrHHBEaVCoSRkENyYHuNEnPQa8IQ8GKUbKbdd7 osNLHEtb9zTVQvBySP+R+Y9WO7GlPDdGRMkmC0K35vTWCxApE0cEJVJ2KErSiYIhIyrV 0pag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257129; x=1707861929; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RiXImkPg8/G5+2DOGxZ3xDCLE7/uYVswnoMrJHCmlE0=; b=Sj8QgZvx/YvVfWi2v/YnYrJY7DPdi6I1rv3gGdY1VLrt45VVI221AyQJgylqKJZNp2 iANK+KYdcuUyX5l6vXpeDYoZv8V2niRXiNfFCQrcOAtnQabyGHTSEsKZqo3sGFUIMv2T 4ldn56MVK7GBgkVVnxNjG1KpG3rVpGEbyxHdTE/5yZ+SKk0kUm+j1LeRZCS6KtryCiea SX3FEVt1e7Ok3yPS8Crlfifw6SeZ1C5LzSGDKmVfwhM4IUXOKIxSbIqEOLHE0JWEtShe Q2Sqwx0pTUGI1MueyxGE1BrckRrgpdH7f5JtrRpbVJTodeQtnpUwFxBKndHVTegZK1RZ ylug== X-Gm-Message-State: AOJu0Yx9Q5UcHOjhVlOl9rkzeFMylcC1NAYbfA8Z2fHa3sfp5ixTXmDd vWDo2EeQomVlf7q6sdjVn9IJBmcxyARnq8aDjmbkRu71J4cCtuoMR2PB52/f X-Google-Smtp-Source: AGHT+IEIl4BNu5aTiVeud6z0/oB6Ycj6jmTdFR/O0kus8+9trSKYb2Mt0v8czJwLnLT6tYLaUsGdUQ== X-Received: by 2002:a62:c144:0:b0:6e0:4d28:b3a with SMTP id i65-20020a62c144000000b006e04d280b3amr813386pfg.21.1707257129006; Tue, 06 Feb 2024 14:05:29 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCVDvb+r0mk8Q2LDEv9B9JMgXfjhY0opj4ilgwOds4APWie51YIRNJnQsC39i+blOZty5ZIcj+KLRBiH6UTqBDlQcNNR2QBOypbk1SEMOeJJvz51o7vEk16399WLfVIf0km+ypgZ7200F6hdBlSZEkCOCdwTvjt+Egj6l8giEFYADjPdyFEjDSXDrkCv67crZA5GJY52dcWfEwzqtvxfTXFf7LbRxgpF28eLdOexz/BcKOiXaX4VDefpgoWZqFjoHApuj3rGrSfywcJymCoRt8xCDNcKkAZJihbB Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id k138-20020a628490000000b006d0d90edd2csm2588957pfd.42.2024.02.06.14.05.27 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:28 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 11/16] libbpf: Add support for bpf_arena. Date: Tue, 6 Feb 2024 14:04:36 -0800 Message-Id: <20240206220441.38311-12-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov mmap() bpf_arena right after creation, since the kernel needs to remember the address returned from mmap. This is user_vm_start. LLVM will generate bpf_arena_cast_user() instructions where necessary and JIT will add upper 32-bit of user_vm_start to such pointers. Use traditional map->value_size * map->max_entries to calculate mmap sz, though it's not the best fit. Also don't set BTF at bpf_arena creation time, since it doesn't support it. Signed-off-by: Alexei Starovoitov --- tools/lib/bpf/libbpf.c | 18 ++++++++++++++++++ tools/lib/bpf/libbpf_probes.c | 6 ++++++ 2 files changed, 24 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 01f407591a92..c5ce5946dc6d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -185,6 +185,7 @@ static const char * const map_type_name[] = { [BPF_MAP_TYPE_BLOOM_FILTER] = "bloom_filter", [BPF_MAP_TYPE_USER_RINGBUF] = "user_ringbuf", [BPF_MAP_TYPE_CGRP_STORAGE] = "cgrp_storage", + [BPF_MAP_TYPE_ARENA] = "arena", }; static const char * const prog_type_name[] = { @@ -4852,6 +4853,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b case BPF_MAP_TYPE_SOCKHASH: case BPF_MAP_TYPE_QUEUE: case BPF_MAP_TYPE_STACK: + case BPF_MAP_TYPE_ARENA: create_attr.btf_fd = 0; create_attr.btf_key_type_id = 0; create_attr.btf_value_type_id = 0; @@ -4908,6 +4910,22 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b if (map->fd == map_fd) return 0; + if (def->type == BPF_MAP_TYPE_ARENA) { + size_t mmap_sz; + + mmap_sz = bpf_map_mmap_sz(def->value_size, def->max_entries); + map->mmaped = mmap((void *)map->map_extra, mmap_sz, PROT_READ | PROT_WRITE, + map->map_extra ? MAP_SHARED | MAP_FIXED : MAP_SHARED, + map_fd, 0); + if (map->mmaped == MAP_FAILED) { + err = -errno; + map->mmaped = NULL; + pr_warn("map '%s': failed to mmap bpf_arena: %d\n", + bpf_map__name(map), err); + return err; + } + } + /* Keep placeholder FD value but now point it to the BPF map object. * This way everything that relied on this map's FD (e.g., relocated * ldimm64 instructions) will stay valid and won't need adjustments. diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c index ee9b1dbea9eb..cbc7f4c09060 100644 --- a/tools/lib/bpf/libbpf_probes.c +++ b/tools/lib/bpf/libbpf_probes.c @@ -338,6 +338,12 @@ static int probe_map_create(enum bpf_map_type map_type) key_size = 0; max_entries = 1; break; + case BPF_MAP_TYPE_ARENA: + key_size = sizeof(__u64); + value_size = sizeof(__u64); + opts.map_extra = 0; /* can mmap() at any address */ + opts.map_flags = BPF_F_MMAPABLE; + break; case BPF_MAP_TYPE_HASH: case BPF_MAP_TYPE_ARRAY: case BPF_MAP_TYPE_PROG_ARRAY: From patchwork Tue Feb 6 22:04:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547830 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F39B1C282 for ; Tue, 6 Feb 2024 22:05:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257135; cv=none; b=LyNwAIl3hxRYY+VX4aqnXaEgdnjF+ZO+Gb1zHO1Mns2M7Gd1zuZCtIfXVE4PVGTKxv5C5oDYQolD6HmJe0WjRYPQXZ0lrUPICBglPTgLfbWn5URs3YyM8E0NI2czzfq5FSYXD1S2KNuKtwHqfoDmMRxbNU+8lbO4ntU8H1j2Ocg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257135; c=relaxed/simple; bh=34DPUo57nj9loxjfLiNGICNBllqELHt4su73eaCRX6w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=C02T+DIPUWYYvAKntIFtf/yZtKJNcttE1+CRQaXaXt0RJ52FL4YL7kKeZykbOFwdvZHMhUtp3FbvsnDhuvCEOcOcQKJ8vwg7Sb+Me4Avi2VlQzceUuV9X8iWO20F3YtLgWu3FpNfQeX2MkRFIHE+qrHFS2b+DpdGWyFe+9jF7bs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dnWm5p4m; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dnWm5p4m" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-6e03f7f427aso6222b3a.2 for ; Tue, 06 Feb 2024 14:05:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257133; x=1707861933; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wpvr7dgsK6u0sIpxAzSJWUuWXV5TlyIi0UfrrFuy1/I=; b=dnWm5p4mwYfbzAXJDiXSBs3dzU9aincIb5iKMyqY5KK4O7+gkaDBqlf8KPWkZizmWB y5X2ds8WZO+sIg1XVdghUmY3DJtqKbhdmqOXTF9zcdBzcONHsTTJ0Xc35DnmdZk+6TGl 8w51Xey4KGB0ZhG11caws+XS2nyCDl5lpaZkafl/xp35O8mk0lLXgJo3kfaanHjw7s9Y 3HUEuOSXfsh9LDYeUbUoSHPlnUiUTaEt/GXej/bA0q7BZ8Hfjo+An9qW87yzT1KBoyCl Nn46ZoI9cXRbOHtWZ7viFGcfn32L06Y+Fd3TEieLoSbxRcpsbW2AW9zUwoCzXfQgVidH YWTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257133; x=1707861933; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wpvr7dgsK6u0sIpxAzSJWUuWXV5TlyIi0UfrrFuy1/I=; b=ZAgss8LWWE6l0V7tyVnhrirNltqBnhj557jUM4v12FUEUeSv7TscejZ1qK495e3Ixb XcVbpdeu727kgKhEsVrZmxY6TJtd0yD3SMxtfNIVzFbo3cozzMHlsyDnO0nQwyWg2AFn aNyJziOQ5AAgLG+sFfmQKjaYDItPwzKgPDPCr/6XPMNfIyARo3D8Q2w7d+1SRei47vy1 144UNGQT22XUKRHknQ3dmMhAoZ+F1zYgOlT5WCQ6bAlnJYZHFiLZCNeIw9mfp5UsxOlG EmE9rkg0PaVLnHKzwM+rhYNXdgI7qdxgIpn8IlkxV+6AEF3wAZ0r4PE9JChrqKgBWGZ2 xCrQ== X-Gm-Message-State: AOJu0YyvdII9Fvn2NZ4nBtMv5jtAiOFBi3bHiueaUvzO6u6axBwLx5Jg 50IHaoS4TONUBwRBapAn29SYOOnj7r8QgtUM5rQi2DINdWrS17ykHH2nxxW3 X-Google-Smtp-Source: AGHT+IEY4Ho/vNriwoPjJ4S75OBCY2kHP9G7jpkOFNGGYsmw17IByMEq936vsD27kQ2v+hwLcnpchA== X-Received: by 2002:a05:6a00:c83:b0:6db:d4f8:bb1d with SMTP id a3-20020a056a000c8300b006dbd4f8bb1dmr1301419pfv.2.1707257132982; Tue, 06 Feb 2024 14:05:32 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCU2OGvHfmq89UQzCp1rwME6GQJLViCec64yB9gCcsghB2SWpFWsKo6/ZGOtD8X8cwbT65OVFmOToCbMFo8mIFHt5dOdX6bHawLpFq78B2yQpO2SQ+CAyDi7JP85wFUDSkEALC6cY8Jt2uzOJxzQANZN9nuIkX26PjvSgZmc7qjDGyu1p3St4AYihBbq9XC9KxRjpCsu9huR4K68LTgrecJJNvFuvorikOF9KmU1EEzO123aAdBH3wpLRjJAe2ZyON/Sjy5Gv2v9eyZ3yOVxj/zYKjqBFSWlngGz Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id o1-20020a056a001b4100b006ddc7ed6edfsm2497587pfv.51.2024.02.06.14.05.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:32 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 12/16] libbpf: Allow specifying 64-bit integers in map BTF. Date: Tue, 6 Feb 2024 14:04:37 -0800 Message-Id: <20240206220441.38311-13-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov __uint() macro that is used to specify map attributes like: __uint(type, BPF_MAP_TYPE_ARRAY); __uint(map_flags, BPF_F_MMAPABLE); is limited to 32-bit, since BTF_KIND_ARRAY has u32 "number of elements" field. Introduce __ulong() macro that allows specifying values bigger than 32-bit. In map definition "map_extra" is the only u64 field. Signed-off-by: Alexei Starovoitov --- tools/lib/bpf/bpf_helpers.h | 1 + tools/lib/bpf/libbpf.c | 44 ++++++++++++++++++++++++++++++++++--- 2 files changed, 42 insertions(+), 3 deletions(-) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 9c777c21da28..fb909fc6866d 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -13,6 +13,7 @@ #define __uint(name, val) int (*name)[val] #define __type(name, val) typeof(val) *name #define __array(name, val) typeof(val) *name[] +#define __ulong(name, val) enum name##__enum { name##__value = val } name /* * Helper macro to place programs, maps, license in diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index c5ce5946dc6d..a8c89b2315cd 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2229,6 +2229,39 @@ static bool get_map_field_int(const char *map_name, const struct btf *btf, return true; } +static bool get_map_field_long(const char *map_name, const struct btf *btf, + const struct btf_member *m, __u64 *res) +{ + const struct btf_type *t = skip_mods_and_typedefs(btf, m->type, NULL); + const char *name = btf__name_by_offset(btf, m->name_off); + + if (btf_is_ptr(t)) + return false; + + if (!btf_is_enum(t) && !btf_is_enum64(t)) { + pr_warn("map '%s': attr '%s': expected enum or enum64, got %s.\n", + map_name, name, btf_kind_str(t)); + return false; + } + + if (btf_vlen(t) != 1) { + pr_warn("map '%s': attr '%s': invalid __ulong\n", + map_name, name); + return false; + } + + if (btf_is_enum(t)) { + const struct btf_enum *e = btf_enum(t); + + *res = e->val; + } else { + const struct btf_enum64 *e = btf_enum64(t); + + *res = btf_enum64_value(e); + } + return true; +} + static int pathname_concat(char *buf, size_t buf_sz, const char *path, const char *name) { int len; @@ -2462,10 +2495,15 @@ int parse_btf_map_def(const char *map_name, struct btf *btf, map_def->pinning = val; map_def->parts |= MAP_DEF_PINNING; } else if (strcmp(name, "map_extra") == 0) { - __u32 map_extra; + __u64 map_extra; - if (!get_map_field_int(map_name, btf, m, &map_extra)) - return -EINVAL; + if (!get_map_field_long(map_name, btf, m, &map_extra)) { + __u32 map_extra_u32; + + if (!get_map_field_int(map_name, btf, m, &map_extra_u32)) + return -EINVAL; + map_extra = map_extra_u32; + } map_def->map_extra = map_extra; map_def->parts |= MAP_DEF_MAP_EXTRA; } else { From patchwork Tue Feb 6 22:04:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547831 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E892F1C292 for ; Tue, 6 Feb 2024 22:05:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257139; cv=none; b=MfHRXakU9sTRd5gWLxshHL1rxWCFAsFqf+qg0OP7LXJQdpcdyXN0YFA+zDFH3Ybi4R7p5hQeKHWJEQa1wUm312Zj3BDMgxbW0SADjZXnVHjNxgfHi1KYeM/wCS5m67+K9OfuE51VLNyKbeBbC0MS/NwwtvLE3x329AeHoFAT8mg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257139; c=relaxed/simple; bh=D/TrAhQWAcIDdh/oQnU+KCkH9OnP8LFJRj8yUjQYKhc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Z8uLwI+B+FyXwgVgJJBJYVEJQDdGocNojxd1TBYj3DNXL2BZczB/K7WDpOLv2YDECBuw4z4CCj3kpCq1uo0vYmUN2BOTToOWgtSpBS8P7LiZilk4m6RnZCbzHcZS3KKkH48PtN1coAl3z/Fdon2sodg7SDEiRNWqcOqT3R58J8k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CqUGOr/H; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CqUGOr/H" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1d731314e67so96645ad.1 for ; Tue, 06 Feb 2024 14:05:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257137; x=1707861937; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mCIDfn6Q/Sj02IoZMnsUeC+cwQq+R0K4Vy6avN3EcgA=; b=CqUGOr/HZgmPRetvL5Npy7dbIWlfoLDqskgF9sAUC1g9LDeN1rJhtHSbXsHAv+FRfi ipUrgXx0TRpOWH7amOsfZiqxjcCtvqJqm1+w7hYUs9r8LPCamJZI0T4ZL6nklVG7Nu/c PwRWCoOiewRZv8zOcG2mQm7d2sQdeX3MFsgWmSnsSa+Iy3b3t4iyMiPyKi1bvgWFRQvY HdtySf0aaz0G6x9BvpBDv0PMI3gyC+J6692m6Bak0N6t3BQTSxoYcBzLHl1E3JY1q585 b2qLRmS7Jf5BFsqJsVtyz70+OUYw7sTRV91oZtRNg0/rFzdNAPSLQ30w2re7OOIDtNJu FLqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257137; x=1707861937; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mCIDfn6Q/Sj02IoZMnsUeC+cwQq+R0K4Vy6avN3EcgA=; b=fdNZ9fK62T1heibyf8TE98ULc/FTRk06+mAld7UJq76i5SFcOyP9d94Z563SfHbqDV ejXTETAfYksqSP8J4bIIs8GCiaD0QtfjcH+MXE18+ipWmsfTwyubAXtxN3UPFz2EAN3t DUnZ35YTCBXBDxew6BkO19C2ib1XiAlg9nFsbufghMgyEMscLI48Xq2x7B5aGIAmxk3C +ZJ//r7U472+m+SOwoDGa8gokh0fvh1pFad1qjZvKZxfS2i9N7yWDl9DRk2PSk+nwBYe paNAxIPZRufAVrLI7BK6lq92/t7PVOEZfCaa/6lCR/SzO100My6OpNfDRYcRjIR6DdOc r9zg== X-Gm-Message-State: AOJu0YxYaNtcv5O0n/XF9vBA3izBuFyCzq0CU7fGTPoTYmzREK2Kcw6A MzdAoiX0S9QAb24O0PguxYWuLdmIBHQ3Tb9QwJXhA17KEAsra3VO0qLGNub/ X-Google-Smtp-Source: AGHT+IFC/8gaSn2ezfySvQY9fE+LjbzC6tGU3b69/lMAgjDuoKzugaxeFxuGwhXQP6FF6XydFy7cZQ== X-Received: by 2002:a17:902:d505:b0:1d9:1322:5b0d with SMTP id b5-20020a170902d50500b001d913225b0dmr3444099plg.65.1707257136843; Tue, 06 Feb 2024 14:05:36 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCV0x5GbXUCf5LB0OoOZH0U0NIvJ4KFVm6E2YVqM7e7OdGPW+05tZ5DAXSvvES4yBjccZ+t5IUAp2y3ibsYajhe4yp9SVObP53SFLmqQJAkx7FyJme2n9iRr5USNt4OJlJqM8O2xXtKU+496radnUUCnB5P/KQPT6zgteLeYt5oGrVMYnpbH4/Dh7vDvJGTDyB8HeVf+n7M4FLc534D7uDs7Skwll/6R5u6HMOZl0XgdXeTx12OGxEAN58tNuRpgmiZPbXSDoBOkb+iBJSRun85q1Mu9578UYxyZ Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id y11-20020a170902e18b00b001d6f29c12f7sm4613pla.135.2024.02.06.14.05.35 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:36 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 13/16] bpf: Tell bpf programs kernel's PAGE_SIZE Date: Tue, 6 Feb 2024 14:04:38 -0800 Message-Id: <20240206220441.38311-14-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov vmlinux BTF includes all kernel enums. Add __PAGE_SIZE = PAGE_SIZE enum, so that bpf programs that include vmlinux.h can easily access it. Signed-off-by: Alexei Starovoitov --- kernel/bpf/core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 2829077f0461..3aa3f56a4310 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -88,13 +88,18 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns return NULL; } +/* tell bpf programs that include vmlinux.h kernel's PAGE_SIZE */ +enum page_size_enum { + __PAGE_SIZE = PAGE_SIZE +}; + struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags) { gfp_t gfp_flags = bpf_memcg_flags(GFP_KERNEL | __GFP_ZERO | gfp_extra_flags); struct bpf_prog_aux *aux; struct bpf_prog *fp; - size = round_up(size, PAGE_SIZE); + size = round_up(size, __PAGE_SIZE); fp = __vmalloc(size, gfp_flags); if (fp == NULL) return NULL; From patchwork Tue Feb 6 22:04:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547832 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0493F1C29B for ; Tue, 6 Feb 2024 22:05:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257143; cv=none; b=OKJM3NOQoWHLb+SRU03KkadRwx3RAjgKBCKjD/Aghj3N5kw7H0fak86MvjQvvkwTbifWL1M2jE4t2FLPwoqve2VOZ7jnqt/y1gir9TSBifxaNeWetOYgjS7ZBshubsdhYFvKzhh8bbUean8pTeSowd+4S0/FQcDMHb+/yucq++M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257143; c=relaxed/simple; bh=DpM4HzfqjDOit5k3hAkHFBsY8yXYvaZc7MIRe8CSyMA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=n7N6ySFO96o9CPp0lJ3YTgr3TAkhQ74xABx4V/THDcJo+xv2AAJy3Zko+Z/Tuav5gVdd3jzsUkI35aZqK6LNsEJCFxnV8CizFyl8nI82aBtbECIbI2rtCK6uXcr5LmrBM/T8St3gT0+10H0W51pB75dUg7BNENJFWvJtu/8uFdk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Cf/DTIQD; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Cf/DTIQD" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-6e05f6c7f50so5148b3a.3 for ; Tue, 06 Feb 2024 14:05:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257141; x=1707861941; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MsJBrUUPtruHWtQsS6zgKVxqPHg8IzuE7qrpmwhQpng=; b=Cf/DTIQDRmuUZOYoOHria+vEiVNJvjNMP7qKQ4Hal6KH+ltqZFKRzXcTQu85HAy1VZ RtUH6l8teB8w+aPWTJYbnnMmUjbk0nKogDZ2xkYASNmzZvAGaBoG8DJqxhbiJLWuHvXK JKltZmjflCWoQf0tHE4aUrrliG5vLJ/o1XvlLz+vyaXc8ITKLg9eat30RLZ2mjxxJs7/ HdYVhtyXBvsrkPtAbMWGQ6/6QwDMySoJT1t6fmixqAneUhJ7FnynnY/Nu8nbA5x/RUdq 359iLXL6ADW2CTHaqTPFIVobZCiFLjnqOrrbjSPPtPjMH/B8vcvg0Y6/Tdt/fWKY2X1h MIbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257141; x=1707861941; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MsJBrUUPtruHWtQsS6zgKVxqPHg8IzuE7qrpmwhQpng=; b=SOx1bCxCKsvoBKO/56bA44IXkw1hksThAU2JXyMXXh6Aey8CI20Dt61QYJpXruqdKu MVX9e5J+8Cuwbv2o0hULAkYY26yOBqv02w+4nmsY8nvPxJDBBjEZ7o9RomzOSfBAD4YY t3j7Kyrxev7OjK13AzYeLgX537zQUp1bnkQa4uOrbfsjn8+HuaAzeVSoEXtkX62iMXhz piFr1dd8ny9F5yxkSl4doExQlemWYahptvLp/TSji/hh1Szrs2TCOLrI1lx8Vjs0+q8g SNX0RmYQv69/hOwR8TXlNrixBgV80uSBlnXDTA6HRsTwzfVI3lxolgaWY+1/eV4WFS0n C0RA== X-Gm-Message-State: AOJu0YxfI+Tko4fAHSqKj9MFIjGNUKdhIE7iHlA2gb/bbLGZdK2WgzlF /x16fAVy/UQBflKQni15dl2yWSl0UFi/6wajfHhJYcqdaWsntTkeoahURFFL X-Google-Smtp-Source: AGHT+IGdHBHn8MfZOBdjDr1wev3fr0slDMYt9TgRf3fpt6EWXX4IriOrk3M4U7i1fJwsabnbpdMnnw== X-Received: by 2002:a05:6a21:1798:b0:19e:a1a1:5360 with SMTP id nx24-20020a056a21179800b0019ea1a15360mr1112802pzb.23.1707257141056; Tue, 06 Feb 2024 14:05:41 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXz5G5HRMjkZjOv6ZNWk6IaBtAGKiStm6/yVXkGspRuSxmD8HC/H4mL6JLFh1XIx4+3PUuzmNG1hY1hX1FugKNofjuX0OghF/O+bK3DIKUNmz9/KnpQdtT8//WXOCO2GXI8KGJN+EmACq9yWmsUXs7bFR9BI0sxDJuQnL9vbW+jHQa5a3qDh4DEh2y7YXrvFIQ58H5MaP8nMeLs3gE0dbpirRALVTKUDLfUP/yiW6/QSk3qNwqYmu3W8DUazqg05AWtkQyQJbdVkyiE0bsIoonUbE2V78dSYFLF Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id sz14-20020a17090b2d4e00b00295b93bfb24sm7888pjb.22.2024.02.06.14.05.39 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:40 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 14/16] bpf: Add helper macro bpf_arena_cast() Date: Tue, 6 Feb 2024 14:04:39 -0800 Message-Id: <20240206220441.38311-15-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Introduce helper macro bpf_arena_cast() that emits: rX = rX instruction with off = BPF_ARENA_CAST_KERN or off = BPF_ARENA_CAST_USER and encodes address_space into imm32. It's useful with older LLVM that doesn't emit this insn automatically. Signed-off-by: Alexei Starovoitov --- .../testing/selftests/bpf/bpf_experimental.h | 41 +++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 0d749006d107..e73b7d48439f 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -331,6 +331,47 @@ l_true: \ asm volatile("%[reg]=%[reg]"::[reg]"r"((short)var)) #endif +/* emit instruction: rX=rX .off = mode .imm32 = address_space */ +#ifndef bpf_arena_cast +#define bpf_arena_cast(var, mode, addr_space) \ + ({ \ + typeof(var) __var = var; \ + asm volatile(".byte 0xBF; \ + .ifc %[reg], r0; \ + .byte 0x00; \ + .endif; \ + .ifc %[reg], r1; \ + .byte 0x11; \ + .endif; \ + .ifc %[reg], r2; \ + .byte 0x22; \ + .endif; \ + .ifc %[reg], r3; \ + .byte 0x33; \ + .endif; \ + .ifc %[reg], r4; \ + .byte 0x44; \ + .endif; \ + .ifc %[reg], r5; \ + .byte 0x55; \ + .endif; \ + .ifc %[reg], r6; \ + .byte 0x66; \ + .endif; \ + .ifc %[reg], r7; \ + .byte 0x77; \ + .endif; \ + .ifc %[reg], r8; \ + .byte 0x88; \ + .endif; \ + .ifc %[reg], r9; \ + .byte 0x99; \ + .endif; \ + .short %[off]; .long %[as]" \ + :: [reg]"r"(__var), [off]"i"(mode), [as]"i"(addr_space)); __var; \ + }) +#endif + /* Description * Assert that a conditional expression is true. * Returns From patchwork Tue Feb 6 22:04:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547833 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 143BE1C28F for ; Tue, 6 Feb 2024 22:05:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257147; cv=none; b=DeCA+LzU6Kh+XrZkdvc6AgTY00ZlhefKHFU51lnWYTLjILvMrbVi2NkaEpp+bcrZgsAXJxs7971q240+KoX7xMsHwJ71+hTi/D65/aKMX+pNlEUDKaXu5E2+bY2in19GA6JcsXRnRbLCcVHgVxBeP58pVJuXM405nOPJ67l/14A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257147; c=relaxed/simple; bh=pgjvqVMRCfFJvV6B1C97AEpFj5UfbOZUhIL4K0taz+I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=si5XG9wKs5tTUCDDCMShVyJBUF2Du6OoVAIsWe/gScWMe92K6/BViiH5zYrPv2PrmwmihtDlGcQgyQbwNLPbJlxqtBw0bmsOWfDMMBWCg7n2/yoNRnhOqRJae6ZPynstbvLP17Wf5ut9abNeXDss1F4m4u26R5tWT4KzWLfTP4w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=g6Mv1eaR; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="g6Mv1eaR" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1d76671e5a4so233155ad.0 for ; Tue, 06 Feb 2024 14:05:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257145; x=1707861945; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LE9MS9eGwDxxFPhO8TsHXptAYT9bfyhTWbVoEpeoqYo=; b=g6Mv1eaRrQzGyvbARftPGc6nDkJ41JpfgLwSqDFeW+DKw5gm3/BDkGjacDuHOMu8uy pish/OH6mPFa/UNJ16Yh4N0i1F5LVpMU0QzVTjmsRrLqXtllbz3r4YGuv6MNUWAcRJWA d+XSLy13SENVt/bXN7pNrePCvC7sfIdUcXgbVFcMaRcXGgQ8CvOy1aKFuocmhGwNw4Wv sV/vGo/u43s8zgxNCvTiq/dWOX+VESWH45rN5Vja7BO9rIAnDpImC7Cb5AGUK2DSfrOh e0sf1KS6t+1+GO97HMjRJEndzeUOsnIawd6id9qtAvk/GvI3s3rmzy9D9oyA40+GKluQ aEgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257145; x=1707861945; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LE9MS9eGwDxxFPhO8TsHXptAYT9bfyhTWbVoEpeoqYo=; b=hmA6MPWhtR/ZX5FgnWkQyUcR07SZJfzRZhQV4EX2JIyGt6Pr4rQjXCzMguk3pC0poV D8Dm9vO2M2v5GN+I+73b8Faf5u1Af0/2KHQ9LMYjiktIsdGTLN66w/FTsFmOOumRyC/x 0KNU3fHCj86HvLn3aQB8bibIh8i9BtKLWlNEWkYeN9PM8tnyib7+aerAVT81QmI4HD5F GLprfENHQOvJ6URZxgcdO7USTYFTvKfqigu5DOARXMaK6CuElgFEU+F2977RhGsUiZg5 3y5UxNgAnmpVwUoK3djV5BZX9VI02r//qNBqI9+Ei48p6SkSNxAN7nM14O/O2a2xOjHv Xw/g== X-Gm-Message-State: AOJu0Yx11jKTygZ2X/9F4mrlPwK2sfh1WraGid0YCYYkvsX6P9LlFuMF 6HTbiQZQTKx/TwbndBxpieCotcv1sOoZ11QWP8FEYGn75gRdYB8KFcs9j7L4 X-Google-Smtp-Source: AGHT+IGG4t9RdwkiKkRY2M19m/mS/AAjGHnAhoQCng8LQI6dq4CJB9Jc9qt3c31BNL7jwZ3IWRcjFQ== X-Received: by 2002:a17:902:7448:b0:1d9:bbc2:87e7 with SMTP id e8-20020a170902744800b001d9bbc287e7mr2885407plt.36.1707257144956; Tue, 06 Feb 2024 14:05:44 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVEDKC/l6eMEMtNlzEQS5nYwT0OU5E3cdDptT/bk60YVGSKo/zQyR2ggEBvMoXC+IGYAYwNexnNHzYKcK6Sy1nflFapKyb6QjD9fh6dKatzP9UaH207UZqQkqxAPhLpJ9gE4n8DSenkdqE6Hx7xzVlkC0yZzzMBg+zbd00GLiEiyZE3ZCbRK1jHdn2TTU4BYf91E/uepkReQz42tma3AhYwsh1vxCHafwJkY55srTZRYvlXGxlMA83+0cIWjdTdB8SDJPrz9RM8r0kHkKlT/LYjXrqgBXOFJ2Dl Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id h4-20020a170902748400b001d8e5a3be8asm1277pll.259.2024.02.06.14.05.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:44 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 15/16] selftests/bpf: Add bpf_arena_list test. Date: Tue, 6 Feb 2024 14:04:40 -0800 Message-Id: <20240206220441.38311-16-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov bpf_arena_common.h - common helpers and macros bpf_arena_alloc.h - implements page_frag allocator as a bpf program. bpf_arena_list.h - doubly linked link list as a bpf program. Compiled as a bpf program and as native C code. Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/DENYLIST.aarch64 | 1 + tools/testing/selftests/bpf/DENYLIST.s390x | 1 + tools/testing/selftests/bpf/bpf_arena_alloc.h | 58 +++++++++++ .../testing/selftests/bpf/bpf_arena_common.h | 70 ++++++++++++++ tools/testing/selftests/bpf/bpf_arena_list.h | 95 +++++++++++++++++++ .../selftests/bpf/prog_tests/arena_list.c | 65 +++++++++++++ .../testing/selftests/bpf/progs/arena_list.c | 75 +++++++++++++++ 7 files changed, 365 insertions(+) create mode 100644 tools/testing/selftests/bpf/bpf_arena_alloc.h create mode 100644 tools/testing/selftests/bpf/bpf_arena_common.h create mode 100644 tools/testing/selftests/bpf/bpf_arena_list.h create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_list.c create mode 100644 tools/testing/selftests/bpf/progs/arena_list.c diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64 index 5c2cc7e8c5d0..7759cff95b6f 100644 --- a/tools/testing/selftests/bpf/DENYLIST.aarch64 +++ b/tools/testing/selftests/bpf/DENYLIST.aarch64 @@ -11,3 +11,4 @@ fill_link_info/kprobe_multi_link_info # bpf_program__attach_kprobe_mu fill_link_info/kretprobe_multi_link_info # bpf_program__attach_kprobe_multi_opts unexpected error: -95 fill_link_info/kprobe_multi_invalid_ubuff # bpf_program__attach_kprobe_multi_opts unexpected error: -95 missed/kprobe_recursion # missed_kprobe_recursion__attach unexpected error: -95 (errno 95) +arena # JIT does not support arena diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x index 1a63996c0304..11f7b612f967 100644 --- a/tools/testing/selftests/bpf/DENYLIST.s390x +++ b/tools/testing/selftests/bpf/DENYLIST.s390x @@ -3,3 +3,4 @@ exceptions # JIT does not support calling kfunc bpf_throw (exceptions) get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace) stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?) +arena # JIT does not support arena diff --git a/tools/testing/selftests/bpf/bpf_arena_alloc.h b/tools/testing/selftests/bpf/bpf_arena_alloc.h new file mode 100644 index 000000000000..0f4cb399b4c7 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_alloc.h @@ -0,0 +1,58 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once +#include "bpf_arena_common.h" + +#ifndef __round_mask +#define __round_mask(x, y) ((__typeof__(x))((y)-1)) +#endif +#ifndef round_up +#define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1) +#endif + +void __arena *cur_page; +int cur_offset; + +/* Simple page_frag allocator */ +static inline void __arena* bpf_alloc(unsigned int size) +{ + __u64 __arena *obj_cnt; + void __arena *page = cur_page; + int offset; + + size = round_up(size, 8); + if (size >= PAGE_SIZE - 8) + return NULL; + if (!page) { +refill: + page = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0); + if (!page) + return NULL; + cast_kern(page); + cur_page = page; + cur_offset = PAGE_SIZE - 8; + obj_cnt = page + PAGE_SIZE - 8; + *obj_cnt = 0; + } else { + cast_kern(page); + obj_cnt = page + PAGE_SIZE - 8; + } + + offset = cur_offset - size; + if (offset < 0) + goto refill; + + (*obj_cnt)++; + cur_offset = offset; + return page + offset; +} + +static inline void bpf_free(void __arena *addr) +{ + __u64 __arena *obj_cnt; + + addr = (void __arena *)(((long)addr) & ~(PAGE_SIZE - 1)); + obj_cnt = addr + PAGE_SIZE - 8; + if (--(*obj_cnt) == 0) + bpf_arena_free_pages(&arena, addr, 1); +} diff --git a/tools/testing/selftests/bpf/bpf_arena_common.h b/tools/testing/selftests/bpf/bpf_arena_common.h new file mode 100644 index 000000000000..07849d502f40 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_common.h @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once + +#ifndef WRITE_ONCE +#define WRITE_ONCE(x, val) ((*(volatile typeof(x) *) &(x)) = (val)) +#endif + +#ifndef NUMA_NO_NODE +#define NUMA_NO_NODE (-1) +#endif + +#ifndef arena_container_of +#define arena_container_of(ptr, type, member) \ + ({ \ + void __arena *__mptr = (void __arena *)(ptr); \ + ((type *)(__mptr - offsetof(type, member))); \ + }) +#endif + +#ifdef __BPF__ /* when compiled as bpf program */ + +#ifndef PAGE_SIZE +#define PAGE_SIZE __PAGE_SIZE +/* + * for older kernels try sizeof(struct genradix_node) + * or flexible: + * static inline long __bpf_page_size(void) { + * return bpf_core_enum_value(enum page_size_enum___l, __PAGE_SIZE___l) ?: sizeof(struct genradix_node); + * } + * but generated code is not great. + */ +#endif + +#if defined(__BPF_FEATURE_ARENA_CAST) && !defined(BPF_ARENA_FORCE_ASM) +#define __arena __attribute__((address_space(1))) +#define cast_kern(ptr) /* nop for bpf prog. emitted by LLVM */ +#define cast_user(ptr) /* nop for bpf prog. emitted by LLVM */ +#else +#define __arena +#define cast_kern(ptr) bpf_arena_cast(ptr, BPF_ARENA_CAST_KERN, 1) +#define cast_user(ptr) bpf_arena_cast(ptr, BPF_ARENA_CAST_USER, 1) +#endif + +void __arena* bpf_arena_alloc_pages(void *map, void __arena *addr, __u32 page_cnt, + int node_id, __u64 flags) __ksym __weak; +void bpf_arena_free_pages(void *map, void __arena *ptr, __u32 page_cnt) __ksym __weak; + +#else /* when compiled as user space code */ + +#define __arena +#define __arg_arena +#define cast_kern(ptr) /* nop for user space */ +#define cast_user(ptr) /* nop for user space */ +__weak char arena[1]; + +#ifndef offsetof +#define offsetof(type, member) ((unsigned long)&((type *)0)->member) +#endif + +static inline void __arena* bpf_arena_alloc_pages(void *map, void *addr, __u32 page_cnt, + int node_id, __u64 flags) +{ + return NULL; +} +static inline void bpf_arena_free_pages(void *map, void __arena *ptr, __u32 page_cnt) +{ +} + +#endif diff --git a/tools/testing/selftests/bpf/bpf_arena_list.h b/tools/testing/selftests/bpf/bpf_arena_list.h new file mode 100644 index 000000000000..9f34142b0f65 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_list.h @@ -0,0 +1,95 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once +#include "bpf_arena_common.h" + +struct arena_list_node; + +typedef struct arena_list_node __arena arena_list_node_t; + +struct arena_list_node { + arena_list_node_t *next; + arena_list_node_t * __arena *pprev; +}; + +struct arena_list_head { + struct arena_list_node __arena *first; +}; +typedef struct arena_list_head __arena arena_list_head_t; + +#define list_entry(ptr, type, member) arena_container_of(ptr, type, member) + +#define list_entry_safe(ptr, type, member) \ + ({ typeof(*ptr) * ___ptr = (ptr); \ + ___ptr ? ({ cast_kern(___ptr); list_entry(___ptr, type, member); }) : NULL; \ + }) + +#ifndef __BPF__ +static inline void *bpf_iter_num_new(struct bpf_iter_num *, int, int) { return NULL; } +static inline void bpf_iter_num_destroy(struct bpf_iter_num *) {} +static inline bool bpf_iter_num_next(struct bpf_iter_num *) { return true; } +#endif + +/* Safely walk link list of up to 1M elements. Deletion of elements is allowed. */ +#define list_for_each_entry(pos, head, member) \ + for (struct bpf_iter_num ___it __attribute__((aligned(8), \ + cleanup(bpf_iter_num_destroy))), \ + * ___tmp = ( \ + bpf_iter_num_new(&___it, 0, (1000000)), \ + pos = list_entry_safe((head)->first, \ + typeof(*(pos)), member), \ + (void)bpf_iter_num_destroy, (void *)0); \ + bpf_iter_num_next(&___it) && pos && \ + ({ ___tmp = (void *)pos->member.next; 1; }); \ + pos = list_entry_safe((void __arena *)___tmp, typeof(*(pos)), member)) + +static inline void list_add_head(arena_list_node_t *n, arena_list_head_t *h) +{ + arena_list_node_t *first = h->first, * __arena *tmp; + + cast_user(first); + cast_kern(n); + WRITE_ONCE(n->next, first); + cast_kern(first); + if (first) { + tmp = &n->next; + cast_user(tmp); + WRITE_ONCE(first->pprev, tmp); + } + cast_user(n); + WRITE_ONCE(h->first, n); + + tmp = &h->first; + cast_user(tmp); + cast_kern(n); + WRITE_ONCE(n->pprev, tmp); +} + +static inline void __list_del(arena_list_node_t *n) +{ + arena_list_node_t *next = n->next, *tmp; + arena_list_node_t * __arena *pprev = n->pprev; + + cast_user(next); + cast_kern(pprev); + tmp = *pprev; + cast_kern(tmp); + WRITE_ONCE(tmp, next); + if (next) { + cast_user(pprev); + cast_kern(next); + WRITE_ONCE(next->pprev, pprev); + } +} + +#define POISON_POINTER_DELTA 0 + +#define LIST_POISON1 ((void __arena *) 0x100 + POISON_POINTER_DELTA) +#define LIST_POISON2 ((void __arena *) 0x122 + POISON_POINTER_DELTA) + +static inline void list_del(arena_list_node_t *n) +{ + __list_del(n); + n->next = LIST_POISON1; + n->pprev = LIST_POISON2; +} diff --git a/tools/testing/selftests/bpf/prog_tests/arena_list.c b/tools/testing/selftests/bpf/prog_tests/arena_list.c new file mode 100644 index 000000000000..ca3ce8abefc4 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/arena_list.c @@ -0,0 +1,65 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include + +#define PAGE_SIZE 4096 + +#include "bpf_arena_list.h" +#include "arena_list.skel.h" + +struct elem { + struct arena_list_node node; + __u64 value; +}; + +static int list_sum(struct arena_list_head *head) +{ + struct elem __arena *n; + int sum = 0; + + list_for_each_entry(n, head, node) + sum += n->value; + return sum; +} + +static void test_arena_list_add_del(int cnt) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts); + struct arena_list *skel; + int expected_sum = (u64)cnt * (cnt - 1) / 2; + int ret, sum; + + skel = arena_list__open_and_load(); + if (!ASSERT_OK_PTR(skel, "arena_list__open_and_load")) + return; + + skel->bss->cnt = cnt; + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_list_add), &opts); + ASSERT_OK(ret, "ret_add"); + ASSERT_OK(opts.retval, "retval"); + if (skel->bss->skip) { + printf("%s:SKIP:compiler doesn't support arena_cast\n", __func__); + test__skip(); + goto out; + } + sum = list_sum(skel->bss->list_head); + ASSERT_EQ(sum, expected_sum, "sum of list elems"); + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_list_del), &opts); + ASSERT_OK(ret, "ret_del"); + sum = list_sum(skel->bss->list_head); + ASSERT_EQ(sum, 0, "sum of list elems after del"); + ASSERT_EQ(skel->bss->list_sum, expected_sum, "sum of list elems computed by prog"); +out: + arena_list__destroy(skel); +} + +void test_arena_list(void) +{ + if (test__start_subtest("arena_list_1")) + test_arena_list_add_del(1); + if (test__start_subtest("arena_list_1000")) + test_arena_list_add_del(1000); +} diff --git a/tools/testing/selftests/bpf/progs/arena_list.c b/tools/testing/selftests/bpf/progs/arena_list.c new file mode 100644 index 000000000000..1acdec9dadde --- /dev/null +++ b/tools/testing/selftests/bpf/progs/arena_list.c @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include +#include "bpf_experimental.h" + +struct { + __uint(type, BPF_MAP_TYPE_ARENA); + __uint(map_flags, BPF_F_MMAPABLE); + __uint(max_entries, 1u << 24); /* max_entries * value_size == size of mmap() region */ + __ulong(map_extra, 2ull << 44); /* start of mmap() region */ + __type(key, __u64); + __type(value, __u64); +} arena SEC(".maps"); + +#include "bpf_arena_alloc.h" +#include "bpf_arena_list.h" + +struct elem { + struct arena_list_node node; + __u64 value; +}; + +struct arena_list_head __arena *list_head; +int list_sum; +int cnt; +bool skip = false; + +SEC("syscall") +int arena_list_add(void *ctx) +{ +#ifdef __BPF_FEATURE_ARENA_CAST + __u64 i; + + list_head = bpf_alloc(sizeof(*list_head)); + + bpf_for(i, 0, cnt) { + struct elem __arena *n = bpf_alloc(sizeof(*n)); + + n->value = i; + list_add_head(&n->node, list_head); + } +#else + skip = true; +#endif + return 0; +} + +SEC("syscall") +int arena_list_del(void *ctx) +{ +#ifdef __BPF_FEATURE_ARENA_CAST + struct elem __arena *n; + int sum = 0; + + list_for_each_entry(n, list_head, node) { + sum += n->value; + list_del(&n->node); + bpf_free(n); + } + list_sum = sum; + + /* triple free will not crash the kernel */ + bpf_free(list_head); + bpf_free(list_head); + bpf_free(list_head); +#else + skip = true; +#endif + return 0; +} + +char _license[] SEC("license") = "GPL"; From patchwork Tue Feb 6 22:04:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13547834 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CCF331C2A0 for ; Tue, 6 Feb 2024 22:05:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257151; cv=none; b=MOZV2lwlyCfroqMW9rqC+QUotfi735Ub3Qg9roQr9mW547HhJMOKYQuuFsk0Hg3EIBJKDQwjR432gTrcpKnlTQwuxAh00k/fUXtmAB1oOMcm0lQ81/vwE6cv4oozQsDJLCkYF3WroxmTq/w67FXT8OXokKpGSYxdjHN5HoAw1v4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707257151; c=relaxed/simple; bh=wsv90GiucnGYO3Rdtvz9NJ7jONRtbYkGT2AiHeEnT3Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NUGlH15cjNvYv2bMPXcJWT7lNm8KS6v7zFYmJO3afPvEOt3gRdNLJSUcuPSfJfk5VTvP3qu4+mnYRKSDKLCxOGeiL/0HhSDKDQ4WvuKejv4S/8oknG/8iqnSNX7RTWRwjuje/3vj633QYvG4FJL7Fvo6RE9GJueMdckBzuMiqMI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ckRZtImY; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ckRZtImY" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1d93edfa76dso135555ad.1 for ; Tue, 06 Feb 2024 14:05:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707257149; x=1707861949; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=z5sXl6wfPkVpJxXcJSxvXuOvb+1sMHHR6BvnXTDIECA=; b=ckRZtImY7oV/m6cNK/3X/OcVburjkmPVHyp74277OyI0C99yBx80ezYuXmELMmKqvh lYpGORRktD129L9I8gjW09XTP42vj53mmpdX2Q0ZJbOgKBCS1jFghM8WwjcEZvCiTqG7 OPFK5MVxpZDJvM1PtW6tp1/OI45pBm87PrBZcu+6ruh7bLgfkkIKXBEjk7fcDuHNcl7b pwro9QSBvvZNB+HJMb7U8IAcwa1fRpBOsL2w69/ruBkUue+ou8YtMP/4rYQ8TSHfCsSS KvhpUZTFPwlyEUAzpSHeoIVIxhf98C1/q9A2KwxqfaU66Pt3SyKAxM7HaNLVkHgMmI+p FmeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707257149; x=1707861949; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z5sXl6wfPkVpJxXcJSxvXuOvb+1sMHHR6BvnXTDIECA=; b=w8NH7U3NBa3nIJy6SuE8axcUoX6c8sRtK5TU5SFwc27cu+EO6kbgq1ieWT6noRiuEd BBMXI0e4GUu5EEr9DgwII6tcFdXgmJmep3dCC1ObYUnwkO/Zv2Bryvqiq7A3D3r+Uhpj Q0pRNuSo1NKrzP3UZE7lwyKc3/1HpJ0JaNUeyk+s/4KW3Qvwnkqt/L1al3yP6y2b48py nNOR7OCRUuFCPeO8mRcsG6qhKA8MaYHdgrcDVg017JwN33K//F4jjwa4CSbCmGVzF9R6 vVRaBgPuZfkOMRK+fHh2GGHf0Tpb6LP9kthX9xzFm1n70yzVp/gfsqxmeHcVHTkhlv83 43pw== X-Gm-Message-State: AOJu0YyeQL2AH3WtSBxug0NlwJhd5YygPx2DN57wYBOwXUlmbHskTUqe oJ80GXX2xQwkGtf2jt0j+9m2fS9zG8h9qHQIcx14rsM2Ybi2dv0J3KLVJ/Xl X-Google-Smtp-Source: AGHT+IF9U+BtdYCt6M8dYZtgQt/dJeYpcJAyzhX1n//KSQC1NX0NDBwdE7WV9Rte853ceTYT7aKzoQ== X-Received: by 2002:a17:902:650f:b0:1d9:edf5:c86b with SMTP id b15-20020a170902650f00b001d9edf5c86bmr59997plk.67.1707257148740; Tue, 06 Feb 2024 14:05:48 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCUwrnDaQTz88yxH7R10WszU4AcGQqpT0AnT3dwaaN4OHwpmdUYYkayQ0PtrkBsBSO9Wjx5ZBjmxmqDBYbHr2gYsJUzr8QjyyEEQvgAvG79SyCGECLR+KK1BoD8xCyc5PEBoX01aS0vrV5oCmEn0oAq8Cv79v4fg0YQg3/5TU0uAa9YA/zGgqQufx4ZMBU3oiWF3pbssnjT3Rt3yOG3GlgiyLxeTvaM8xicjLTdO62TKE10RjLGBtI7VQRG0Ytm9FHSObaQjTVnSsh5LfPEi9eM4mstvz9dj7/i0 Received: from localhost.localdomain ([2620:10d:c090:400::4:27bf]) by smtp.gmail.com with ESMTPSA id r1-20020a170903410100b001d8f81ecebesm3718pld.192.2024.02.06.14.05.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 06 Feb 2024 14:05:48 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH bpf-next 16/16] selftests/bpf: Add bpf_arena_htab test. Date: Tue, 6 Feb 2024 14:04:41 -0800 Message-Id: <20240206220441.38311-17-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240206220441.38311-1-alexei.starovoitov@gmail.com> References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov bpf_arena_htab.h - hash table implemented as bpf program Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/bpf_arena_htab.h | 100 ++++++++++++++++++ .../selftests/bpf/prog_tests/arena_htab.c | 88 +++++++++++++++ .../testing/selftests/bpf/progs/arena_htab.c | 48 +++++++++ .../selftests/bpf/progs/arena_htab_asm.c | 5 + 4 files changed, 241 insertions(+) create mode 100644 tools/testing/selftests/bpf/bpf_arena_htab.h create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_htab.c create mode 100644 tools/testing/selftests/bpf/progs/arena_htab.c create mode 100644 tools/testing/selftests/bpf/progs/arena_htab_asm.c diff --git a/tools/testing/selftests/bpf/bpf_arena_htab.h b/tools/testing/selftests/bpf/bpf_arena_htab.h new file mode 100644 index 000000000000..acc01a876668 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_htab.h @@ -0,0 +1,100 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once +#include +#include "bpf_arena_alloc.h" +#include "bpf_arena_list.h" + +struct htab_bucket { + struct arena_list_head head; +}; +typedef struct htab_bucket __arena htab_bucket_t; + +struct htab { + htab_bucket_t *buckets; + int n_buckets; +}; +typedef struct htab __arena htab_t; + +static inline htab_bucket_t *__select_bucket(htab_t *htab, __u32 hash) +{ + htab_bucket_t *b = htab->buckets; + + cast_kern(b); + return &b[hash & (htab->n_buckets - 1)]; +} + +static inline arena_list_head_t *select_bucket(htab_t *htab, __u32 hash) +{ + return &__select_bucket(htab, hash)->head; +} + +struct hashtab_elem { + int hash; + int key; + int value; + struct arena_list_node hash_node; +}; +typedef struct hashtab_elem __arena hashtab_elem_t; + +static hashtab_elem_t *lookup_elem_raw(arena_list_head_t *head, __u32 hash, int key) +{ + hashtab_elem_t *l; + + list_for_each_entry(l, head, hash_node) + if (l->hash == hash && l->key == key) + return l; + + return NULL; +} + +static int htab_hash(int key) +{ + return key; +} + +__weak int htab_lookup_elem(htab_t *htab __arg_arena, int key) +{ + hashtab_elem_t *l_old; + arena_list_head_t *head; + + cast_kern(htab); + head = select_bucket(htab, key); + l_old = lookup_elem_raw(head, htab_hash(key), key); + if (l_old) + return l_old->value; + return 0; +} + +__weak int htab_update_elem(htab_t *htab __arg_arena, int key, int value) +{ + hashtab_elem_t *l_new = NULL, *l_old; + arena_list_head_t *head; + + cast_kern(htab); + head = select_bucket(htab, key); + l_old = lookup_elem_raw(head, htab_hash(key), key); + + l_new = bpf_alloc(sizeof(*l_new)); + if (!l_new) + return -ENOMEM; + l_new->key = key; + l_new->hash = htab_hash(key); + l_new->value = value; + + list_add_head(&l_new->hash_node, head); + if (l_old) { + list_del(&l_old->hash_node); + bpf_free(l_old); + } + return 0; +} + +void htab_init(htab_t *htab) +{ + void __arena *buckets = bpf_arena_alloc_pages(&arena, NULL, 2, NUMA_NO_NODE, 0); + + cast_user(buckets); + htab->buckets = buckets; + htab->n_buckets = 2 * PAGE_SIZE / sizeof(struct htab_bucket); +} diff --git a/tools/testing/selftests/bpf/prog_tests/arena_htab.c b/tools/testing/selftests/bpf/prog_tests/arena_htab.c new file mode 100644 index 000000000000..0766702de846 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/arena_htab.c @@ -0,0 +1,88 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include + +#include "arena_htab_asm.skel.h" +#include "arena_htab.skel.h" + +#define PAGE_SIZE 4096 + +#include "bpf_arena_htab.h" + +static void test_arena_htab_common(struct htab *htab) +{ + int i; + + printf("htab %p buckets %p n_buckets %d\n", htab, htab->buckets, htab->n_buckets); + ASSERT_OK_PTR(htab->buckets, "htab->buckets shouldn't be NULL"); + for (i = 0; htab->buckets && i < 16; i += 4) { + /* + * Walk htab buckets and link lists since all pointers are correct, + * though they were written by bpf program. + */ + int val = htab_lookup_elem(htab, i); + + ASSERT_EQ(i, val, "key == value"); + } +} + +static void test_arena_htab_llvm(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts); + struct arena_htab *skel; + struct htab *htab; + size_t arena_sz; + void *area; + int ret; + + skel = arena_htab__open_and_load(); + if (!ASSERT_OK_PTR(skel, "arena_htab__open_and_load")) + return; + + area = bpf_map__initial_value(skel->maps.arena, &arena_sz); + /* fault-in a page with pgoff == 0 as sanity check */ + *(volatile int *)area = 0x55aa; + + /* bpf prog will allocate more pages */ + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_htab_llvm), &opts); + ASSERT_OK(ret, "ret"); + ASSERT_OK(opts.retval, "retval"); + if (skel->bss->skip) { + printf("%s:SKIP:compiler doesn't support arena_cast\n", __func__); + test__skip(); + goto out; + } + htab = skel->bss->htab_for_user; + test_arena_htab_common(htab); +out: + arena_htab__destroy(skel); +} + +static void test_arena_htab_asm(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts); + struct arena_htab_asm *skel; + struct htab *htab; + int ret; + + skel = arena_htab_asm__open_and_load(); + if (!ASSERT_OK_PTR(skel, "arena_htab_asm__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_htab_asm), &opts); + ASSERT_OK(ret, "ret"); + ASSERT_OK(opts.retval, "retval"); + htab = skel->bss->htab_for_user; + test_arena_htab_common(htab); + arena_htab_asm__destroy(skel); +} + +void test_arena_htab(void) +{ + if (test__start_subtest("arena_htab_llvm")) + test_arena_htab_llvm(); + if (test__start_subtest("arena_htab_asm")) + test_arena_htab_asm(); +} diff --git a/tools/testing/selftests/bpf/progs/arena_htab.c b/tools/testing/selftests/bpf/progs/arena_htab.c new file mode 100644 index 000000000000..51a9eeb3df5a --- /dev/null +++ b/tools/testing/selftests/bpf/progs/arena_htab.c @@ -0,0 +1,48 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include +#include "bpf_experimental.h" + +struct { + __uint(type, BPF_MAP_TYPE_ARENA); + __uint(map_flags, BPF_F_MMAPABLE); + __uint(max_entries, 1u << 20); /* max_entries * value_size == size of mmap() region */ + __type(key, __u64); + __type(value, __u64); +} arena SEC(".maps"); + +#include "bpf_arena_htab.h" + +void __arena *htab_for_user; +bool skip = false; + +SEC("syscall") +int arena_htab_llvm(void *ctx) +{ +#if defined(__BPF_FEATURE_ARENA_CAST) || defined(BPF_ARENA_FORCE_ASM) + struct htab __arena *htab; + __u64 i; + + htab = bpf_alloc(sizeof(*htab)); + cast_kern(htab); + htab_init(htab); + + /* first run. No old elems in the table */ + bpf_for(i, 0, 1000) + htab_update_elem(htab, i, i); + + /* should replace all elems with new ones */ + bpf_for(i, 0, 1000) + htab_update_elem(htab, i, i); + cast_user(htab); + htab_for_user = htab; +#else + skip = true; +#endif + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/arena_htab_asm.c b/tools/testing/selftests/bpf/progs/arena_htab_asm.c new file mode 100644 index 000000000000..6cd70ea12f0d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/arena_htab_asm.c @@ -0,0 +1,5 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#define BPF_ARENA_FORCE_ASM +#define arena_htab_llvm arena_htab_asm +#include "arena_htab.c"