From patchwork Fri Feb 9 04:05:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550821 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 067175221 for ; Fri, 9 Feb 2024 04:06:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451580; cv=none; b=IgEOT0OiQL3MbIiEt6tsJvpMoWOXYxdIlAspqJ1CfzS2Hz2xe6CTY23JxKsXiXzAb2hHS0v4xbIIEOPsn5vHhV4HeI/F4BgFNLhmwIdcMApuzTnQ9N+nGNnXUjYKW+ky6nTj8Rg8Q5Dlgl6KyXfTfX6nsEdTzbapDD7yStO8Ka4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451580; c=relaxed/simple; bh=FkMFrDSl3Yz+eaNiT7m7A+eMdr4uSrwOyWuovaBExdo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UVr9u1Z6oq53Bu3PDbKppu09OAX79jk0V/rQRCZ5o2VzzCLX+lxlM8OWSMNAlrXYrK/wwtdiBw9DSFcn3mTuQu12g+gfzGI41gqqk+d0Ti6QqjLFctF0VzfIMkhrSFkHbqRpjzskaVaCPjlVryJBzlKJUibGRcZdyMUI7hswdFw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MoxBkRfv; arc=none smtp.client-ip=209.85.214.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MoxBkRfv" Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-1d72f71f222so4490595ad.1 for ; Thu, 08 Feb 2024 20:06:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451577; x=1708056377; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HiZXvAtRKK8jkFx0ziwixfKGMxiAC5XovXkV04D0O98=; b=MoxBkRfvIpktMBKBJGGxAXIVtSRMRwi07r7QJsnqIfcxoHsjozgzipm+T59ejx2iUA aouUUohsAAxa9gsVkgPt6aEn9yIGqL/xeIXtYi5qFW/IQI5M2/eCZeb81txIGIOpIb7R vIqmHFu4xF+4oHrrJ6iM0nVf+hW+thNMBdsSDSX9fwoXzZdi1T0MvqgC+iTUDQGgHl95 Kml2kMyOUlBFnKAJjaXgVWqcRJG5L8a1dh8yE2tUyO+Y/gfbBT4AfOeNj4oj8/WdY8Ga lKn7nuu1x8OVvVws0Ha4eiZy+iNlPH8F0myw8c6aqDtYJ++u1NU5B021C+axPtZwQyqq LzXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451577; x=1708056377; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HiZXvAtRKK8jkFx0ziwixfKGMxiAC5XovXkV04D0O98=; b=YWz7fs62ln1SKfimEBiBVQk4Uj/sfZeyH6qHJJFBEq1GfNa4UKMxOW9mLyOdTXYYX/ IEJykMcMzzrM7wO4Yu2JnMqjZafanPQENqOBe4nL+1+WJ5FVLrjyCSOI/mA2Pavg4hxB 5I3vVqLZta6NZIT051PqFPWqe2ZeirlUzgZXE+wbY+v4bx6kQA33sYb2GXmfjinNqDwC IXQCD4tzcR+PsPmhQW12F8/1FboFxfPIe7PQfXoxrTJjqwIk4GB4BLvcZSmT1mBdmokI GZCI9H4Ayxbd7tEahp/eMpKHF1vkOlI7g7hROhzcy81dQ8oVUx+aGFnrkOCstWrqUe+M TijA== X-Gm-Message-State: AOJu0YwiuvU4QD7GnmeP6KaKB813kzBAOwx97xkImvSoytZNrAPsNzHM TWIfNMPvUEQSegFiX2nZsJ1r+sVuAFfFQsu/L552Q7hJpTbT49HHs6O2Ga67 X-Google-Smtp-Source: AGHT+IH0qd7DUNWL2d1Urm49Tarwt8Y7vlvkifg1t+rPqEzlKKds4sNBaEDojklBlzOW65EyuN84oQ== X-Received: by 2002:a17:902:d512:b0:1d9:cc68:19c6 with SMTP id b18-20020a170902d51200b001d9cc6819c6mr505956plg.43.1707451577482; Thu, 08 Feb 2024 20:06:17 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCW0PSDSVSrkv+MnkyI+/hIYocwwt2ruGu8w0QAeQbBnRSNdSiWSEkbUIE0BAXyb5Zg9Ii4TSozKKj1jxxKcrHkZkTO0DaUSIlKkZtOh6JNQ36oXTkCUzMAX4yMunEMiwp3ti9w3FRrGlWp5IUZloJgZyY9JqybOu8LCa4fyoIMPi3oxP7rXgATE/KrV1IYoYtHNxPVCOE9ctb5w8jhL8M+66f6h0UCln/TA1zrEC55iEka/PM0Pl2LQsdp714N5FuoTkqyOpR7eEsHj8Kl1nmymCmhFm2o7SJ3zpjLN6vSGYhbm6ybyG83M8i1dSN/5Sz7UClBMTTVQu6Yl7l4iU8Sx1HbZa4/un5ZV0lD5c226Y0vpra+y0w== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id mf3-20020a170902fc8300b001d9edac54b2sm544084plb.205.2024.02.08.20.06.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:17 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 01/20] bpf: Allow kfuncs return 'void *' Date: Thu, 8 Feb 2024 20:05:49 -0800 Message-Id: <20240209040608.98927-2-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Recognize return of 'void *' from kfunc as returning unknown scalar. Acked-by: Andrii Nakryiko Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- kernel/bpf/verifier.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index ddaf09db1175..d9c2dbb3939f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -12353,6 +12353,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, meta.func_name); return -EFAULT; } + } else if (btf_type_is_void(ptr_type)) { + /* kfunc returning 'void *' is equivalent to returning scalar */ + mark_reg_unknown(env, regs, BPF_REG_0); } else if (!__btf_type_is_struct(ptr_type)) { if (!meta.r0_size) { __u32 sz; From patchwork Fri Feb 9 04:05:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550822 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A31A85663 for ; Fri, 9 Feb 2024 04:06:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451584; cv=none; b=QnfHwackh1jUknLwpVEbp6Suoz2plHywgtw1eYMxH+Oq3FvVzb9I3W+hI5Py5ItuZRVkwKblCygD61TLrABRJ5LnyqZv/R9E9SsYz3Si+BGUM4mAE7ax0pnuIizTxD9k7ofh/hdZq6bWOnzKc4xklFzYj6tQgs5grKbj95Wy0Yo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451584; c=relaxed/simple; bh=y/WnPpS5v7h0NAuexgmi/BtX46N+W3zY7R6LeqdxZUM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=PECwXfLu1I2C78AjDC2N/nVFcZ++nmcXazlthC4ZQSvplPxnK1GFI1GxZLcSLjRgVRkeQWg9PfP6Vw0kwlt5QNucgrBR9Bnn5dzAHeN8pvWmUcxj5PeqPZ2qgakXnaFNiouK1z374LNL96xS01U6OE0bjlQD78IJvuilZ04riVU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JzRy/tdW; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JzRy/tdW" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1d8aadc624dso4523495ad.0 for ; Thu, 08 Feb 2024 20:06:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451582; x=1708056382; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RYMIDAZLgBUgMpP99zrxeTSMtNJZxQwPNyQYFWZYzso=; b=JzRy/tdWUKq4FelyAmDU1OIiVqCW6x6CEpn2bnukElnIiJoq6Lk4R442wYrKxxuOfU cRvRNJg5rW4optmubwiTocz+khGN2CGJtoWrqoAa5iYHqP5TXb5NbeoQoEfgdTt3wZSW jtyRtoVlmiGtgS0XAW0aQjIGQ2NxXm3V+HGPIQxv/EH8F4S24xsxcV42F89MSjvZzDlN FYB0DyVxakFu6kGdRZHZf+YCgbonL4eo2ows5u7qAynKvo9IC73p8QqbI/hcu0XGq7zI 4x1/iN2PvUibxR3Wvalj/nFANQcT1Gg4OuAtKE87dBrcxD3r3Y7BggAIIZj5HYKeMQ6t UEtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451582; x=1708056382; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RYMIDAZLgBUgMpP99zrxeTSMtNJZxQwPNyQYFWZYzso=; b=nLu5qlicTOArk9DZ19n9kOKPZZcxzAUgGGDoDR/p5bJdTp4h4xZrm298T/xGAi+aVy LvwJdn2UWiS0d5VBms3qxB/mK+tNsW13k2g58RKtvkArNw8n9eXh1MKPmK594AoAVCcd 8JSQFBGYGhzbFtSeal6pNv3uSFGKcEZq+TXxEHyZBfmmdcyfLngf0agU0w4QSWHTQNac ZWmh8N9KQqv14iUGvy3LHo4wyTKAfU0wxbdK1AslRY8X0n9Ua43nOVmBQQYqyd5W/yI+ 3e6DyCjCyiNbuWLWcnKWNdzxtiQ9kNgXzuGqFofJVhHI6JxNKiq8mzfMw06FHlMYu616 aqdQ== X-Gm-Message-State: AOJu0YyVlC4qd/HzJT69UH7NBkZuHXNyZgBQwz3TwpAR9YcrXaDFfgUQ Tp5A+v6slaNRTu+IhR9F4jIEknB8CZZENG+63H88lCd0lwnJupRvMlidj61y X-Google-Smtp-Source: AGHT+IG7W4w6iclNa9lFghGZkrapWzAeDlADw31sucf55LWFs70wKngA8+n/+OS3WjIjVQMSK9dgJA== X-Received: by 2002:a17:902:e549:b0:1d9:6dc8:b44f with SMTP id n9-20020a170902e54900b001d96dc8b44fmr349803plf.1.1707451581624; Thu, 08 Feb 2024 20:06:21 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCU9TUB8D0L6Glwm5xlI3JvnBt7zEVn8IOywhbt58OfQBsWE4k4qMGs4CWoKHxyxuTzW1lD9onprEAHbY/BQVbHwXNg9VmhIZInBkRh8ifJLD08aWjb3T+K0tDAVFrboxdJ+d8n2YA52cdz0aPx0veAq9E4y39FA4QYAbouAb2omG9KISCOMfSfKL6a8oDHvwuifT9JSGK4q6WX0S4EqCjbHGubv+u7IqGmSdLK7SIiYC28SWmhCj/ydGdJrKud4QkLYUh1W+4dZWd/d9cSw1xwgy0LsiufRVZBXcgKL3wjIvYg5eJlRcFjB7OjTKhqUSIarEy3JRhCDPQNO/jtaRfHoaboeoMjjcVd7WbWx0EEabK2p/ysZuw== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id ku6-20020a170903288600b001d8f82f90ccsm538134plb.199.2024.02.08.20.06.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:21 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 02/20] bpf: Recognize '__map' suffix in kfunc arguments Date: Thu, 8 Feb 2024 20:05:50 -0800 Message-Id: <20240209040608.98927-3-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Recognize 'void *p__map' kfunc argument as 'struct bpf_map *p__map'. It allows kfunc to have 'void *' argument for maps, since bpf progs will call them as: struct { __uint(type, BPF_MAP_TYPE_ARENA); ... } arena SEC(".maps"); bpf_kfunc_with_map(... &arena ...); Underneath libbpf will load CONST_PTR_TO_MAP into the register via ld_imm64 insn. If kfunc was defined with 'struct bpf_map *' it would pass the verifier, but bpf prog would need to use '(void *)&arena'. Which is not clean. Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- kernel/bpf/verifier.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d9c2dbb3939f..db569ce89fb1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10741,6 +10741,11 @@ static bool is_kfunc_arg_ignore(const struct btf *btf, const struct btf_param *a return __kfunc_param_match_suffix(btf, arg, "__ign"); } +static bool is_kfunc_arg_map(const struct btf *btf, const struct btf_param *arg) +{ + return __kfunc_param_match_suffix(btf, arg, "__map"); +} + static bool is_kfunc_arg_alloc_obj(const struct btf *btf, const struct btf_param *arg) { return __kfunc_param_match_suffix(btf, arg, "__alloc"); @@ -11064,7 +11069,7 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, return KF_ARG_PTR_TO_CONST_STR; if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { - if (!btf_type_is_struct(ref_t)) { + if (!btf_type_is_struct(ref_t) && !btf_type_is_void(ref_t)) { verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n", meta->func_name, argno, btf_type_str(ref_t), ref_tname); return -EINVAL; @@ -11660,6 +11665,13 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (kf_arg_type < 0) return kf_arg_type; + if (is_kfunc_arg_map(btf, &args[i])) { + /* If argument has '__map' suffix expect 'struct bpf_map *' */ + ref_id = *reg2btf_ids[CONST_PTR_TO_MAP]; + ref_t = btf_type_by_id(btf_vmlinux, ref_id); + ref_tname = btf_name_by_offset(btf, ref_t->name_off); + } + switch (kf_arg_type) { case KF_ARG_PTR_TO_NULL: continue; From patchwork Fri Feb 9 04:05:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550823 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B29F522A for ; Fri, 9 Feb 2024 04:06:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451587; cv=none; b=P6IWXxj0gNqOXQ181OAvSVIyyqaYo5OYLXajVm1dikbsBRQgpAeMjiXLstU0dsdR+166D2bptfmP0Ql9JgYjY5Tj8j9IDzer+Pl3Of6O7f4qkGN8qS3yLekCwDAGkf5eCMKiOxmxu4Gt5d//fa+1OSVw+MewsGqLbFIbeeFXClg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451587; c=relaxed/simple; bh=bn15qwkFZKFzLTANvXpovH96/FTFHgiCfZDyaSbMCuU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=AFZssHo8YQjHUomfZZRyw+dVmEIM7qH8q1NbSQAu+Kf0z+usxlcZ07KBBHIZZT8uenZTNYxIqQ299IFQAF1KbZYURtdDn8B42fbgnKafEtctNuHMRAC3lP3f4qemFoF+u9DwVJ0Fj8D2ru8NOvWLj8V0LbTDMTenJnEPu9KLnVA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hCGWXxs2; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hCGWXxs2" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1d918008b99so4059255ad.3 for ; Thu, 08 Feb 2024 20:06:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451585; x=1708056385; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6BMTbGEmBryFuIGcZHloqpvwHOH0k2D5DfDjYGSWlIg=; b=hCGWXxs2cCfVNZFHVuFASbNjOAPKwslabewmp4BFUffofxGy/HMdaC2gMsW8NKqr7N YruWAFigjm4T10ssgPKVC3scHJud5+2Xz7H+pWCHk5bHbalsYAzcqlTYUYd1h1BUjhMH KDenid94/l15VLOsOlKWSjQCir1gr9/uLXn2EFK2eaDkIM03I4BAuFqEKz3v3RAhwqnO epPoqIS0NhF00fa+ICSYo55fC30cJQEVfDs7WGJv80x/1ZDvfVrPLE5PYKRv3Y0X3fJ7 7jbe8wB2dXl9V4q1X6g7tXntqKHxi7UitCGllyiUQp2yzQPlOkff1ZFzKOZCM28v8DDx 1GRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451585; x=1708056385; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6BMTbGEmBryFuIGcZHloqpvwHOH0k2D5DfDjYGSWlIg=; b=BX4IjEIZFEg/2cplDd67Q3biOzcT6A1Cjz8Vbmu5uSOtAI4vntK3mky/9civhWLQQU b6+D/cBCu4gNkSzgKH2DEI4TJ+l5dc2DxvYCfBPXIexBDR7Z89W1ab9Xt5f9bSwzYtG1 1onRj25VeodZNJOVXo2n6/w2SykfbZXcf5fLxbdZAw59/GCg0WDzngQEx2Li7FeOcXid TmzYzCLJXu8iLRTAnNIRsA9RuaJZjPH5QCSufrV/EuOr9AA3UTLvIgatPxbG9AjmQKHt PTy0V+usJJO1PUaGcd6A2cfDiBlpQfnkIWDD8epvCONBZ3ySnZ2s4dJm0faLtFWjLZz5 yVOA== X-Gm-Message-State: AOJu0YzC/Xh7VE45K4PAX+9ei47Fw8/VzjfL1JAMpw8y5uVBPb5FZzqS TqmReNOxjVGyMluAvuq3+F/ANMXfR18xq/JJJzwg3mtfL8lnvqE7FuhPjk/Q X-Google-Smtp-Source: AGHT+IF5Mfq/bPHfBQYoGGtl+p8BmK45T13+kVLjl6HOkUeWFmPC1hzclwCAAHXTr58YrlBe4W+2VA== X-Received: by 2002:a17:903:40ca:b0:1d9:5f11:d018 with SMTP id t10-20020a17090340ca00b001d95f11d018mr520921pld.1.1707451585566; Thu, 08 Feb 2024 20:06:25 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCU+VQojeGmZnI/6ZKbMA4QjC6Z7cs8p2E3pelD8tmw/+rxHsyJp+sHxsISw306bbIwagnzXrqksuerDCdyJyzHnkQ8d3DIm2mi1k1zoOdJMP/wTNoNNR5ir3f4CT7VEYkubNEGmEnkgZRPf9eGmcjXs9cZMp9inOKRZV5yXU3IAzhgmqi2+d2s8sfmHKBYfLJ0cTlhfgPzGfXc6HritoDUz+EJpKOETMg/w/f5tKSmrToubP2DeHNnK+Nb6neWYKAOjYI7Nbtm4Q637JP9enCDVLsNvdrPpk40p+0X64NQlSkbO6KUf1lMsM+50n70zWLAH4TkyqwOFm/a4C0sXyAEf8d77rzbEMPOXScNRvVSW7DsnXrjq3Q== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id ki11-20020a170903068b00b001d937bc5602sm544378plb.227.2024.02.08.20.06.23 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:25 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 03/20] bpf: Plumb get_unmapped_area() callback into bpf_map_ops Date: Thu, 8 Feb 2024 20:05:51 -0800 Message-Id: <20240209040608.98927-4-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Subsequent patches introduce bpf_arena that imposes special alignment requirements on address selection. Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- include/linux/bpf.h | 3 +++ kernel/bpf/syscall.c | 12 ++++++++++++ 2 files changed, 15 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 1ebbee1d648e..8b0dcb66eb33 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -139,6 +139,9 @@ struct bpf_map_ops { int (*map_mmap)(struct bpf_map *map, struct vm_area_struct *vma); __poll_t (*map_poll)(struct bpf_map *map, struct file *filp, struct poll_table_struct *pts); + unsigned long (*map_get_unmapped_area)(struct file *filep, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags); /* Functions called by bpf_local_storage maps */ int (*map_local_storage_charge)(struct bpf_local_storage_map *smap, diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index b2750b79ac80..8dd9814a0e14 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -937,6 +937,17 @@ static __poll_t bpf_map_poll(struct file *filp, struct poll_table_struct *pts) return EPOLLERR; } +static unsigned long bpf_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags) +{ + struct bpf_map *map = filp->private_data; + + if (map->ops->map_get_unmapped_area) + return map->ops->map_get_unmapped_area(filp, addr, len, pgoff, flags); + return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); +} + const struct file_operations bpf_map_fops = { #ifdef CONFIG_PROC_FS .show_fdinfo = bpf_map_show_fdinfo, @@ -946,6 +957,7 @@ const struct file_operations bpf_map_fops = { .write = bpf_dummy_write, .mmap = bpf_map_mmap, .poll = bpf_map_poll, + .get_unmapped_area = bpf_get_unmapped_area, }; int bpf_map_new_fd(struct bpf_map *map, int flags) From patchwork Fri Feb 9 04:05:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550824 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49F516139 for ; Fri, 9 Feb 2024 04:06:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451592; cv=none; b=SQ3kFqlE+DfufkSTVhIEI9bfodK3NzW4/Kn2Ox66tRghC+qFG0dIyWlJz2VX5Lj1lHqP9B6WBuassjI0hb8zsHqpf76GOdOlH40lm0UgIOmApttVGltvS5Z1YThqpwq+7W/xD55uDWH9cFeA1LjPcXfYlFUKKf+X+/pcGZ4djUU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451592; c=relaxed/simple; bh=6TwdfN0NC5VtCZesa7tG8I7CgElc2Ce0A55StbRVB4U=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rsgZaZhVSWKVkyld4ApmZ2enVaWgYQbo4Qa4uF9jnuyDjct+xHm+XoTxZaOSgTB3aXxNzZrUNasJ6GwQRhY7wZcYBEXLlhx0RTv6mNL/KUW1Z6GJ2bZODZ0t8qhnF7YMyaNVFoUaUYnR0Lh4vTwvv5liWI+JClYXq+2g75acgng= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RbMvGi77; arc=none smtp.client-ip=209.85.210.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RbMvGi77" Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-6de3141f041so462441b3a.0 for ; Thu, 08 Feb 2024 20:06:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451590; x=1708056390; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8KvQuM8OQ/k4hdbGFhyp69M4b0glecGcjvBA4TzLzwQ=; b=RbMvGi77mO7zllHanm/jGJS9Dd8TCZtr76JNlI+e3SRz6DZkaFPHv3M2ihdA/WvupA a6vHjsg/0LPSDeYEgwoec0D6aCpvuV85toHZuAn8Q2hsK94lG0iqFAvacMh0dcp2Jr9k bHpWxoe8sVFY6CBr+RtTSUxCV+jcpLYg+YRLFvmAb28aFQ+btaC2m7vI8JI2Tu8N9Vfh ZTlzSKwqoywk3pEg0mbYJ3FdLMwhiriPL9KLqfY0K2VQIA+rPFtbQpZPWWv5KN8T95Fs hoYMDyjn8pZoU3gIwjWrvVhQxFZRDREb+RcfFGC4F8s2PJGtiYl3VlB9flmtgMl6iBLk bypA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451590; x=1708056390; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8KvQuM8OQ/k4hdbGFhyp69M4b0glecGcjvBA4TzLzwQ=; b=LuEA50eLZAnpcMTK6gaeAX3ON+FDcZdzkm0LMgwj1vSVpD5do3vI5G+IGjhnJVCJmF QPMPxDMzEopheyVjQHPJVT5eiFJOTY5sephvLPzPBAVjsYVk7JT1d/mtRtDAgulCKFcC 9KdzGNtroWIhIyQYrG1ysQEW8viXnJRtwBzahjSjK9STk+A65FKUccsG7zhZnBoRL/GX d45dtinipoDJjkdsFJrf4ZNbgWJQQCPLGzL5c2GfRsNUayu/l0OAYAIB1QpfRleKXS0/ sh+YHNky+5pFMCZnChOWMvskYefkGMuvrROf/9dfbys7sPejY1EJEGlmhBassqyB9tdx 15VQ== X-Gm-Message-State: AOJu0YzBjX1AH22ganxPA2VJVdRQ+cj59RZ/xeVhQvvnbs5gVweHZSRK +CJp3wLmC8La4rYQTviynLZzK6VA1R3UyDr7G1+rMo44QC1lFLn7baMxpBXl X-Google-Smtp-Source: AGHT+IGQBdQ96AWfRShK3THRmZ/s+UP/PKdWL9/0jO8LeSeDC7xm1ls7NjuGAIDwxkvXo++YHCAN4Q== X-Received: by 2002:a05:6a00:bdc:b0:6df:e85e:ecc6 with SMTP id x28-20020a056a000bdc00b006dfe85eecc6mr521111pfu.3.1707451589662; Thu, 08 Feb 2024 20:06:29 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWdg8hv5XdH6EKCu1LTwjNtjdDavpVVV78qHLOLm6RZihoGfapXop+ABS9UTa9kAd3MiFUVzdXtkUDGGX96nIPHXik+Vtby1jDoxmr93Sadbon7ZAQZgHUtdc4jR8uEEianIkut2yaoRtShsOlvTSBLyu6qrKADVSDnSWmXWKw7Z+GYkTkVIxTMwWhVUkuQ3tyuAMr64xwMkrKnAnn2HdWivAzaIpaikmZVk5QXlulka1ipuvoCbLwhJ/vJd7n5LB6etkbgJ2LbWrom4JWoyjucVlWdpM7pX6nH6nTuIfGVEzy5qQr/g6YbNCh7o033IAymJ/6+pJsAzwh3lWNBTSOlquwUKBcGvK083zCJn2ES30w86coAEg== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id x29-20020a056a000bdd00b006e0418993cesm591583pfu.8.2024.02.08.20.06.27 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:29 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 04/20] mm: Expose vmap_pages_range() to the rest of the kernel. Date: Thu, 8 Feb 2024 20:05:52 -0800 Message-Id: <20240209040608.98927-5-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov BPF would like to use the vmap API to implement a lazily-populated memory space which can be shared by multiple userspace threads. The vmap API is generally public and has functions to request and release areas of kernel address space, as well as functions to map various types of backing memory into that space. For example, there is the public ioremap_page_range(), which is used to map device memory into addressable kernel space. The new BPF code needs the functionality of vmap_pages_range() in order to incrementally map privately managed arrays of pages into its vmap area. Indeed this function used to be public, but became private when usecases other than vmalloc happened to disappear. Make it public again for the new external user. The next commits will introduce bpf_arena which is a sparsely populated shared memory region between bpf program and user space process. It will map privately-managed pages into an existing vm area. It's the same pattern and layer of abstraction as ioremap_pages_range(). Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- include/linux/vmalloc.h | 2 ++ mm/vmalloc.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index c720be70c8dd..bafb87c69e3d 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -233,6 +233,8 @@ static inline bool is_vm_area_hugepages(const void *addr) #ifdef CONFIG_MMU void vunmap_range(unsigned long addr, unsigned long end); +int vmap_pages_range(unsigned long addr, unsigned long end, + pgprot_t prot, struct page **pages, unsigned int page_shift); static inline void set_vm_flush_reset_perms(void *addr) { struct vm_struct *vm = find_vm_area(addr); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index d12a17fc0c17..eae93d575d1b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -625,8 +625,8 @@ int vmap_pages_range_noflush(unsigned long addr, unsigned long end, * RETURNS: * 0 on success, -errno on failure. */ -static int vmap_pages_range(unsigned long addr, unsigned long end, - pgprot_t prot, struct page **pages, unsigned int page_shift) +int vmap_pages_range(unsigned long addr, unsigned long end, + pgprot_t prot, struct page **pages, unsigned int page_shift) { int err; From patchwork Fri Feb 9 04:05:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550825 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 580475394 for ; Fri, 9 Feb 2024 04:06:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451597; cv=none; b=O7gz2iPl1QNa41ocMLO7Jfy9LEI2HJJLiS+H0x7nN5dUWI0YFt7whsAEZWeqaWemhTwn+eHXRLoe9WdS/VRS6Hf2FdJMQ0gngasibLKSAjGmB2G41Hn0E+/tb+ssaRCiLgE53JJoqW5dG4Qd16++KxbMzcoMrgTHE5FyXKBGw8M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451597; c=relaxed/simple; bh=tM8aWYKz2SU4RACZ8kHrwsKQMwtzkJoxG+Ed5InuX9g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NSf1ALasI3ef7CILTLeRf51Sbab+mPAHDRA9NijvkwqHNMeIojXriLNHQcU4XrsHo25T/0Fc+nXz09vRUOl9r/lLyXVLq1q7qfX/msCbJIYDZQ7+T5NEJdER3/8OCmHOoz/eNk57NnlXUx9MUAjRkWSGQpL+lt2naHFC1LvpGCY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=foqfvspV; arc=none smtp.client-ip=209.85.216.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="foqfvspV" Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-296a5863237so1239480a91.0 for ; Thu, 08 Feb 2024 20:06:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451594; x=1708056394; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CbW1Ux0GyRomZa2QV1OutRN2C00HubCW5vkIGuoKpjE=; b=foqfvspVJUY7hQx2sUr8GCV/QOYYfpxzNvbiixPXOIoHwKVQZDA3uXwYy9GLxFl29f wncDGB8EVFwKwmbkZO0/Mxod5MPU86PrCixCb4fKOyPmTS63AQzgg2VJhqq9x7riCUQ7 r8BkSkFCeDea2O9IayoH+GuMAIGPuWtPqGzXE/M4Dz4pzLanTj8KFLVeMxmmmPEFGtsR fbzqB2CeNaJWSHYcjp2CZYQ1zPXFBZPuy7uQByJFPH7Y55a7caeaBhvTUe2TbFODA7L2 kP60znORtMn0Ntemnsnu+cgGqf+1v1SkxiLS+u/1Umy2avmuTGGxTDqMVL720o/m7tlQ tKgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451594; x=1708056394; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CbW1Ux0GyRomZa2QV1OutRN2C00HubCW5vkIGuoKpjE=; b=r6eOXmT79jbGeRCZTnjav9fCVyB6lvlUkwMTLRkdkZjS+Y8HDsXkLg71Pd3BeWqLKw yJIWjF/SjrXmxu6UNUZrupEY0vaz1BS0L2O+EoBl+qaFKZQzPpGe1/HyFNL4T2LH1Ohs TfRxubCRxt3aaeZzGXnIX/T1s12nyBPUYGnHsnr5wpZZCXCwrRoDDPSeSx4fSshc2HX0 5lvZ25d2otU+4NSt5TcwVvVZd/ca0i7xHxL4LNKkd31JEvzk3dbDgRidm7VxaV7ZQVar Omqygcbehg8QfyeClV59S9ZAX91xk8B/9m/HScWkRG/apNueCncYyF4il7YO1pYBXFzC VcSQ== X-Gm-Message-State: AOJu0YwzCAvjDKmXVhpWdpc+CGq91nRHnGA6cmDpWsSIqjF9ZcVtcprA RSCeA4627SDj5KSTIFiIXd+O/xhN6YKYjSNBf3+/FnrOvLDe5Vm83ld06wn0 X-Google-Smtp-Source: AGHT+IHXQWcnr8EgdwX+jDOt+brv7WOQ4JqLcKYBLJJYhVT5dTCNwjVtOKt42l0jWIJeFtNrIqw36w== X-Received: by 2002:a17:90a:f68d:b0:296:530:996e with SMTP id cl13-20020a17090af68d00b002960530996emr692978pjb.20.1707451593996; Thu, 08 Feb 2024 20:06:33 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXGmdtHWr7aWPEnLy21kkm0M5ajC/CAPNM2vLsQ/qmOrob/yEZ64zsIHUAM7Tv4Y2yn/DeeUFhBB+RnfdGM612UT3kb7vzpjw3wSPSsBoLe8NU2IBLKqCxKV7NP7ajHq9UzjqQNrpyikIwZliwB20cEoMuqVqcjgaUuPF4xMiQP86IAFBSMcGx0elohbNToZnLmkMqC8i6l+NPkG7mh1WMd4tWfhhPM0doGowtdNM3ChK+vcpLNCD86GI3JGdvwvJcRhQ5FAoOey58Xh6ISBXz7rkP17CiZ87CXrEsWeFD4drpogSYSe3HE2BLctjrDZEjLsU4viPxg7EaADdNieoUUXciZ4RWJjUujznNfrM1wxLzF5GlSbQ== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id f5-20020a17090aec8500b002970ce13bb9sm85149pjy.52.2024.02.08.20.06.32 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:33 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 05/20] bpf: Introduce bpf_arena. Date: Thu, 8 Feb 2024 20:05:53 -0800 Message-Id: <20240209040608.98927-6-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Introduce bpf_arena, which is a sparse shared memory region between the bpf program and user space. Use cases: 1. User space mmap-s bpf_arena and uses it as a traditional mmap-ed anonymous region, like memcached or any key/value storage. The bpf program implements an in-kernel accelerator. XDP prog can search for a key in bpf_arena and return a value without going to user space. 2. The bpf program builds arbitrary data structures in bpf_arena (hash tables, rb-trees, sparse arrays), while user space consumes it. 3. bpf_arena is a "heap" of memory from the bpf program's point of view. The user space may mmap it, but bpf program will not convert pointers to user base at run-time to improve bpf program speed. Initially, the kernel vm_area and user vma are not populated. User space can fault in pages within the range. While servicing a page fault, bpf_arena logic will insert a new page into the kernel and user vmas. The bpf program can allocate pages from that region via bpf_arena_alloc_pages(). This kernel function will insert pages into the kernel vm_area. The subsequent fault-in from user space will populate that page into the user vma. The BPF_F_SEGV_ON_FAULT flag at arena creation time can be used to prevent fault-in from user space. In such a case, if a page is not allocated by the bpf program and not present in the kernel vm_area, the user process will segfault. This is useful for use cases 2 and 3 above. bpf_arena_alloc_pages() is similar to user space mmap(). It allocates pages either at a specific address within the arena or allocates a range with the maple tree. bpf_arena_free_pages() is analogous to munmap(), which frees pages and removes the range from the kernel vm_area and from user process vmas. bpf_arena can be used as a bpf program "heap" of up to 4GB. The speed of bpf program is more important than ease of sharing with user space. This is use case 3. In such a case, the BPF_F_NO_USER_CONV flag is recommended. It will tell the verifier to treat the rX = bpf_arena_cast_user(rY) instruction as a 32-bit move wX = wY, which will improve bpf prog performance. Otherwise, bpf_arena_cast_user is translated by JIT to conditionally add the upper 32 bits of user vm_start (if the pointer is not NULL) to arena pointers before they are stored into memory. This way, user space sees them as valid 64-bit pointers. Diff https://github.com/llvm/llvm-project/pull/79902 taught LLVM BPF backend to generate the bpf_cast_kern() instruction before dereference of the arena pointer and the bpf_cast_user() instruction when the arena pointer is formed. In a typical bpf program there will be very few bpf_cast_user(). From LLVM's point of view, arena pointers are tagged as __attribute__((address_space(1))). Hence, clang provides helpful diagnostics when pointers cross address space. Libbpf and the kernel support only address_space == 1. All other address space identifiers are reserved. rX = bpf_cast_kern(rY, addr_space) tells the verifier that rX->type = PTR_TO_ARENA. Any further operations on PTR_TO_ARENA register have to be in the 32-bit domain. The verifier will mark load/store through PTR_TO_ARENA with PROBE_MEM32. JIT will generate them as kern_vm_start + 32bit_addr memory accesses. The behavior is similar to copy_from_kernel_nofault() except that no address checks are necessary. The address is guaranteed to be in the 4GB range. If the page is not present, the destination register is zeroed on read, and the operation is ignored on write. rX = bpf_cast_user(rY, addr_space) tells the verifier that rX->type = unknown scalar. If arena->map_flags has BPF_F_NO_USER_CONV set, then the verifier converts cast_user to mov32. Otherwise, JIT will emit native code equivalent to: rX = (u32)rY; if (rY) rX |= clear_lo32_bits(arena->user_vm_start); /* replace hi32 bits in rX */ After such conversion, the pointer becomes a valid user pointer within bpf_arena range. The user process can access data structures created in bpf_arena without any additional computations. For example, a linked list built by a bpf program can be walked natively by user space. Signed-off-by: Alexei Starovoitov Reviewed-by: Barret Rhoden --- include/linux/bpf.h | 5 +- include/linux/bpf_types.h | 1 + include/uapi/linux/bpf.h | 7 + kernel/bpf/Makefile | 3 + kernel/bpf/arena.c | 557 +++++++++++++++++++++++++++++++++ kernel/bpf/core.c | 11 + kernel/bpf/syscall.c | 3 + kernel/bpf/verifier.c | 1 + tools/include/uapi/linux/bpf.h | 7 + 9 files changed, 593 insertions(+), 2 deletions(-) create mode 100644 kernel/bpf/arena.c diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 8b0dcb66eb33..de557c6c42e0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -37,6 +37,7 @@ struct perf_event; struct bpf_prog; struct bpf_prog_aux; struct bpf_map; +struct bpf_arena; struct sock; struct seq_file; struct btf; @@ -534,8 +535,8 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head, struct bpf_spin_lock *spin_lock); void bpf_rb_root_free(const struct btf_field *field, void *rb_root, struct bpf_spin_lock *spin_lock); - - +u64 bpf_arena_get_kern_vm_start(struct bpf_arena *arena); +u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena); int bpf_obj_name_cpy(char *dst, const char *src, unsigned int size); struct bpf_offload_dev; diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 94baced5a1ad..9f2a6b83b49e 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -132,6 +132,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_STRUCT_OPS, bpf_struct_ops_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_RINGBUF, ringbuf_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_BLOOM_FILTER, bloom_filter_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_USER_RINGBUF, user_ringbuf_map_ops) +BPF_MAP_TYPE(BPF_MAP_TYPE_ARENA, arena_map_ops) BPF_LINK_TYPE(BPF_LINK_TYPE_RAW_TRACEPOINT, raw_tracepoint) BPF_LINK_TYPE(BPF_LINK_TYPE_TRACING, tracing) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d96708380e52..f6648851eae6 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -983,6 +983,7 @@ enum bpf_map_type { BPF_MAP_TYPE_BLOOM_FILTER, BPF_MAP_TYPE_USER_RINGBUF, BPF_MAP_TYPE_CGRP_STORAGE, + BPF_MAP_TYPE_ARENA, __MAX_BPF_MAP_TYPE }; @@ -1370,6 +1371,12 @@ enum { /* BPF token FD is passed in a corresponding command's token_fd field */ BPF_F_TOKEN_FD = (1U << 16), + +/* When user space page faults in bpf_arena send SIGSEGV instead of inserting new page */ + BPF_F_SEGV_ON_FAULT = (1U << 17), + +/* Do not translate kernel bpf_arena pointers to user pointers */ + BPF_F_NO_USER_CONV = (1U << 18), }; /* Flags for BPF_PROG_QUERY. */ diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 4ce95acfcaa7..368c5d86b5b7 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -15,6 +15,9 @@ obj-${CONFIG_BPF_LSM} += bpf_inode_storage.o obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o obj-$(CONFIG_BPF_JIT) += trampoline.o obj-$(CONFIG_BPF_SYSCALL) += btf.o memalloc.o +ifeq ($(CONFIG_MMU)$(CONFIG_64BIT),yy) +obj-$(CONFIG_BPF_SYSCALL) += arena.o +endif obj-$(CONFIG_BPF_JIT) += dispatcher.o ifeq ($(CONFIG_NET),y) obj-$(CONFIG_BPF_SYSCALL) += devmap.o diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c new file mode 100644 index 000000000000..5c1014471740 --- /dev/null +++ b/kernel/bpf/arena.c @@ -0,0 +1,557 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include +#include +#include + +/* + * bpf_arena is a sparsely populated shared memory region between bpf program and + * user space process. + * + * For example on x86-64 the values could be: + * user_vm_start 7f7d26200000 // picked by mmap() + * kern_vm_start ffffc90001e69000 // picked by get_vm_area() + * For user space all pointers within the arena are normal 8-byte addresses. + * In this example 7f7d26200000 is the address of the first page (pgoff=0). + * The bpf program will access it as: kern_vm_start + lower_32bit_of_user_ptr + * (u32)7f7d26200000 -> 26200000 + * hence + * ffffc90001e69000 + 26200000 == ffffc90028069000 is "pgoff=0" within 4Gb + * kernel memory region. + * + * BPF JITs generate the following code to access arena: + * mov eax, eax // eax has lower 32-bit of user pointer + * mov word ptr [rax + r12 + off], bx + * where r12 == kern_vm_start and off is s16. + * Hence allocate 4Gb + GUARD_SZ/2 on each side. + * + * Initially kernel vm_area and user vma are not populated. + * User space can fault-in any address which will insert the page + * into kernel and user vma. + * bpf program can allocate a page via bpf_arena_alloc_pages() kfunc + * which will insert it into kernel vm_area. + * The later fault-in from user space will populate that page into user vma. + */ + +/* number of bytes addressable by LDX/STX insn with 16-bit 'off' field */ +#define GUARD_SZ (1ull << sizeof(((struct bpf_insn *)0)->off) * 8) +#define KERN_VM_SZ ((1ull << 32) + GUARD_SZ) + +struct bpf_arena { + struct bpf_map map; + u64 user_vm_start; + u64 user_vm_end; + struct vm_struct *kern_vm; + struct maple_tree mt; + struct list_head vma_list; + struct mutex lock; +}; + +u64 bpf_arena_get_kern_vm_start(struct bpf_arena *arena) +{ + return arena ? (u64) (long) arena->kern_vm->addr + GUARD_SZ / 2 : 0; +} + +u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena) +{ + return arena ? arena->user_vm_start : 0; +} + +static long arena_map_peek_elem(struct bpf_map *map, void *value) +{ + return -EOPNOTSUPP; +} + +static long arena_map_push_elem(struct bpf_map *map, void *value, u64 flags) +{ + return -EOPNOTSUPP; +} + +static long arena_map_pop_elem(struct bpf_map *map, void *value) +{ + return -EOPNOTSUPP; +} + +static long arena_map_delete_elem(struct bpf_map *map, void *value) +{ + return -EOPNOTSUPP; +} + +static int arena_map_get_next_key(struct bpf_map *map, void *key, void *next_key) +{ + return -EOPNOTSUPP; +} + +static long compute_pgoff(struct bpf_arena *arena, long uaddr) +{ + return (u32)(uaddr - (u32)arena->user_vm_start) >> PAGE_SHIFT; +} + +static struct bpf_map *arena_map_alloc(union bpf_attr *attr) +{ + struct vm_struct *kern_vm; + int numa_node = bpf_map_attr_numa_node(attr); + struct bpf_arena *arena; + u64 vm_range; + int err = -ENOMEM; + + if (attr->key_size || attr->value_size || attr->max_entries == 0 || + /* BPF_F_MMAPABLE must be set */ + !(attr->map_flags & BPF_F_MMAPABLE) || + /* No unsupported flags present */ + (attr->map_flags & ~(BPF_F_SEGV_ON_FAULT | BPF_F_MMAPABLE | BPF_F_NO_USER_CONV))) + return ERR_PTR(-EINVAL); + + if (attr->map_extra & ~PAGE_MASK) + /* If non-zero the map_extra is an expected user VMA start address */ + return ERR_PTR(-EINVAL); + + vm_range = (u64)attr->max_entries * PAGE_SIZE; + if (vm_range > (1ull << 32)) + return ERR_PTR(-E2BIG); + + if ((attr->map_extra >> 32) != ((attr->map_extra + vm_range - 1) >> 32)) + /* user vma must not cross 32-bit boundary */ + return ERR_PTR(-ERANGE); + + kern_vm = get_vm_area(KERN_VM_SZ, VM_MAP | VM_USERMAP); + if (!kern_vm) + return ERR_PTR(-ENOMEM); + + arena = bpf_map_area_alloc(sizeof(*arena), numa_node); + if (!arena) + goto err; + + arena->kern_vm = kern_vm; + arena->user_vm_start = attr->map_extra; + if (arena->user_vm_start) + arena->user_vm_end = arena->user_vm_start + vm_range; + + INIT_LIST_HEAD(&arena->vma_list); + bpf_map_init_from_attr(&arena->map, attr); + mt_init_flags(&arena->mt, MT_FLAGS_ALLOC_RANGE); + mutex_init(&arena->lock); + + return &arena->map; +err: + free_vm_area(kern_vm); + return ERR_PTR(err); +} + +static int for_each_pte(pte_t *ptep, unsigned long addr, void *data) +{ + struct page *page; + pte_t pte; + + pte = ptep_get(ptep); + if (!pte_present(pte)) + return 0; + page = pte_page(pte); + /* + * We do not update pte here: + * 1. Nobody should be accessing bpf_arena's range outside of a kernel bug + * 2. TLB flushing is batched or deferred. Even if we clear pte, + * the TLB entries can stick around and continue to permit access to + * the freed page. So it all relies on 1. + */ + __free_page(page); + return 0; +} + +static void arena_map_free(struct bpf_map *map) +{ + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + /* + * Check that user vma-s are not around when bpf map is freed. + * mmap() holds vm_file which holds bpf_map refcnt. + * munmap() must have happened on vma followed by arena_vm_close() + * which would clear arena->vma_list. + */ + if (WARN_ON_ONCE(!list_empty(&arena->vma_list))) + return; + + /* + * free_vm_area() calls remove_vm_area() that calls free_unmap_vmap_area(). + * It unmaps everything from vmalloc area and clears pgtables. + * Call apply_to_existing_page_range() first to find populated ptes and + * free those pages. + */ + apply_to_existing_page_range(&init_mm, bpf_arena_get_kern_vm_start(arena), + KERN_VM_SZ - GUARD_SZ / 2, for_each_pte, NULL); + free_vm_area(arena->kern_vm); + mtree_destroy(&arena->mt); + bpf_map_area_free(arena); +} + +static void *arena_map_lookup_elem(struct bpf_map *map, void *key) +{ + return ERR_PTR(-EINVAL); +} + +static long arena_map_update_elem(struct bpf_map *map, void *key, + void *value, u64 flags) +{ + return -EOPNOTSUPP; +} + +static int arena_map_check_btf(const struct bpf_map *map, const struct btf *btf, + const struct btf_type *key_type, const struct btf_type *value_type) +{ + return 0; +} + +static u64 arena_map_mem_usage(const struct bpf_map *map) +{ + return 0; +} + +struct vma_list { + struct vm_area_struct *vma; + struct list_head head; +}; + +static int remember_vma(struct bpf_arena *arena, struct vm_area_struct *vma) +{ + struct vma_list *vml; + + vml = kmalloc(sizeof(*vml), GFP_KERNEL); + if (!vml) + return -ENOMEM; + vma->vm_private_data = vml; + vml->vma = vma; + list_add(&vml->head, &arena->vma_list); + return 0; +} + +static void arena_vm_close(struct vm_area_struct *vma) +{ + struct bpf_map *map = vma->vm_file->private_data; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + struct vma_list *vml; + + guard(mutex)(&arena->lock); + vml = vma->vm_private_data; + list_del(&vml->head); + vma->vm_private_data = NULL; + kfree(vml); +} + +#define MT_ENTRY ((void *)&arena_map_ops) /* unused. has to be valid pointer */ + +static vm_fault_t arena_vm_fault(struct vm_fault *vmf) +{ + struct bpf_map *map = vmf->vma->vm_file->private_data; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + struct page *page; + long kbase, kaddr; + int ret; + + kbase = bpf_arena_get_kern_vm_start(arena); + kaddr = kbase + (u32)(vmf->address & PAGE_MASK); + + guard(mutex)(&arena->lock); + page = vmalloc_to_page((void *)kaddr); + if (page) + /* already have a page vmap-ed */ + goto out; + + if (arena->map.map_flags & BPF_F_SEGV_ON_FAULT) + /* User space requested to segfault when page is not allocated by bpf prog */ + return VM_FAULT_SIGSEGV; + + ret = mtree_insert(&arena->mt, vmf->pgoff, MT_ENTRY, GFP_KERNEL); + if (ret) + return VM_FAULT_SIGSEGV; + + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) { + mtree_erase(&arena->mt, vmf->pgoff); + return VM_FAULT_SIGSEGV; + } + + ret = vmap_pages_range(kaddr, kaddr + PAGE_SIZE, PAGE_KERNEL, &page, PAGE_SHIFT); + if (ret) { + mtree_erase(&arena->mt, vmf->pgoff); + __free_page(page); + return VM_FAULT_SIGSEGV; + } +out: + page_ref_add(page, 1); + vmf->page = page; + return 0; +} + +static const struct vm_operations_struct arena_vm_ops = { + .close = arena_vm_close, + .fault = arena_vm_fault, +}; + +static unsigned long arena_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags) +{ + struct bpf_map *map = filp->private_data; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + long ret; + + if (pgoff) + return -EINVAL; + if (len > (1ull << 32)) + return -E2BIG; + + /* if user_vm_start was specified at arena creation time */ + if (arena->user_vm_start) { + if (len > arena->user_vm_end - arena->user_vm_start) + return -E2BIG; + if (len != arena->user_vm_end - arena->user_vm_start) + return -EINVAL; + if (addr != arena->user_vm_start) + return -EINVAL; + } + + ret = current->mm->get_unmapped_area(filp, addr, len * 2, 0, flags); + if (IS_ERR_VALUE(ret)) + return 0; + if ((ret >> 32) == ((ret + len - 1) >> 32)) + return ret; + if (WARN_ON_ONCE(arena->user_vm_start)) + /* checks at map creation time should prevent this */ + return -EFAULT; + return round_up(ret, 1ull << 32); +} + +static int arena_map_mmap(struct bpf_map *map, struct vm_area_struct *vma) +{ + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + guard(mutex)(&arena->lock); + if (arena->user_vm_start && arena->user_vm_start != vma->vm_start) + /* + * If map_extra was not specified at arena creation time then + * 1st user process can do mmap(NULL, ...) to pick user_vm_start + * 2nd user process must pass the same addr to mmap(addr, MAP_FIXED..); + * or + * specify addr in map_extra and + * use the same addr later with mmap(addr, MAP_FIXED..); + */ + return -EBUSY; + + if (arena->user_vm_end && arena->user_vm_end != vma->vm_end) + /* all user processes must have the same size of mmap-ed region */ + return -EBUSY; + + /* Earlier checks should prevent this */ + if (WARN_ON_ONCE(vma->vm_end - vma->vm_start > (1ull << 32) || vma->vm_pgoff)) + return -EFAULT; + + if (remember_vma(arena, vma)) + return -ENOMEM; + + arena->user_vm_start = vma->vm_start; + arena->user_vm_end = vma->vm_end; + /* + * bpf_map_mmap() checks that it's being mmaped as VM_SHARED and + * clears VM_MAYEXEC. Set VM_DONTEXPAND as well to avoid + * potential change of user_vm_start. + */ + vm_flags_set(vma, VM_DONTEXPAND); + vma->vm_ops = &arena_vm_ops; + return 0; +} + +static int arena_map_direct_value_addr(const struct bpf_map *map, u64 *imm, u32 off) +{ + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + if ((u64)off > arena->user_vm_end - arena->user_vm_start) + return -ERANGE; + *imm = (unsigned long)arena->user_vm_start; + return 0; +} + +BTF_ID_LIST_SINGLE(bpf_arena_map_btf_ids, struct, bpf_arena) +const struct bpf_map_ops arena_map_ops = { + .map_meta_equal = bpf_map_meta_equal, + .map_alloc = arena_map_alloc, + .map_free = arena_map_free, + .map_direct_value_addr = arena_map_direct_value_addr, + .map_mmap = arena_map_mmap, + .map_get_unmapped_area = arena_get_unmapped_area, + .map_get_next_key = arena_map_get_next_key, + .map_push_elem = arena_map_push_elem, + .map_peek_elem = arena_map_peek_elem, + .map_pop_elem = arena_map_pop_elem, + .map_lookup_elem = arena_map_lookup_elem, + .map_update_elem = arena_map_update_elem, + .map_delete_elem = arena_map_delete_elem, + .map_check_btf = arena_map_check_btf, + .map_mem_usage = arena_map_mem_usage, + .map_btf_id = &bpf_arena_map_btf_ids[0], +}; + +static u64 clear_lo32(u64 val) +{ + return val & ~(u64)~0U; +} + +/* + * Allocate pages and vmap them into kernel vmalloc area. + * Later the pages will be mmaped into user space vma. + */ +static long arena_alloc_pages(struct bpf_arena *arena, long uaddr, long page_cnt, int node_id) +{ + /* user_vm_end/start are fixed before bpf prog runs */ + long page_cnt_max = (arena->user_vm_end - arena->user_vm_start) >> PAGE_SHIFT; + u64 kern_vm_start = bpf_arena_get_kern_vm_start(arena); + long pgoff = 0, nr_pages = 0; + struct page **pages; + u32 uaddr32; + int ret, i; + + if (page_cnt > page_cnt_max) + return 0; + + if (uaddr) { + if (uaddr & ~PAGE_MASK) + return 0; + pgoff = compute_pgoff(arena, uaddr); + if (pgoff + page_cnt > page_cnt_max) + /* requested address will be outside of user VMA */ + return 0; + } + + /* zeroing is needed, since alloc_pages_bulk_array() only fills in non-zero entries */ + pages = kvcalloc(page_cnt, sizeof(struct page *), GFP_KERNEL); + if (!pages) + return 0; + + guard(mutex)(&arena->lock); + + if (uaddr) + ret = mtree_insert_range(&arena->mt, pgoff, pgoff + page_cnt - 1, + MT_ENTRY, GFP_KERNEL); + else + ret = mtree_alloc_range(&arena->mt, &pgoff, MT_ENTRY, + page_cnt, 0, page_cnt_max - 1, GFP_KERNEL); + if (ret) + goto out_free_pages; + + nr_pages = alloc_pages_bulk_array_node(GFP_KERNEL | __GFP_ZERO, node_id, page_cnt, pages); + if (nr_pages != page_cnt) + goto out; + + uaddr32 = (u32)(arena->user_vm_start + pgoff * PAGE_SIZE); + /* Earlier checks make sure that uaddr32 + page_cnt * PAGE_SIZE will not overflow 32-bit */ + ret = vmap_pages_range(kern_vm_start + uaddr32, + kern_vm_start + uaddr32 + page_cnt * PAGE_SIZE, + PAGE_KERNEL, pages, PAGE_SHIFT); + if (ret) + goto out; + kvfree(pages); + return clear_lo32(arena->user_vm_start) + uaddr32; +out: + mtree_erase(&arena->mt, pgoff); +out_free_pages: + if (pages) + for (i = 0; i < nr_pages; i++) + __free_page(pages[i]); + kvfree(pages); + return 0; +} + +/* + * If page is present in vmalloc area, unmap it from vmalloc area, + * unmap it from all user space vma-s, + * and free it. + */ +static void zap_pages(struct bpf_arena *arena, long uaddr, long page_cnt) +{ + struct vma_list *vml; + + list_for_each_entry(vml, &arena->vma_list, head) + zap_page_range_single(vml->vma, uaddr, + PAGE_SIZE * page_cnt, NULL); +} + +static void arena_free_pages(struct bpf_arena *arena, long uaddr, long page_cnt) +{ + u64 full_uaddr, uaddr_end; + long kaddr, pgoff, i; + struct page *page; + + /* only aligned lower 32-bit are relevant */ + uaddr = (u32)uaddr; + uaddr &= PAGE_MASK; + full_uaddr = clear_lo32(arena->user_vm_start) + uaddr; + uaddr_end = min(arena->user_vm_end, full_uaddr + (page_cnt << PAGE_SHIFT)); + if (full_uaddr >= uaddr_end) + return; + + page_cnt = (uaddr_end - full_uaddr) >> PAGE_SHIFT; + + guard(mutex)(&arena->lock); + + pgoff = compute_pgoff(arena, uaddr); + /* clear range */ + mtree_store_range(&arena->mt, pgoff, pgoff + page_cnt - 1, NULL, GFP_KERNEL); + + if (page_cnt > 1) + /* bulk zap if multiple pages being freed */ + zap_pages(arena, full_uaddr, page_cnt); + + kaddr = bpf_arena_get_kern_vm_start(arena) + uaddr; + for (i = 0; i < page_cnt; i++, kaddr += PAGE_SIZE, full_uaddr += PAGE_SIZE) { + page = vmalloc_to_page((void *)kaddr); + if (!page) + continue; + if (page_cnt == 1 && page_mapped(page)) /* mapped by some user process */ + zap_pages(arena, full_uaddr, 1); + vunmap_range(kaddr, kaddr + PAGE_SIZE); + __free_page(page); + } +} + +__bpf_kfunc_start_defs(); + +__bpf_kfunc void *bpf_arena_alloc_pages(void *p__map, void *addr__ign, u32 page_cnt, + int node_id, u64 flags) +{ + struct bpf_map *map = p__map; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + if (map->map_type != BPF_MAP_TYPE_ARENA || flags || !page_cnt) + return NULL; + + return (void *)arena_alloc_pages(arena, (long)addr__ign, page_cnt, node_id); +} + +__bpf_kfunc void bpf_arena_free_pages(void *p__map, void *ptr__ign, u32 page_cnt) +{ + struct bpf_map *map = p__map; + struct bpf_arena *arena = container_of(map, struct bpf_arena, map); + + if (map->map_type != BPF_MAP_TYPE_ARENA || !page_cnt || !ptr__ign) + return; + arena_free_pages(arena, (long)ptr__ign, page_cnt); +} +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(arena_kfuncs) +BTF_ID_FLAGS(func, bpf_arena_alloc_pages, KF_TRUSTED_ARGS | KF_SLEEPABLE) +BTF_ID_FLAGS(func, bpf_arena_free_pages, KF_TRUSTED_ARGS | KF_SLEEPABLE) +BTF_KFUNCS_END(arena_kfuncs) + +static const struct btf_kfunc_id_set common_kfunc_set = { + .owner = THIS_MODULE, + .set = &arena_kfuncs, +}; + +static int __init kfunc_init(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_UNSPEC, &common_kfunc_set); +} +late_initcall(kfunc_init); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 71c459a51d9e..2539d9bfe369 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2970,6 +2970,17 @@ void __weak arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, { } +/* for configs without MMU or 32-bit */ +__weak const struct bpf_map_ops arena_map_ops; +__weak u64 bpf_arena_get_user_vm_start(struct bpf_arena *arena) +{ + return 0; +} +__weak u64 bpf_arena_get_kern_vm_start(struct bpf_arena *arena) +{ + return 0; +} + #ifdef CONFIG_BPF_SYSCALL static int __init bpf_global_ma_init(void) { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 8dd9814a0e14..6b9efb3f79dd 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -164,6 +164,7 @@ static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, if (bpf_map_is_offloaded(map)) { return bpf_map_offload_update_elem(map, key, value, flags); } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || + map->map_type == BPF_MAP_TYPE_ARENA || map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { return map->ops->map_update_elem(map, key, value, flags); } else if (map->map_type == BPF_MAP_TYPE_SOCKHASH || @@ -1172,6 +1173,7 @@ static int map_create(union bpf_attr *attr) } if (attr->map_type != BPF_MAP_TYPE_BLOOM_FILTER && + attr->map_type != BPF_MAP_TYPE_ARENA && attr->map_extra != 0) return -EINVAL; @@ -1261,6 +1263,7 @@ static int map_create(union bpf_attr *attr) case BPF_MAP_TYPE_LRU_PERCPU_HASH: case BPF_MAP_TYPE_STRUCT_OPS: case BPF_MAP_TYPE_CPUMAP: + case BPF_MAP_TYPE_ARENA: if (!bpf_token_capable(token, CAP_BPF)) goto put_token; break; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index db569ce89fb1..3c77a3ab1192 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -18047,6 +18047,7 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env, case BPF_MAP_TYPE_SK_STORAGE: case BPF_MAP_TYPE_TASK_STORAGE: case BPF_MAP_TYPE_CGRP_STORAGE: + case BPF_MAP_TYPE_ARENA: break; default: verbose(env, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index d96708380e52..f6648851eae6 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -983,6 +983,7 @@ enum bpf_map_type { BPF_MAP_TYPE_BLOOM_FILTER, BPF_MAP_TYPE_USER_RINGBUF, BPF_MAP_TYPE_CGRP_STORAGE, + BPF_MAP_TYPE_ARENA, __MAX_BPF_MAP_TYPE }; @@ -1370,6 +1371,12 @@ enum { /* BPF token FD is passed in a corresponding command's token_fd field */ BPF_F_TOKEN_FD = (1U << 16), + +/* When user space page faults in bpf_arena send SIGSEGV instead of inserting new page */ + BPF_F_SEGV_ON_FAULT = (1U << 17), + +/* Do not translate kernel bpf_arena pointers to user pointers */ + BPF_F_NO_USER_CONV = (1U << 18), }; /* Flags for BPF_PROG_QUERY. */ From patchwork Fri Feb 9 04:05:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550826 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 021875681 for ; Fri, 9 Feb 2024 04:06:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451601; cv=none; b=RHL44D9C/GFf3JvM1z21lmwRr1WUC2BJnCD1js23UFAmA9cRPrBD8jywPYQ1VnA08Ty/GlbkXybcjPLJVGk4OJbO5HPCg6hDx3O9rMibjkrolbjTJkHproejnT/Ev5V8a0Ztpo09J1APfpVOUL1ilwNnYU+bi1i66dd14FJJpBs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451601; c=relaxed/simple; bh=q/Oh4UGLAUx5nS2repzDRhenYLeGL9xgTdHnvBGVAfw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=SL6EuFF0pkfkAI4mCcSIka1b+rqjuyyWdjFKByB6nHOS+U2shqL+ozYwHpoY2inKtNuf85TFiHKhwJUfJ1nrxLIAJxRfSt9nrG1nrt/A1PvK7B87ip7BqKTLshZOg7ErPA3XA28zMKAYAjM0dBR6d+BwsZlMaZDmIYPU5w6Kdbs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=L5Z+MWUF; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="L5Z+MWUF" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1d93f2c3701so3165485ad.3 for ; Thu, 08 Feb 2024 20:06:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451598; x=1708056398; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Dj+O15ERlc2oNccgk7u96NynUdtw7IljkhhXG+cbct8=; b=L5Z+MWUF+R9W0lgEVdctfML1z7k6SMZwGr4nOyYTpgzSBuRyKECiTUQQ2iZV71147v ZYNi2m3RvmGFFER5ZB2Rtio8AeJQIhQzOvsjDjSUhejojoSabHjkIN8qWdRzlA2uT8m9 tJ19V/Y5jF3VCXHzZLvuODgvj0EgJeT/6JHsyEhyEhqAbEjKXfLWms1ELC/uySULhrkY TzQ60E++fcaT1e3ECOgnDTdf+J8OGNFwsFKLTMlRiPji0bYCeY9Hbl7axUqWcHqMyLQN +xWquk8dtlNjx7SCD5WRG26h8lIV1tt8vPjHR6i4p5ihuuKeUM/agyXZhI/qSrTTXg5L Z60g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451598; x=1708056398; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Dj+O15ERlc2oNccgk7u96NynUdtw7IljkhhXG+cbct8=; b=FMy+1d2iZV1D0Nbe42fPciowrWtm2hpxCc826W4DSvmEsksgfWE0tRQToh+yRlH0Ef PBXY7fQv+AdWiZP6feXdRoB+Y8b7I4gBIUotLu17QXlVWCHbtMV0gGsvdIYqnVH/2woy 7HfpFFfB+p/MFPC/hIiGKp4n59VIjJLOX3jKhtaM3jpQUK5vDL6UiQivU2fihcGApQwr 7ZjQwsHoe4uO6nhgl8DM6jmuroCVgsTSxgdtNxlO33cY4sK4PJfildx8R1FQSwE5TivI fNmLKJAUDzRXR5gOyYaQgQC+Up51//As0B9+p+lzgmhEiwd5sJZhuf1KBMMaU4SsKUrj zAww== X-Gm-Message-State: AOJu0YxC/jFbT7UnYtzLx/QZhO0wrwjsls4MXYbHUpubXhb/2m8LI4AK rFN9e63gLyBKkMdA/s2L81DfFBgAbbD0JK4/VTKLTdf5/cqYSikHHwrHZM+V X-Google-Smtp-Source: AGHT+IH2IsgScbF5ztEZpq6NjR9/2tA7q8l7Ix0S1q4VHpCiUufQi2AhA2u+G2VsbGb/+uLf6rgQHA== X-Received: by 2002:a17:902:b68c:b0:1d8:fae3:2216 with SMTP id c12-20020a170902b68c00b001d8fae32216mr419187pls.35.1707451598204; Thu, 08 Feb 2024 20:06:38 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVCvhcfuaPez5OC/0K6lsZftyKKLtOFjodjSkRdPq0l1nidkiVoJHU0Trfc2t9CHoxVoGuR75BnUBtDC7gLBQ5y1kSWyuyg0x40itdsIGSi0vL6t/hDpiz3VGWvzx1E7jyVPl9K9LQX+7fbCFhxUC3CdIANDI84yBGyKa04JykPcZxl05c6A0hQNL7dkYjfPxslz4jAHOWN1SA3/EFocm4/Y4mzyEFTltE88QxdJlxI/+lGIt1LRaqh1R6CISaCy56IDDTUokGPy6rscJPCxBOK6mSWq+iGAF9teb5RqL2aAEQNZwCC6Nscq4hnc7lq7PODCpjmijWqNsxvLhCBkxJ8QMIxLowrFG/g1w/GsVw1Akxvfy/DcA== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id lh16-20020a170903291000b001d9537cf238sm538602plb.295.2024.02.08.20.06.36 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:37 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 06/20] bpf: Disasm support for cast_kern/user instructions. Date: Thu, 8 Feb 2024 20:05:54 -0800 Message-Id: <20240209040608.98927-7-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov LLVM generates rX = bpf_cast_kern/_user(rY, address_space) instructions when pointers in non-zero address space are used by the bpf program. Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- include/uapi/linux/bpf.h | 5 +++++ kernel/bpf/disasm.c | 11 +++++++++++ tools/include/uapi/linux/bpf.h | 5 +++++ 3 files changed, 21 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index f6648851eae6..3de1581379d4 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1313,6 +1313,11 @@ enum { */ #define BPF_PSEUDO_KFUNC_CALL 2 +enum bpf_arena_cast_kinds { + BPF_ARENA_CAST_KERN = 1, + BPF_ARENA_CAST_USER = 2, +}; + /* flags for BPF_MAP_UPDATE_ELEM command */ enum { BPF_ANY = 0, /* create new element or update existing */ diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index 49940c26a227..37d9b37b34f7 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -166,6 +166,12 @@ static bool is_movsx(const struct bpf_insn *insn) (insn->off == 8 || insn->off == 16 || insn->off == 32); } +static bool is_arena_cast(const struct bpf_insn *insn) +{ + return insn->code == (BPF_ALU64 | BPF_MOV | BPF_X) && + (insn->off == BPF_ARENA_CAST_KERN || insn->off == BPF_ARENA_CAST_USER); +} + void print_bpf_insn(const struct bpf_insn_cbs *cbs, const struct bpf_insn *insn, bool allow_ptr_leaks) @@ -184,6 +190,11 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, insn->code, class == BPF_ALU ? 'w' : 'r', insn->dst_reg, class == BPF_ALU ? 'w' : 'r', insn->dst_reg); + } else if (is_arena_cast(insn)) { + verbose(cbs->private_data, "(%02x) r%d = cast_%s(r%d, %d)\n", + insn->code, insn->dst_reg, + insn->off == BPF_ARENA_CAST_KERN ? "kern" : "user", + insn->src_reg, insn->imm); } else if (BPF_SRC(insn->code) == BPF_X) { verbose(cbs->private_data, "(%02x) %c%d %s %s%c%d\n", insn->code, class == BPF_ALU ? 'w' : 'r', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index f6648851eae6..3de1581379d4 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1313,6 +1313,11 @@ enum { */ #define BPF_PSEUDO_KFUNC_CALL 2 +enum bpf_arena_cast_kinds { + BPF_ARENA_CAST_KERN = 1, + BPF_ARENA_CAST_USER = 2, +}; + /* flags for BPF_MAP_UPDATE_ELEM command */ enum { BPF_ANY = 0, /* create new element or update existing */ From patchwork Fri Feb 9 04:05:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550827 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 389025673 for ; Fri, 9 Feb 2024 04:06:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451606; cv=none; b=i8HJmOID2gT+hvnnnhX+Zc8Yxc+Z8y3A4jobJmuDABQq0p6HUMj1jrws6i2r0OyDgxFs489sT7mEmf9Ame9Ca42IiJGsahpLLK6IZe3uwoUjL+GIiw8urdPv6SxukkNvbEonuR6UhFetA9tksmc/PPwIRo/PtUeYanapJndF4jA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451606; c=relaxed/simple; bh=dB2//4PPm1Yp4eBFeVV1EAhzCGAbJX2UDbaryxiReM0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=F7JHX4np7gc/rNuxvFN6se81A1gwRIj3IMFcR0fQSjenvT09Kon/4oX0VzkN4I8xGj3gt6/QZ6lLUyJPCqx9RuqcvYOnHWyq9Zc8icLyQ+9sd0Y4Hmd9jhQ/VjT2EgVVuDlSWY2Bq7e148UoRS62jEWJnPgL0wIxJhowdLRdmNM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QxZg5sr3; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QxZg5sr3" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1d73066880eso4405815ad.3 for ; Thu, 08 Feb 2024 20:06:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451602; x=1708056402; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=r96UcfCBLUTrZkh0XKcSEOthU3NUKwaxo/e39vpFxwI=; b=QxZg5sr3lt5W9jQXvD3Q7QMXVjKJV8n6Xzw2QqUOssDnu3BwPE2JXS/aG78v2XGXJB W0ngbZGCyWNm0v7AIh2+wVfJ7MjhITVG1V06v0GAWlY1x3Nqt04yNuaB+KszgWPFSnFN Q9pu9WpX4EXBWJrLVAUH2KjPzfZZt7Px1t0pIzsLertmcpEB82cvNDfb6P5Uvcq9ynqL jZwiHqGgrD3h3Qn8GBHFTHlPZyO15mNC7YZM7IGAD6FUOlCHy6H5A2jS+cPz40riPqtt yIyFvFLra/llfp1NP7S6NpRawFwCIiEA6PwwirbiNRzFr9DZBc/yhUsIbnwgY663D2QC Gy/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451602; x=1708056402; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=r96UcfCBLUTrZkh0XKcSEOthU3NUKwaxo/e39vpFxwI=; b=nMtmZAkzYiOK/0utAoLqW7dcGl0y6YVqeDCRaW5gRg2WcsIiAgCATkCYh1wBHtOUVD rpCsfL2TfSnzUIYRuFotFgJeRyn+dlGiVG7yybZfTIjSHKZu6WdlGKb/TX1AMBmoIKcG XOE5+ixHKDoYeU1nZgyg6R4GezX2Pp6Tv2JfNDpQpoHCEKzOiXbGM02sZbemUpAApccM Ww8GCu2rKobpZyH27pcVT8o5+n7wYKEn1qy4RiiRGU8OTtg4qKRL1xgyzfACEQ/SHe2h NfmlkmmGBMwSLcCD1jHu8Db+1SQaVuTi1Kmhe1PcsuaQLGxWcIZuoQG/IP/krPNBglVu bq1g== X-Gm-Message-State: AOJu0YwmnceAh52iZTTgUDPy2JlWNLczeG3k1TtBicZkAEtN+D3jDltO zyaRXBdxIzX3YKT2qxjEKBh0+0LskSzbuFNBFz2XK5dYI7e+1oHdNqQ0ZBc3 X-Google-Smtp-Source: AGHT+IF1XX2MhSM/LO2CyghiHgp2M0a1axk082AxrJPPTkzUSm6fuDm951Y0QMtQRmbzz1hhWppIoA== X-Received: by 2002:a17:903:2287:b0:1d9:a29e:ff1a with SMTP id b7-20020a170903228700b001d9a29eff1amr536162plh.34.1707451602554; Thu, 08 Feb 2024 20:06:42 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCW7TFVefsK6YiLeFqnRQ+ngnAM3Iz9zc36ph1O0+RtguHwMKRjvVOYFFLDzlhm7GpceHe4QWZ/ukSU1k2THoo8Yh5tiJCf+N5NzpDvOH6PVcXVvr1DLsYcVrwwpgMureD9I4F9yvnBekkrpH4nQ5rAWXf+mvkPKcdk/4nhOKtazibWbS9zFESguuCqUyauUmA3uh3i7SXb3/y+uDn7yKkDIU3SU9QMGf6xHQZubzvwfv775vfmvAMkhvi34uCws6gTklKGRLS/S9Z1Xf7sGpXvGFH97phRnv3qgVxLHYnVzl5SX8QWnlFoVqxqpId9mbWPB5A7XDgRAeMZeixCaLHNbe6940Tr3mPQh4OCOoY2p4MV1oncWiw== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id kk7-20020a170903070700b001d8f99dbe4asm548739plb.4.2024.02.08.20.06.40 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:42 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 07/20] bpf: Add x86-64 JIT support for PROBE_MEM32 pseudo instructions. Date: Thu, 8 Feb 2024 20:05:55 -0800 Message-Id: <20240209040608.98927-8-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW] instructions. They are similar to PROBE_MEM instructions with the following differences: - PROBE_MEM has to check that the address is in the kernel range with src_reg + insn->off >= TASK_SIZE_MAX + PAGE_SIZE check - PROBE_MEM doesn't support store - PROBE_MEM32 relies on the verifier to clear upper 32-bit in the register - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in %r12 in the prologue) Due to bpf_arena constructions such %r12 + %reg + off16 access is guaranteed to be within arena virtual range, so no address check at run-time. - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When LDX faults the destination register is zeroed. Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- arch/x86/net/bpf_jit_comp.c | 183 +++++++++++++++++++++++++++++++++++- include/linux/bpf.h | 1 + include/linux/filter.h | 3 + 3 files changed, 186 insertions(+), 1 deletion(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index e1390d1e331b..883b7f604b9a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -113,6 +113,7 @@ static int bpf_size_to_x86_bytes(int bpf_size) /* Pick a register outside of BPF range for JIT internal work */ #define AUX_REG (MAX_BPF_JIT_REG + 1) #define X86_REG_R9 (MAX_BPF_JIT_REG + 2) +#define X86_REG_R12 (MAX_BPF_JIT_REG + 3) /* * The following table maps BPF registers to x86-64 registers. @@ -139,6 +140,7 @@ static const int reg2hex[] = { [BPF_REG_AX] = 2, /* R10 temp register */ [AUX_REG] = 3, /* R11 temp register */ [X86_REG_R9] = 1, /* R9 register, 6th function argument */ + [X86_REG_R12] = 4, /* R12 callee saved */ }; static const int reg2pt_regs[] = { @@ -167,6 +169,7 @@ static bool is_ereg(u32 reg) BIT(BPF_REG_8) | BIT(BPF_REG_9) | BIT(X86_REG_R9) | + BIT(X86_REG_R12) | BIT(BPF_REG_AX)); } @@ -205,6 +208,17 @@ static u8 add_2mod(u8 byte, u32 r1, u32 r2) return byte; } +static u8 add_3mod(u8 byte, u32 r1, u32 r2, u32 index) +{ + if (is_ereg(r1)) + byte |= 1; + if (is_ereg(index)) + byte |= 2; + if (is_ereg(r2)) + byte |= 4; + return byte; +} + /* Encode 'dst_reg' register into x86-64 opcode 'byte' */ static u8 add_1reg(u8 byte, u32 dst_reg) { @@ -887,6 +901,18 @@ static void emit_insn_suffix(u8 **pprog, u32 ptr_reg, u32 val_reg, int off) *pprog = prog; } +static void emit_insn_suffix_SIB(u8 **pprog, u32 ptr_reg, u32 val_reg, u32 index_reg, int off) +{ + u8 *prog = *pprog; + + if (is_imm8(off)) { + EMIT3(add_2reg(0x44, BPF_REG_0, val_reg), add_2reg(0, ptr_reg, index_reg) /* SIB */, off); + } else { + EMIT2_off32(add_2reg(0x84, BPF_REG_0, val_reg), add_2reg(0, ptr_reg, index_reg) /* SIB */, off); + } + *pprog = prog; +} + /* * Emit a REX byte if it will be necessary to address these registers */ @@ -968,6 +994,37 @@ static void emit_ldsx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) *pprog = prog; } +static void emit_ldx_index(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, u32 index_reg, int off) +{ + u8 *prog = *pprog; + + switch (size) { + case BPF_B: + /* movzx rax, byte ptr [rax + r12 + off] */ + EMIT3(add_3mod(0x40, src_reg, dst_reg, index_reg), 0x0F, 0xB6); + break; + case BPF_H: + /* movzx rax, word ptr [rax + r12 + off] */ + EMIT3(add_3mod(0x40, src_reg, dst_reg, index_reg), 0x0F, 0xB7); + break; + case BPF_W: + /* mov eax, dword ptr [rax + r12 + off] */ + EMIT2(add_3mod(0x40, src_reg, dst_reg, index_reg), 0x8B); + break; + case BPF_DW: + /* mov rax, qword ptr [rax + r12 + off] */ + EMIT2(add_3mod(0x48, src_reg, dst_reg, index_reg), 0x8B); + break; + } + emit_insn_suffix_SIB(&prog, src_reg, dst_reg, index_reg, off); + *pprog = prog; +} + +static void emit_ldx_r12(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) +{ + emit_ldx_index(pprog, size, dst_reg, src_reg, X86_REG_R12, off); +} + /* STX: *(u8*)(dst_reg + off) = src_reg */ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) { @@ -1002,6 +1059,71 @@ static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) *pprog = prog; } +/* STX: *(u8*)(dst_reg + index_reg + off) = src_reg */ +static void emit_stx_index(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, u32 index_reg, int off) +{ + u8 *prog = *pprog; + + switch (size) { + case BPF_B: + /* mov byte ptr [rax + r12 + off], al */ + EMIT2(add_3mod(0x40, dst_reg, src_reg, index_reg), 0x88); + break; + case BPF_H: + /* mov word ptr [rax + r12 + off], ax */ + EMIT3(0x66, add_3mod(0x40, dst_reg, src_reg, index_reg), 0x89); + break; + case BPF_W: + /* mov dword ptr [rax + r12 + 1], eax */ + EMIT2(add_3mod(0x40, dst_reg, src_reg, index_reg), 0x89); + break; + case BPF_DW: + /* mov qword ptr [rax + r12 + 1], rax */ + EMIT2(add_3mod(0x48, dst_reg, src_reg, index_reg), 0x89); + break; + } + emit_insn_suffix_SIB(&prog, dst_reg, src_reg, index_reg, off); + *pprog = prog; +} + +static void emit_stx_r12(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off) +{ + emit_stx_index(pprog, size, dst_reg, src_reg, X86_REG_R12, off); +} + +/* ST: *(u8*)(dst_reg + index_reg + off) = imm32 */ +static void emit_st_index(u8 **pprog, u32 size, u32 dst_reg, u32 index_reg, int off, int imm) +{ + u8 *prog = *pprog; + + switch (size) { + case BPF_B: + /* mov byte ptr [rax + r12 + off], imm8 */ + EMIT2(add_3mod(0x40, dst_reg, 0, index_reg), 0xC6); + break; + case BPF_H: + /* mov word ptr [rax + r12 + off], imm16 */ + EMIT3(0x66, add_3mod(0x40, dst_reg, 0, index_reg), 0xC7); + break; + case BPF_W: + /* mov dword ptr [rax + r12 + 1], imm32 */ + EMIT2(add_3mod(0x40, dst_reg, 0, index_reg), 0xC7); + break; + case BPF_DW: + /* mov qword ptr [rax + r12 + 1], imm32 */ + EMIT2(add_3mod(0x48, dst_reg, 0, index_reg), 0xC7); + break; + } + emit_insn_suffix_SIB(&prog, dst_reg, 0, index_reg, off); + EMIT(imm, bpf_size_to_x86_bytes(size)); + *pprog = prog; +} + +static void emit_st_r12(u8 **pprog, u32 size, u32 dst_reg, int off, int imm) +{ + emit_st_index(pprog, size, dst_reg, X86_REG_R12, off, imm); +} + static int emit_atomic(u8 **pprog, u8 atomic_op, u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size) { @@ -1043,12 +1165,15 @@ static int emit_atomic(u8 **pprog, u8 atomic_op, return 0; } +#define DONT_CLEAR 1 + bool ex_handler_bpf(const struct exception_table_entry *x, struct pt_regs *regs) { u32 reg = x->fixup >> 8; /* jump over faulting load and clear dest register */ - *(unsigned long *)((void *)regs + reg) = 0; + if (reg != DONT_CLEAR) + *(unsigned long *)((void *)regs + reg) = 0; regs->ip += x->fixup & 0xff; return true; } @@ -1147,11 +1272,14 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image bool tail_call_seen = false; bool seen_exit = false; u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; + u64 arena_vm_start; int i, excnt = 0; int ilen, proglen = 0; u8 *prog = temp; int err; + arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); + detect_reg_usage(insn, insn_cnt, callee_regs_used, &tail_call_seen); @@ -1172,8 +1300,13 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image push_r12(&prog); push_callee_regs(&prog, all_callee_regs_used); } else { + if (arena_vm_start) + push_r12(&prog); push_callee_regs(&prog, callee_regs_used); } + if (arena_vm_start) + emit_mov_imm64(&prog, X86_REG_R12, + arena_vm_start >> 32, (u32) arena_vm_start); ilen = prog - temp; if (rw_image) @@ -1564,6 +1697,52 @@ st: if (is_imm8(insn->off)) emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); break; + case BPF_ST | BPF_PROBE_MEM32 | BPF_B: + case BPF_ST | BPF_PROBE_MEM32 | BPF_H: + case BPF_ST | BPF_PROBE_MEM32 | BPF_W: + case BPF_ST | BPF_PROBE_MEM32 | BPF_DW: + start_of_ldx = prog; + emit_st_r12(&prog, BPF_SIZE(insn->code), dst_reg, insn->off, insn->imm); + goto populate_extable; + + /* LDX: dst_reg = *(u8*)(src_reg + r12 + off) */ + case BPF_LDX | BPF_PROBE_MEM32 | BPF_B: + case BPF_LDX | BPF_PROBE_MEM32 | BPF_H: + case BPF_LDX | BPF_PROBE_MEM32 | BPF_W: + case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW: + case BPF_STX | BPF_PROBE_MEM32 | BPF_B: + case BPF_STX | BPF_PROBE_MEM32 | BPF_H: + case BPF_STX | BPF_PROBE_MEM32 | BPF_W: + case BPF_STX | BPF_PROBE_MEM32 | BPF_DW: + start_of_ldx = prog; + if (BPF_CLASS(insn->code) == BPF_LDX) + emit_ldx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); + else + emit_stx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); +populate_extable: + { + struct exception_table_entry *ex; + u8 *_insn = image + proglen + (start_of_ldx - temp); + s64 delta; + + if (!bpf_prog->aux->extable) + break; + + ex = &bpf_prog->aux->extable[excnt++]; + + delta = _insn - (u8 *)&ex->insn; + /* switch ex to rw buffer for writes */ + ex = (void *)rw_image + ((void *)ex - (void *)image); + + ex->insn = delta; + + ex->data = EX_TYPE_BPF; + + ex->fixup = (prog - start_of_ldx) | + ((BPF_CLASS(insn->code) == BPF_LDX ? reg2pt_regs[dst_reg] : DONT_CLEAR) << 8); + } + break; + /* LDX: dst_reg = *(u8*)(src_reg + off) */ case BPF_LDX | BPF_MEM | BPF_B: case BPF_LDX | BPF_PROBE_MEM | BPF_B: @@ -2036,6 +2215,8 @@ st: if (is_imm8(insn->off)) pop_r12(&prog); } else { pop_callee_regs(&prog, callee_regs_used); + if (arena_vm_start) + pop_r12(&prog); } EMIT1(0xC9); /* leave */ emit_return(&prog, image + addrs[i - 1] + (prog - temp)); diff --git a/include/linux/bpf.h b/include/linux/bpf.h index de557c6c42e0..26419a57bf9f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1463,6 +1463,7 @@ struct bpf_prog_aux { bool xdp_has_frags; bool exception_cb; bool exception_boundary; + struct bpf_arena *arena; /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */ const struct btf_type *attach_func_proto; /* function name for valid attach_btf_id */ diff --git a/include/linux/filter.h b/include/linux/filter.h index fee070b9826e..cd76d43412d0 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -72,6 +72,9 @@ struct ctl_table_header; /* unused opcode to mark special ldsx instruction. Same as BPF_IND */ #define BPF_PROBE_MEMSX 0x40 +/* unused opcode to mark special load instruction. Same as BPF_MSH */ +#define BPF_PROBE_MEM32 0xa0 + /* unused opcode to mark call to interpreter with arguments */ #define BPF_CALL_ARGS 0xe0 From patchwork Fri Feb 9 04:05:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550828 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4122B5695 for ; Fri, 9 Feb 2024 04:06:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451609; cv=none; b=QwkyyoEOnofLzx0dcfQRShx0IPsofW08BRWFo2KEUY5wtI/BYhZC5i7rkOqQpGdFMUbFXz4QvgoCqFmxhdiBM4H7IZCKCue20qWo7evqLsX4eGYOQfJZ37n+oKVOY/Flz1mz+QiAHanHXpiy5Xh1cBKkOHh3VepCx80dwAC5oT0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451609; c=relaxed/simple; bh=J1UZVDn6uI5UtDGhH2/mRbk7fjIbvzhRqLYTqIkwRcE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=s9i27zVRk0HK95vQagoD36bvYuUFGrprlbTFHZlr5xJD7cQIHR/IjFL20ohXKu0pRNGOldyxqlK6Ua1ALnoOXcxhWMrb7WKvhiBZ4wjEe3Lr7xQlYZiui6ovcSpxLfm4AB0f0arvJwQxxio3gsMI3AX7SqzK72gL0P3CKeQzv3Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=clp5xwBJ; arc=none smtp.client-ip=209.85.215.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="clp5xwBJ" Received: by mail-pg1-f171.google.com with SMTP id 41be03b00d2f7-5dc20645871so378409a12.1 for ; Thu, 08 Feb 2024 20:06:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451607; x=1708056407; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GxmcZQXB6hILUuERStuFRZ0MPT4uFMATBLnu6rXBqLE=; b=clp5xwBJAUtOYMWoLVu0ggjk3ruhFZflB1j4SWHMUGJ7dlywXV3L9nlzROrsbN+bGX HWvvf8jx+0x2HXij6LZRhhCb+K4tHPrP+dHzagoqCIOl3RAgS48D/exucDlxKaOz6pP1 ziMSOrzVAadOy5gU/bFJoiHJqukRSEHMpLJY1J4vlox8waI3wEOOW+RgUkgMpPHwSt4Q hXBLgVtivG3SgmSSm4P0mXvPLQoHVQ42abVziICcNnM1UuOpxAMIQmWMFEe67Kw07X4I 99y/mcVRAc6qqLo/DU1WaIktPHrO2WzElbeq65O8QornM9X1qDMhpiJBgnM+WQAvd11j Juug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451607; x=1708056407; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GxmcZQXB6hILUuERStuFRZ0MPT4uFMATBLnu6rXBqLE=; b=wFOLCsvV+K3PysZ5VcaH2Q7bX6DaOIZYDUNpBWoRPOoBbFKSq4f5pT3iAO4fgxsg+t hFQWzzRhUIBFfZcfaLwtvHFZJ5UMCpEOUlTv/6vQZuzUVP0Cl8htsQUfDw/DLZIAHPLM +IMjNOcu7cPxcLxeb0vKodiBaS4HldZFgdUE1sPU59zzm52hKUG2JGANeF1SZ4ZqI8Rt F8rrwOF67oMaq8gCxwklIepPaB6SHdb09GWtWvC5VTYElZpubhDWbeWL1KoUXb0b+Arr TGxIe1mk0FRaWp/4VlHKjCY6/tC2cRMTMjm24ijFolCePXVMxBTb82RDH0SorMosoFo1 8j2Q== X-Gm-Message-State: AOJu0Yyc5viY84m3f60/3BZhfEK5r/Cq0SjISUXaCnSvMWTMv3fjk8Gx BiuWWvSoih6wbdQV7XEfodtpir7+1wgDcDEkUuS6OaeFmouYFHqJqS7vka6k X-Google-Smtp-Source: AGHT+IHuqaEsk6hY3MyoaXGt78Tnlu+oap9lHuWJS3aqvunCJMIGEmoL01w5jyIJ2MTY+5Q0V0YZ4g== X-Received: by 2002:a17:90a:c291:b0:296:fff3:cf33 with SMTP id f17-20020a17090ac29100b00296fff3cf33mr1016964pjt.8.1707451606688; Thu, 08 Feb 2024 20:06:46 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWYw8ifD9te6TKOQruqSWHsfeyVWflUrl1SO2obE7M4KjqFDUY4tyAd7G+fW4weZ61H3tBeM7iE9QrtVUgUuqPkA3BqbcV/uqVtAYEMOdzBtNJ7moG+4HEo4Yrf8pUTvj0KhqQp2Ld60XtdpyNfIMydcJU0tnH+GgOO/Ir40yeps53JAqQY/ou32WGYozZ2TQUdJn/A04Fs+czxvy4plVwfCv1hwqMwa2KJUO7JBAlDbj2wEEWXbOZcNbE/o8UjskjgDrBXgRZh0EXhh2tx/MKhK2g1C+sdFMWpUb6Rk0/njLS/4TTplFl0/c9Po2lLk7OZzJFytqqZ5A6DCtSe9uqoRXyqYMZDfBGZ7eXuH9eKfEt9KVfxjg== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id gz21-20020a17090b0ed500b00296b90d93absm631254pjb.29.2024.02.08.20.06.44 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:46 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 08/20] bpf: Add x86-64 JIT support for bpf_cast_user instruction. Date: Thu, 8 Feb 2024 20:05:56 -0800 Message-Id: <20240209040608.98927-9-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov LLVM generates bpf_cast_kern and bpf_cast_user instructions while translating pointers with __attribute__((address_space(1))). rX = cast_kern(rY) is processed by the verifier and converted to normal 32-bit move: wX = wY bpf_cast_user has to be converted by JIT. rX = cast_user(rY) is aux_reg = upper_32_bits of arena->user_vm_start aux_reg <<= 32 wX = wY // clear upper 32 bits of dst register if (wX) // if not zero add upper bits of user_vm_start wX |= aux_reg JIT can do it more efficiently: mov dst_reg32, src_reg32 // 32-bit move shl dst_reg, 32 or dst_reg, user_vm_start rol dst_reg, 32 xor r11, r11 test dst_reg32, dst_reg32 // check if lower 32-bit are zero cmove r11, dst_reg // if so, set dst_reg to zero // Intel swapped src/dst register encoding in CMOVcc Signed-off-by: Alexei Starovoitov Acked-by: Eduard Zingerman --- arch/x86/net/bpf_jit_comp.c | 41 ++++++++++++++++++++++++++++++++++++- include/linux/filter.h | 1 + kernel/bpf/core.c | 5 +++++ 3 files changed, 46 insertions(+), 1 deletion(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 883b7f604b9a..a042ed57af7b 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1272,13 +1272,14 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image bool tail_call_seen = false; bool seen_exit = false; u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; - u64 arena_vm_start; + u64 arena_vm_start, user_vm_start; int i, excnt = 0; int ilen, proglen = 0; u8 *prog = temp; int err; arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); + user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena); detect_reg_usage(insn, insn_cnt, callee_regs_used, &tail_call_seen); @@ -1346,6 +1347,39 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image break; case BPF_ALU64 | BPF_MOV | BPF_X: + if (insn->off == BPF_ARENA_CAST_USER) { + if (dst_reg != src_reg) + /* 32-bit mov */ + emit_mov_reg(&prog, false, dst_reg, src_reg); + /* shl dst_reg, 32 */ + maybe_emit_1mod(&prog, dst_reg, true); + EMIT3(0xC1, add_1reg(0xE0, dst_reg), 32); + + /* or dst_reg, user_vm_start */ + maybe_emit_1mod(&prog, dst_reg, true); + if (is_axreg(dst_reg)) + EMIT1_off32(0x0D, user_vm_start >> 32); + else + EMIT2_off32(0x81, add_1reg(0xC8, dst_reg), user_vm_start >> 32); + + /* rol dst_reg, 32 */ + maybe_emit_1mod(&prog, dst_reg, true); + EMIT3(0xC1, add_1reg(0xC0, dst_reg), 32); + + /* xor r11, r11 */ + EMIT3(0x4D, 0x31, 0xDB); + + /* test dst_reg32, dst_reg32; check if lower 32-bit are zero */ + maybe_emit_mod(&prog, dst_reg, dst_reg, false); + EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg)); + + /* cmove r11, dst_reg; if so, set dst_reg to zero */ + /* WARNING: Intel swapped src/dst register encoding in CMOVcc !!! */ + maybe_emit_mod(&prog, AUX_REG, dst_reg, true); + EMIT3(0x0F, 0x44, add_2reg(0xC0, AUX_REG, dst_reg)); + break; + } + fallthrough; case BPF_ALU | BPF_MOV | BPF_X: if (insn->off == 0) emit_mov_reg(&prog, @@ -3424,6 +3458,11 @@ void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke, } } +bool bpf_jit_supports_arena(void) +{ + return true; +} + bool bpf_jit_supports_ptr_xchg(void) { return true; diff --git a/include/linux/filter.h b/include/linux/filter.h index cd76d43412d0..78ea63002531 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -959,6 +959,7 @@ bool bpf_jit_supports_kfunc_call(void); bool bpf_jit_supports_far_kfunc_call(void); bool bpf_jit_supports_exceptions(void); bool bpf_jit_supports_ptr_xchg(void); +bool bpf_jit_supports_arena(void); void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie); bool bpf_helper_changes_pkt_data(void *func); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 2539d9bfe369..2829077f0461 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2926,6 +2926,11 @@ bool __weak bpf_jit_supports_far_kfunc_call(void) return false; } +bool __weak bpf_jit_supports_arena(void) +{ + return false; +} + /* Return TRUE if the JIT backend satisfies the following two conditions: * 1) JIT backend supports atomic_xchg() on pointer-sized words. * 2) Under the specific arch, the implementation of xchg() is the same From patchwork Fri Feb 9 04:05:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550829 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 021CF5695 for ; Fri, 9 Feb 2024 04:06:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451613; cv=none; b=ubu/xocvtwndx9bAthW1802rhAIN/CiZcfmBomobzIW5kkTBi+Nre+mF7IinxOuXPbceCAQnMdftjhUIAPhNpQryuhzH3OZ+306oGGd66MWGhyDjZcsJfDzezfDH/uTQnFEs8+xKFt/LKo+8ISdja4STvrmCeG3+nxaESIcxXuM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451613; c=relaxed/simple; bh=RGmnC2qapNe7OSIJKKLlaKFWmQfQSfIt8eXZiTHNiaA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=MABchtpwe/y3aTlhMwlrR93WlLKAct1Me/03M1P/Le07k506VIBPXBvTPEB9RNFQ0m29cxUEcw+GaEyRTpn127I9brSF11VBqfz1kYfFJvTNE9fVgbvF2mdkTsxfSJJ6BUrCZZiDJJMNWq6acVaDCXh8U4YJ+Sh7J40iKAgqD1g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dNn8BtD+; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dNn8BtD+" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-29041136f73so442993a91.0 for ; Thu, 08 Feb 2024 20:06:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451611; x=1708056411; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6orzNV0AjOboinNstixj6Czt+5FkF5rLiP2S91rGkng=; b=dNn8BtD+gM75vu0VYpdMrQM9sGKqVlyrmL4ZSVorhL5unJwMz7HZul2dJpapz3zkgG QobaLO+qZlt5b7FqTXE7iiJZSBAu7DXJflLkecA1IT1PcrP0nnCjdAbqDzMBxwM6XuC5 gRb4/MEKEa58QYphSe6U8bvZr+fJ7ejMilpCpV40faTb+GG3LusqPhrcOh0Jh00ZZ+fH kf5BGubg2QCBzVBE+ddQLxv2BHlNGZtn6esNRkdhg06al9N/LWDoL00K9cPTe0HjyqI3 53eq2DF9VRgKr/Zol3aGMom7dqwlauIcscg0z56XUqDsX11i0OGXDup9/Vfb7U9e9urF uSZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451611; x=1708056411; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6orzNV0AjOboinNstixj6Czt+5FkF5rLiP2S91rGkng=; b=OMS11ScJRQ3aBcueI7ptKIfnq3ROH2BgqQO+clKEkEqHSd+WQaHvenHY7HmD6btCgF FwFvPe0rv7hHix/5QMGWgV5URZjLNS6NRQgkC0rwBZkK6XqMe/e3/BjZU9hi7sCCSuj4 hRWwtL95H1X09UEOCdqFrjdEOdBOj3WNI974Romr2nqiBPrVoOwq0lyaNoKHCwiqksOa J7/j4RF3i1wVQvjNm331mARNc2QfUa189O4rg4saDFW/u2Y49IBmufyXEtGMK6tc+T0s wZGduSg7duZYmLVuxCGZ5VpEKPy3hSvntzKWoC98qMI4d43RwR6vDC+SRQB+tKa12FTu TlWw== X-Gm-Message-State: AOJu0YwK8PaZuErnPo37OnZMlGMuKVUNI16R94IXq7h+4m2PMmwTJmVT sYCUYHi9CVE0oyf6sf2j4L6O42bkWYXR8dLwuvKo+kxu8860uU9tg90POawy X-Google-Smtp-Source: AGHT+IELh8rBH0G9Lv9cd6+SlCU2HHOEdFHvhzYPN8nxJUwbCo29hXvQdLAj23YQAxaPQ6BGyTN6CQ== X-Received: by 2002:a17:90a:ec0a:b0:297:604:1ff7 with SMTP id l10-20020a17090aec0a00b0029706041ff7mr369707pjy.17.1707451610983; Thu, 08 Feb 2024 20:06:50 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVbmgFXdBFBjoBt3ptKvyA1cp6LgeIz61rrJcLFv5JUSuH38/oThZV2vN01bpNPpzSAQQzUFFLy5t84VFxFzRvu6fnUge0QkzDc5TzAwavQYHF/tskPhpRwU9tHxSE9TlxkDOuMxPxVDmYlVq+ESWxWRJAkVDkalPzwjCAUzmE8TAZSi0v2qvrRKnnIQ4mIciNs6exu3NF5zyf5m09UcymKjg5roaQ1+4O9u3IaSB1OjQcxE34V8P4E4nnCwSMCt8W3ZXoaNOZ/HHM2MBZ9HBuzyVwifcAlMUuCLYuZbK7iGpu3QO/o8FG5FX0lHxI0Y69dTchHVCLiMculML1U95RXRRI+BFCuced/EbT6d+8/wKzNVMVjsQ== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id x15-20020a17090a8a8f00b00296e2434e7esm608017pjn.53.2024.02.08.20.06.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:50 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 09/20] bpf: Recognize cast_kern/user instructions in the verifier. Date: Thu, 8 Feb 2024 20:05:57 -0800 Message-Id: <20240209040608.98927-10-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov rX = bpf_cast_kern(rY, addr_space) tells the verifier that rX->type = PTR_TO_ARENA. Any further operations on PTR_TO_ARENA register have to be in 32-bit domain. The verifier will mark load/store through PTR_TO_ARENA with PROBE_MEM32. JIT will generate them as kern_vm_start + 32bit_addr memory accesses. rX = bpf_cast_user(rY, addr_space) tells the verifier that rX->type = unknown scalar. If arena->map_flags has BPF_F_NO_USER_CONV set then convert cast_user to mov32 as well. Otherwise JIT will convert it to: rX = (u32)rY; if (rX) rX |= arena->user_vm_start & ~(u64)~0U; Signed-off-by: Alexei Starovoitov --- include/linux/bpf.h | 1 + include/linux/bpf_verifier.h | 1 + kernel/bpf/log.c | 3 ++ kernel/bpf/verifier.c | 102 ++++++++++++++++++++++++++++++++--- 4 files changed, 100 insertions(+), 7 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 26419a57bf9f..70d5351427e6 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -889,6 +889,7 @@ enum bpf_reg_type { * an explicit null check is required for this struct. */ PTR_TO_MEM, /* reg points to valid memory region */ + PTR_TO_ARENA, PTR_TO_BUF, /* reg points to a read/write buffer */ PTR_TO_FUNC, /* reg points to a bpf program function */ CONST_PTR_TO_DYNPTR, /* reg points to a const struct bpf_dynptr */ diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 84365e6dd85d..43c95e3e2a3c 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -547,6 +547,7 @@ struct bpf_insn_aux_data { u32 seen; /* this insn was processed by the verifier at env->pass_cnt */ bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */ bool zext_dst; /* this insn zero extends dst reg */ + bool needs_zext; /* alu op needs to clear upper bits */ bool storage_get_func_atomic; /* bpf_*_storage_get() with atomic memory alloc */ bool is_iter_next; /* bpf_iter__next() kfunc call */ bool call_with_percpu_alloc_ptr; /* {this,per}_cpu_ptr() with prog percpu alloc */ diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c index 594a234f122b..677076c760ff 100644 --- a/kernel/bpf/log.c +++ b/kernel/bpf/log.c @@ -416,6 +416,7 @@ const char *reg_type_str(struct bpf_verifier_env *env, enum bpf_reg_type type) [PTR_TO_XDP_SOCK] = "xdp_sock", [PTR_TO_BTF_ID] = "ptr_", [PTR_TO_MEM] = "mem", + [PTR_TO_ARENA] = "arena", [PTR_TO_BUF] = "buf", [PTR_TO_FUNC] = "func", [PTR_TO_MAP_KEY] = "map_key", @@ -651,6 +652,8 @@ static void print_reg_state(struct bpf_verifier_env *env, } verbose(env, "%s", reg_type_str(env, t)); + if (t == PTR_TO_ARENA) + return; if (t == PTR_TO_STACK) { if (state->frameno != reg->frameno) verbose(env, "[%d]", reg->frameno); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3c77a3ab1192..5eeb9bf7e324 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4370,6 +4370,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type) case PTR_TO_MEM: case PTR_TO_FUNC: case PTR_TO_MAP_KEY: + case PTR_TO_ARENA: return true; default: return false; @@ -5805,6 +5806,8 @@ static int check_ptr_alignment(struct bpf_verifier_env *env, case PTR_TO_XDP_SOCK: pointer_desc = "xdp_sock "; break; + case PTR_TO_ARENA: + return 0; default: break; } @@ -6906,6 +6909,9 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn if (!err && value_regno >= 0 && (rdonly_mem || t == BPF_READ)) mark_reg_unknown(env, regs, value_regno); + } else if (reg->type == PTR_TO_ARENA) { + if (t == BPF_READ && value_regno >= 0) + mark_reg_unknown(env, regs, value_regno); } else { verbose(env, "R%d invalid mem access '%s'\n", regno, reg_type_str(env, reg->type)); @@ -8377,6 +8383,7 @@ static int check_func_arg_reg_off(struct bpf_verifier_env *env, case PTR_TO_MEM | MEM_RINGBUF: case PTR_TO_BUF: case PTR_TO_BUF | MEM_RDONLY: + case PTR_TO_ARENA: case SCALAR_VALUE: return 0; /* All the rest must be rejected, except PTR_TO_BTF_ID which allows @@ -13837,6 +13844,21 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env, dst_reg = ®s[insn->dst_reg]; src_reg = NULL; + + if (dst_reg->type == PTR_TO_ARENA) { + struct bpf_insn_aux_data *aux = cur_aux(env); + + if (BPF_CLASS(insn->code) == BPF_ALU64) + /* + * 32-bit operations zero upper bits automatically. + * 64-bit operations need to be converted to 32. + */ + aux->needs_zext = true; + + /* Any arithmetic operations are allowed on arena pointers */ + return 0; + } + if (dst_reg->type != SCALAR_VALUE) ptr_reg = dst_reg; else @@ -13954,16 +13976,17 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) } else if (opcode == BPF_MOV) { if (BPF_SRC(insn->code) == BPF_X) { - if (insn->imm != 0) { - verbose(env, "BPF_MOV uses reserved fields\n"); - return -EINVAL; - } - if (BPF_CLASS(insn->code) == BPF_ALU) { - if (insn->off != 0 && insn->off != 8 && insn->off != 16) { + if ((insn->off != 0 && insn->off != 8 && insn->off != 16) || + insn->imm) { verbose(env, "BPF_MOV uses reserved fields\n"); return -EINVAL; } + } else if (insn->off == BPF_ARENA_CAST_KERN || insn->off == BPF_ARENA_CAST_USER) { + if (!insn->imm) { + verbose(env, "cast_kern/user insn must have non zero imm32\n"); + return -EINVAL; + } } else { if (insn->off != 0 && insn->off != 8 && insn->off != 16 && insn->off != 32) { @@ -13993,7 +14016,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) struct bpf_reg_state *dst_reg = regs + insn->dst_reg; if (BPF_CLASS(insn->code) == BPF_ALU64) { - if (insn->off == 0) { + if (insn->imm) { + /* off == BPF_ARENA_CAST_KERN || off == BPF_ARENA_CAST_USER */ + mark_reg_unknown(env, regs, insn->dst_reg); + if (insn->off == BPF_ARENA_CAST_KERN) + dst_reg->type = PTR_TO_ARENA; + } else if (insn->off == 0) { /* case: R1 = R2 * copy register state to dest reg */ @@ -14059,6 +14087,9 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn) dst_reg->subreg_def = env->insn_idx + 1; coerce_subreg_to_size_sx(dst_reg, insn->off >> 3); } + } else if (src_reg->type == PTR_TO_ARENA) { + mark_reg_unknown(env, regs, insn->dst_reg); + dst_reg->type = PTR_TO_ARENA; } else { mark_reg_unknown(env, regs, insn->dst_reg); @@ -15142,6 +15173,10 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) if (insn->src_reg == BPF_PSEUDO_MAP_VALUE || insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) { + if (map->map_type == BPF_MAP_TYPE_ARENA) { + __mark_reg_unknown(env, dst_reg); + return 0; + } dst_reg->type = PTR_TO_MAP_VALUE; dst_reg->off = aux->map_off; WARN_ON_ONCE(map->max_entries != 1); @@ -16519,6 +16554,8 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, * the same stack frame, since fp-8 in foo != fp-8 in bar */ return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno; + case PTR_TO_ARENA: + return true; default: return regs_exact(rold, rcur, idmap); } @@ -18235,6 +18272,31 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env) fdput(f); return -EBUSY; } + if (map->map_type == BPF_MAP_TYPE_ARENA) { + if (env->prog->aux->arena) { + verbose(env, "Only one arena per program\n"); + fdput(f); + return -EBUSY; + } + if (!env->allow_ptr_leaks || !env->bpf_capable) { + verbose(env, "CAP_BPF and CAP_PERFMON are required to use arena\n"); + fdput(f); + return -EPERM; + } + if (!env->prog->jit_requested) { + verbose(env, "JIT is required to use arena\n"); + return -EOPNOTSUPP; + } + if (!bpf_jit_supports_arena()) { + verbose(env, "JIT doesn't support arena\n"); + return -EOPNOTSUPP; + } + env->prog->aux->arena = (void *)map; + if (!bpf_arena_get_user_vm_start(env->prog->aux->arena)) { + verbose(env, "arena's user address must be set via map_extra or mmap()\n"); + return -EINVAL; + } + } fdput(f); next_insn: @@ -18799,6 +18861,18 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) insn->code == (BPF_ST | BPF_MEM | BPF_W) || insn->code == (BPF_ST | BPF_MEM | BPF_DW)) { type = BPF_WRITE; + } else if (insn->code == (BPF_ALU64 | BPF_MOV | BPF_X) && insn->imm) { + if (insn->off == BPF_ARENA_CAST_KERN || + (((struct bpf_map *)env->prog->aux->arena)->map_flags & BPF_F_NO_USER_CONV)) { + /* convert to 32-bit mov that clears upper 32-bit */ + insn->code = BPF_ALU | BPF_MOV | BPF_X; + /* clear off, so it's a normal 'wX = wY' from JIT pov */ + insn->off = 0; + } /* else insn->off == BPF_ARENA_CAST_USER should be handled by JIT */ + continue; + } else if (env->insn_aux_data[i + delta].needs_zext) { + /* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */ + insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code); } else { continue; } @@ -18856,6 +18930,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) env->prog->aux->num_exentries++; } continue; + case PTR_TO_ARENA: + if (BPF_MODE(insn->code) == BPF_MEMSX) { + verbose(env, "sign extending loads from arena are not supported yet\n"); + return -EOPNOTSUPP; + } + insn->code = BPF_CLASS(insn->code) | BPF_PROBE_MEM32 | BPF_SIZE(insn->code); + env->prog->aux->num_exentries++; + continue; default: continue; } @@ -19041,13 +19123,19 @@ static int jit_subprogs(struct bpf_verifier_env *env) func[i]->aux->nr_linfo = prog->aux->nr_linfo; func[i]->aux->jited_linfo = prog->aux->jited_linfo; func[i]->aux->linfo_idx = env->subprog_info[i].linfo_idx; + func[i]->aux->arena = prog->aux->arena; num_exentries = 0; insn = func[i]->insnsi; for (j = 0; j < func[i]->len; j++, insn++) { if (BPF_CLASS(insn->code) == BPF_LDX && (BPF_MODE(insn->code) == BPF_PROBE_MEM || + BPF_MODE(insn->code) == BPF_PROBE_MEM32 || BPF_MODE(insn->code) == BPF_PROBE_MEMSX)) num_exentries++; + if ((BPF_CLASS(insn->code) == BPF_STX || + BPF_CLASS(insn->code) == BPF_ST) && + BPF_MODE(insn->code) == BPF_PROBE_MEM32) + num_exentries++; } func[i]->aux->num_exentries = num_exentries; func[i]->aux->tail_call_reachable = env->subprog_info[i].tail_call_reachable; From patchwork Fri Feb 9 04:05:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550830 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 194535667 for ; Fri, 9 Feb 2024 04:06:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451617; cv=none; b=kA3sI8oucVbLK0RRnLSCUKMrzfGEauRLjtejtbQcQrIEAgLxPAesa2o6eq92QhFPuiksQzld5zpfLr37LXHbiIyLA8IyCVj92Cg5tZqM3CY5W3mYC6xTxWfeugxIHqDT/oUcLqZe4H/RWHUM6elcq0sAROPSXIoaxBTRs0gg96Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451617; c=relaxed/simple; bh=Bg5aGjdIz8HulVZArC9AzXZ4d0bb4JFzjyEXMh1sB5w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sikYJ6f1NcLChCg81FkUMpqISBag+6eZe8uEfTZvxBsCQcS0sVayAJKsUfdGPXihQpnenmfDixPLlnfBop57vxcvLz9wisyBQFng0BGHDCQCBGR9rVgZ3w28lQh6nTTLu2B7yGBYEr4NRoSRL6FU277yrK9n8rZMfsMGS9293rk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Ire1UH1c; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ire1UH1c" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-1d7354ba334so4476085ad.1 for ; Thu, 08 Feb 2024 20:06:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451615; x=1708056415; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XZ9Snvkhjzxk5f60xE38IZQNIMkdSN5DBO4L+T7puE0=; b=Ire1UH1ciFgYSUnaAX3+hxFI1ATA37DVRT8gPeUm10r7cFwA6qcDKQvktBvSksinCP ryDM1vcsaWc2u5X66oLEGLVtVpuoeGEo6uca1LWWy1HnTSGvqxpLn0LTlxw+KZowbmH6 hO/4g6YMngxp4H/PaxE6qO00ArQQGr5vRWUnA8cFl837/5XImqAtXDCLC1IM+gkBxuyA xn2cHg0Paz/09+sRb69b6MAMMvxDjWWZXm/Bv1u3zcLBtErfkiR/RQQgXMHBqLd6GrKI iPM3vTd2rJlg8Xe1pQO68mmtqwObtugsp7HiXUXWWvEhaun7l+2Vu9UlRDNDdmpnUOWO 4g4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451615; x=1708056415; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XZ9Snvkhjzxk5f60xE38IZQNIMkdSN5DBO4L+T7puE0=; b=cGcx5Ib9Gr/33ADYCfYtu5cUgKrsbj6lzCRZsLVjM8VKl2ckb+nsTnf9bjLl2ob2IU 0F/O/TaQOYPfiL6snXHnTlwzJfDwvbCCVN8SKOugrO/bqXSp1bF6GzssM7MlLsp23Jul OjYJG/YueHKRUW3ZIVoBwOyVGvLtP1es+OALVDSzTNPnsMafDWmYlU+t1lJztDIZ24os EPlFj9pqSM4KUlP/q33PrvtGeipdboBJTwYZoh/zMY4axGwRS+eeyO76cwel/1Q87Hh8 161F/0yHOZwJgy/5Pa1xrGgJMLjhI7yeCOMcCCe0fhlsiYBGF36LXZKEwcayjLixjsdu fA3w== X-Gm-Message-State: AOJu0Yykly7J8kH38L1oonhnnsMRGzY28qgbMP1p/WEV20n+EXrJFFXb OHNa0V6IwCwP+Oux6UzJFQpg3zCBDjzNunHQHKHIon6JaB+8SMvvGpCPpNnT X-Google-Smtp-Source: AGHT+IGdKvGloD2x7T0TvljdUW1/xtPfwtVY4eu3Wz+SojJrERvkkNs762n7o/a8ViAyTbYVnq16Ig== X-Received: by 2002:a17:902:c403:b0:1d9:bbc2:87e7 with SMTP id k3-20020a170902c40300b001d9bbc287e7mr500885plk.36.1707451615176; Thu, 08 Feb 2024 20:06:55 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCU/x2cbpnz0BUBrwD3oynKGkbQWd49A9Q51EP6K/5fCnnF23ZUpv0qRDmHTlrR6yA0AtXPMi3FuK5K+KFjNWQZB5mJntsOxS8Pm1u0s8ziWoeQw2r37+Me/HIxl3XgxR4kF3y3DhvCArqcZd+TXsi4bpWVUbTSne3QN9pquK2NsflaDJazTaGUXoTDw5FqEjjU2zuYKegJHRBDw6WC7STY0O4c1TLAlrG4mfSz/8Qw8kYBzblb0HSXdBLKCCxn0P8d6aqm9/j/gIVQKJRkMiVlEC3V3y4n8PzaD9mBz4h5KNSVPY3TwerwIV4dK9wOn2J489jvaUMgPG6ksWBRa0Vjh8eFWxGtM4aKjqcZVj6h9pFRxYkRbAg== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id kw13-20020a170902f90d00b001d752c4f180sm560989plb.94.2024.02.08.20.06.53 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:54 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 10/20] bpf: Recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. Date: Thu, 8 Feb 2024 20:05:58 -0800 Message-Id: <20240209040608.98927-11-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov In global bpf functions recognize btf_decl_tag("arg:arena") as PTR_TO_ARENA. Note, when the verifier sees: __weak void foo(struct bar *p) it recognizes 'p' as PTR_TO_MEM and 'struct bar' has to be a struct with scalars. Hence the only way to use arena pointers in global functions is to tag them with "arg:arena". Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- include/linux/bpf.h | 1 + kernel/bpf/btf.c | 19 +++++++++++++++---- kernel/bpf/verifier.c | 15 +++++++++++++++ 3 files changed, 31 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 70d5351427e6..46a92e41b9d5 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -718,6 +718,7 @@ enum bpf_arg_type { * on eBPF program stack */ ARG_PTR_TO_MEM, /* pointer to valid memory (stack, packet, map value) */ + ARG_PTR_TO_ARENA, ARG_CONST_SIZE, /* number of bytes accessed from memory */ ARG_CONST_SIZE_OR_ZERO, /* number of bytes accessed from memory or 0 */ diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 8e06d29961f1..857059c8d56c 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7053,10 +7053,11 @@ static int btf_get_ptr_to_btf_id(struct bpf_verifier_log *log, int arg_idx, } enum btf_arg_tag { - ARG_TAG_CTX = 0x1, - ARG_TAG_NONNULL = 0x2, - ARG_TAG_TRUSTED = 0x4, - ARG_TAG_NULLABLE = 0x8, + ARG_TAG_CTX = BIT_ULL(0), + ARG_TAG_NONNULL = BIT_ULL(1), + ARG_TAG_TRUSTED = BIT_ULL(2), + ARG_TAG_NULLABLE = BIT_ULL(3), + ARG_TAG_ARENA = BIT_ULL(4), }; /* Process BTF of a function to produce high-level expectation of function @@ -7168,6 +7169,8 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog) tags |= ARG_TAG_NONNULL; } else if (strcmp(tag, "nullable") == 0) { tags |= ARG_TAG_NULLABLE; + } else if (strcmp(tag, "arena") == 0) { + tags |= ARG_TAG_ARENA; } else { bpf_log(log, "arg#%d has unsupported set of tags\n", i); return -EOPNOTSUPP; @@ -7222,6 +7225,14 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog) sub->args[i].btf_id = kern_type_id; continue; } + if (tags & ARG_TAG_ARENA) { + if (tags & ~ARG_TAG_ARENA) { + bpf_log(log, "arg#%d arena cannot be combined with any other tags\n", i); + return -EINVAL; + } + sub->args[i].arg_type = ARG_PTR_TO_ARENA; + continue; + } if (is_global) { /* generic user data pointer */ u32 mem_size; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5eeb9bf7e324..fa49602194d5 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -9348,6 +9348,18 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog, bpf_log(log, "arg#%d is expected to be non-NULL\n", i); return -EINVAL; } + } else if (base_type(arg->arg_type) == ARG_PTR_TO_ARENA) { + /* + * Can pass any value and the kernel won't crash, but + * only PTR_TO_ARENA or SCALAR make sense. Everything + * else is a bug in the bpf program. Point it out to + * the user at the verification time instead of + * run-time debug nightmare. + */ + if (reg->type != PTR_TO_ARENA && reg->type != SCALAR_VALUE) { + bpf_log(log, "R%d is not a pointer to arena or scalar.\n", regno); + return -EINVAL; + } } else if (arg->arg_type == (ARG_PTR_TO_DYNPTR | MEM_RDONLY)) { ret = process_dynptr_func(env, regno, -1, arg->arg_type, 0); if (ret) @@ -20329,6 +20341,9 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog) reg->btf = bpf_get_btf_vmlinux(); /* can't fail at this point */ reg->btf_id = arg->btf_id; reg->id = ++env->id_gen; + } else if (base_type(arg->arg_type) == ARG_PTR_TO_ARENA) { + /* caller can pass either PTR_TO_ARENA or SCALAR */ + mark_reg_unknown(env, regs, i); } else { WARN_ONCE(1, "BUG: unhandled arg#%d type %d\n", i - BPF_REG_1, arg->arg_type); From patchwork Fri Feb 9 04:05:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550831 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6500F5667 for ; Fri, 9 Feb 2024 04:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451621; cv=none; b=j5TFlmjUpBmN4ZcMZ9hpJ7zxtVLh038v5yw40IaMlawO6gR1Ymb6MMrhItbpeMI+Uy8WCcqDvT0kMZI32359xpxta+zWnzh0VomF3y5T8CfUweLntal3GaiRLhzJDLrOXkKcyLHxTgDtMDz5UaW/UlJVsWwWxU9t3vrezLpgFpI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451621; c=relaxed/simple; bh=B9RYqjZPOWV6C0GMRY3SkdIHMqd4rfBrkE8wPVtghus=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JSfSYQnO94VUwaS0944dHRsA49D0k97KAyu5y6sXD1xbkj+F9zoxJt4jm3yAvnD78ITZx5xNsU63BepdSve7mJg+6ZOsVYBhA0v5n+2fIr84y0mA2y6ZyrC6qf/ZABTLXFceF5j4pPOiG/4UuhDh22xpAcYOIkgTGG4/byT6IRQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=i4uUpAQ1; arc=none smtp.client-ip=209.85.210.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="i4uUpAQ1" Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-6e04fd5e05aso440758b3a.0 for ; Thu, 08 Feb 2024 20:07:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451619; x=1708056419; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kyKA4wCImIZZ5xJE04KmfmUkjYhSBJW7+tRHKBcOaBI=; b=i4uUpAQ1x7wpjggITqtMPKx/vCgmCHEe7EtF7/j3senHuBaTMDIktrUprZdzPgmoUu LkC2Uaf8Et/W3WZB4FBUimkdvofvsbL2aXJ9mCJc0Jej3SAlCcWxfQ4h3hct74CDnJJH 4WwcGSzzES6GqexNCX6NAiZloL/k1fSqBt15frLCjTde0u1IlnOHU/R4Jio5DV4j1vwg VtvkZLV1oT5n2A6EiMJc7Qa7nhoAs6BcFziIHBtJ3ebHaK6YZFO9O0kZ63MRdDzGxYo3 fYIQbYee3Ij8om7CcFk3SM1tN/omGuShZWGdpgOAnxdLFpkiygqGQl4+4tN5cQenCdsF ndyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451619; x=1708056419; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kyKA4wCImIZZ5xJE04KmfmUkjYhSBJW7+tRHKBcOaBI=; b=rq7e35yY3ErOJE/Q5NqA+RQFFe5aUXWnFwO5BXhPTs1xtrMjL6yKBsCSx+/0v9Rzgn w74DBXtjrB1yd8rGIVcvitvmrNAP3IyBzxWHv9PajWBu3LEgvdNwJZWhwIWlEdtl4jtF UQvL+SCiyK8SKpVANH2v2F5/kZ9jmkE0mbx8Tq98fgAn+Gt4iw3A9R8LBfGfx7yzn/wd V9BpQOX2LUf+QDcEUzYmrULFZWomQkL5bqVoLXx8dB/Msu9VfvEWgsBV2Lk6IK6iqW9n I9jGlj3t+8XoJD5JQ0a8gunuJ3MLgWXTgmgdq9uiJ0yN6mOTwQpq5HCjNiP9RxbesEL1 vF0Q== X-Gm-Message-State: AOJu0Ywgm5QBwn12XUbZhgv6KS/yjqRYGCwciKMVuqDbZt8Z5VXizk+f LF1E/gg+JCAHvShtJwYpiCjArUJOyJJddBz5+eElacDe2AuNHlS1qji9R0Pj X-Google-Smtp-Source: AGHT+IEcq3u8IxQEchLTqYOdj8JCUTyhf++9XKhZV1VlMQZ/9+Av7nCVAWvySSMoNnMsdtUnppsIjA== X-Received: by 2002:a05:6a20:8c01:b0:19e:b477:33a4 with SMTP id j1-20020a056a208c0100b0019eb47733a4mr520269pzh.27.1707451619456; Thu, 08 Feb 2024 20:06:59 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWQEeNvrqoRwizmtNe7laAnu501WQNUUrnNc5bjAfQWP+sx+w5e2Fh+IdN7e3m/thOyA1PXmPbUrKFF2iv85lnNbw22sei0R1dMfM2b8sjF/scMfelPDzWIRLb3Rbmeu0nynfi0QRD9uaXnafRn3hsJTcHF8/wZcK3Qf+kv3Ni8kHO8NrzDQeHD4uLmFd/P0i5q1NYeju6EtjYm1j4OJY8+q7gyU4OR7agn4uOilcwA8qebcGY49lcc99jxuByats2X9fUO/pyRnbvi0DCQwCENKvViTtlTpGJaicvrQlE97J3YVk3FS629SYdh7Ivcfb+pS7C/9gwvPVJnwZqDz03apakQb62uN7S8jWHLS7hJgoSz3YlDeQ== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id f19-20020a056a00229300b006dbda7bcf3csm589192pfe.83.2024.02.08.20.06.57 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:06:59 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 11/20] libbpf: Add __arg_arena to bpf_helpers.h Date: Thu, 8 Feb 2024 20:05:59 -0800 Message-Id: <20240209040608.98927-12-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Add __arg_arena to bpf_helpers.h Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi Acked-by: Andrii Nakryiko --- tools/lib/bpf/bpf_helpers.h | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 79eaa581be98..9c777c21da28 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -192,6 +192,7 @@ enum libbpf_tristate { #define __arg_nonnull __attribute((btf_decl_tag("arg:nonnull"))) #define __arg_nullable __attribute((btf_decl_tag("arg:nullable"))) #define __arg_trusted __attribute((btf_decl_tag("arg:trusted"))) +#define __arg_arena __attribute((btf_decl_tag("arg:arena"))) #ifndef ___bpf_concat #define ___bpf_concat(a, b) a ## b From patchwork Fri Feb 9 04:06:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550832 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1C6A5258 for ; Fri, 9 Feb 2024 04:07:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451626; cv=none; b=JeEQtQnoCI8fjV73wu7ZUnRaJAYtx7X2mk5zDNCGEBtluPIEvZonWiOxTzw6fbX3IwKiv0XnjL/eC82/Ddgj+DvrsXpSM4UN3Q5sSNK7AKFQ9XrKRGJS28BEqzIsBKr6xIMDWvUg81LmPH9sxXweFG6YYLHBNHw8tqBqATXmudo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451626; c=relaxed/simple; bh=SSUom18fY1/fkI/8Dau98DvaWPho4QqURg692OjHxHQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=c9Pab+uo9IbgM/zZu58H86WZND5E/lVpZ88QWFMZ3K7RTSi4Sb9cBCilPX7OkxFCxD7DHUGFTrKK87IZw2GPkRdIRRgVx+kNNvcmNpJxZhcgym555t2fqg6/iiEFPG3+lHbuQt7PvJTpr8NafZ2ISfdiuAFtemvZyV7wYEkHdEU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PLxA2Air; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PLxA2Air" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-1da0cd9c0e5so4469735ad.0 for ; Thu, 08 Feb 2024 20:07:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451623; x=1708056423; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Nd9ldHvznK5bBX0WOOiLjO64Z4v1t4JC5wlS+JmAlpg=; b=PLxA2Airwazq64b9koUcxGI9iSN8jOD8fcxRaGkcqt92qXGttcAp3zQjonCroOd4x4 l2xnbRJWldcRcjbMa/ZdgbFgjXH5IaPY7j81poP3QMWu9bBg07K0FtGCv7XpgGJoqOHE H5AXThL7fLccLMLi9FPWUPFEr5YpWWlq0fErqE93SLQNBouy6eDmgyur3IIbW1Lu+08i jHYnUQ9RzCIssocZ7t39GmQzua3sB2lCH7vHq4kOtyFY+ymzx+A9YM13yypOHS6WdvX1 HDwoD+KUlI3hacciEAC5dY8jYOK9QLwuNRzh1RYP/i54Ke22X7h71DJSwPmWK1NQRy/f 4fPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451623; x=1708056423; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Nd9ldHvznK5bBX0WOOiLjO64Z4v1t4JC5wlS+JmAlpg=; b=BYJ0+bVUR4rWFnTbyG1eldcx5kshgY2tQhgaaBNWGqF0imIFy5X+J0VLC7QpTD6Iq0 Cpb74HwkJhwr4glNQPfuapf3GoUw6ljmxug/uks0NXUotARHkJf0OMn20NODtXVWIq3R WU/6Ojw4sV4m1QSo0Ypxp3WUdUTcwkP9snaAaKvC9JOJeGA0AeU/Ba6RoOmpzWcesb+w sCkolQoG//n/7yBZc8CnUn+wd37osXgZfJ91fkPW1JP9roWR+sfRQZJbqtx82kb88oXa kQ4ZttDgskbLgq4auvnJ2+JHv7w9mJYUN80jttp9aDWUjndIfLI+bnRhn1Sp995v4JIn T9JQ== X-Gm-Message-State: AOJu0YxWsEA15ZQ6dMOop7E0ZH8hn5g0iA5tb8yNEBJ88eUPk9kIJ1uz TfJBDfaR/3sFsxofRPJAN464AZVcLsihVZ834EbEXp71+aHF+9h0EaZxP9jQ X-Google-Smtp-Source: AGHT+IEoPNaoXNST3bY7EQ3Xy37Qb7fTssbofLabaCcz0TDvqlgviwuOkE+S47i6w80xgnTbu99/eA== X-Received: by 2002:a17:902:ce91:b0:1d9:34ff:f807 with SMTP id f17-20020a170902ce9100b001d934fff807mr727831plg.31.1707451623582; Thu, 08 Feb 2024 20:07:03 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCU4y6CYVGbOKIVod4G5Bg5GTdHTlGslPYpsljRN9o3Z+8SFl6eHgSF+Plus5zKv7ifiilN8+3Rjaqfp4tPFWlgiGm9U4WYTgptRd2cVgf0dvvrSyQcj/sI1ZPcywe5wsAK74e+moJ3nEcZD20bSvOl4UhYy9gcv8vpm3VgPJnpPhUu4irxBkQwSGw2QH6I3K6McMgvIFFbwqhzJxep1QZGhuyXkrRiSjJTp3dKWrYgJ878B2ObgfAyj29Ns+1kI/6tpzGl9I94oP1IN+y3mDfcH7Qk1OeqG6dNbq3FmYOFCEvdYqPdiXRYRMtkXLZqiyMVp1rjFBtFgEcyRZH6cMEDc1G3cXwnNMCz4r4xrV67tuUAMd7uW7g== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id g12-20020a170902f74c00b001d9ba3b2b33sm541375plw.163.2024.02.08.20.07.01 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:03 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 12/20] libbpf: Add support for bpf_arena. Date: Thu, 8 Feb 2024 20:06:00 -0800 Message-Id: <20240209040608.98927-13-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov mmap() bpf_arena right after creation, since the kernel needs to remember the address returned from mmap. This is user_vm_start. LLVM will generate bpf_arena_cast_user() instructions where necessary and JIT will add upper 32-bit of user_vm_start to such pointers. Fix up bpf_map_mmap_sz() to compute mmap size as map->value_size * map->max_entries for arrays and PAGE_SIZE * map->max_entries for arena. Don't set BTF at arena creation time, since it doesn't support it. Signed-off-by: Alexei Starovoitov --- tools/lib/bpf/libbpf.c | 43 ++++++++++++++++++++++++++++++----- tools/lib/bpf/libbpf_probes.c | 7 ++++++ 2 files changed, 44 insertions(+), 6 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 01f407591a92..4880d623098d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -185,6 +185,7 @@ static const char * const map_type_name[] = { [BPF_MAP_TYPE_BLOOM_FILTER] = "bloom_filter", [BPF_MAP_TYPE_USER_RINGBUF] = "user_ringbuf", [BPF_MAP_TYPE_CGRP_STORAGE] = "cgrp_storage", + [BPF_MAP_TYPE_ARENA] = "arena", }; static const char * const prog_type_name[] = { @@ -1577,7 +1578,7 @@ static struct bpf_map *bpf_object__add_map(struct bpf_object *obj) return map; } -static size_t bpf_map_mmap_sz(unsigned int value_sz, unsigned int max_entries) +static size_t __bpf_map_mmap_sz(unsigned int value_sz, unsigned int max_entries) { const long page_sz = sysconf(_SC_PAGE_SIZE); size_t map_sz; @@ -1587,6 +1588,20 @@ static size_t bpf_map_mmap_sz(unsigned int value_sz, unsigned int max_entries) return map_sz; } +static size_t bpf_map_mmap_sz(const struct bpf_map *map) +{ + const long page_sz = sysconf(_SC_PAGE_SIZE); + + switch (map->def.type) { + case BPF_MAP_TYPE_ARRAY: + return __bpf_map_mmap_sz(map->def.value_size, map->def.max_entries); + case BPF_MAP_TYPE_ARENA: + return page_sz * map->def.max_entries; + default: + return 0; /* not supported */ + } +} + static int bpf_map_mmap_resize(struct bpf_map *map, size_t old_sz, size_t new_sz) { void *mmaped; @@ -1740,7 +1755,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, pr_debug("map '%s' (global data): at sec_idx %d, offset %zu, flags %x.\n", map->name, map->sec_idx, map->sec_offset, def->map_flags); - mmap_sz = bpf_map_mmap_sz(map->def.value_size, map->def.max_entries); + mmap_sz = bpf_map_mmap_sz(map); map->mmaped = mmap(NULL, mmap_sz, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (map->mmaped == MAP_FAILED) { @@ -4852,6 +4867,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b case BPF_MAP_TYPE_SOCKHASH: case BPF_MAP_TYPE_QUEUE: case BPF_MAP_TYPE_STACK: + case BPF_MAP_TYPE_ARENA: create_attr.btf_fd = 0; create_attr.btf_key_type_id = 0; create_attr.btf_value_type_id = 0; @@ -4908,6 +4924,21 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b if (map->fd == map_fd) return 0; + if (def->type == BPF_MAP_TYPE_ARENA) { + map->mmaped = mmap((void *)map->map_extra, bpf_map_mmap_sz(map), + PROT_READ | PROT_WRITE, + map->map_extra ? MAP_SHARED | MAP_FIXED : MAP_SHARED, + map_fd, 0); + if (map->mmaped == MAP_FAILED) { + err = -errno; + map->mmaped = NULL; + close(map_fd); + pr_warn("map '%s': failed to mmap bpf_arena: %d\n", + bpf_map__name(map), err); + return err; + } + } + /* Keep placeholder FD value but now point it to the BPF map object. * This way everything that relied on this map's FD (e.g., relocated * ldimm64 instructions) will stay valid and won't need adjustments. @@ -8582,7 +8613,7 @@ static void bpf_map__destroy(struct bpf_map *map) if (map->mmaped) { size_t mmap_sz; - mmap_sz = bpf_map_mmap_sz(map->def.value_size, map->def.max_entries); + mmap_sz = bpf_map_mmap_sz(map); munmap(map->mmaped, mmap_sz); map->mmaped = NULL; } @@ -9830,8 +9861,8 @@ int bpf_map__set_value_size(struct bpf_map *map, __u32 size) int err; size_t mmap_old_sz, mmap_new_sz; - mmap_old_sz = bpf_map_mmap_sz(map->def.value_size, map->def.max_entries); - mmap_new_sz = bpf_map_mmap_sz(size, map->def.max_entries); + mmap_old_sz = bpf_map_mmap_sz(map); + mmap_new_sz = __bpf_map_mmap_sz(size, map->def.max_entries); err = bpf_map_mmap_resize(map, mmap_old_sz, mmap_new_sz); if (err) { pr_warn("map '%s': failed to resize memory-mapped region: %d\n", @@ -13356,7 +13387,7 @@ int bpf_object__load_skeleton(struct bpf_object_skeleton *s) for (i = 0; i < s->map_cnt; i++) { struct bpf_map *map = *s->maps[i].map; - size_t mmap_sz = bpf_map_mmap_sz(map->def.value_size, map->def.max_entries); + size_t mmap_sz = bpf_map_mmap_sz(map); int prot, map_fd = map->fd; void **mmaped = s->maps[i].mmaped; diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c index ee9b1dbea9eb..302188122439 100644 --- a/tools/lib/bpf/libbpf_probes.c +++ b/tools/lib/bpf/libbpf_probes.c @@ -338,6 +338,13 @@ static int probe_map_create(enum bpf_map_type map_type) key_size = 0; max_entries = 1; break; + case BPF_MAP_TYPE_ARENA: + key_size = 0; + value_size = 0; + max_entries = 1; /* one page */ + opts.map_extra = 0; /* can mmap() at any address */ + opts.map_flags = BPF_F_MMAPABLE; + break; case BPF_MAP_TYPE_HASH: case BPF_MAP_TYPE_ARRAY: case BPF_MAP_TYPE_PROG_ARRAY: From patchwork Fri Feb 9 04:06:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550833 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-oo1-f42.google.com (mail-oo1-f42.google.com [209.85.161.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C0705227 for ; Fri, 9 Feb 2024 04:07:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451630; cv=none; b=n0cBn7C2GSXn89qhMAV9KsuHYRN3QlS6/nAJva7BM0SCAiFBQwqHlKh/aBB5fakJAS/FKXXrLcAYneSII3KJf/hSZ4JEZtoUhiZtecbrxHSy3mkCo5YuRUNAug6QdzwNyHBIZDuhOzMIMq1s40WsBvpdtf0pXNh7oPBVLKxdTqo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451630; c=relaxed/simple; bh=fZ6Y58uW7ogEd7gM1JqcAJ0v8UlMwmoELq7rWQLCEXA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Cxs80Gvho+kaJ7pUnEvqN68jXWQt3T+zPQSHUDC/2DHqTKhYxxRD3pwEshiPzhFza1ymYZPmWt2FJ5QUMqIW6uLuusWkSQlUsbTtXH4zhDLnS7JmbItDEfYfganVVHxatewf0xPNxznWRyN1wqAaZhugU2W4e8Qs3vDaCzoWiqY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JJNThMhJ; arc=none smtp.client-ip=209.85.161.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JJNThMhJ" Received: by mail-oo1-f42.google.com with SMTP id 006d021491bc7-598699c0f1eso317144eaf.2 for ; Thu, 08 Feb 2024 20:07:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451628; x=1708056428; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=F0RFXFHjgEipVAjjo7rYUK8PW9TKsuaRbrDEibnvW4k=; b=JJNThMhJdUZEC1apPjwJ9s5S+feCPTlTNOf4PSk9HQBE5aYhIJ7MGRqnkrciJtzyWa hS5p0jSZ3spk7EFoEe5JA6fIEP0M6mzT77S36aL16zzu8B8BozkFjUncJ/vdnf7Dd2E2 Rz1wyLpcCRn3fjfn6pDGV0HdmX1h9YgBzUihwqAlm6f3t9PUfrfbqBdHo3Aebck9qX4v 8pySTEzJLNsXpRYdxM4yUlt+fsTr05T7TfpbY1kYF9iIbiS3Z+X2FPVIt9U0e4PNneeb vTPsnSAAkJFkJ9F8MqsMjlC0xR+n60UAyt72QJ5XhB2bf0vlQIQDSoLIpaI8k2ut8Vvg kiow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451628; x=1708056428; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=F0RFXFHjgEipVAjjo7rYUK8PW9TKsuaRbrDEibnvW4k=; b=aJMfxVfBeLqDgqnhYjkLf9T+sf6IE9nRx/W8u7xI4GD8Aa1GAjMjOft1hOZ4D6HXqt QY5U6K624su6Q2gjwRjVBZduFxW045dVY82qIjF4UZ3SRCy51eS/pXw30WceXkgd3zFG OCZqEz++mNL7JRdA2dP7DXjOEKOj1e78vuKqwir1i8escnNfyGioHV18iamBLUvKtPDx fggkUsdX87TFI3G1yrOngm4F6aIrOHRq1Bqai9nWnAsb9EdJ18K1TYav/vOiI6RQbyWR 40wd3l3AcnkDgmvqdGxAjOk/dIOqy66vvPR2gCVDCEa+7NPgGwX9zNIQFKaHWEMVkElZ ZYyg== X-Gm-Message-State: AOJu0YzBpiH8LsuoYXRHL7Ltjs/3EwIv4ZxCjcxMxMZO2kfbpTNkqR6d sS6nZVsx69ikoz8S3YQFRjWp8g6OMrLFFpVJsBsM5VxirSCnLnoZAVOzcmSy X-Google-Smtp-Source: AGHT+IH3ELoemyowyx9wDHV3307l0ycjd0Y77bmRJBAVEdimNMeIuNSIlIwPFXHELajuyUjjlQuC3w== X-Received: by 2002:a05:6358:5923:b0:178:fcd3:c316 with SMTP id g35-20020a056358592300b00178fcd3c316mr369964rwf.19.1707451627656; Thu, 08 Feb 2024 20:07:07 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWoqc11kvj74nhgP0ItrhTalUSnkiKOdLRkeNXEUDkXxEiRPRf5d4bYuosF+6cOhfxzAMXVsl+t9+QOFHt5NPTVLFeWfXCYrBD7zGBhDwyKkAg83xUqbq16d4W438iTv/u4iqXlrxEfu5sU2ANBjPMqsLutqvxFcyLN7iF8AmK5czq/wJC1HhSd5mJgCo/vuHecDmQai1BM7uVYVxdm2xvH48dSiZe88KkqjcoFFxkEbZdmbNwrubW/g03c508mJ+Ep58KUOUKGi+0TXrw5efj3tEt/fIfNkCt6kmNdiEVmolW+mb3CmnCkJvU+Vf3d6lOpC4xDzA8R75HxOgkWVg0SLXn4HSflWlWByPP9oqC5Q5ltk7QVRA== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id r15-20020aa7988f000000b006e02da3a158sm610623pfl.17.2024.02.08.20.07.05 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:07 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 13/20] libbpf: Allow specifying 64-bit integers in map BTF. Date: Thu, 8 Feb 2024 20:06:01 -0800 Message-Id: <20240209040608.98927-14-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov __uint() macro that is used to specify map attributes like: __uint(type, BPF_MAP_TYPE_ARRAY); __uint(map_flags, BPF_F_MMAPABLE); is limited to 32-bit, since BTF_KIND_ARRAY has u32 "number of elements" field. Introduce __ulong() macro that allows specifying values bigger than 32-bit. In map definition "map_extra" is the only u64 field. Signed-off-by: Alexei Starovoitov Acked-by: Eduard Zingerman --- tools/lib/bpf/bpf_helpers.h | 5 +++++ tools/lib/bpf/libbpf.c | 44 ++++++++++++++++++++++++++++++++++--- 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 9c777c21da28..0aeac8ea7af2 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -13,6 +13,11 @@ #define __uint(name, val) int (*name)[val] #define __type(name, val) typeof(val) *name #define __array(name, val) typeof(val) *name[] +#ifndef __PASTE +#define ___PASTE(a,b) a##b +#define __PASTE(a,b) ___PASTE(a,b) +#endif +#define __ulong(name, val) enum { __PASTE(__unique_value, __COUNTER__) = val } name /* * Helper macro to place programs, maps, license in diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 4880d623098d..f8158e250327 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -2243,6 +2243,39 @@ static bool get_map_field_int(const char *map_name, const struct btf *btf, return true; } +static bool get_map_field_long(const char *map_name, const struct btf *btf, + const struct btf_member *m, __u64 *res) +{ + const struct btf_type *t = skip_mods_and_typedefs(btf, m->type, NULL); + const char *name = btf__name_by_offset(btf, m->name_off); + + if (btf_is_ptr(t)) + return false; + + if (!btf_is_enum(t) && !btf_is_enum64(t)) { + pr_warn("map '%s': attr '%s': expected enum or enum64, got %s.\n", + map_name, name, btf_kind_str(t)); + return false; + } + + if (btf_vlen(t) != 1) { + pr_warn("map '%s': attr '%s': invalid __ulong\n", + map_name, name); + return false; + } + + if (btf_is_enum(t)) { + const struct btf_enum *e = btf_enum(t); + + *res = e->val; + } else { + const struct btf_enum64 *e = btf_enum64(t); + + *res = btf_enum64_value(e); + } + return true; +} + static int pathname_concat(char *buf, size_t buf_sz, const char *path, const char *name) { int len; @@ -2476,10 +2509,15 @@ int parse_btf_map_def(const char *map_name, struct btf *btf, map_def->pinning = val; map_def->parts |= MAP_DEF_PINNING; } else if (strcmp(name, "map_extra") == 0) { - __u32 map_extra; + __u64 map_extra; - if (!get_map_field_int(map_name, btf, m, &map_extra)) - return -EINVAL; + if (!get_map_field_long(map_name, btf, m, &map_extra)) { + __u32 map_extra_u32; + + if (!get_map_field_int(map_name, btf, m, &map_extra_u32)) + return -EINVAL; + map_extra = map_extra_u32; + } map_def->map_extra = map_extra; map_def->parts |= MAP_DEF_MAP_EXTRA; } else { From patchwork Fri Feb 9 04:06:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550834 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB46953B8 for ; Fri, 9 Feb 2024 04:07:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451634; cv=none; b=iRRjMvfcjENi5ps0PMo6CEBGl7FjvkGmlamGTtk02KpEPzitdgzyqVEPZtTtzHZnYCoGcwZKGM2CK526E243b99Gfa7oZpDmjKDTRU5iQIbPOaml8hzbKrFtWv/X7SolK8bZTcjbfFmfF4m7cxj4bjXAfgncElIAyFlZokYvKGo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451634; c=relaxed/simple; bh=wOQzsme/WJx8E+7K76KBZWEu4MRl7v3chIkCFixnjIs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=l7QNbM+MP8IGM1VKtkFLehN59adTJ3wslfo0yFGLU6EfBA0ew8YnPXd1CP8Z3YdNjjqg0q15lduJLR+BKU19JjeGUfwi28A5HVqwRK8dA7obBVWAxUt5Z4isui7DXN7yKbYQsEybmK1hsIED4bdbB9u15P67X/TOLq2Wj8j/9/I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=NAGacgMi; arc=none smtp.client-ip=209.85.210.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NAGacgMi" Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-6da202aa138so369927b3a.2 for ; Thu, 08 Feb 2024 20:07:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451632; x=1708056432; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kLcdghajJR/QGdT0y2Nv6AwpiR5Q3wh3RT5GphLP8Vw=; b=NAGacgMiy697GJH25gRxgG7eaM2er+6kVSIZEowEnivFNh7Lp+L/0v2C8g/052GB5M B7J4I6sWWZHQH1ODbDaBzB9S3TIhfJlAS431kfrOFzWJxNMlBJxxJEWiXjy8bxJOGR2I CHSPlsCg2XFJP9wZQvaLOiT0c37fX4IszDDwjg/pAjnOh818fnFnrd9SRsJBWXWfM1xP v38XR0S99h1OIPxqqB8vxqONM0c6gdQSyQq95QoKBUkRDDx44Hz05q1+YiHBbExU7dBt UyuRczQjhEPAwItnsljw5X0joxKuCZK7lPpK60LzX1geUGebCiuSdlxyt4ZdYupRRLko ipXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451632; x=1708056432; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kLcdghajJR/QGdT0y2Nv6AwpiR5Q3wh3RT5GphLP8Vw=; b=QsI3h9R5Sz6zkuuyRPM+LlbOPg+PdkusBISRacuJlVp+8mHX1Wof+v4jcKXfexfkyq EDs5XGDbM/CwrzasoM3hbMDTxvvTVdtXKjg7oj9eiO4lUpWh7dvQ8tWgOt5cgwIqbBpL zn7xd7VJQiJjdJZwgYXlfcOOq4nNUahDnCsXzmSdkmRpsCCTAMmJ7OxW6Dair0fYERVT FlqD21R74EgAJt2nEHChwKN4+5c7+x7R6pgp9aga6oTJJgl/ftidXdOw+yXl+QMFJUqs 9cjwM3LQP9mVOEJCZiwQihlx52AG2opcSZFODfEF18Pm1qUFiWS2L/OXCza2zSHuCl7Z pF1g== X-Gm-Message-State: AOJu0YwiRHfm7jG65HGCj46KSs+qqPcDIVYpKqJURVGnva2VC7iwQKKC U3eQhm9bdAmh/YvzvIennujqXh52368A+goegV3W5qbVSwZIfPBDtW5mtHrD X-Google-Smtp-Source: AGHT+IFyT2HEc4ssBsDGMa4TA3yrF8aBqF74Rmstz83H0u4hHry+ndHS7WzzE7gccPrmnj/Cj736wQ== X-Received: by 2002:a05:6a20:4c96:b0:19a:4418:1e86 with SMTP id fq22-20020a056a204c9600b0019a44181e86mr530941pzb.58.1707451631872; Thu, 08 Feb 2024 20:07:11 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCUMVJtQVx+RHJh+uTVXziJuPTI/HF31SiR4SNYP6BHTsrkAvpTU1sc62DC21f13kRH9Dye4YDzYfba3LqMoAKSi+zMvfbZxGIUdYE4eQTBaHACABp5HjD9YYkJ7KoGVldB8/5CPr+2TEHW8cuKCioB9EIZV+kQ5CCdT4rZWQSNjINRK7wk8vRN2AufUOUbG4S0F8ZI5Fh4BtKBORfHn9I/5Du1L3rrfFgBPPTnFwtR37nrQi7pimC+mbyRHJbo3bH7RKBioHKKpzRZiTlPSUxx5A2rr+cKgpBI7AI6TMyhVRxJ0Nu5ITYRkBCWEAd6OiF/vq72Jxx7rz7KGQuubEf/0nS2k6bgxYL+WyM67sSGSZp7txMlxWQ== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id r1-20020a17090a438100b0029464b5fcdbsm715206pjg.42.2024.02.08.20.07.10 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:11 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 14/20] libbpf: Recognize __arena global varaibles. Date: Thu, 8 Feb 2024 20:06:02 -0800 Message-Id: <20240209040608.98927-15-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov LLVM automatically places __arena variables into ".arena.1" ELF section. When libbpf sees such section it creates internal 'struct bpf_map' LIBBPF_MAP_ARENA that is connected to actual BPF_MAP_TYPE_ARENA 'struct bpf_map'. They share the same kernel's side bpf map and single map_fd. Both are emitted into skeleton. Real arena with the name given by bpf program in SEC(".maps") and another with "__arena_internal" name. All global variables from ".arena.1" section are accessible from user space via skel->arena->name_of_var. For bss/data/rodata the skeleton/libbpf perform the following sequence: 1. addr = mmap(MAP_ANONYMOUS) 2. user space optionally modifies global vars 3. map_fd = bpf_create_map() 4. bpf_update_map_elem(map_fd, addr) // to store values into the kernel 5. mmap(addr, MAP_FIXED, map_fd) after step 5 user spaces see the values it wrote at step 2 at the same addresses arena doesn't support update_map_elem. Hence skeleton/libbpf do: 1. addr = mmap(MAP_ANONYMOUS) 2. user space optionally modifies global vars 3. map_fd = bpf_create_map(MAP_TYPE_ARENA) 4. real_addr = mmap(map->map_extra, MAP_SHARED | MAP_FIXED, map_fd) 5. memcpy(real_addr, addr) // this will fault-in and allocate pages 6. munmap(addr) At the end look and feel of global data vs __arena global data is the same from bpf prog pov. Another complication is: struct { __uint(type, BPF_MAP_TYPE_ARENA); } arena SEC(".maps"); int __arena foo; int bar; ptr1 = &foo; // relocation against ".arena.1" section ptr2 = &arena; // relocation against ".maps" section ptr3 = &bar; // relocation against ".bss" section Fo the kernel ptr1 and ptr2 has point to the same arena's map_fd while ptr3 points to a different global array's map_fd. For the verifier: ptr1->type == unknown_scalar ptr2->type == const_ptr_to_map ptr3->type == ptr_to_map_value after the verifier and for JIT all 3 ptr-s are normal ld_imm64 insns. Signed-off-by: Alexei Starovoitov --- tools/bpf/bpftool/gen.c | 13 ++++- tools/lib/bpf/libbpf.c | 102 +++++++++++++++++++++++++++++++++++----- 2 files changed, 101 insertions(+), 14 deletions(-) diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index a9334c57e859..74fabbdbad2b 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -82,7 +82,7 @@ static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz) const char *name = bpf_map__name(map); int i, n; - if (!bpf_map__is_internal(map)) { + if (!bpf_map__is_internal(map) || bpf_map__type(map) == BPF_MAP_TYPE_ARENA) { snprintf(buf, buf_sz, "%s", name); return true; } @@ -106,6 +106,12 @@ static bool get_datasec_ident(const char *sec_name, char *buf, size_t buf_sz) static const char *pfxs[] = { ".data", ".rodata", ".bss", ".kconfig" }; int i, n; + /* recognize hard coded LLVM section name */ + if (strcmp(sec_name, ".arena.1") == 0) { + /* this is the name to use in skeleton */ + strncpy(buf, "arena", buf_sz); + return true; + } for (i = 0, n = ARRAY_SIZE(pfxs); i < n; i++) { const char *pfx = pfxs[i]; @@ -239,6 +245,11 @@ static bool is_internal_mmapable_map(const struct bpf_map *map, char *buf, size_ if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) & BPF_F_MMAPABLE)) return false; + if (bpf_map__type(map) == BPF_MAP_TYPE_ARENA) { + strncpy(buf, "arena", sz); + return true; + } + if (!get_map_ident(map, buf, sz)) return false; diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index f8158e250327..d5364280a06c 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -498,6 +498,7 @@ struct bpf_struct_ops { #define KSYMS_SEC ".ksyms" #define STRUCT_OPS_SEC ".struct_ops" #define STRUCT_OPS_LINK_SEC ".struct_ops.link" +#define ARENA_SEC ".arena.1" enum libbpf_map_type { LIBBPF_MAP_UNSPEC, @@ -505,6 +506,7 @@ enum libbpf_map_type { LIBBPF_MAP_BSS, LIBBPF_MAP_RODATA, LIBBPF_MAP_KCONFIG, + LIBBPF_MAP_ARENA, }; struct bpf_map_def { @@ -547,6 +549,7 @@ struct bpf_map { bool reused; bool autocreate; __u64 map_extra; + struct bpf_map *arena; }; enum extern_type { @@ -613,6 +616,7 @@ enum sec_type { SEC_BSS, SEC_DATA, SEC_RODATA, + SEC_ARENA, }; struct elf_sec_desc { @@ -1718,10 +1722,34 @@ static int bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, const char *real_name, int sec_idx, void *data, size_t data_sz) { + const long page_sz = sysconf(_SC_PAGE_SIZE); + struct bpf_map *map, *arena = NULL; struct bpf_map_def *def; - struct bpf_map *map; size_t mmap_sz; - int err; + int err, i; + + if (type == LIBBPF_MAP_ARENA) { + for (i = 0; i < obj->nr_maps; i++) { + map = &obj->maps[i]; + if (map->def.type != BPF_MAP_TYPE_ARENA) + continue; + arena = map; + real_name = "__arena_internal"; + mmap_sz = bpf_map_mmap_sz(map); + if (roundup(data_sz, page_sz) > mmap_sz) { + pr_warn("Declared arena map size %zd is too small to hold" + "global __arena variables of size %zd\n", + mmap_sz, data_sz); + return -E2BIG; + } + break; + } + if (!arena) { + pr_warn("To use global __arena variables the arena map should" + "be declared explicitly in SEC(\".maps\")\n"); + return -ENOENT; + } + } map = bpf_object__add_map(obj); if (IS_ERR(map)) @@ -1732,6 +1760,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, map->sec_offset = 0; map->real_name = strdup(real_name); map->name = internal_map_name(obj, real_name); + map->arena = arena; if (!map->real_name || !map->name) { zfree(&map->real_name); zfree(&map->name); @@ -1739,18 +1768,32 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, } def = &map->def; - def->type = BPF_MAP_TYPE_ARRAY; - def->key_size = sizeof(int); - def->value_size = data_sz; - def->max_entries = 1; - def->map_flags = type == LIBBPF_MAP_RODATA || type == LIBBPF_MAP_KCONFIG - ? BPF_F_RDONLY_PROG : 0; + if (type == LIBBPF_MAP_ARENA) { + /* bpf_object will contain two arena maps: + * LIBBPF_MAP_ARENA & BPF_MAP_TYPE_ARENA + * and + * LIBBPF_MAP_UNSPEC & BPF_MAP_TYPE_ARENA. + * The former map->arena will point to latter. + */ + def->type = BPF_MAP_TYPE_ARENA; + def->key_size = 0; + def->value_size = 0; + def->max_entries = roundup(data_sz, page_sz) / page_sz; + def->map_flags = BPF_F_MMAPABLE; + } else { + def->type = BPF_MAP_TYPE_ARRAY; + def->key_size = sizeof(int); + def->value_size = data_sz; + def->max_entries = 1; + def->map_flags = type == LIBBPF_MAP_RODATA || type == LIBBPF_MAP_KCONFIG + ? BPF_F_RDONLY_PROG : 0; - /* failures are fine because of maps like .rodata.str1.1 */ - (void) map_fill_btf_type_info(obj, map); + /* failures are fine because of maps like .rodata.str1.1 */ + (void) map_fill_btf_type_info(obj, map); - if (map_is_mmapable(obj, map)) - def->map_flags |= BPF_F_MMAPABLE; + if (map_is_mmapable(obj, map)) + def->map_flags |= BPF_F_MMAPABLE; + } pr_debug("map '%s' (global data): at sec_idx %d, offset %zu, flags %x.\n", map->name, map->sec_idx, map->sec_offset, def->map_flags); @@ -1814,6 +1857,13 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj) NULL, sec_desc->data->d_size); break; + case SEC_ARENA: + sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx)); + err = bpf_object__init_internal_map(obj, LIBBPF_MAP_ARENA, + sec_name, sec_idx, + sec_desc->data->d_buf, + sec_desc->data->d_size); + break; default: /* skip */ break; @@ -3646,6 +3696,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) } else if (strcmp(name, STRUCT_OPS_LINK_SEC) == 0) { obj->efile.st_ops_link_data = data; obj->efile.st_ops_link_shndx = idx; + } else if (strcmp(name, ARENA_SEC) == 0) { + sec_desc->sec_type = SEC_ARENA; + sec_desc->shdr = sh; + sec_desc->data = data; } else { pr_info("elf: skipping unrecognized data section(%d) %s\n", idx, name); @@ -4148,6 +4202,7 @@ static bool bpf_object__shndx_is_data(const struct bpf_object *obj, case SEC_BSS: case SEC_DATA: case SEC_RODATA: + case SEC_ARENA: return true; default: return false; @@ -4173,6 +4228,8 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx) return LIBBPF_MAP_DATA; case SEC_RODATA: return LIBBPF_MAP_RODATA; + case SEC_ARENA: + return LIBBPF_MAP_ARENA; default: return LIBBPF_MAP_UNSPEC; } @@ -4326,7 +4383,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog, reloc_desc->type = RELO_DATA; reloc_desc->insn_idx = insn_idx; - reloc_desc->map_idx = map_idx; + reloc_desc->map_idx = map->arena ? map->arena - obj->maps : map_idx; reloc_desc->sym_off = sym->st_value; return 0; } @@ -4813,6 +4870,9 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map) bpf_gen__map_freeze(obj->gen_loader, map - obj->maps); return 0; } + if (map_type == LIBBPF_MAP_ARENA) + return 0; + err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0); if (err) { err = -errno; @@ -5119,6 +5179,15 @@ bpf_object__create_maps(struct bpf_object *obj) if (bpf_map__is_internal(map) && !kernel_supports(obj, FEAT_GLOBAL_DATA)) map->autocreate = false; + if (map->libbpf_type == LIBBPF_MAP_ARENA) { + size_t len = bpf_map_mmap_sz(map); + + memcpy(map->arena->mmaped, map->mmaped, len); + map->autocreate = false; + munmap(map->mmaped, len); + map->mmaped = NULL; + } + if (!map->autocreate) { pr_debug("map '%s': skipped auto-creating...\n", map->name); continue; @@ -9735,6 +9804,8 @@ static bool map_uses_real_name(const struct bpf_map *map) return true; if (map->libbpf_type == LIBBPF_MAP_RODATA && strcmp(map->real_name, RODATA_SEC) != 0) return true; + if (map->libbpf_type == LIBBPF_MAP_ARENA) + return true; return false; } @@ -13437,6 +13508,11 @@ int bpf_object__load_skeleton(struct bpf_object_skeleton *s) continue; } + if (map->arena) { + *mmaped = map->arena->mmaped; + continue; + } + if (map->def.map_flags & BPF_F_RDONLY_PROG) prot = PROT_READ; else From patchwork Fri Feb 9 04:06:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550835 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE64C5221 for ; Fri, 9 Feb 2024 04:07:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451638; cv=none; b=mKN8lMFg3PchFN30VSTHK8/W83G5t0SsOKNM+97qJkmxcg7Lo+KrElZXx2SWyB/WxqQX9TgNKHWS5SKQcIo/quwIkdLW7hFO3lo8gzyYYpcXPkGQnZlQ0ccEqRniC2GrHX3LXA46SC+KT5jMyynmGQGLoOM8hOL6Cyzq2q9HtXA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451638; c=relaxed/simple; bh=D/TrAhQWAcIDdh/oQnU+KCkH9OnP8LFJRj8yUjQYKhc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dvEeELxnuCXhnL/UNhwj5m4ArBx7z7gzmipurcICV1kPV2HHSnttQTWw4tisHhDHR5AQBR/pvyaj/+MMgSS5njcaXd1sfNTbzQOFprw89kBbi1xQDLBmr6zd1KeHPBJiTu0ZaQP5oG/c3dEjzglEXiEoLeFN6wTvnisIjPaY1jE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LE0RhHv2; arc=none smtp.client-ip=209.85.215.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LE0RhHv2" Received: by mail-pg1-f180.google.com with SMTP id 41be03b00d2f7-5d8ddbac4fbso437704a12.0 for ; Thu, 08 Feb 2024 20:07:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451636; x=1708056436; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mCIDfn6Q/Sj02IoZMnsUeC+cwQq+R0K4Vy6avN3EcgA=; b=LE0RhHv2dns8ZgzUmerp0A8r/xNz02j/fdXW+6Awzqu4/+3D/2AkY+pYgvkBrq12al Pbjl/3vFIBG1n8zGAOgvoUx6fca0t2i6Wxlv+RcpI4kaiDZusMxlpHiiQt+e+HWSP1nU any7uvFtO4CMYbYrh8hF8F0vDZjPHqyXCoSTbgkz36zwZof7Be94XY/PwMHf7nO/sqCV t26MUkvMSd9ngVzAdhrcEYseN9h2NHJcoWfZzk86H/wWujW+/p5QNLJhJ6QOb2w2iVvK 5ElYXgzzxiCT0yOdQFZ50iFZg6UTFn1W7rc3LWYxYXT7HN13yNDiJo6MKAdBLFdxqPQu UwNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451636; x=1708056436; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mCIDfn6Q/Sj02IoZMnsUeC+cwQq+R0K4Vy6avN3EcgA=; b=lcQ7PKnvG/PXNfQZ9XoET63+4A/5eksPATfItjjFCq1QmiScg3Iilgkta2AOH0SmfZ QymIiDMQHMxXImT33YwDeti2p5V9h/jHiCwHLxC7np54EMlmpXcd+LTYpDdEF+dAc4e/ 1DgY5tPbQP0GmfYlf8FXBiT781VlRj2JbI1Ek8n3X/PpjHfgXnutR0zN+GL406e1u23i myWRke4t0m5R5qyjkKnl4ISo9QXms5kl6xV7w5NFP5kKA2I1ceCysWsDfoXs0xGj7Bf9 gff2FQc24BdaA04+HmexOilNly3uZPpQbVRc3CwwWZ269KoclpvR6xx9HSDNf9AKgEPC Ijkw== X-Gm-Message-State: AOJu0Yy9WtVYzFBbeSIXNvrL90OZST9/YPAm/OyjlCcZqXztkSvLH/1Z anJcZw/Ej0Gy9tynFBxcj5x65hwZEx94QnoF7+1qVUFTK5Su8Kbdc2ZQ07aE X-Google-Smtp-Source: AGHT+IEi6h+ceTfRMJ5dI0p3X7XJYh4ZzKQVqNqt1uF5ux0MvuhWLqAFaBeMLSjm/CCs2wr1Y3GeYQ== X-Received: by 2002:a05:6a21:6711:b0:19c:9c2e:7860 with SMTP id wh17-20020a056a21671100b0019c9c2e7860mr691997pzb.13.1707451635981; Thu, 08 Feb 2024 20:07:15 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXL44286flKbPuiC5jJpQL4r9tnFLyhpDpeoUK86uHyxm1KmiZjhC6iWst+l0Yj3ohxbkOTv9OMTtkRZouflvj/p0qsmCfCCNN+RU3re5UYu0eZbWx8Sm3Trzrg2OIR5wSWop8s/7FQFUH7FL46mVTYTOUwdUDu4w4XDY8IA5NbPjR3F59YeYpxDdlZ6qaYXgatcmUWxkbTYM+4Q6SW/7jYYzybmfjOPjllVmPAAYXvZ0YIqXV+dVZ+aEPOeAxlLZHGJST60otvvVynWglcMCjg34Qy05PuwmX05Am2ndC0Gyx9LCDF3GGY+BLLY5ra/6Typc56SRoEhQjNtaZfxlsYSSewLkgZjqAGJ0j4Ju5PQ7l/N2P/Sg== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id t8-20020a170902bc4800b001d9a40e204bsm551470plz.21.2024.02.08.20.07.14 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:15 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 15/20] bpf: Tell bpf programs kernel's PAGE_SIZE Date: Thu, 8 Feb 2024 20:06:03 -0800 Message-Id: <20240209040608.98927-16-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov vmlinux BTF includes all kernel enums. Add __PAGE_SIZE = PAGE_SIZE enum, so that bpf programs that include vmlinux.h can easily access it. Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- kernel/bpf/core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 2829077f0461..3aa3f56a4310 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -88,13 +88,18 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns return NULL; } +/* tell bpf programs that include vmlinux.h kernel's PAGE_SIZE */ +enum page_size_enum { + __PAGE_SIZE = PAGE_SIZE +}; + struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags) { gfp_t gfp_flags = bpf_memcg_flags(GFP_KERNEL | __GFP_ZERO | gfp_extra_flags); struct bpf_prog_aux *aux; struct bpf_prog *fp; - size = round_up(size, PAGE_SIZE); + size = round_up(size, __PAGE_SIZE); fp = __vmalloc(size, gfp_flags); if (fp == NULL) return NULL; From patchwork Fri Feb 9 04:06:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550836 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C80F522A for ; Fri, 9 Feb 2024 04:07:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451642; cv=none; b=EcF+Eov0W9/4da2inIY5tzXoXvvLR4YzL+xDoXz/ijm4mY+xQlT22xEwHa3hzpIiLPhxc0DFS+TqJBiFiHrecGFcUdmpqyjxPW3yn/Iy7WD4Lis7sizm5272hhPJaBAGq0FmnFhw0xhdZDuh0u4ESWJKSsq3po5dc29H/Ak28c0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451642; c=relaxed/simple; bh=DpM4HzfqjDOit5k3hAkHFBsY8yXYvaZc7MIRe8CSyMA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=irs7NQ1T+KhoZpDrsxi1UPwdeFpGf2nOY9V7kSqurzzyJiqeS2IeKZomGHhA0hF8Jxx7At9No3YWZFhVMEoz2ffaWBLJ30E5Vjn+j9Qg9wC0qfE/7YadmsTtAITDTa96dFwVglcW1UbK2+kXaEEuSqWnAUkPuAxxnhp3Q4mKekI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=GlXB01ti; arc=none smtp.client-ip=209.85.210.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GlXB01ti" Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-6e062fa6e00so327922b3a.1 for ; Thu, 08 Feb 2024 20:07:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451640; x=1708056440; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MsJBrUUPtruHWtQsS6zgKVxqPHg8IzuE7qrpmwhQpng=; b=GlXB01tix9tjTIMq0N2b8N2vKme8MCWePo/Wam6tsJwcDyw9f5srNeZvWWdBmYvp8v 4KUEP4OQS1FV+tKVrD25r1FUcdgh7oCPxogtefyvcLHQc3F++mxc76Vyuoe3NcPFebPM xa6mM4Es4bGgwqggHcDlWrvWt7ibAtXUfCp7HUKfKdZcNRwV+GBrrJMcdhJOMZ4Elw6w Debk5ivea83F8X4orKpW1fo3r4xyPF6N1YLxGc8PEuQXpn8SUBcSMHNtLZ+gQCaxsvK5 hxWZO0z4aKdhHJmrsarKGRovGf54mI8sWxGkRJ+I6Oq7ZErL2oDww7n7JReNUAW3hK3b CHtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451640; x=1708056440; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MsJBrUUPtruHWtQsS6zgKVxqPHg8IzuE7qrpmwhQpng=; b=sa6WhMVaGpsYC3jDRkok8p6C4FnToWPbUmZvOT6DhZzJmQOS76ES1JmNiY5AaRRt7z YTopV0D58VHCwDOIO9lC+fhyCyk51x+Z+GbHQW5ZFBcziLEfo3x74qdnv60JyxG7ah8B Af73qZzpuuBeeM0FNzrCRTpeuFqxjvZB9/YfLD0G6PgWYDTde5Ehp1vWfQdzrx1F0uvm +naOWab4BAJ0abCrs8zoc3rC+Sz5i5xlQFOZcwUPQSLIcIqL8ZKmvIaMmb02ZiAVcpjo Dd3K4Bn3eW4BKGvLlpkdkajhlWzs105NpmzPtKAtmiAGeOclGEICcd/KjgufzvPg9b09 mRkw== X-Gm-Message-State: AOJu0Yx34xj0RE+qWf4ibYyKHzUPkqvbOIEfN4rH/9obdKL4gMVMhuso /VyOzneeOZvPQ/AZp/uCDBK0epo0iYPcW8eF5lzfxBCrfvjsZyQBImqbepHi X-Google-Smtp-Source: AGHT+IFFu/goz1FrmbpZ66r+D/WoNAbblyTq352/gs8sP6e/6QY7jw8AvLvp16EdF9oF+hm9bcXL7A== X-Received: by 2002:a17:902:ced1:b0:1d9:a2d3:8127 with SMTP id d17-20020a170902ced100b001d9a2d38127mr470046plg.52.1707451640097; Thu, 08 Feb 2024 20:07:20 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWP7HPoDU+RIV0hbhhVZtaTKZ52ozQSAI2pmi/5/+M4R+kofDRBWlviHowAmZf4KAFWB00xVtEVtRwexebBtYOr5AunnGUHJHavtyq1fO76hzOsbjlXb6ahRaPTVniCLqPQVTDLcmGtA+rqFV9qPMYDXBfd1cJ61lPC4OPXYwCm56ZKxVkulgzpnfoB2DhozBGuIbcQ/lmUZzREPwYFIWjmTN+9/y2JuUZspdbHfxqeXh60l+d35X6YqgaPJBYzRA+h7ETU6QLl1X17oZ7YJWJ35m4Mn4X9nbYlY0JqtdjgCe50tzxXbXK82HqDvWSALZFykpj6UqRgzuzDJqpHvP6D7mmZ/eNM3hPd+94XK6JgFgmhKcw1oA== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id i19-20020a170902eb5300b001d8f3f91a23sm535557pli.258.2024.02.08.20.07.18 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:19 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 16/20] bpf: Add helper macro bpf_arena_cast() Date: Thu, 8 Feb 2024 20:06:04 -0800 Message-Id: <20240209040608.98927-17-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Introduce helper macro bpf_arena_cast() that emits: rX = rX instruction with off = BPF_ARENA_CAST_KERN or off = BPF_ARENA_CAST_USER and encodes address_space into imm32. It's useful with older LLVM that doesn't emit this insn automatically. Signed-off-by: Alexei Starovoitov Acked-by: Kumar Kartikeya Dwivedi --- .../testing/selftests/bpf/bpf_experimental.h | 41 +++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 0d749006d107..e73b7d48439f 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -331,6 +331,47 @@ l_true: \ asm volatile("%[reg]=%[reg]"::[reg]"r"((short)var)) #endif +/* emit instruction: rX=rX .off = mode .imm32 = address_space */ +#ifndef bpf_arena_cast +#define bpf_arena_cast(var, mode, addr_space) \ + ({ \ + typeof(var) __var = var; \ + asm volatile(".byte 0xBF; \ + .ifc %[reg], r0; \ + .byte 0x00; \ + .endif; \ + .ifc %[reg], r1; \ + .byte 0x11; \ + .endif; \ + .ifc %[reg], r2; \ + .byte 0x22; \ + .endif; \ + .ifc %[reg], r3; \ + .byte 0x33; \ + .endif; \ + .ifc %[reg], r4; \ + .byte 0x44; \ + .endif; \ + .ifc %[reg], r5; \ + .byte 0x55; \ + .endif; \ + .ifc %[reg], r6; \ + .byte 0x66; \ + .endif; \ + .ifc %[reg], r7; \ + .byte 0x77; \ + .endif; \ + .ifc %[reg], r8; \ + .byte 0x88; \ + .endif; \ + .ifc %[reg], r9; \ + .byte 0x99; \ + .endif; \ + .short %[off]; .long %[as]" \ + :: [reg]"r"(__var), [off]"i"(mode), [as]"i"(addr_space)); __var; \ + }) +#endif + /* Description * Assert that a conditional expression is true. * Returns From patchwork Fri Feb 9 04:06:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550837 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22E835667 for ; Fri, 9 Feb 2024 04:07:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451647; cv=none; b=VPyaHoxqfu03R3VawlFRi8mTI6p6pgPd5It5+qbZiF5iXI92VOt8wUKZurFJTeS1SCoHlEzOCDa5SZfdVrapohGoNjNu5RruXGjy7tkYDbMuexps2fGcpMxyoRCNI7GPZkhetMc9SUt4bA9KbFSd0w/CxM3TMMIAy9CqJaHvn/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451647; c=relaxed/simple; bh=YONaFYwtRZfCUHSPP/xLCZA0fjMcy/91+vEi9K1d3BI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hvcWZ2fYgu1bqvmsd8CIFHMLDunhLh1rZzzClzwl4OI/0ESRNqPD1eT0NzCwrdPBpknQ6PjNmOZTeu4TAvhyR4iGtdpCus/cn6J/zY1uz/lodV30UqhVcBjdS54VcAzpvUrkAtJpgoh1o1rxVintyrDJOBPOd3252A/oW92cMK4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AVOG1qK+; arc=none smtp.client-ip=209.85.210.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AVOG1qK+" Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-6e04fd5e05aso440878b3a.0 for ; Thu, 08 Feb 2024 20:07:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451645; x=1708056445; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cygxMwXpUKU81RFmqoqePBosJeNZ00y1PK/p5gN4GLo=; b=AVOG1qK+yXhZIH57PzV+AQ3ucpbxtsqbikR211PxUN/ipyyli9UFyF9M9hn4SfR1Qx AJqHpmnNlXrWMmdijUCluGUX+NHHYTnAk3oLSjO4FgcKvkVZZKCod6D15nCM278fQOnk mBLahtxFV/eY1S+j5hDLv7Er7G8UTizCRFkB/J20rz9KeGNbuayitlhhGUoIrA0NLfpk owTQu/frMwdiGpfZnEQWP3zyCqp03QD3bkBEsLS8MVdFxhPt4Lc/RGCmeEHp0DA68hXx mkFUiIEOVIi9qtFYnQg4RPSKMQ/zZgh5Q0ZbaxsTMmjnpw9Wv5UxVxZiDpfe+P5XSONw wCJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451645; x=1708056445; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cygxMwXpUKU81RFmqoqePBosJeNZ00y1PK/p5gN4GLo=; b=P/sUuGk7Qg06yylDuUf4H9pGnPOMHAfJEMAtQvbHjFa9Ly3EeM7EOS9ipRtBgTpvpp pnIV0KB/OiuuZ8JXX3VoRgDE48uAc2uHHYR2uPVfh6j15QpPXIhEa9PvWaONJ0M6lKSF AddtbZbp0Bvuaq4vnajs2YajsB+FvDDV8fcmj/LsYMJG51Q0/L8KJTh1WApiP6XyT+n7 vKQ7RWa2Uu84lH/lhYOV3FlPOiVxnbHA+73kj5kTUpnVxKcgWslMe8nVjZTLXAsSUcNU dYGiEFPSUQJCBbhn9Dqgg5sj2AASdjcBiEfdvqWMlATgTPKRFK3eDRQRId2fXNJUIHnM eOhg== X-Gm-Message-State: AOJu0Yw+zs4aL/QFu9nstTBd+1doevwPE4bl/QY0FzVczxs/FJ9gV0Gm FsDXZ+pGflu4EdOQakRMF4idtmIXxgyiPah88lgTqV9XuTyaqW12VTOcnTch X-Google-Smtp-Source: AGHT+IHaHqKC2gjT6eMMFa5TCSDFBnv1yNEkB/syWL6KEWt6hbk/NU6aBcqi1y0pyAXerAeLHX9DIA== X-Received: by 2002:a62:f901:0:b0:6da:bcea:4cd4 with SMTP id o1-20020a62f901000000b006dabcea4cd4mr650234pfh.16.1707451645004; Thu, 08 Feb 2024 20:07:25 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXKa7AoYoSTpHf4QY1MuRhoeHy/otD5YbJxf7F1nq3Kf80HMJe7T0JPldiANHIWK0RPvaLH7ALpvryppvlFYP3mvzADeCO+6W0Df2OEfEyZCy1EthA0awtTR8laxGBdXgAA+oT4niKbJetncz2Fps4xueuExRoXrm/pX3Gcvt7rdxSEJ2rWTFSq+W25Wl3Drj8BYjUFe9Ck6okd8ulBbCOHQJIDRJ8dlA6f3o17pQoFudGpIsq00Zyk6sAn17YSRtSDazpAvv0earkDd4DsTwoFLj0GNPgNctSu+ObZYiRoMJJK+zmn/wkbO9+fovsxt3aeg00hIDWktHbfbs2uiBwomv1vCaKMYx0lfUcmtBHD7mvIUD0euQ== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id v14-20020aa7850e000000b006e0825acbc3sm590230pfn.77.2024.02.08.20.07.23 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:24 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 17/20] selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages Date: Thu, 8 Feb 2024 20:06:05 -0800 Message-Id: <20240209040608.98927-18-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Add unit tests for bpf_arena_alloc/free_pages() functionality and bpf_arena_common.h with a set of common helpers and macros that is used in this test and the following patches. Also modify test_loader that didn't support running bpf_prog_type_syscall programs. Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/DENYLIST.aarch64 | 1 + tools/testing/selftests/bpf/DENYLIST.s390x | 1 + .../testing/selftests/bpf/bpf_arena_common.h | 70 ++++++++++++++ .../selftests/bpf/prog_tests/verifier.c | 2 + .../selftests/bpf/progs/verifier_arena.c | 91 +++++++++++++++++++ tools/testing/selftests/bpf/test_loader.c | 9 +- 6 files changed, 172 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/bpf/bpf_arena_common.h create mode 100644 tools/testing/selftests/bpf/progs/verifier_arena.c diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64 index 5c2cc7e8c5d0..8e70af386e52 100644 --- a/tools/testing/selftests/bpf/DENYLIST.aarch64 +++ b/tools/testing/selftests/bpf/DENYLIST.aarch64 @@ -11,3 +11,4 @@ fill_link_info/kprobe_multi_link_info # bpf_program__attach_kprobe_mu fill_link_info/kretprobe_multi_link_info # bpf_program__attach_kprobe_multi_opts unexpected error: -95 fill_link_info/kprobe_multi_invalid_ubuff # bpf_program__attach_kprobe_multi_opts unexpected error: -95 missed/kprobe_recursion # missed_kprobe_recursion__attach unexpected error: -95 (errno 95) +verifier_arena # JIT does not support arena diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x index 1a63996c0304..ded440277f6e 100644 --- a/tools/testing/selftests/bpf/DENYLIST.s390x +++ b/tools/testing/selftests/bpf/DENYLIST.s390x @@ -3,3 +3,4 @@ exceptions # JIT does not support calling kfunc bpf_throw (exceptions) get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace) stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?) +verifier_arena # JIT does not support arena diff --git a/tools/testing/selftests/bpf/bpf_arena_common.h b/tools/testing/selftests/bpf/bpf_arena_common.h new file mode 100644 index 000000000000..07849d502f40 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_common.h @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once + +#ifndef WRITE_ONCE +#define WRITE_ONCE(x, val) ((*(volatile typeof(x) *) &(x)) = (val)) +#endif + +#ifndef NUMA_NO_NODE +#define NUMA_NO_NODE (-1) +#endif + +#ifndef arena_container_of +#define arena_container_of(ptr, type, member) \ + ({ \ + void __arena *__mptr = (void __arena *)(ptr); \ + ((type *)(__mptr - offsetof(type, member))); \ + }) +#endif + +#ifdef __BPF__ /* when compiled as bpf program */ + +#ifndef PAGE_SIZE +#define PAGE_SIZE __PAGE_SIZE +/* + * for older kernels try sizeof(struct genradix_node) + * or flexible: + * static inline long __bpf_page_size(void) { + * return bpf_core_enum_value(enum page_size_enum___l, __PAGE_SIZE___l) ?: sizeof(struct genradix_node); + * } + * but generated code is not great. + */ +#endif + +#if defined(__BPF_FEATURE_ARENA_CAST) && !defined(BPF_ARENA_FORCE_ASM) +#define __arena __attribute__((address_space(1))) +#define cast_kern(ptr) /* nop for bpf prog. emitted by LLVM */ +#define cast_user(ptr) /* nop for bpf prog. emitted by LLVM */ +#else +#define __arena +#define cast_kern(ptr) bpf_arena_cast(ptr, BPF_ARENA_CAST_KERN, 1) +#define cast_user(ptr) bpf_arena_cast(ptr, BPF_ARENA_CAST_USER, 1) +#endif + +void __arena* bpf_arena_alloc_pages(void *map, void __arena *addr, __u32 page_cnt, + int node_id, __u64 flags) __ksym __weak; +void bpf_arena_free_pages(void *map, void __arena *ptr, __u32 page_cnt) __ksym __weak; + +#else /* when compiled as user space code */ + +#define __arena +#define __arg_arena +#define cast_kern(ptr) /* nop for user space */ +#define cast_user(ptr) /* nop for user space */ +__weak char arena[1]; + +#ifndef offsetof +#define offsetof(type, member) ((unsigned long)&((type *)0)->member) +#endif + +static inline void __arena* bpf_arena_alloc_pages(void *map, void *addr, __u32 page_cnt, + int node_id, __u64 flags) +{ + return NULL; +} +static inline void bpf_arena_free_pages(void *map, void __arena *ptr, __u32 page_cnt) +{ +} + +#endif diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index 9c6072a19745..985273832f89 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -4,6 +4,7 @@ #include "cap_helpers.h" #include "verifier_and.skel.h" +#include "verifier_arena.skel.h" #include "verifier_array_access.skel.h" #include "verifier_basic_stack.skel.h" #include "verifier_bitfield_write.skel.h" @@ -118,6 +119,7 @@ static void run_tests_aux(const char *skel_name, #define RUN(skel) run_tests_aux(#skel, skel##__elf_bytes, NULL) void test_verifier_and(void) { RUN(verifier_and); } +void test_verifier_arena(void) { RUN(verifier_arena); } void test_verifier_basic_stack(void) { RUN(verifier_basic_stack); } void test_verifier_bitfield_write(void) { RUN(verifier_bitfield_write); } void test_verifier_bounds(void) { RUN(verifier_bounds); } diff --git a/tools/testing/selftests/bpf/progs/verifier_arena.c b/tools/testing/selftests/bpf/progs/verifier_arena.c new file mode 100644 index 000000000000..0e667132ef92 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_arena.c @@ -0,0 +1,91 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include +#include "bpf_misc.h" +#include "bpf_experimental.h" +#include "bpf_arena_common.h" + +struct { + __uint(type, BPF_MAP_TYPE_ARENA); + __uint(map_flags, BPF_F_MMAPABLE); + __uint(max_entries, 2); /* arena of two pages close to 32-bit boundary*/ + __ulong(map_extra, (1ull << 44) | (~0u - __PAGE_SIZE * 2 + 1)); /* start of mmap() region */ +} arena SEC(".maps"); + +SEC("syscall") +__success __retval(0) +int basic_alloc1(void *ctx) +{ + volatile int __arena *page1, *page2, *no_page, *page3; + + page1 = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0); + if (!page1) + return 1; + *page1 = 1; + page2 = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0); + if (!page2) + return 2; + *page2 = 2; + no_page = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0); + if (no_page) + return 3; + if (*page1 != 1) + return 4; + if (*page2 != 2) + return 5; + bpf_arena_free_pages(&arena, (void __arena *)page2, 1); + if (*page1 != 1) + return 6; + if (*page2 != 0) /* use-after-free should return 0 */ + return 7; + page3 = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0); + if (!page3) + return 8; + *page3 = 3; + if (page2 != page3) + return 9; + if (*page1 != 1) + return 10; + return 0; +} + +SEC("syscall") +__success __retval(0) +int basic_alloc2(void *ctx) +{ + volatile char __arena *page1, *page2, *page3, *page4; + + page1 = bpf_arena_alloc_pages(&arena, NULL, 2, NUMA_NO_NODE, 0); + if (!page1) + return 1; + page2 = page1 + __PAGE_SIZE; + page3 = page1 + __PAGE_SIZE * 2; + page4 = page1 - __PAGE_SIZE; + *page1 = 1; + *page2 = 2; + *page3 = 3; + *page4 = 4; + if (*page1 != 1) + return 1; + if (*page2 != 2) + return 2; + if (*page3 != 0) + return 3; + if (*page4 != 0) + return 4; + bpf_arena_free_pages(&arena, (void __arena *)page1, 2); + if (*page1 != 0) + return 5; + if (*page2 != 0) + return 6; + if (*page3 != 0) + return 7; + if (*page4 != 0) + return 8; + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c index ba57601c2a4d..524c38e9cde4 100644 --- a/tools/testing/selftests/bpf/test_loader.c +++ b/tools/testing/selftests/bpf/test_loader.c @@ -501,7 +501,7 @@ static bool is_unpriv_capable_map(struct bpf_map *map) } } -static int do_prog_test_run(int fd_prog, int *retval) +static int do_prog_test_run(int fd_prog, int *retval, bool empty_opts) { __u8 tmp_out[TEST_DATA_LEN << 2] = {}; __u8 tmp_in[TEST_DATA_LEN] = {}; @@ -514,6 +514,10 @@ static int do_prog_test_run(int fd_prog, int *retval) .repeat = 1, ); + if (empty_opts) { + memset(&topts, 0, sizeof(struct bpf_test_run_opts)); + topts.sz = sizeof(struct bpf_test_run_opts); + } err = bpf_prog_test_run_opts(fd_prog, &topts); saved_errno = errno; @@ -649,7 +653,8 @@ void run_subtest(struct test_loader *tester, } } - do_prog_test_run(bpf_program__fd(tprog), &retval); + do_prog_test_run(bpf_program__fd(tprog), &retval, + bpf_program__type(tprog) == BPF_PROG_TYPE_SYSCALL ? true : false); if (retval != subspec->retval && subspec->retval != POINTER_VALUE) { PRINT_FAIL("Unexpected retval: %d != %d\n", retval, subspec->retval); goto tobj_cleanup; From patchwork Fri Feb 9 04:06:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550838 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3641C525D for ; Fri, 9 Feb 2024 04:07:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451652; cv=none; b=DVhxaDxGYocAfCyfHtwHUwz0JDX82lGg90YtaQ+NNr8jegiqzIpgVyo6kBH54NLg6dfd814K0CSRipXiXEqhQKGFfgHJ+XMzstm5y/3mEPjXfL0IoMkEji7jt9S46uihbq1q0aP7Y5M8xbkAZmZBGQ5EccEpsVMly4x5xdGAbZs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451652; c=relaxed/simple; bh=Dh/Oq8ZZBWmdokJ7xdFmxBMzQGBQw/0kJBzxQd6oHlA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OM9aSJL1e9N5QYbggvm8aL5DlCZxGaWEMQTkbLY5KlDqC19kVQRjfDySnaiMNdCcbaBAnwmCGzq1OX8VQguNDFrIVXLMCOVreSPs4Xmq+ZAsaewmteXbuRxmuDkKlho44p9RmO8in/MNE9t5rgtZ8JPiLY8V1QxXLvIt786cnKQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QL9KRfKp; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QL9KRfKp" Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-6e05d41828aso397935b3a.0 for ; Thu, 08 Feb 2024 20:07:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451649; x=1708056449; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eb+bGqWXAT7EYhVLB+w+aV2CuyvwLgJLPbtGe0eOYJ0=; b=QL9KRfKpWXPBKuKJhHSFsVoteYSrCsEntvJ0ntF6DPhzkxjUsYOOkSRXSzNr38ct2H 8HYuUIHg+CKjgmsGPlG/7s1zRYgVqBf8Wt4rZUo60R++aQjNHVKLftZVdnI/yHrZRz8a Rf0FpnBYKefkK6Ohi0IPE9Xm35Puqeoi47DMrjBuMuCZVMqYxhuWrLpWvqMY7L2sSZ3J zUEkm+wylV3mzeOXqeo2ON7z1M00LwcrXM/6njEZ5d3I9xOQA3vQmRJDN5m7jn8kKA2U JLb08IaEWr+1BwOIV1opdVTaWUzpbTnkhoLwXD2nQe6ATOGeKTMTVXn06W/LJJCvQ1WW 9ULA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451649; x=1708056449; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eb+bGqWXAT7EYhVLB+w+aV2CuyvwLgJLPbtGe0eOYJ0=; b=oYqoR5uxtdcAH1pf04MPpTMCcFM5R0u0Sv5p7R45qo3LSOJ3PxD6ziWt00R0bkSZfY RFm77Q+z4CmB1n+5NAXxgGSC4/JUT/sqCvAImXezzfqWkq1KadCx1wVFWz0cCmxBXkaE ZDbhV+j6/VNsN7DdaK1Yu5XblRmzufxeq2brCeRCzUcJZa3T7lnqTjFKeXY78rMkerRh HsWZ9bZ8CZVm5GU1Bxw8Jb3/viuPVaTu/FPV0t9A2NZC4mNQTT+NHL+Pg+W5eWCVIbFi hgmQMb0Y0YTMT5+s56EQigfghuiyvi9VEToljPGNwjtkbLSe8fGYvlR+vH/MVn1sXCu2 Zhgw== X-Gm-Message-State: AOJu0YxEemzA24R9i0KBx480jfK/SBqmbIR2oYJtH7tK230PmI7F1YQW KoXPrAg8q8Qsgqia2F4ZdlweCUmifkB74ElWUpn0mPobQssJI1vC43XCuajO X-Google-Smtp-Source: AGHT+IGG5TNU//6saVXjiaDd3768cg95qj6j0b0wUL2WN3QIR5+8d+0UOYlVtQ8UWRPRvc2Z7Z+eHQ== X-Received: by 2002:aa7:8058:0:b0:6d9:bb2f:3a69 with SMTP id y24-20020aa78058000000b006d9bb2f3a69mr491139pfm.28.1707451649186; Thu, 08 Feb 2024 20:07:29 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCUjtlzegU4md9eHPzOxI0W1dNxIUvJ3ODioqbSz+ZgQBvX4dmxdiUbBS7O0w8vBr9vAaA6uqWY3U+OSwiXqgbz757P/uWzG9Bl/sXQ7GtnQIvtx20f9N7sMwORHdhPk1jSJQUGn+dV9YbQOAp9CA6N76SE+oDBaw4b2sQToICMcKVjX46jljS+JQMlD//n/vyjn+4CQctUA2dDpLFz731lYn5YNR6kHFjmOpnCVsqCimcT4OO69W6fLm3kO0bz103wJeDiTKCf2y2EhDA8TpqLgZ0/vRkNtoRuBNvX+x8dmjKsHDq6+BYewLzKAphfVEbgNcG8GaOwoAFr2ClxgPEgiiSFHz8ev2mjVD2N/VkgGqLN8yyjFNg== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id z19-20020aa785d3000000b006ddcf5d5b0bsm582570pfn.153.2024.02.08.20.07.27 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:28 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 18/20] selftests/bpf: Add bpf_arena_list test. Date: Thu, 8 Feb 2024 20:06:06 -0800 Message-Id: <20240209040608.98927-19-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov bpf_arena_alloc.h - implements page_frag allocator as a bpf program. bpf_arena_list.h - doubly linked link list as a bpf program. Compiled as a bpf program and as native C code. Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/DENYLIST.aarch64 | 1 + tools/testing/selftests/bpf/DENYLIST.s390x | 1 + tools/testing/selftests/bpf/bpf_arena_alloc.h | 58 +++++++++++ tools/testing/selftests/bpf/bpf_arena_list.h | 95 +++++++++++++++++++ .../selftests/bpf/prog_tests/arena_list.c | 68 +++++++++++++ .../testing/selftests/bpf/progs/arena_list.c | 76 +++++++++++++++ 6 files changed, 299 insertions(+) create mode 100644 tools/testing/selftests/bpf/bpf_arena_alloc.h create mode 100644 tools/testing/selftests/bpf/bpf_arena_list.h create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_list.c create mode 100644 tools/testing/selftests/bpf/progs/arena_list.c diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64 index 8e70af386e52..83a3d9bee59c 100644 --- a/tools/testing/selftests/bpf/DENYLIST.aarch64 +++ b/tools/testing/selftests/bpf/DENYLIST.aarch64 @@ -12,3 +12,4 @@ fill_link_info/kretprobe_multi_link_info # bpf_program__attach_kprobe_mu fill_link_info/kprobe_multi_invalid_ubuff # bpf_program__attach_kprobe_multi_opts unexpected error: -95 missed/kprobe_recursion # missed_kprobe_recursion__attach unexpected error: -95 (errno 95) verifier_arena # JIT does not support arena +arena # JIT does not support arena diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x index ded440277f6e..9293b88a327e 100644 --- a/tools/testing/selftests/bpf/DENYLIST.s390x +++ b/tools/testing/selftests/bpf/DENYLIST.s390x @@ -4,3 +4,4 @@ exceptions # JIT does not support calling kfunc bpf_throw (excepti get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace) stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?) verifier_arena # JIT does not support arena +arena # JIT does not support arena diff --git a/tools/testing/selftests/bpf/bpf_arena_alloc.h b/tools/testing/selftests/bpf/bpf_arena_alloc.h new file mode 100644 index 000000000000..0f4cb399b4c7 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_alloc.h @@ -0,0 +1,58 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once +#include "bpf_arena_common.h" + +#ifndef __round_mask +#define __round_mask(x, y) ((__typeof__(x))((y)-1)) +#endif +#ifndef round_up +#define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1) +#endif + +void __arena *cur_page; +int cur_offset; + +/* Simple page_frag allocator */ +static inline void __arena* bpf_alloc(unsigned int size) +{ + __u64 __arena *obj_cnt; + void __arena *page = cur_page; + int offset; + + size = round_up(size, 8); + if (size >= PAGE_SIZE - 8) + return NULL; + if (!page) { +refill: + page = bpf_arena_alloc_pages(&arena, NULL, 1, NUMA_NO_NODE, 0); + if (!page) + return NULL; + cast_kern(page); + cur_page = page; + cur_offset = PAGE_SIZE - 8; + obj_cnt = page + PAGE_SIZE - 8; + *obj_cnt = 0; + } else { + cast_kern(page); + obj_cnt = page + PAGE_SIZE - 8; + } + + offset = cur_offset - size; + if (offset < 0) + goto refill; + + (*obj_cnt)++; + cur_offset = offset; + return page + offset; +} + +static inline void bpf_free(void __arena *addr) +{ + __u64 __arena *obj_cnt; + + addr = (void __arena *)(((long)addr) & ~(PAGE_SIZE - 1)); + obj_cnt = addr + PAGE_SIZE - 8; + if (--(*obj_cnt) == 0) + bpf_arena_free_pages(&arena, addr, 1); +} diff --git a/tools/testing/selftests/bpf/bpf_arena_list.h b/tools/testing/selftests/bpf/bpf_arena_list.h new file mode 100644 index 000000000000..31fd744dfb72 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_list.h @@ -0,0 +1,95 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once +#include "bpf_arena_common.h" + +struct arena_list_node; + +typedef struct arena_list_node __arena arena_list_node_t; + +struct arena_list_node { + arena_list_node_t *next; + arena_list_node_t * __arena *pprev; +}; + +struct arena_list_head { + struct arena_list_node __arena *first; +}; +typedef struct arena_list_head __arena arena_list_head_t; + +#define list_entry(ptr, type, member) arena_container_of(ptr, type, member) + +#define list_entry_safe(ptr, type, member) \ + ({ typeof(*ptr) * ___ptr = (ptr); \ + ___ptr ? ({ cast_kern(___ptr); list_entry(___ptr, type, member); }) : NULL; \ + }) + +#ifndef __BPF__ +static inline void *bpf_iter_num_new(struct bpf_iter_num *it, int i, int j) { return NULL; } +static inline void bpf_iter_num_destroy(struct bpf_iter_num *it) {} +static inline bool bpf_iter_num_next(struct bpf_iter_num *it) { return true; } +#endif + +/* Safely walk link list of up to 1M elements. Deletion of elements is allowed. */ +#define list_for_each_entry(pos, head, member) \ + for (struct bpf_iter_num ___it __attribute__((aligned(8), \ + cleanup(bpf_iter_num_destroy))), \ + * ___tmp = ( \ + bpf_iter_num_new(&___it, 0, (1000000)), \ + pos = list_entry_safe((head)->first, \ + typeof(*(pos)), member), \ + (void)bpf_iter_num_destroy, (void *)0); \ + bpf_iter_num_next(&___it) && pos && \ + ({ ___tmp = (void *)pos->member.next; 1; }); \ + pos = list_entry_safe((void __arena *)___tmp, typeof(*(pos)), member)) + +static inline void list_add_head(arena_list_node_t *n, arena_list_head_t *h) +{ + arena_list_node_t *first = h->first, * __arena *tmp; + + cast_user(first); + cast_kern(n); + WRITE_ONCE(n->next, first); + cast_kern(first); + if (first) { + tmp = &n->next; + cast_user(tmp); + WRITE_ONCE(first->pprev, tmp); + } + cast_user(n); + WRITE_ONCE(h->first, n); + + tmp = &h->first; + cast_user(tmp); + cast_kern(n); + WRITE_ONCE(n->pprev, tmp); +} + +static inline void __list_del(arena_list_node_t *n) +{ + arena_list_node_t *next = n->next, *tmp; + arena_list_node_t * __arena *pprev = n->pprev; + + cast_user(next); + cast_kern(pprev); + tmp = *pprev; + cast_kern(tmp); + WRITE_ONCE(tmp, next); + if (next) { + cast_user(pprev); + cast_kern(next); + WRITE_ONCE(next->pprev, pprev); + } +} + +#define POISON_POINTER_DELTA 0 + +#define LIST_POISON1 ((void __arena *) 0x100 + POISON_POINTER_DELTA) +#define LIST_POISON2 ((void __arena *) 0x122 + POISON_POINTER_DELTA) + +static inline void list_del(arena_list_node_t *n) +{ + __list_del(n); + n->next = LIST_POISON1; + n->pprev = LIST_POISON2; +} diff --git a/tools/testing/selftests/bpf/prog_tests/arena_list.c b/tools/testing/selftests/bpf/prog_tests/arena_list.c new file mode 100644 index 000000000000..e61886debab1 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/arena_list.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include + +#define PAGE_SIZE 4096 + +#include "bpf_arena_list.h" +#include "arena_list.skel.h" + +struct elem { + struct arena_list_node node; + __u64 value; +}; + +static int list_sum(struct arena_list_head *head) +{ + struct elem __arena *n; + int sum = 0; + + list_for_each_entry(n, head, node) + sum += n->value; + return sum; +} + +static void test_arena_list_add_del(int cnt) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts); + struct arena_list *skel; + int expected_sum = (u64)cnt * (cnt - 1) / 2; + int ret, sum; + + skel = arena_list__open_and_load(); + if (!ASSERT_OK_PTR(skel, "arena_list__open_and_load")) + return; + + skel->bss->cnt = cnt; + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_list_add), &opts); + ASSERT_OK(ret, "ret_add"); + ASSERT_OK(opts.retval, "retval"); + if (skel->bss->skip) { + printf("%s:SKIP:compiler doesn't support arena_cast\n", __func__); + test__skip(); + goto out; + } + sum = list_sum(skel->bss->list_head); + ASSERT_EQ(sum, expected_sum, "sum of elems"); + ASSERT_EQ(skel->arena->arena_sum, expected_sum, "__arena sum of elems"); + ASSERT_EQ(skel->arena->test_val, cnt + 1, "num of elems"); + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_list_del), &opts); + ASSERT_OK(ret, "ret_del"); + sum = list_sum(skel->bss->list_head); + ASSERT_EQ(sum, 0, "sum of list elems after del"); + ASSERT_EQ(skel->bss->list_sum, expected_sum, "sum of list elems computed by prog"); + ASSERT_EQ(skel->arena->arena_sum, expected_sum, "__arena sum of elems"); +out: + arena_list__destroy(skel); +} + +void test_arena_list(void) +{ + if (test__start_subtest("arena_list_1")) + test_arena_list_add_del(1); + if (test__start_subtest("arena_list_1000")) + test_arena_list_add_del(1000); +} diff --git a/tools/testing/selftests/bpf/progs/arena_list.c b/tools/testing/selftests/bpf/progs/arena_list.c new file mode 100644 index 000000000000..04ebcdd98f10 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/arena_list.c @@ -0,0 +1,76 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include +#include "bpf_experimental.h" + +struct { + __uint(type, BPF_MAP_TYPE_ARENA); + __uint(map_flags, BPF_F_MMAPABLE); + __uint(max_entries, 1000); /* number of pages */ + __ulong(map_extra, 2ull << 44); /* start of mmap() region */ +} arena SEC(".maps"); + +#include "bpf_arena_alloc.h" +#include "bpf_arena_list.h" + +struct elem { + struct arena_list_node node; + __u64 value; +}; + +struct arena_list_head __arena *list_head; +int list_sum; +int cnt; +bool skip = false; + +long __arena arena_sum; +int __arena test_val = 1; +struct arena_list_head __arena global_head; + +SEC("syscall") +int arena_list_add(void *ctx) +{ +#ifdef __BPF_FEATURE_ARENA_CAST + __u64 i; + + list_head = &global_head; + + bpf_for(i, 0, cnt) { + struct elem __arena *n = bpf_alloc(sizeof(*n)); + + test_val++; + n->value = i; + arena_sum += i; + list_add_head(&n->node, list_head); + } +#else + skip = true; +#endif + return 0; +} + +SEC("syscall") +int arena_list_del(void *ctx) +{ +#ifdef __BPF_FEATURE_ARENA_CAST + struct elem __arena *n; + int sum = 0; + + arena_sum = 0; + list_for_each_entry(n, list_head, node) { + sum += n->value; + arena_sum += n->value; + list_del(&n->node); + bpf_free(n); + } + list_sum = sum; +#else + skip = true; +#endif + return 0; +} + +char _license[] SEC("license") = "GPL"; From patchwork Fri Feb 9 04:06:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550839 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B150A5663 for ; Fri, 9 Feb 2024 04:07:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451656; cv=none; b=lKkm3Xw2CcCL/R6w9t8RCGsgkNnQYYKIJoFL3O9qJ6WfcsW0Zso77HfFNhmr9n/8LF4nGSI6/BxYgopEtfvxZrMMiCKW/zrSANcWRnK5MBBgR1nwwZj5nl1pmB9jXOOCFYyncvUr3DqsCJQQpbPXUczNHcxN/GEF/A0Wvg0YtcE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451656; c=relaxed/simple; bh=7F+cBTB+Lz+iwrFci5SzkqudP1jNJrXGzpkIb1Egtt4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LYh+sTw3ci/N4kKyp3CCOL7SmmP9fJuZTwo/kB0DGpboQUlv937TNmDph2C+V/slwxqek8YazjsXZrU6W8/WBIiW5tSwGsEVqWuoeh5EHBLr3nqxCkXDV5GmiKlbxxAIXUgJ7HulWt3exHbhEw9UbTFM6jSsVZbjnRuTWg495EA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LfgR3Fkj; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LfgR3Fkj" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-1d95d67ff45so3996405ad.2 for ; Thu, 08 Feb 2024 20:07:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451653; x=1708056453; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ax1i6Xx3KClBMGyRsPQIF9NlJODAQM+auENr+G97oYA=; b=LfgR3FkjClh2tydUCXToWYQmM/SlJJ+SnwkQRgzZMMkqlXJespOQ11MAL8leVwmL/s CZQzhhgb+u2LZ4fJU/4LOAsknzdtDeetx2PsUCRZudLUc597Xn//xw4HZigLrrYeYlR4 emc6E9Sf31yn4zu/BAJVv5Ll6aViT4ph03s73NbkiIyy4V5L4CXlCu+0e449zKK++Xv7 w51AMkQT79k+p0eyXXo6McxOHkiCEk44gxTVXMth6UZ/eqLbY1E4QUWNgqm7QLz2qPyw BB0Wuf1UK3GSISyHlXITRUpl5FHxgYNjOggxjQtuJOy8YMRbJuGs36UiLgf9QXCskA9D hYrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451653; x=1708056453; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ax1i6Xx3KClBMGyRsPQIF9NlJODAQM+auENr+G97oYA=; b=P+Rs4nItdoJf6FZUI5aZc6UgAjbG4d+lhL8py64fS1t0vV72DRiEXBHU0Kn0UCUxzK 9K1yNgTG/xnpD0XLTTQ2Papen2//i85pL7aMlD7We/aWgn6tvEE5q06zYyNJrds4YuyN +rShF3YwG+LIN9MWShkKUIkF9aurxpL14l37nlmUp0WPnFYsWcV41LRQN9oQZS1rBn3t GE3M8nxRDGh13bs6OLS2un1zo5fri0Qtr40E8aaCmCJs0u0O7gfjoU8SNq8U4U+h+yF4 xIxOHIAdMBbNPacrjaGhJQ8QYJkj70dM4l8YfRE4/9nuWqop5KvgO32zpU6KzeiyXtX5 nwiA== X-Gm-Message-State: AOJu0YzCnViAOFiboMdow46IUTJ7SVw4eMpPxVWO+R0SQZRKFhMJjQqw 28zYzJKOyT277lgVVLNHpxM+Um6X8DtAumYgqoba/8kortup5rVxD2w9yhlG X-Google-Smtp-Source: AGHT+IEGpYswlstUcGLeCwP8YO/u0UAf0+3x8WGh941h8RI1JhGiemicOLthwo5sUBqPjsP/3SF5Yw== X-Received: by 2002:a17:902:ec86:b0:1d8:b798:dfe3 with SMTP id x6-20020a170902ec8600b001d8b798dfe3mr539249plg.43.1707451653406; Thu, 08 Feb 2024 20:07:33 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVs1Kr+GkxG5gGSe9IAqpYlusa71Kdv6s5vkhdFG/fLIWePlY2qWKxLMvnRUrNz4H/DAJGLF26mTDtKKjD8WWDBIctDQZzbsST9F4Ln+bPNO0wK+o8/lgyIJDSDX9j0UVSTG00H0rEVqkxYJfTYelhcv9T3cgo6Q3OyKabdF+pCiDx0TAFrFONzZioUBqDHvXhS2oIgVjsslE6CRoZx9oQ36cxG82cPnMKVzWA3lxyVdGIsESj7DqQMgkA+fPZOW4nam9SwLlwJAN1KWtzKdp0P8+g79UkteFT1wAcUGkp8s/fBeG9eZbxbKgPEzBihaEPIHFM4epKG+jmwpn4Na/ARvgwu54fm4mUdB5o9tfeL1v7IVfQwXQ== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id q13-20020a170902a3cd00b001d963d963aasm561477plb.308.2024.02.08.20.07.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:33 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 19/20] selftests/bpf: Add bpf_arena_htab test. Date: Thu, 8 Feb 2024 20:06:07 -0800 Message-Id: <20240209040608.98927-20-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov bpf_arena_htab.h - hash table implemented as bpf program Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/bpf_arena_htab.h | 100 ++++++++++++++++++ .../selftests/bpf/prog_tests/arena_htab.c | 88 +++++++++++++++ .../testing/selftests/bpf/progs/arena_htab.c | 46 ++++++++ .../selftests/bpf/progs/arena_htab_asm.c | 5 + 4 files changed, 239 insertions(+) create mode 100644 tools/testing/selftests/bpf/bpf_arena_htab.h create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_htab.c create mode 100644 tools/testing/selftests/bpf/progs/arena_htab.c create mode 100644 tools/testing/selftests/bpf/progs/arena_htab_asm.c diff --git a/tools/testing/selftests/bpf/bpf_arena_htab.h b/tools/testing/selftests/bpf/bpf_arena_htab.h new file mode 100644 index 000000000000..acc01a876668 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_htab.h @@ -0,0 +1,100 @@ +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#pragma once +#include +#include "bpf_arena_alloc.h" +#include "bpf_arena_list.h" + +struct htab_bucket { + struct arena_list_head head; +}; +typedef struct htab_bucket __arena htab_bucket_t; + +struct htab { + htab_bucket_t *buckets; + int n_buckets; +}; +typedef struct htab __arena htab_t; + +static inline htab_bucket_t *__select_bucket(htab_t *htab, __u32 hash) +{ + htab_bucket_t *b = htab->buckets; + + cast_kern(b); + return &b[hash & (htab->n_buckets - 1)]; +} + +static inline arena_list_head_t *select_bucket(htab_t *htab, __u32 hash) +{ + return &__select_bucket(htab, hash)->head; +} + +struct hashtab_elem { + int hash; + int key; + int value; + struct arena_list_node hash_node; +}; +typedef struct hashtab_elem __arena hashtab_elem_t; + +static hashtab_elem_t *lookup_elem_raw(arena_list_head_t *head, __u32 hash, int key) +{ + hashtab_elem_t *l; + + list_for_each_entry(l, head, hash_node) + if (l->hash == hash && l->key == key) + return l; + + return NULL; +} + +static int htab_hash(int key) +{ + return key; +} + +__weak int htab_lookup_elem(htab_t *htab __arg_arena, int key) +{ + hashtab_elem_t *l_old; + arena_list_head_t *head; + + cast_kern(htab); + head = select_bucket(htab, key); + l_old = lookup_elem_raw(head, htab_hash(key), key); + if (l_old) + return l_old->value; + return 0; +} + +__weak int htab_update_elem(htab_t *htab __arg_arena, int key, int value) +{ + hashtab_elem_t *l_new = NULL, *l_old; + arena_list_head_t *head; + + cast_kern(htab); + head = select_bucket(htab, key); + l_old = lookup_elem_raw(head, htab_hash(key), key); + + l_new = bpf_alloc(sizeof(*l_new)); + if (!l_new) + return -ENOMEM; + l_new->key = key; + l_new->hash = htab_hash(key); + l_new->value = value; + + list_add_head(&l_new->hash_node, head); + if (l_old) { + list_del(&l_old->hash_node); + bpf_free(l_old); + } + return 0; +} + +void htab_init(htab_t *htab) +{ + void __arena *buckets = bpf_arena_alloc_pages(&arena, NULL, 2, NUMA_NO_NODE, 0); + + cast_user(buckets); + htab->buckets = buckets; + htab->n_buckets = 2 * PAGE_SIZE / sizeof(struct htab_bucket); +} diff --git a/tools/testing/selftests/bpf/prog_tests/arena_htab.c b/tools/testing/selftests/bpf/prog_tests/arena_htab.c new file mode 100644 index 000000000000..0766702de846 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/arena_htab.c @@ -0,0 +1,88 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include + +#include "arena_htab_asm.skel.h" +#include "arena_htab.skel.h" + +#define PAGE_SIZE 4096 + +#include "bpf_arena_htab.h" + +static void test_arena_htab_common(struct htab *htab) +{ + int i; + + printf("htab %p buckets %p n_buckets %d\n", htab, htab->buckets, htab->n_buckets); + ASSERT_OK_PTR(htab->buckets, "htab->buckets shouldn't be NULL"); + for (i = 0; htab->buckets && i < 16; i += 4) { + /* + * Walk htab buckets and link lists since all pointers are correct, + * though they were written by bpf program. + */ + int val = htab_lookup_elem(htab, i); + + ASSERT_EQ(i, val, "key == value"); + } +} + +static void test_arena_htab_llvm(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts); + struct arena_htab *skel; + struct htab *htab; + size_t arena_sz; + void *area; + int ret; + + skel = arena_htab__open_and_load(); + if (!ASSERT_OK_PTR(skel, "arena_htab__open_and_load")) + return; + + area = bpf_map__initial_value(skel->maps.arena, &arena_sz); + /* fault-in a page with pgoff == 0 as sanity check */ + *(volatile int *)area = 0x55aa; + + /* bpf prog will allocate more pages */ + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_htab_llvm), &opts); + ASSERT_OK(ret, "ret"); + ASSERT_OK(opts.retval, "retval"); + if (skel->bss->skip) { + printf("%s:SKIP:compiler doesn't support arena_cast\n", __func__); + test__skip(); + goto out; + } + htab = skel->bss->htab_for_user; + test_arena_htab_common(htab); +out: + arena_htab__destroy(skel); +} + +static void test_arena_htab_asm(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts); + struct arena_htab_asm *skel; + struct htab *htab; + int ret; + + skel = arena_htab_asm__open_and_load(); + if (!ASSERT_OK_PTR(skel, "arena_htab_asm__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.arena_htab_asm), &opts); + ASSERT_OK(ret, "ret"); + ASSERT_OK(opts.retval, "retval"); + htab = skel->bss->htab_for_user; + test_arena_htab_common(htab); + arena_htab_asm__destroy(skel); +} + +void test_arena_htab(void) +{ + if (test__start_subtest("arena_htab_llvm")) + test_arena_htab_llvm(); + if (test__start_subtest("arena_htab_asm")) + test_arena_htab_asm(); +} diff --git a/tools/testing/selftests/bpf/progs/arena_htab.c b/tools/testing/selftests/bpf/progs/arena_htab.c new file mode 100644 index 000000000000..441fc502312f --- /dev/null +++ b/tools/testing/selftests/bpf/progs/arena_htab.c @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include +#include "bpf_experimental.h" + +struct { + __uint(type, BPF_MAP_TYPE_ARENA); + __uint(map_flags, BPF_F_MMAPABLE); + __uint(max_entries, 100); /* number of pages */ +} arena SEC(".maps"); + +#include "bpf_arena_htab.h" + +void __arena *htab_for_user; +bool skip = false; + +SEC("syscall") +int arena_htab_llvm(void *ctx) +{ +#if defined(__BPF_FEATURE_ARENA_CAST) || defined(BPF_ARENA_FORCE_ASM) + struct htab __arena *htab; + __u64 i; + + htab = bpf_alloc(sizeof(*htab)); + cast_kern(htab); + htab_init(htab); + + /* first run. No old elems in the table */ + bpf_for(i, 0, 1000) + htab_update_elem(htab, i, i); + + /* should replace all elems with new ones */ + bpf_for(i, 0, 1000) + htab_update_elem(htab, i, i); + cast_user(htab); + htab_for_user = htab; +#else + skip = true; +#endif + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/arena_htab_asm.c b/tools/testing/selftests/bpf/progs/arena_htab_asm.c new file mode 100644 index 000000000000..6cd70ea12f0d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/arena_htab_asm.c @@ -0,0 +1,5 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#define BPF_ARENA_FORCE_ASM +#define arena_htab_llvm arena_htab_asm +#include "arena_htab.c" From patchwork Fri Feb 9 04:06:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 13550840 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94D5E5224 for ; Fri, 9 Feb 2024 04:07:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451660; cv=none; b=XI2Tgn4AjmIZZXOK4RYHQfF4HLpgi22RbXv8O29AByxL9IvYYmwV9OaPIMzLGZDZacCvce1a5obdgyzRghsu4Xwk+Qn19ZoIYQ6a2EYAMfFy0ki76MYSZdD6RigV9E7DMYflZPLAuxOiD6N/ku9cwTiIe8j/H071jlKYEOaH8Yw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707451660; c=relaxed/simple; bh=UVrTPwmI3HDH2ra8Q148lMPrZTmq/PMqMLLRSQkg6qg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tWJaHljX44T+8gdJjVEDKEDkrSn6MiduCSaGPuKy9PHPDWxvfYlU/XqlOWYH/WTOtB7uCs19Gu9OEUWKP1BKZLK3YCuqJqfEtnz+/zhJ2DqU9XOyhiaYz7L8EPTkuvh+newlplah0+QodfccmAg+zBEEkK4IWEV1KdroV9xnmXw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QXj6CYRh; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QXj6CYRh" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1d93edfa76dso4388935ad.1 for ; Thu, 08 Feb 2024 20:07:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707451657; x=1708056457; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6ll3cFrEr9hbCQcocHDOoHYlUKQul/zqkLdXGSoHE9U=; b=QXj6CYRhRseVncUdXaHkxG9QnyzAbZoiYtRdlULyRI7Vd1KQYd2af8/qnBKoo160Pk bkTDMIgQQhNqE97Vw1g7Me0PWaipmtMb3iI56P9DYxJP78GwvUtKsXVFMd65NMLudvh+ gsUX3iwXaioHKa4QyyqXTdLNPtpbTC3GDaf9F001zzlBI8ey+cuinZ90Q5W5gIQHuY+b 6bN70sTth6KSy98a72OH9WjOTmzWx4SNlkKmtPMxvbj272pqtQu1Ll8yNaHY+HEvncXN s2xbjnZdFsdkKVp5mz9kSZjGF+ogpDfPdtU3Ei/SwgQyHWJY3TpyiNQuy+5ObaSrDsRp HTCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707451657; x=1708056457; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6ll3cFrEr9hbCQcocHDOoHYlUKQul/zqkLdXGSoHE9U=; b=cUA+ucg+o1AKnUKuHmca6qIs3o3Fj26Oc6j+SBvyF8+I+xSmI+I2ACYdBwUVGAaJ7j X09El+TxHmC3DNl6IyLLASVG01mABJBkZDQZAbFRHz64rxUEEMzXw0skYu6CGGVCSy6n MHmB3Wywnqo/UmCCYRVKwPdYxvdhU+i+Y/HVMjTMnpKNqa6pnbE9Daun8NjSZWgC5oAK 6B3K/vTE0XSJ/1nEB8kjUphVp0C89FA2ABS3ZwrCumvUa+2i/4JufvB0mb2SD/ZViJyo WD2GoNQAlA88cCrcdGaD4G06bKXWB39cw6c7UszTr83nhkXdLzWGNq7thvb7nGke/oai HssA== X-Gm-Message-State: AOJu0YwgZicvQXKjuLVY0feu6e9hmvnNpzZ657DMDhPfW69Uuz+91/Bj VVTufQ4q2N7d85/GFmQZr15Sz0mrjvO7436P3rTat1kcOsRO055zBTm5u0qy X-Google-Smtp-Source: AGHT+IGkOiYtxyH9h4Hxmx0HmJv/rugkioUJ3yihq6NNOSDK+ZXse18c8Q3zRElKlWdCYTq+E6Xxsw== X-Received: by 2002:a17:903:2a90:b0:1d6:f185:f13b with SMTP id lv16-20020a1709032a9000b001d6f185f13bmr618327plb.17.1707451657685; Thu, 08 Feb 2024 20:07:37 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXDiS092+UkDyF4aEZPoblkWtOoQh1U8TGTdcCSduROIhOu+T04/eDIqGY6NzkwsOOzzZi0ztZBRPjETZoAFTsBxT2B3gdn7EYY0CrHwnxfASTjykvyk4yUPd99WN1zKGqA6dczojNPrJ0u4cqZeHkVclQfcV0afn+ij1NLT26mTYwX8tDsBlFcO3vI1tfpczHFjzQ5zEfAj3sIJaM7Ybtr2QS+kHiCwRn1K+R9vmundeWwOcBO4F+N9D7ITsVvn4Z59MKMI8lC9bdSbwy4VqUhhg30uZJOtPtrVLHLhwPbV64heLkaKzgW5Tk+DG2lHuTnYn9hyuJbcl0Hv6xLbiFvHOEwp1ebFpnSWQc32i3o5DeLAiU5HQ== Received: from macbook-pro-49.dhcp.thefacebook.com ([2620:10d:c090:400::4:a894]) by smtp.gmail.com with ESMTPSA id l4-20020a170902d04400b001d9fcd344afsm541162pll.222.2024.02.08.20.07.35 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 08 Feb 2024 20:07:37 -0800 (PST) From: Alexei Starovoitov To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, memxor@gmail.com, eddyz87@gmail.com, tj@kernel.org, brho@google.com, hannes@cmpxchg.org, lstoakes@gmail.com, akpm@linux-foundation.org, urezki@gmail.com, hch@infradead.org, linux-mm@kvack.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 20/20] selftests/bpf: Convert simple page_frag allocator to per-cpu. Date: Thu, 8 Feb 2024 20:06:08 -0800 Message-Id: <20240209040608.98927-21-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20240209040608.98927-1-alexei.starovoitov@gmail.com> References: <20240209040608.98927-1-alexei.starovoitov@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net From: Alexei Starovoitov Convert simple page_frag allocator to per-cpu page_frag to further stress test a combination of __arena global and static variables and alloc/free from arena. Signed-off-by: Alexei Starovoitov --- tools/testing/selftests/bpf/bpf_arena_alloc.h | 23 +++++++++++++------ 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/bpf/bpf_arena_alloc.h b/tools/testing/selftests/bpf/bpf_arena_alloc.h index 0f4cb399b4c7..c27678299e0c 100644 --- a/tools/testing/selftests/bpf/bpf_arena_alloc.h +++ b/tools/testing/selftests/bpf/bpf_arena_alloc.h @@ -10,14 +10,19 @@ #define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1) #endif -void __arena *cur_page; -int cur_offset; +#ifdef __BPF__ +#define NR_CPUS (sizeof(struct cpumask) * 8) + +static void __arena * __arena page_frag_cur_page[NR_CPUS]; +static int __arena page_frag_cur_offset[NR_CPUS]; /* Simple page_frag allocator */ static inline void __arena* bpf_alloc(unsigned int size) { __u64 __arena *obj_cnt; - void __arena *page = cur_page; + __u32 cpu = bpf_get_smp_processor_id(); + void __arena *page = page_frag_cur_page[cpu]; + int __arena *cur_offset = &page_frag_cur_offset[cpu]; int offset; size = round_up(size, 8); @@ -29,8 +34,8 @@ static inline void __arena* bpf_alloc(unsigned int size) if (!page) return NULL; cast_kern(page); - cur_page = page; - cur_offset = PAGE_SIZE - 8; + page_frag_cur_page[cpu] = page; + *cur_offset = PAGE_SIZE - 8; obj_cnt = page + PAGE_SIZE - 8; *obj_cnt = 0; } else { @@ -38,12 +43,12 @@ static inline void __arena* bpf_alloc(unsigned int size) obj_cnt = page + PAGE_SIZE - 8; } - offset = cur_offset - size; + offset = *cur_offset - size; if (offset < 0) goto refill; (*obj_cnt)++; - cur_offset = offset; + *cur_offset = offset; return page + offset; } @@ -56,3 +61,7 @@ static inline void bpf_free(void __arena *addr) if (--(*obj_cnt) == 0) bpf_arena_free_pages(&arena, addr, 1); } +#else +static inline void __arena* bpf_alloc(unsigned int size) { return NULL; } +static inline void bpf_free(void __arena *addr) {} +#endif