From patchwork Thu Aug 18 22:42:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12948145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3910C00140 for ; Thu, 18 Aug 2022 23:18:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F64F8D0003; Thu, 18 Aug 2022 19:18:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A67A8D0002; Thu, 18 Aug 2022 19:18:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06E4D8D0003; Thu, 18 Aug 2022 19:18:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EAB718D0002 for ; Thu, 18 Aug 2022 19:18:18 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id BD60CA47B8 for ; Thu, 18 Aug 2022 23:18:18 +0000 (UTC) X-FDA: 79814279076.30.A398515 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf09.hostedemail.com (Postfix) with ESMTP id EEB25141244 for ; Thu, 18 Aug 2022 23:03:19 +0000 (UTC) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.5/8.17.1.5) with ESMTP id 27IKhaBH006309 for ; Thu, 18 Aug 2022 15:54:19 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0001303.ppops.net (PPS) with ESMTPS id 3j1p0rv3sy-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 18 Aug 2022 15:54:19 -0700 Received: from twshared5413.23.frc3.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Thu, 18 Aug 2022 15:54:14 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id 2C564BBF66C3; Thu, 18 Aug 2022 15:42:39 -0700 (PDT) From: Song Liu To: , CC: , , , , , , , , Song Liu Subject: [RFC 2/5] bpf: use vmalloc_exec Date: Thu, 18 Aug 2022 15:42:15 -0700 Message-ID: <20220818224218.2399791-3-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220818224218.2399791-1-song@kernel.org> References: <20220818224218.2399791-1-song@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: pAHNoqkl8i4JOyIYPjmoBDKfqPth5cai X-Proofpoint-GUID: pAHNoqkl8i4JOyIYPjmoBDKfqPth5cai X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-18_16,2022-08-18_01,2022-06-22_01 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660863800; a=rsa-sha256; cv=none; b=SwploYS5AtIt9EV77RCKk30Y67c/I+8A7aRlgFvxCd2oQzhb8aeGZXXdNvY6aUKOx5TQhh C+fVgn9RZKYM8KNSFu6w74A9O91zGa0BaLLzGu8wpe8G1SXSspJ9VpYJohloeediPsdl0+ fvknPpO8qzFH6UwDU3cUAVxG2HNa1qk= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf09.hostedemail.com: domain of "prvs=9229e92ca3=songliubraving@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=9229e92ca3=songliubraving@fb.com" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660863800; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uwWHJ4ZcDX4SP5HmDGOK5+ApqhqPFsZopKdG5F16ah0=; b=HPHoxoVG1KjLcsv54Thm0XYp9CjkPE04+e2f6UuHLeEqfWeVV53jr4bswvrMZuajgFtsPI HUMiHg+u2mPSiF/Qh7vFdQcO1e3tJUW7fqPVUa9z2HUdLIcx+5aItLygExIdL+bR18YfMP YOgQjB4neBT8QrrCBT1+DeDIM53Mrf4= Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf09.hostedemail.com: domain of "prvs=9229e92ca3=songliubraving@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=9229e92ca3=songliubraving@fb.com" X-Rspam-User: X-Stat-Signature: 4ofwhhwxkiuqrju3hw6qc7s4kk19m3ub X-Rspamd-Queue-Id: EEB25141244 X-Rspamd-Server: rspam03 X-HE-Tag: 1660863799-697768 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Use vmalloc_exec and vfree_exec instead of bpf_prog_pack_[alloc|free]. --- kernel/bpf/core.c | 155 +++------------------------------------------- 1 file changed, 10 insertions(+), 145 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index c1e10d088dbb..834cce7e1ef2 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -806,144 +806,6 @@ int bpf_jit_add_poke_descriptor(struct bpf_prog *prog, return slot; } -/* - * BPF program pack allocator. - * - * Most BPF programs are pretty small. Allocating a hole page for each - * program is sometime a waste. Many small bpf program also adds pressure - * to instruction TLB. To solve this issue, we introduce a BPF program pack - * allocator. The prog_pack allocator uses HPAGE_PMD_SIZE page (2MB on x86) - * to host BPF programs. - */ -#define BPF_PROG_CHUNK_SHIFT 6 -#define BPF_PROG_CHUNK_SIZE (1 << BPF_PROG_CHUNK_SHIFT) -#define BPF_PROG_CHUNK_MASK (~(BPF_PROG_CHUNK_SIZE - 1)) - -struct bpf_prog_pack { - struct list_head list; - void *ptr; - unsigned long bitmap[]; -}; - -#define BPF_PROG_SIZE_TO_NBITS(size) (round_up(size, BPF_PROG_CHUNK_SIZE) / BPF_PROG_CHUNK_SIZE) - -static DEFINE_MUTEX(pack_mutex); -static LIST_HEAD(pack_list); - -/* PMD_SIZE is not available in some special config, e.g. ARCH=arm with - * CONFIG_MMU=n. Use PAGE_SIZE in these cases. - */ -#ifdef PMD_SIZE -#define BPF_PROG_PACK_SIZE (PMD_SIZE * num_possible_nodes()) -#else -#define BPF_PROG_PACK_SIZE PAGE_SIZE -#endif - -#define BPF_PROG_CHUNK_COUNT (BPF_PROG_PACK_SIZE / BPF_PROG_CHUNK_SIZE) - -static struct bpf_prog_pack *alloc_new_pack(bpf_jit_fill_hole_t bpf_fill_ill_insns) -{ - struct bpf_prog_pack *pack; - - pack = kzalloc(struct_size(pack, bitmap, BITS_TO_LONGS(BPF_PROG_CHUNK_COUNT)), - GFP_KERNEL); - if (!pack) - return NULL; - pack->ptr = module_alloc(BPF_PROG_PACK_SIZE); - if (!pack->ptr) { - kfree(pack); - return NULL; - } - bpf_fill_ill_insns(pack->ptr, BPF_PROG_PACK_SIZE); - bitmap_zero(pack->bitmap, BPF_PROG_PACK_SIZE / BPF_PROG_CHUNK_SIZE); - list_add_tail(&pack->list, &pack_list); - - set_vm_flush_reset_perms(pack->ptr); - set_memory_ro((unsigned long)pack->ptr, BPF_PROG_PACK_SIZE / PAGE_SIZE); - set_memory_x((unsigned long)pack->ptr, BPF_PROG_PACK_SIZE / PAGE_SIZE); - return pack; -} - -static void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insns) -{ - unsigned int nbits = BPF_PROG_SIZE_TO_NBITS(size); - struct bpf_prog_pack *pack; - unsigned long pos; - void *ptr = NULL; - - mutex_lock(&pack_mutex); - if (size > BPF_PROG_PACK_SIZE) { - size = round_up(size, PAGE_SIZE); - ptr = module_alloc(size); - if (ptr) { - bpf_fill_ill_insns(ptr, size); - set_vm_flush_reset_perms(ptr); - set_memory_ro((unsigned long)ptr, size / PAGE_SIZE); - set_memory_x((unsigned long)ptr, size / PAGE_SIZE); - } - goto out; - } - list_for_each_entry(pack, &pack_list, list) { - pos = bitmap_find_next_zero_area(pack->bitmap, BPF_PROG_CHUNK_COUNT, 0, - nbits, 0); - if (pos < BPF_PROG_CHUNK_COUNT) - goto found_free_area; - } - - pack = alloc_new_pack(bpf_fill_ill_insns); - if (!pack) - goto out; - - pos = 0; - -found_free_area: - bitmap_set(pack->bitmap, pos, nbits); - ptr = (void *)(pack->ptr) + (pos << BPF_PROG_CHUNK_SHIFT); - -out: - mutex_unlock(&pack_mutex); - return ptr; -} - -static void bpf_prog_pack_free(struct bpf_binary_header *hdr) -{ - struct bpf_prog_pack *pack = NULL, *tmp; - unsigned int nbits; - unsigned long pos; - - mutex_lock(&pack_mutex); - if (hdr->size > BPF_PROG_PACK_SIZE) { - module_memfree(hdr); - goto out; - } - - list_for_each_entry(tmp, &pack_list, list) { - if ((void *)hdr >= tmp->ptr && (tmp->ptr + BPF_PROG_PACK_SIZE) > (void *)hdr) { - pack = tmp; - break; - } - } - - if (WARN_ONCE(!pack, "bpf_prog_pack bug\n")) - goto out; - - nbits = BPF_PROG_SIZE_TO_NBITS(hdr->size); - pos = ((unsigned long)hdr - (unsigned long)pack->ptr) >> BPF_PROG_CHUNK_SHIFT; - - WARN_ONCE(bpf_arch_text_invalidate(hdr, hdr->size), - "bpf_prog_pack bug: missing bpf_arch_text_invalidate?\n"); - - bitmap_clear(pack->bitmap, pos, nbits); - if (bitmap_find_next_zero_area(pack->bitmap, BPF_PROG_CHUNK_COUNT, 0, - BPF_PROG_CHUNK_COUNT, 0) == 0) { - list_del(&pack->list); - module_memfree(pack->ptr); - kfree(pack); - } -out: - mutex_unlock(&pack_mutex); -} - static atomic_long_t bpf_jit_current; /* Can be overridden by an arch's JIT compiler if it has a custom, @@ -1043,6 +905,9 @@ void bpf_jit_binary_free(struct bpf_binary_header *hdr) bpf_jit_uncharge_modmem(size); } +#define BPF_PROG_EXEC_ALIGN 64 +#define BPF_PROG_EXEC_MASK (~(BPF_PROG_EXEC_ALIGN - 1)) + /* Allocate jit binary from bpf_prog_pack allocator. * Since the allocated memory is RO+X, the JIT engine cannot write directly * to the memory. To solve this problem, a RW buffer is also allocated at @@ -1065,11 +930,11 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 **image_ptr, alignment > BPF_IMAGE_ALIGNMENT); /* add 16 bytes for a random section of illegal instructions */ - size = round_up(proglen + sizeof(*ro_header) + 16, BPF_PROG_CHUNK_SIZE); + size = round_up(proglen + sizeof(*ro_header) + 16, BPF_PROG_EXEC_ALIGN); if (bpf_jit_charge_modmem(size)) return NULL; - ro_header = bpf_prog_pack_alloc(size, bpf_fill_ill_insns); + ro_header = vmalloc_exec(size, BPF_PROG_EXEC_ALIGN); if (!ro_header) { bpf_jit_uncharge_modmem(size); return NULL; @@ -1078,7 +943,7 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 **image_ptr, *rw_header = kvmalloc(size, GFP_KERNEL); if (!*rw_header) { bpf_arch_text_copy(&ro_header->size, &size, sizeof(size)); - bpf_prog_pack_free(ro_header); + vfree_exec(ro_header); bpf_jit_uncharge_modmem(size); return NULL; } @@ -1088,7 +953,7 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 **image_ptr, (*rw_header)->size = size; hole = min_t(unsigned int, size - (proglen + sizeof(*ro_header)), - BPF_PROG_CHUNK_SIZE - sizeof(*ro_header)); + BPF_PROG_EXEC_ALIGN - sizeof(*ro_header)); start = (get_random_int() % hole) & ~(alignment - 1); *image_ptr = &ro_header->image[start]; @@ -1109,7 +974,7 @@ int bpf_jit_binary_pack_finalize(struct bpf_prog *prog, kvfree(rw_header); if (IS_ERR(ptr)) { - bpf_prog_pack_free(ro_header); + vfree_exec(ro_header); return PTR_ERR(ptr); } return 0; @@ -1130,7 +995,7 @@ void bpf_jit_binary_pack_free(struct bpf_binary_header *ro_header, { u32 size = ro_header->size; - bpf_prog_pack_free(ro_header); + vfree_exec(ro_header); kvfree(rw_header); bpf_jit_uncharge_modmem(size); } @@ -1141,7 +1006,7 @@ bpf_jit_binary_pack_hdr(const struct bpf_prog *fp) unsigned long real_start = (unsigned long)fp->bpf_func; unsigned long addr; - addr = real_start & BPF_PROG_CHUNK_MASK; + addr = real_start & BPF_PROG_EXEC_MASK; return (void *)addr; }