From patchwork Fri May 20 03:15:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12856258 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B78A0C433FE for ; Fri, 20 May 2022 03:18:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345146AbiETDSx convert rfc822-to-8bit (ORCPT ); Thu, 19 May 2022 23:18:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345149AbiETDSv (ORCPT ); Thu, 19 May 2022 23:18:51 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D7A914A25B for ; Thu, 19 May 2022 20:18:50 -0700 (PDT) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24K0467o029826 for ; Thu, 19 May 2022 20:18:50 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3g604bgqw6-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 19 May 2022 20:18:49 -0700 Received: from twshared14818.18.frc3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:45 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id 468207D8C2E6; Thu, 19 May 2022 20:16:00 -0700 (PDT) From: Song Liu To: , , CC: , , , , , , , Song Liu Subject: [PATCH v3 bpf-next 1/8] bpf: fill new bpf_prog_pack with illegal instructions Date: Thu, 19 May 2022 20:15:41 -0700 Message-ID: <20220520031548.338934-2-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220520031548.338934-1-song@kernel.org> References: <20220520031548.338934-1-song@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: psI6CLzSrsEjqOwHeizjsG3iP_CqRiWs X-Proofpoint-GUID: psI6CLzSrsEjqOwHeizjsG3iP_CqRiWs X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-20_01,2022-05-19_03,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net bpf_prog_pack enables sharing huge pages among multiple BPF programs. These pages are marked as executable before the JIT engine fill it with BPF programs. To make these pages safe, fill the hole bpf_prog_pack with illegal instructions before making it executable. Fixes: 57631054fae6 ("bpf: Introduce bpf_prog_pack allocator") Fixes: 33c9805860e5 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]") Signed-off-by: Song Liu --- kernel/bpf/core.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 9cc91f0f3115..2d0c9d4696ad 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -873,7 +873,7 @@ static size_t select_bpf_prog_pack_size(void) return size; } -static struct bpf_prog_pack *alloc_new_pack(void) +static struct bpf_prog_pack *alloc_new_pack(bpf_jit_fill_hole_t bpf_fill_ill_insns) { struct bpf_prog_pack *pack; @@ -886,6 +886,7 @@ static struct bpf_prog_pack *alloc_new_pack(void) kfree(pack); return NULL; } + bpf_fill_ill_insns(pack->ptr, bpf_prog_pack_size); bitmap_zero(pack->bitmap, bpf_prog_pack_size / BPF_PROG_CHUNK_SIZE); list_add_tail(&pack->list, &pack_list); @@ -895,7 +896,7 @@ static struct bpf_prog_pack *alloc_new_pack(void) return pack; } -static void *bpf_prog_pack_alloc(u32 size) +static void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insns) { unsigned int nbits = BPF_PROG_SIZE_TO_NBITS(size); struct bpf_prog_pack *pack; @@ -910,6 +911,7 @@ static void *bpf_prog_pack_alloc(u32 size) size = round_up(size, PAGE_SIZE); ptr = module_alloc(size); if (ptr) { + bpf_fill_ill_insns(ptr, size); set_vm_flush_reset_perms(ptr); set_memory_ro((unsigned long)ptr, size / PAGE_SIZE); set_memory_x((unsigned long)ptr, size / PAGE_SIZE); @@ -923,7 +925,7 @@ static void *bpf_prog_pack_alloc(u32 size) goto found_free_area; } - pack = alloc_new_pack(); + pack = alloc_new_pack(bpf_fill_ill_insns); if (!pack) goto out; @@ -1102,7 +1104,7 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 **image_ptr, if (bpf_jit_charge_modmem(size)) return NULL; - ro_header = bpf_prog_pack_alloc(size); + ro_header = bpf_prog_pack_alloc(size, bpf_fill_ill_insns); if (!ro_header) { bpf_jit_uncharge_modmem(size); return NULL; From patchwork Fri May 20 03:15:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12856260 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48FD7C433FE for ; Fri, 20 May 2022 03:18:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345155AbiETDS4 convert rfc822-to-8bit (ORCPT ); Thu, 19 May 2022 23:18:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345148AbiETDSy (ORCPT ); Thu, 19 May 2022 23:18:54 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8689F5D5EA for ; Thu, 19 May 2022 20:18:53 -0700 (PDT) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24K0467u029826 for ; Thu, 19 May 2022 20:18:53 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3g604bgqw6-12 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 19 May 2022 20:18:53 -0700 Received: from twshared14818.18.frc3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:46 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id 6709E7D8C2E9; Thu, 19 May 2022 20:16:01 -0700 (PDT) From: Song Liu To: , , CC: , , , , , , , Song Liu Subject: [PATCH v3 bpf-next 2/8] x86/alternative: introduce text_poke_set Date: Thu, 19 May 2022 20:15:42 -0700 Message-ID: <20220520031548.338934-3-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220520031548.338934-1-song@kernel.org> References: <20220520031548.338934-1-song@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: cFM28PAH73mfcui5jVeD4qsg2_aH4IaA X-Proofpoint-GUID: cFM28PAH73mfcui5jVeD4qsg2_aH4IaA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-20_01,2022-05-19_03,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Introduce a memset like API for text_poke. This will be used to fill the unused RX memory with illegal instructions. Suggested-by: Peter Zijlstra (Intel) Acked-by: Peter Zijlstra (Intel) Signed-off-by: Song Liu --- arch/x86/include/asm/text-patching.h | 1 + arch/x86/kernel/alternative.c | 67 +++++++++++++++++++++++----- 2 files changed, 58 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h index d20ab0921480..1cc15528ce29 100644 --- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h @@ -45,6 +45,7 @@ extern void *text_poke(void *addr, const void *opcode, size_t len); extern void text_poke_sync(void); extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len); extern void *text_poke_copy(void *addr, const void *opcode, size_t len); +extern void *text_poke_set(void *addr, int c, size_t len); extern int poke_int3_handler(struct pt_regs *regs); extern void text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate); diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index d374cb3cf024..7563b5bc8328 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -994,7 +994,21 @@ static inline void unuse_temporary_mm(temp_mm_state_t prev_state) __ro_after_init struct mm_struct *poking_mm; __ro_after_init unsigned long poking_addr; -static void *__text_poke(void *addr, const void *opcode, size_t len) +static void text_poke_memcpy(void *dst, const void *src, size_t len) +{ + memcpy(dst, src, len); +} + +static void text_poke_memset(void *dst, const void *src, size_t len) +{ + int c = *(const int *)src; + + memset(dst, c, len); +} + +typedef void text_poke_f(void *dst, const void *src, size_t len); + +static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t len) { bool cross_page_boundary = offset_in_page(addr) + len > PAGE_SIZE; struct page *pages[2] = {NULL}; @@ -1059,7 +1073,7 @@ static void *__text_poke(void *addr, const void *opcode, size_t len) prev = use_temporary_mm(poking_mm); kasan_disable_current(); - memcpy((u8 *)poking_addr + offset_in_page(addr), opcode, len); + func((u8 *)poking_addr + offset_in_page(addr), src, len); kasan_enable_current(); /* @@ -1087,11 +1101,13 @@ static void *__text_poke(void *addr, const void *opcode, size_t len) (cross_page_boundary ? 2 : 1) * PAGE_SIZE, PAGE_SHIFT, false); - /* - * If the text does not match what we just wrote then something is - * fundamentally screwy; there's nothing we can really do about that. - */ - BUG_ON(memcmp(addr, opcode, len)); + if (func == text_poke_memcpy) { + /* + * If the text does not match what we just wrote then something is + * fundamentally screwy; there's nothing we can really do about that. + */ + BUG_ON(memcmp(addr, src, len)); + } local_irq_restore(flags); pte_unmap_unlock(ptep, ptl); @@ -1118,7 +1134,7 @@ void *text_poke(void *addr, const void *opcode, size_t len) { lockdep_assert_held(&text_mutex); - return __text_poke(addr, opcode, len); + return __text_poke(text_poke_memcpy, addr, opcode, len); } /** @@ -1137,7 +1153,7 @@ void *text_poke(void *addr, const void *opcode, size_t len) */ void *text_poke_kgdb(void *addr, const void *opcode, size_t len) { - return __text_poke(addr, opcode, len); + return __text_poke(text_poke_memcpy, addr, opcode, len); } /** @@ -1167,7 +1183,38 @@ void *text_poke_copy(void *addr, const void *opcode, size_t len) s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched); - __text_poke((void *)ptr, opcode + patched, s); + __text_poke(text_poke_memcpy, (void *)ptr, opcode + patched, s); + patched += s; + } + mutex_unlock(&text_mutex); + return addr; +} + +/** + * text_poke_set - memset into (an unused part of) RX memory + * @addr: address to modify + * @c: the byte to fill the area with + * @len: length to copy, could be more than 2x PAGE_SIZE + * + * This is useful to overwrite unused regions of RX memory with illegal + * instructions. + */ +void *text_poke_set(void *addr, int c, size_t len) +{ + unsigned long start = (unsigned long)addr; + size_t patched = 0; + + if (WARN_ON_ONCE(core_kernel_text(start))) + return NULL; + + mutex_lock(&text_mutex); + while (patched < len) { + unsigned long ptr = start + patched; + size_t s; + + s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched); + + __text_poke(text_poke_memset, (void *)ptr, (void *)&c, s); patched += s; } mutex_unlock(&text_mutex); From patchwork Fri May 20 03:15:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12856257 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 266EDC43217 for ; Fri, 20 May 2022 03:18:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345142AbiETDSx convert rfc822-to-8bit (ORCPT ); Thu, 19 May 2022 23:18:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345155AbiETDSv (ORCPT ); Thu, 19 May 2022 23:18:51 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8ED6514A26A for ; Thu, 19 May 2022 20:18:50 -0700 (PDT) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.5/8.17.1.5) with ESMTP id 24JLCXpd020396 for ; Thu, 19 May 2022 20:18:49 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0001303.ppops.net (PPS) with ESMTPS id 3g5wkr9mn5-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 19 May 2022 20:18:49 -0700 Received: from snc-exhub201.TheFacebook.com (2620:10d:c085:21d::7) by snc-exhub204.TheFacebook.com (2620:10d:c085:21d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:48 -0700 Received: from twshared8307.18.frc3.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:47 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id B65E77D8C2F0; Thu, 19 May 2022 20:16:02 -0700 (PDT) From: Song Liu To: , , CC: , , , , , , , Song Liu Subject: [PATCH v3 bpf-next 3/8] bpf: introduce bpf_arch_text_invalidate for bpf_prog_pack Date: Thu, 19 May 2022 20:15:43 -0700 Message-ID: <20220520031548.338934-4-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220520031548.338934-1-song@kernel.org> References: <20220520031548.338934-1-song@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: fBvxR8GaWdouGJ99gFXkN1G4ncZzim7v X-Proofpoint-ORIG-GUID: fBvxR8GaWdouGJ99gFXkN1G4ncZzim7v X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-20_01,2022-05-19_03,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Introduce bpf_arch_text_invalidate and use it to fill unused part of the bpf_prog_pack with illegal instructions when a BPF program is freed. Signed-off-by: Song Liu --- arch/x86/net/bpf_jit_comp.c | 5 +++++ include/linux/bpf.h | 1 + kernel/bpf/core.c | 8 ++++++++ 3 files changed, 14 insertions(+) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index a2b6d197c226..f298b18a9a3d 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -228,6 +228,11 @@ static void jit_fill_hole(void *area, unsigned int size) memset(area, 0xcc, size); } +int bpf_arch_text_invalidate(void *dst, size_t len) +{ + return IS_ERR_OR_NULL(text_poke_set(dst, 0xcc, len)); +} + struct jit_context { int cleanup_addr; /* Epilogue code offset */ diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c107392b0ba7..f6dfa416f892 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2364,6 +2364,7 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, void *addr1, void *addr2); void *bpf_arch_text_copy(void *dst, void *src, size_t len); +int bpf_arch_text_invalidate(void *dst, size_t len); struct btf_id_set; bool btf_id_set_contains(const struct btf_id_set *set, u32 id); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 2d0c9d4696ad..cacd8684c3c4 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -968,6 +968,9 @@ static void bpf_prog_pack_free(struct bpf_binary_header *hdr) nbits = BPF_PROG_SIZE_TO_NBITS(hdr->size); pos = ((unsigned long)hdr - (unsigned long)pack_ptr) >> BPF_PROG_CHUNK_SHIFT; + WARN_ONCE(bpf_arch_text_invalidate(hdr, hdr->size), + "bpf_prog_pack bug: missing bpf_arch_text_invalidate?\n"); + bitmap_clear(pack->bitmap, pos, nbits); if (bitmap_find_next_zero_area(pack->bitmap, bpf_prog_chunk_count(), 0, bpf_prog_chunk_count(), 0) == 0) { @@ -2740,6 +2743,11 @@ void * __weak bpf_arch_text_copy(void *dst, void *src, size_t len) return ERR_PTR(-ENOTSUPP); } +int __weak bpf_arch_text_invalidate(void *dst, size_t len) +{ + return -ENOTSUPP; +} + DEFINE_STATIC_KEY_FALSE(bpf_stats_enabled_key); EXPORT_SYMBOL(bpf_stats_enabled_key); From patchwork Fri May 20 03:15:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12856259 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5171FC433F5 for ; Fri, 20 May 2022 03:18:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344546AbiETDS4 convert rfc822-to-8bit (ORCPT ); Thu, 19 May 2022 23:18:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345145AbiETDSy (ORCPT ); Thu, 19 May 2022 23:18:54 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAC0062BE3 for ; Thu, 19 May 2022 20:18:53 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.5/8.17.1.5) with ESMTP id 24JKmXCl032187 for ; Thu, 19 May 2022 20:18:53 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3g5h5d6vrr-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 19 May 2022 20:18:52 -0700 Received: from twshared0725.22.frc3.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:50 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id 2C0A27D8C2F7; Thu, 19 May 2022 20:16:05 -0700 (PDT) From: Song Liu To: , , CC: , , , , , , , Song Liu Subject: [PATCH v3 bpf-next 5/8] bpf: use module_alloc_huge for bpf_prog_pack Date: Thu, 19 May 2022 20:15:45 -0700 Message-ID: <20220520031548.338934-6-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220520031548.338934-1-song@kernel.org> References: <20220520031548.338934-1-song@kernel.org> X-FB-Internal: Safe X-Proofpoint-GUID: 8yUOQx5CaYt6h08goPeYw1sHxUZOsjOk X-Proofpoint-ORIG-GUID: 8yUOQx5CaYt6h08goPeYw1sHxUZOsjOk X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-20_01,2022-05-19_03,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Use module_alloc_huge for bpf_prog_pack so that BPF programs sit on PMD_SIZE pages. This benefits system performance by reducing iTLB miss rate. Benchmark of a real web service workload shows this change gives another ~0.2% performance boost on top of PAGE_SIZE bpf_prog_pack (which improve system throughput by ~0.5%). Also, remove set_vm_flush_reset_perms() from alloc_new_pack() and use set_memory_[nx|rw] in bpf_prog_pack_free(). This is because VM_FLUSH_RESET_PERMS does not work with huge pages yet. [1] [1] https://lore.kernel.org/bpf/aeeeaf0b7ec63fdba55d4834d2f524d8bf05b71b.camel@intel.com/ Suggested-by: Rick Edgecombe Signed-off-by: Song Liu --- kernel/bpf/core.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index cacd8684c3c4..b64d91fcb0ba 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -857,7 +857,7 @@ static size_t select_bpf_prog_pack_size(void) void *ptr; size = BPF_HPAGE_SIZE * num_online_nodes(); - ptr = module_alloc(size); + ptr = module_alloc_huge(size); /* Test whether we can get huge pages. If not just use PAGE_SIZE * packs. @@ -881,7 +881,7 @@ static struct bpf_prog_pack *alloc_new_pack(bpf_jit_fill_hole_t bpf_fill_ill_ins GFP_KERNEL); if (!pack) return NULL; - pack->ptr = module_alloc(bpf_prog_pack_size); + pack->ptr = module_alloc_huge(bpf_prog_pack_size); if (!pack->ptr) { kfree(pack); return NULL; @@ -890,7 +890,6 @@ static struct bpf_prog_pack *alloc_new_pack(bpf_jit_fill_hole_t bpf_fill_ill_ins bitmap_zero(pack->bitmap, bpf_prog_pack_size / BPF_PROG_CHUNK_SIZE); list_add_tail(&pack->list, &pack_list); - set_vm_flush_reset_perms(pack->ptr); set_memory_ro((unsigned long)pack->ptr, bpf_prog_pack_size / PAGE_SIZE); set_memory_x((unsigned long)pack->ptr, bpf_prog_pack_size / PAGE_SIZE); return pack; @@ -909,10 +908,9 @@ static void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insn if (size > bpf_prog_pack_size) { size = round_up(size, PAGE_SIZE); - ptr = module_alloc(size); + ptr = module_alloc_huge(size); if (ptr) { bpf_fill_ill_insns(ptr, size); - set_vm_flush_reset_perms(ptr); set_memory_ro((unsigned long)ptr, size / PAGE_SIZE); set_memory_x((unsigned long)ptr, size / PAGE_SIZE); } @@ -949,6 +947,8 @@ static void bpf_prog_pack_free(struct bpf_binary_header *hdr) mutex_lock(&pack_mutex); if (hdr->size > bpf_prog_pack_size) { + set_memory_nx((unsigned long)hdr, hdr->size / PAGE_SIZE); + set_memory_rw((unsigned long)hdr, hdr->size / PAGE_SIZE); module_memfree(hdr); goto out; } @@ -975,6 +975,8 @@ static void bpf_prog_pack_free(struct bpf_binary_header *hdr) if (bitmap_find_next_zero_area(pack->bitmap, bpf_prog_chunk_count(), 0, bpf_prog_chunk_count(), 0) == 0) { list_del(&pack->list); + set_memory_nx((unsigned long)pack->ptr, bpf_prog_pack_size / PAGE_SIZE); + set_memory_rw((unsigned long)pack->ptr, bpf_prog_pack_size / PAGE_SIZE); module_memfree(pack->ptr); kfree(pack); } From patchwork Fri May 20 03:15:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12856255 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DC9FC433F5 for ; Fri, 20 May 2022 03:18:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345135AbiETDSo convert rfc822-to-8bit (ORCPT ); Thu, 19 May 2022 23:18:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345127AbiETDSn (ORCPT ); Thu, 19 May 2022 23:18:43 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E5735EBF6 for ; Thu, 19 May 2022 20:18:42 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24JKmYqw030912 for ; Thu, 19 May 2022 20:18:41 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3g59tbru88-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 19 May 2022 20:18:41 -0700 Received: from twshared14818.18.frc3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:40 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id A61D77D8C2FA; Thu, 19 May 2022 20:16:06 -0700 (PDT) From: Song Liu To: , , CC: , , , , , , , Song Liu Subject: [PATCH v3 bpf-next 6/8] vmalloc: WARN for set_vm_flush_reset_perms() on huge pages Date: Thu, 19 May 2022 20:15:46 -0700 Message-ID: <20220520031548.338934-7-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220520031548.338934-1-song@kernel.org> References: <20220520031548.338934-1-song@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: 907j3rj0GTapVPnEH80N8XSJTq-HM3dZ X-Proofpoint-GUID: 907j3rj0GTapVPnEH80N8XSJTq-HM3dZ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-20_01,2022-05-19_03,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net VM_FLUSH_RESET_PERMS is not yet ready for huge pages, add a WARN to catch misuse soon. Suggested-by: Rick Edgecombe Signed-off-by: Song Liu --- include/linux/vmalloc.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index b159c2789961..5e0d0a60d9d5 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -238,6 +238,7 @@ static inline void set_vm_flush_reset_perms(void *addr) { struct vm_struct *vm = find_vm_area(addr); + WARN_ON_ONCE(is_vm_area_hugepages(addr)); if (vm) vm->flags |= VM_FLUSH_RESET_PERMS; } From patchwork Fri May 20 03:15:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12856261 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45955C433F5 for ; Fri, 20 May 2022 03:19:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345148AbiETDTA convert rfc822-to-8bit (ORCPT ); Thu, 19 May 2022 23:19:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345169AbiETDS6 (ORCPT ); Thu, 19 May 2022 23:18:58 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5CE014A25E for ; Thu, 19 May 2022 20:18:57 -0700 (PDT) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24JKmWte017172 for ; Thu, 19 May 2022 20:18:57 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3g5pj4w87v-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 19 May 2022 20:18:57 -0700 Received: from twshared11660.23.frc3.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:56 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id 363FC7D8C302; Thu, 19 May 2022 20:16:08 -0700 (PDT) From: Song Liu To: , , CC: , , , , , , , Song Liu Subject: [PATCH v3 bpf-next 7/8] vmalloc: introduce huge_vmalloc_supported Date: Thu, 19 May 2022 20:15:47 -0700 Message-ID: <20220520031548.338934-8-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220520031548.338934-1-song@kernel.org> References: <20220520031548.338934-1-song@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: IROkRLn5KQCyONhPZ9J8DCUw0Zx-k2bC X-Proofpoint-ORIG-GUID: IROkRLn5KQCyONhPZ9J8DCUw0Zx-k2bC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-20_01,2022-05-19_03,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net huge_vmalloc_supported() exposes vmap_allow_huge so that users of vmalloc APIs could know whether vmalloc will return huge pages. Suggested-by: Rick Edgecombe Signed-off-by: Song Liu --- include/linux/vmalloc.h | 6 ++++++ mm/vmalloc.c | 5 +++++ 2 files changed, 11 insertions(+) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 5e0d0a60d9d5..22e81c1813bd 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -242,11 +242,17 @@ static inline void set_vm_flush_reset_perms(void *addr) if (vm) vm->flags |= VM_FLUSH_RESET_PERMS; } +bool huge_vmalloc_supported(void); #else static inline void set_vm_flush_reset_perms(void *addr) { } + +static inline bool huge_vmalloc_supported(void) +{ + return false; +} #endif /* for /proc/kcore */ diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 07da85ae825b..d3b11317b025 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -72,6 +72,11 @@ early_param("nohugevmalloc", set_nohugevmalloc); static const bool vmap_allow_huge = false; #endif /* CONFIG_HAVE_ARCH_HUGE_VMALLOC */ +bool huge_vmalloc_supported(void) +{ + return vmap_allow_huge; +} + bool is_vmalloc_addr(const void *x) { unsigned long addr = (unsigned long)kasan_reset_tag(x); From patchwork Fri May 20 03:15:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12856262 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C13EAC433F5 for ; Fri, 20 May 2022 03:19:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345168AbiETDTB convert rfc822-to-8bit (ORCPT ); Thu, 19 May 2022 23:19:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345145AbiETDTA (ORCPT ); Thu, 19 May 2022 23:19:00 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8059668F80 for ; Thu, 19 May 2022 20:18:59 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.5/8.17.1.5) with ESMTP id 24JKmcJn032487 for ; Thu, 19 May 2022 20:18:58 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3g5h5d6vry-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 19 May 2022 20:18:58 -0700 Received: from twshared8307.18.frc3.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Thu, 19 May 2022 20:18:56 -0700 Received: by devbig932.frc1.facebook.com (Postfix, from userid 4523) id 19BBB7D8C309; Thu, 19 May 2022 20:16:10 -0700 (PDT) From: Song Liu To: , , CC: , , , , , , , Song Liu Subject: [PATCH v3 bpf-next 8/8] bpf: simplify select_bpf_prog_pack_size Date: Thu, 19 May 2022 20:15:48 -0700 Message-ID: <20220520031548.338934-9-song@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220520031548.338934-1-song@kernel.org> References: <20220520031548.338934-1-song@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: DStbXLmgz3PdCLCozD3JEjNuffvXUJ3L X-Proofpoint-ORIG-GUID: DStbXLmgz3PdCLCozD3JEjNuffvXUJ3L X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-20_01,2022-05-19_03,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Use huge_vmalloc_supported to simplify select_bpf_prog_pack_size, so that we don't allocate some huge pages and free them immediately. Suggested-by: Rick Edgecombe Signed-off-by: Song Liu --- kernel/bpf/core.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index b64d91fcb0ba..b5dcc8f182b3 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -854,22 +854,15 @@ static LIST_HEAD(pack_list); static size_t select_bpf_prog_pack_size(void) { size_t size; - void *ptr; - - size = BPF_HPAGE_SIZE * num_online_nodes(); - ptr = module_alloc_huge(size); - /* Test whether we can get huge pages. If not just use PAGE_SIZE - * packs. - */ - if (!ptr || !is_vm_area_hugepages(ptr)) { + if (huge_vmalloc_supported()) { + size = BPF_HPAGE_SIZE * num_online_nodes(); + bpf_prog_pack_mask = BPF_HPAGE_MASK; + } else { size = PAGE_SIZE; bpf_prog_pack_mask = PAGE_MASK; - } else { - bpf_prog_pack_mask = BPF_HPAGE_MASK; } - vfree(ptr); return size; }