From patchwork Sat Nov 10 01:38:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10676799 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0156A17D4 for ; Sat, 10 Nov 2018 01:36:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3C5E2E2AF for ; Sat, 10 Nov 2018 01:36:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D6CC32F059; Sat, 10 Nov 2018 01:36:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBE022E2AF for ; Sat, 10 Nov 2018 01:36:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C4A1A6B075B; Fri, 9 Nov 2018 20:36:30 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A05076B075A; Fri, 9 Nov 2018 20:36:30 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A2D96B075D; Fri, 9 Nov 2018 20:36:30 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id 457346B075A for ; Fri, 9 Nov 2018 20:36:30 -0500 (EST) Received: by mail-pf1-f200.google.com with SMTP id 87-v6so2814683pfq.8 for ; Fri, 09 Nov 2018 17:36:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=uJR4wvSxvJF21bAWNoROmkCnuY/uMr51cezzkx7SWfA=; b=lb020K99WH5BvThMkPZ7ga8CX/N1MBmy0N50kQKXZHiYhZib6aVr7jJzGBsL0O83Pu HqgIpUmeus3qjzwdrpP3N7t0Oi/wZvhXy/Fc8DCoOiBmSGzsrtEg4SSv6eEYk9cg5MnO HrR1EIPPmKgjFaTYBFqXqcxSwaAWhOvkIEKxMmJANprzcX8b17LEuGn7IRvrbiMxNqfL Uy/K3IqyFWgPMJh25IwRW6cRp47j/2FR3pfFOsU2HZzHqkSTqHafaT4IQH+k8Kc1vL7E HIxa1e1qpdblPhgTbxPfZLkCKxwOqEaBzIhVAMSJLdqU3NClPt40S4wKmGZH3nft9Iki Mgdw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AGRZ1gJu7l/PeifPqrLcLdoRETGB0WBUtWXjFv9V1oITkzaAQW68OucJ RdMaxKou2+gbjDlZFXOIx4Y1nZAEU+UMMjBYYV8FcOGzKx9g4wInChxBD4tzOCvg6TE8iSS66MR pqTMsqW2WNm24frRIxA96m2HAWE8z1DyQSfhCdX+/ltl5E2xPGXhD4WY8o4iJx4Wjqg== X-Received: by 2002:a17:902:33c1:: with SMTP id b59-v6mr10876623plc.71.1541813789862; Fri, 09 Nov 2018 17:36:29 -0800 (PST) X-Google-Smtp-Source: AJdET5fJVT1KEg+PfWt9cPZzF3cFftazLDXdrfnvoPlOSmSAT/L23xL7qq2ucwMrAJicve4s/Ycd X-Received: by 2002:a17:902:33c1:: with SMTP id b59-v6mr10876558plc.71.1541813788575; Fri, 09 Nov 2018 17:36:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541813788; cv=none; d=google.com; s=arc-20160816; b=fYvVn13Mn9n0AjnXPSInQ23ECv0IYvZkwC+MvLjWzOi0lk4MjehFxvsTYeWEvM4Lnw fBXCFiBW4oWchgfJpYommHnzuJWBmJHsVUhJ5McItlKtObYbjmnsU6dGcWMy0AsoDRKO Cc1H+FiFPY8jYSWd2SyBmbwUcNB8TcpidGP0FThNANgZV2KZLAsGbmoJQn8Qnb8QwBiS 0yQIpZ3L/9ooCaoUaCXgPEh0VPVTaIxY2OKylQ19ew8A8I4068qDQHlzzubjLcetO8v5 zt1GLGUtI7549AJoqobEOQptfXjIzIR1DdpWykCplx+2bbP1e8k8Q7jv1UEry6I9zqBb AHIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=uJR4wvSxvJF21bAWNoROmkCnuY/uMr51cezzkx7SWfA=; b=NlegmIfM+Wa1OQmFtpGeMfbCUvPplmJ0bHL3JZLfbcM28CQ0z3kUj5BVnYOxhnwZmO bXNg3Yjj2Wu4FqhgDGahDemoK3XN3K6/Kkv+XpRk2gsgwBXx6GOvIH8Cv8Cpij5UJtBk +Q25l04v1mXN2Ymd49/ozLFtgyGodLaF/c+wXtVlyPuqRSSujLOsKsB1VwI/lp1xUOQ8 LId6U8+V345WEmuUPgKdhSIi+sCOBlQR7PHPqj6JENr0zU+ONrclgU/tBWW0fZb3noOR FpZ5EmrXj8nOeV7h5Fp6d9l9nvdYvK8q30plI7oxYCSMfDfj/4XsuPL1pPuEfGPEoBLi 5SMw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n32si8099080pgm.439.2018.11.09.17.36.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Nov 2018 17:36:28 -0800 (PST) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Nov 2018 17:36:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,485,1534834800"; d="scan'208";a="105025660" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga004.fm.intel.com with ESMTP; 09 Nov 2018 17:36:27 -0800 From: Rick Edgecombe To: akpm@linux-foundation.org, willy@infradead.org, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v9 1/4] vmalloc: Add __vmalloc_node_try_addr function Date: Fri, 9 Nov 2018 17:38:04 -0800 Message-Id: <20181110013807.24903-2-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181110013807.24903-1-rick.p.edgecombe@intel.com> References: <20181110013807.24903-1-rick.p.edgecombe@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Create __vmalloc_node_try_addr function that tries to allocate at a specific address without triggering any lazy purging and retry. For the randomized allocator that uses this function, failing to allocate at a specific address is a lot more common. This function will not try to do any lazy purge and retry, to try to fail faster when an allocation won't fit at a specific address. This function is used for a case where lazy free areas are unlikely and so the purge and retry is just extra work done every time. For the randomized module loader, the performance for an average allocation in ns for different numbers of modules was: Modules Vmalloc optimization No Vmalloc Optimization 1000 1433 1993 2000 2295 3681 3000 4424 7450 4000 7746 13824 5000 12721 21852 6000 19724 33926 7000 27638 47427 8000 37745 64443 In order to support this behavior a try_addr argument was plugged into several of the static helpers. This also changes logic in __get_vm_area_node to be faster in cases where allocations fail due to no space, which is a lot more common when trying specific addresses. Signed-off-by: Rick Edgecombe --- include/linux/vmalloc.h | 3 + mm/vmalloc.c | 128 +++++++++++++++++++++++++++++----------- 2 files changed, 95 insertions(+), 36 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 398e9c95cd61..6eaa89612372 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -82,6 +82,9 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align, unsigned long start, unsigned long end, gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, int node, const void *caller); +extern void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size, + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, const void *caller); #ifndef CONFIG_MMU extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags); static inline void *__vmalloc_node_flags_caller(unsigned long size, int node, diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 97d4b25d0373..b8b34d319c85 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -326,6 +326,9 @@ EXPORT_SYMBOL(vmalloc_to_pfn); #define VM_LAZY_FREE 0x02 #define VM_VM_AREA 0x04 +#define VMAP_MAY_PURGE 0x2 +#define VMAP_NO_PURGE 0x1 + static DEFINE_SPINLOCK(vmap_area_lock); /* Export for kexec only */ LIST_HEAD(vmap_area_list); @@ -402,12 +405,12 @@ static BLOCKING_NOTIFIER_HEAD(vmap_notify_list); static struct vmap_area *alloc_vmap_area(unsigned long size, unsigned long align, unsigned long vstart, unsigned long vend, - int node, gfp_t gfp_mask) + int node, gfp_t gfp_mask, int try_purge) { struct vmap_area *va; struct rb_node *n; unsigned long addr; - int purged = 0; + int purged = try_purge & VMAP_NO_PURGE; struct vmap_area *first; BUG_ON(!size); @@ -860,7 +863,7 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) va = alloc_vmap_area(VMAP_BLOCK_SIZE, VMAP_BLOCK_SIZE, VMALLOC_START, VMALLOC_END, - node, gfp_mask); + node, gfp_mask, VMAP_MAY_PURGE); if (IS_ERR(va)) { kfree(vb); return ERR_CAST(va); @@ -1170,8 +1173,9 @@ void *vm_map_ram(struct page **pages, unsigned int count, int node, pgprot_t pro addr = (unsigned long)mem; } else { struct vmap_area *va; - va = alloc_vmap_area(size, PAGE_SIZE, - VMALLOC_START, VMALLOC_END, node, GFP_KERNEL); + va = alloc_vmap_area(size, PAGE_SIZE, VMALLOC_START, + VMALLOC_END, node, GFP_KERNEL, + VMAP_MAY_PURGE); if (IS_ERR(va)) return NULL; @@ -1372,7 +1376,8 @@ static void clear_vm_uninitialized_flag(struct vm_struct *vm) static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long align, unsigned long flags, unsigned long start, - unsigned long end, int node, gfp_t gfp_mask, const void *caller) + unsigned long end, int node, gfp_t gfp_mask, int try_purge, + const void *caller) { struct vmap_area *va; struct vm_struct *area; @@ -1386,16 +1391,17 @@ static struct vm_struct *__get_vm_area_node(unsigned long size, align = 1ul << clamp_t(int, get_count_order_long(size), PAGE_SHIFT, IOREMAP_MAX_ORDER); - area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); - if (unlikely(!area)) - return NULL; - if (!(flags & VM_NO_GUARD)) size += PAGE_SIZE; - va = alloc_vmap_area(size, align, start, end, node, gfp_mask); - if (IS_ERR(va)) { - kfree(area); + va = alloc_vmap_area(size, align, start, end, node, gfp_mask, + try_purge); + if (IS_ERR(va)) + return NULL; + + area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); + if (unlikely(!area)) { + free_vmap_area(va); return NULL; } @@ -1408,7 +1414,8 @@ struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags, unsigned long start, unsigned long end) { return __get_vm_area_node(size, 1, flags, start, end, NUMA_NO_NODE, - GFP_KERNEL, __builtin_return_address(0)); + GFP_KERNEL, VMAP_MAY_PURGE, + __builtin_return_address(0)); } EXPORT_SYMBOL_GPL(__get_vm_area); @@ -1417,7 +1424,7 @@ struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags, const void *caller) { return __get_vm_area_node(size, 1, flags, start, end, NUMA_NO_NODE, - GFP_KERNEL, caller); + GFP_KERNEL, VMAP_MAY_PURGE, caller); } /** @@ -1432,7 +1439,7 @@ struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags, struct vm_struct *get_vm_area(unsigned long size, unsigned long flags) { return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END, - NUMA_NO_NODE, GFP_KERNEL, + NUMA_NO_NODE, GFP_KERNEL, VMAP_MAY_PURGE, __builtin_return_address(0)); } @@ -1440,7 +1447,8 @@ struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags, const void *caller) { return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END, - NUMA_NO_NODE, GFP_KERNEL, caller); + NUMA_NO_NODE, GFP_KERNEL, VMAP_MAY_PURGE, + caller); } /** @@ -1713,26 +1721,10 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, return NULL; } -/** - * __vmalloc_node_range - allocate virtually contiguous memory - * @size: allocation size - * @align: desired alignment - * @start: vm area range start - * @end: vm area range end - * @gfp_mask: flags for the page level allocator - * @prot: protection mask for the allocated pages - * @vm_flags: additional vm area flags (e.g. %VM_NO_GUARD) - * @node: node to use for allocation or NUMA_NO_NODE - * @caller: caller's return address - * - * Allocate enough pages to cover @size from the page level - * allocator with @gfp_mask flags. Map them into contiguous - * kernel virtual space, using a pagetable protection of @prot. - */ -void *__vmalloc_node_range(unsigned long size, unsigned long align, +static void *__vmalloc_node_range_opts(unsigned long size, unsigned long align, unsigned long start, unsigned long end, gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, int node, - const void *caller) + int try_purge, const void *caller) { struct vm_struct *area; void *addr; @@ -1743,7 +1735,8 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, goto fail; area = __get_vm_area_node(size, align, VM_ALLOC | VM_UNINITIALIZED | - vm_flags, start, end, node, gfp_mask, caller); + vm_flags, start, end, node, gfp_mask, + try_purge, caller); if (!area) goto fail; @@ -1768,6 +1761,69 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, return NULL; } +/** + * __vmalloc_node_range - allocate virtually contiguous memory + * @size: allocation size + * @align: desired alignment + * @start: vm area range start + * @end: vm area range end + * @gfp_mask: flags for the page level allocator + * @prot: protection mask for the allocated pages + * @vm_flags: additional vm area flags (e.g. %VM_NO_GUARD) + * @node: node to use for allocation or NUMA_NO_NODE + * @caller: caller's return address + * + * Allocate enough pages to cover @size from the page level + * allocator with @gfp_mask flags. Map them into contiguous + * kernel virtual space, using a pagetable protection of @prot. + */ +void *__vmalloc_node_range(unsigned long size, unsigned long align, + unsigned long start, unsigned long end, gfp_t gfp_mask, + pgprot_t prot, unsigned long vm_flags, int node, + const void *caller) +{ + return __vmalloc_node_range_opts(size, align, start, end, gfp_mask, + prot, vm_flags, node, VMAP_MAY_PURGE, + caller); +} + +/** + * __vmalloc_try_addr - try to alloc at a specific address + * @addr: address to try + * @size: size to try + * @gfp_mask: flags for the page level allocator + * @prot: protection mask for the allocated pages + * @vm_flags: additional vm area flags (e.g. %VM_NO_GUARD) + * @node: node to use for allocation or NUMA_NO_NODE + * @caller: caller's return address + * + * Try to allocate at the specific address. If it succeeds the address is + * returned. If it fails NULL is returned. It will not try to purge lazy + * free vmap areas in order to fit. + */ +void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size, + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, const void *caller) +{ + unsigned long addr_end; + unsigned long vsize = PAGE_ALIGN(size); + + if (!vsize || (vsize >> PAGE_SHIFT) > totalram_pages) + return NULL; + + if (!(vm_flags & VM_NO_GUARD)) + vsize += PAGE_SIZE; + + addr_end = addr + vsize; + + if (addr > addr_end) + return NULL; + + return __vmalloc_node_range_opts(size, 1, addr, addr_end, + gfp_mask | __GFP_NOWARN, prot, vm_flags, node, + VMAP_NO_PURGE, caller); +} + /** * __vmalloc_node - allocate virtually contiguous memory * @size: allocation size From patchwork Sat Nov 10 01:38:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10676801 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C558109C for ; Sat, 10 Nov 2018 01:36:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 290692E2AF for ; Sat, 10 Nov 2018 01:36:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1CB462F059; Sat, 10 Nov 2018 01:36:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5BEB82E2AF for ; Sat, 10 Nov 2018 01:36:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0DDBF6B0758; Fri, 9 Nov 2018 20:36:31 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 091476B075A; Fri, 9 Nov 2018 20:36:31 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E02D26B0760; Fri, 9 Nov 2018 20:36:30 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id 7F9BD6B0758 for ; Fri, 9 Nov 2018 20:36:30 -0500 (EST) Received: by mail-pg1-f197.google.com with SMTP id r13so2380659pgb.7 for ; Fri, 09 Nov 2018 17:36:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=pmqRVlwThZ3gV6FpDi77XAXWmrabinWWNF8l1nqzCoo=; b=mxhQYxHF7X1c594mZCyhd0rFCK74sG4JLFEKZZ3tS4tHEBl+zeNx7VxtzHnaVuUGR6 LxnMpZ/GuqEXu6u1shP0Yjo1ngFASE2Pzxiju0JSzd6//HkS03BW0OPEM+xgD3frkKr3 jsTTh3Q9BuBq3cU/Pr6rXJZHY36o5hdKMYzers0PqoncBCXB6B9kJ/ve/vAcy0OW2LdC v/fKeFaRzf07Kec2jGWFfikRhaM9elw9q+QQwfW4fxTJ8jiA1TmVAprJkgl6yM26TurF HLPWSQkcokWVsHkT7ezwPej2ype/RQewMjq8weoiwmHb4MPOVZB6JJ7qpV0EVhl2DuLD zx1A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AGRZ1gJqIezep9EMxDEQ8T2gGu8I8iyddFx+7lVAqIQENhCikw9iV/s1 dVJHEktTCxmat7EnnTSJ2CwPdP5YAKKcWLayiVTY3UaFUEEX9MZVwmm4nO6TJqncVnP94LmZTkP gQt6JgFbIuNGnwLTAnG5uJAxIMcl+DSpiFuP6YrJ1ZM/8EQnh6obM+asw9TmWv41oyg== X-Received: by 2002:a17:902:7c0a:: with SMTP id x10-v6mr3849958pll.263.1541813790124; Fri, 09 Nov 2018 17:36:30 -0800 (PST) X-Google-Smtp-Source: AJdET5c5f6fDjW71xDooiC3lr1gESgF5FppGCCewFIlt5KB80+90GsPDvECwGe5eiQb2z8/nfVFc X-Received: by 2002:a17:902:7c0a:: with SMTP id x10-v6mr3849904pll.263.1541813789018; Fri, 09 Nov 2018 17:36:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541813788; cv=none; d=google.com; s=arc-20160816; b=Btn9MACPslOvfVaBj8jQXMq69G28Zok8kLM40ZeMhu2nfMZpi1D5F8tot4DdDn2rvs NwPY37XIq3IgbG6u9WFhDnvjWf6cYI4eU+exmxLVCaNcprN/7isaFo+21xCSrmWXHw23 XYFPB/AlDnr4q4lzn/PJIJy4MxXRJhlk0tSwd6A5h7TQ2uwkCV07Xj6XubzHAMCdBxGh 2Il7qVHG0tGHcjX5Netvz18d7f1+IFJAhmekLnkEc6yp0nz9pAE3IOSKqUHTrRU9/gDB E0V47mMntsPYftab7bViH9WZZy0Z2puw1IYR8rputTJNwpihhdpx7A0W46SCIAqey6jT /BXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=pmqRVlwThZ3gV6FpDi77XAXWmrabinWWNF8l1nqzCoo=; b=gt+45uO/wHnCfuFRQlX+rlAESMZrClQz4Wd+DuoDFRX9uLzpOc8WNxIT1sZenvddjs r22JHXvoiwxiYDMZW6FxB+GfDWI4mTaHJXs5ls46BMQpwQnR6t+juTSoEY8zy7lFhVlI 8p8pc8HEkjpsRWavPm5+YVeFyLFcNGJMS+a4TI8LLrRF/2D6f2x8WWuDBt47CXnX+yPO tT8XZmN7yQ3160knUeRXQ23m8fsYbXLP7YQkh0H7GimXtxKn5JxeDaG/uP+qD+WLZab+ ZcoRWE3PHBV9NXnDlb/fMwMAvGKWYRfs6Ma8859hZYby5H1SAeO8ANFO0ceu0sNFm99z PdWg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n32si8099080pgm.439.2018.11.09.17.36.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Nov 2018 17:36:28 -0800 (PST) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Nov 2018 17:36:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,485,1534834800"; d="scan'208";a="105025663" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga004.fm.intel.com with ESMTP; 09 Nov 2018 17:36:27 -0800 From: Rick Edgecombe To: akpm@linux-foundation.org, willy@infradead.org, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v9 2/4] x86/modules: Increase randomization for modules Date: Fri, 9 Nov 2018 17:38:05 -0800 Message-Id: <20181110013807.24903-3-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181110013807.24903-1-rick.p.edgecombe@intel.com> References: <20181110013807.24903-1-rick.p.edgecombe@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This changes the behavior of the KASLR logic for allocating memory for the text sections of loadable modules. It randomizes the location of each module text section with about 17 bits of entropy in typical use. This is enabled on X86_64 only. For 32 bit, the behavior is unchanged. It refactors existing code around module randomization somewhat. There are now three different behaviors for x86 module_alloc depending on config. RANDOMIZE_BASE=n, and RANDOMIZE_BASE=y ARCH=x86_64, and RANDOMIZE_BASE=y ARCH=i386. The refactor of the existing code is to try to clearly show what those behaviors are without having three separate versions or threading the behaviors in a bunch of little spots. The reason it is not enabled on 32 bit yet is because the module space is much smaller and simulations haven't been run to see how it performs. The new algorithm breaks the module space in two, a random area and a backup area. It first tries to allocate at a number of randomly located starting pages inside the random section. If this fails, then it will allocate in the backup area. The backup area base will be offset in the same way as the current algorithm does for the base area, 1024 possible locations. Due to boot_params being defined with different types in different places, placing the config helpers modules.h or kaslr.h caused conflicts elsewhere, and so they are placed in a new file, kaslr_modules.h, instead. Signed-off-by: Rick Edgecombe --- arch/x86/Kconfig | 3 + arch/x86/include/asm/kaslr_modules.h | 38 ++++++++ arch/x86/include/asm/pgtable_64_types.h | 7 ++ arch/x86/kernel/module.c | 111 +++++++++++++++++++----- 4 files changed, 136 insertions(+), 23 deletions(-) create mode 100644 arch/x86/include/asm/kaslr_modules.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ba7e3464ee92..db93cde0528a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2144,6 +2144,9 @@ config RANDOMIZE_BASE If unsure, say Y. +config RANDOMIZE_FINE_MODULE + def_bool y if RANDOMIZE_BASE && X86_64 && !CONFIG_UML + # Relocation on x86 needs some additional build support config X86_NEED_RELOCS def_bool y diff --git a/arch/x86/include/asm/kaslr_modules.h b/arch/x86/include/asm/kaslr_modules.h new file mode 100644 index 000000000000..1da6eced4b47 --- /dev/null +++ b/arch/x86/include/asm/kaslr_modules.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_KASLR_MODULES_H_ +#define _ASM_KASLR_MODULES_H_ + +#ifdef CONFIG_RANDOMIZE_BASE +/* kaslr_enabled is not always defined */ +static inline int kaslr_mod_randomize_base(void) +{ + return kaslr_enabled(); +} +#else +static inline int kaslr_mod_randomize_base(void) +{ + return 0; +} +#endif /* CONFIG_RANDOMIZE_BASE */ + +#ifdef CONFIG_RANDOMIZE_FINE_MODULE +/* kaslr_enabled is not always defined */ +static inline int kaslr_mod_randomize_each_module(void) +{ + return kaslr_enabled(); +} + +static inline unsigned long get_modules_rand_len(void) +{ + return MODULES_RAND_LEN; +} +#else +static inline int kaslr_mod_randomize_each_module(void) +{ + return 0; +} + +unsigned long get_modules_rand_len(void); +#endif /* CONFIG_RANDOMIZE_FINE_MODULE */ + +#endif diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 04edd2d58211..5e26369ab86c 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -143,6 +143,13 @@ extern unsigned int ptrs_per_p4d; #define MODULES_END _AC(0xffffffffff000000, UL) #define MODULES_LEN (MODULES_END - MODULES_VADDR) +/* + * Dedicate the first part of the module space to a randomized area when KASLR + * is in use. Leave the remaining part for a fallback if we are unable to + * allocate in the random area. + */ +#define MODULES_RAND_LEN PAGE_ALIGN((MODULES_LEN/3)*2) + #define ESPFIX_PGD_ENTRY _AC(-2, UL) #define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index b052e883dd8c..35cb912ed1f8 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -36,6 +36,7 @@ #include #include #include +#include #if 0 #define DEBUGP(fmt, ...) \ @@ -48,34 +49,96 @@ do { \ } while (0) #endif -#ifdef CONFIG_RANDOMIZE_BASE static unsigned long module_load_offset; +static const unsigned long NO_TRY_RAND = 10000; /* Mutex protects the module_load_offset. */ static DEFINE_MUTEX(module_kaslr_mutex); static unsigned long int get_module_load_offset(void) { - if (kaslr_enabled()) { - mutex_lock(&module_kaslr_mutex); - /* - * Calculate the module_load_offset the first time this - * code is called. Once calculated it stays the same until - * reboot. - */ - if (module_load_offset == 0) - module_load_offset = - (get_random_int() % 1024 + 1) * PAGE_SIZE; - mutex_unlock(&module_kaslr_mutex); - } + mutex_lock(&module_kaslr_mutex); + /* + * Calculate the module_load_offset the first time this + * code is called. Once calculated it stays the same until + * reboot. + */ + if (module_load_offset == 0) + module_load_offset = (get_random_int() % 1024 + 1) * PAGE_SIZE; + mutex_unlock(&module_kaslr_mutex); + return module_load_offset; } -#else -static unsigned long int get_module_load_offset(void) + +static unsigned long get_module_vmalloc_start(void) { - return 0; + unsigned long addr = MODULES_VADDR; + + if (kaslr_mod_randomize_base()) + addr += get_module_load_offset(); + + if (kaslr_mod_randomize_each_module()) + addr += get_modules_rand_len(); + + return addr; +} + +static void *try_module_alloc(unsigned long addr, unsigned long size) +{ + const unsigned long vm_flags = 0; + + return __vmalloc_node_try_addr(addr, size, GFP_KERNEL, PAGE_KERNEL_EXEC, + vm_flags, NUMA_NO_NODE, + __builtin_return_address(0)); +} + +/* + * Find a random address to try that won't obviously not fit. Random areas are + * allowed to overflow into the backup area + */ +static unsigned long get_rand_module_addr(unsigned long size) +{ + unsigned long nr_max_pos = (MODULES_LEN - size) / MODULE_ALIGN + 1; + unsigned long nr_rnd_pos = get_modules_rand_len() / MODULE_ALIGN; + unsigned long nr_pos = min(nr_max_pos, nr_rnd_pos); + + unsigned long module_position_nr = get_random_long() % nr_pos; + unsigned long offset = module_position_nr * MODULE_ALIGN; + + return MODULES_VADDR + offset; +} + +/* + * Try to allocate in the random area at 10000 random addresses. If these + * fail, return NULL. + */ +static void *try_module_randomize_each(unsigned long size) +{ + void *p = NULL; + unsigned int i; + + /* This will have a guard page */ + unsigned long va_size = PAGE_ALIGN(size) + PAGE_SIZE; + + if (!kaslr_mod_randomize_each_module()) + return NULL; + + /* Make sure there is at least one address that might fit. */ + if (va_size < PAGE_ALIGN(size) || va_size > MODULES_LEN) + return NULL; + + /* Try to find a spot that doesn't need a lazy purge */ + for (i = 0; i < NO_TRY_RAND; i++) { + unsigned long addr = get_rand_module_addr(va_size); + + p = try_module_alloc(addr, size); + + if (p) + return p; + } + + return NULL; } -#endif void *module_alloc(unsigned long size) { @@ -84,16 +147,18 @@ void *module_alloc(unsigned long size) if (PAGE_ALIGN(size) > MODULES_LEN) return NULL; - p = __vmalloc_node_range(size, MODULE_ALIGN, - MODULES_VADDR + get_module_load_offset(), - MODULES_END, GFP_KERNEL, - PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, - __builtin_return_address(0)); + p = try_module_randomize_each(size); + + if (!p) + p = __vmalloc_node_range(size, MODULE_ALIGN, + get_module_vmalloc_start(), MODULES_END, + GFP_KERNEL, PAGE_KERNEL_EXEC, 0, + NUMA_NO_NODE, __builtin_return_address(0)); + if (p && (kasan_module_alloc(p, size) < 0)) { vfree(p); return NULL; } - return p; } From patchwork Sat Nov 10 01:38:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10676803 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 445F514DB for ; Sat, 10 Nov 2018 01:36:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3054E2E2AF for ; Sat, 10 Nov 2018 01:36:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 23FCA2F059; Sat, 10 Nov 2018 01:36:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B1C52E2AF for ; Sat, 10 Nov 2018 01:36:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74AE16B075A; Fri, 9 Nov 2018 20:36:31 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6F79C6B075F; Fri, 9 Nov 2018 20:36:31 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EAE76B0760; Fri, 9 Nov 2018 20:36:31 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id D0EBB6B075F for ; Fri, 9 Nov 2018 20:36:30 -0500 (EST) Received: by mail-pl1-f199.google.com with SMTP id t5-v6so2595867plo.2 for ; Fri, 09 Nov 2018 17:36:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=+mSO61ChHconaFDfZPZTkzvJ9XsnfWKihWo8ZoXaF20=; b=lZZNQ9Ejt3pmrotdeslr5aLG677UNwhN3yrBHhQofjMF6IvMVwPYVWfufVlXmd6XIr B7S8qhPD1/kYXH74ifza5T5XneDb/W8CLD8C3GrTWIioE8o+6J7nH4FSMyC1v9QfqjsA EcMNWomnz2aXdkSZ12T+0FJ9QCT3A8drggA1ihN2di2Qh8GbnFRPjNLR3fRYj0fvsVij bg0iqVQUgb2WPsX9C/3Ff2GmMBCbSTQJyTphUpu7jHYOyQWwfSWDKxPQ/Zv1Rq6BHw33 WUZzY8FGY5JJofnZUqi/H3Yx+amYzMfBzNKHQFNoMbjaSBrJD2GrGDsFpgZsi11NlfkQ m40A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AGRZ1gJylh5mwz61xjm7sDKKjEmRuy8A7w//+2BmKTZNmh1hz0qk05un EyFRWU6xo3zTMvt6luKEQFiJ1CGcE0Z7AP8SAufAZjJbinouBM2E+5a51V9qeeSicCsAsAfKM2C m9z0MFTP+Wt6fbv/gXMVXJZL6uevXspEZkM/4uT2C6KrujLlWcjutQ6sYoTy75a0zsg== X-Received: by 2002:a17:902:9b83:: with SMTP id y3-v6mr10917766plp.113.1541813790491; Fri, 09 Nov 2018 17:36:30 -0800 (PST) X-Google-Smtp-Source: AJdET5eDvsOcnRq3p1GqrsuBLiylaud6Jh4Q0+d1GRr4E6pvAH3HX3S+K7zH1d2Zmh03R2EK2IZg X-Received: by 2002:a17:902:9b83:: with SMTP id y3-v6mr10917729plp.113.1541813789591; Fri, 09 Nov 2018 17:36:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541813789; cv=none; d=google.com; s=arc-20160816; b=q/KgtQdcdtr2jdITJQHWvY5U2A2SpA/ZGvB5zu1iW5drMFpIFX1NmcvbdOv0uEsB5f +YrUHFQQQS8NOGhIw0lCvWyvT/2cSmra4E9yYZHHwHw0BmSjGYC0gfb7o5Ok2jcMdKZB r8vIpgkKTxIxluT9dYyx4orJwYRABmWaC80UTKsghilDlAv8ntWiEBIpW70McroXfbiv ylgtu7v1SBcLaGV9uWsjz8hyywtQmvP73FCwvhJf56koRMrep6eh0x7lRDLkF0L1lLMT +rrunOHPSMPX/i3rj7qZ59FB7ziM3OhXURInfOnKwx26bXlWBWeGC7OhLakQTsfRbNIU auhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=+mSO61ChHconaFDfZPZTkzvJ9XsnfWKihWo8ZoXaF20=; b=uSF5jhNu0ZpdfQyGj/nS5N5dFzGOU7RAP8S8P82ZByJu7dgejhnRosvhfXAKa58G+S Okk2pC96klNSBXkFIkHaeeHRmimHNVoZJ6zLbuMy2m8DkBfGu+Pfg4m8zCO5kJJHtmjP piUFvmm/0QbRio5a1W5ByHZ4XAeAPZpS4FplSnzsHdCPDRB2yclJvQmersmYCdloLhel FCClQMMGEvbD+8MUfZLFMjg0O8G96bsndXIta+O4Ui5YvrJ1CFU/SAEK8YCG/u6nxxdM 0JeOT3QXE99bWn6YDDKDXRCbPiZb0XBHa4sckt6c5zf6U+keRh3/1Yp8kKsjXUsDHFd6 opvg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n32si8099080pgm.439.2018.11.09.17.36.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Nov 2018 17:36:29 -0800 (PST) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Nov 2018 17:36:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,485,1534834800"; d="scan'208";a="105025666" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga004.fm.intel.com with ESMTP; 09 Nov 2018 17:36:28 -0800 From: Rick Edgecombe To: akpm@linux-foundation.org, willy@infradead.org, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v9 3/4] vmalloc: Add debugfs modfraginfo Date: Fri, 9 Nov 2018 17:38:06 -0800 Message-Id: <20181110013807.24903-4-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181110013807.24903-1-rick.p.edgecombe@intel.com> References: <20181110013807.24903-1-rick.p.edgecombe@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add debugfs file "modfraginfo" for providing info on module space fragmentation. This can be used for determining if loadable module randomization is causing any problems for extreme module loading situations, like huge numbers of modules or extremely large modules. Sample output when KASLR is enabled and X86_64 is configured: Largest free space: 897912 kB Total free space: 1025424 kB Allocations in backup area: 0 Sample output when just X86_64: Largest free space: 897912 kB Total free space: 1025424 kB Signed-off-by: Rick Edgecombe Reviewed-by: Kees Cook --- mm/vmalloc.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 98 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index b8b34d319c85..63894cb50873 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -36,6 +37,12 @@ #include #include +#ifdef CONFIG_X86 +#include +#include +#include +#endif + #include "internal.h" struct vfree_deferred { @@ -2415,7 +2422,6 @@ void free_vm_area(struct vm_struct *area) } EXPORT_SYMBOL_GPL(free_vm_area); -#ifdef CONFIG_SMP static struct vmap_area *node_to_va(struct rb_node *n) { return rb_entry_safe(n, struct vmap_area, rb_node); @@ -2463,6 +2469,7 @@ static bool pvm_find_next_prev(unsigned long end, return true; } +#ifdef CONFIG_SMP /** * pvm_determine_end - find the highest aligned address between two vmap_areas * @pnext: in/out arg for the next vmap_area @@ -2804,7 +2811,96 @@ static int __init proc_vmalloc_init(void) proc_create_seq("vmallocinfo", 0400, NULL, &vmalloc_op); return 0; } -module_init(proc_vmalloc_init); +#elif defined(CONFIG_DEBUG_FS) +static int __init proc_vmalloc_init(void) +{ + return 0; +} +#endif + +#if defined(CONFIG_DEBUG_FS) && defined(CONFIG_RANDOMIZE_FINE_MODULE) +static inline unsigned long is_in_backup(unsigned long addr) +{ + return addr >= MODULES_VADDR + get_modules_rand_len(); +} + +static int modulefraginfo_debug_show(struct seq_file *m, void *v) +{ + unsigned long last_end = MODULES_VADDR; + unsigned long total_free = 0; + unsigned long largest_free = 0; + unsigned long backup_cnt = 0; + unsigned long gap; + struct vmap_area *prev, *cur = NULL; + + spin_lock(&vmap_area_lock); + + if (!pvm_find_next_prev(MODULES_VADDR, &cur, &prev) || !cur) + goto done; + + for (; cur->va_end <= MODULES_END; cur = list_next_entry(cur, list)) { + /* Don't count areas that are marked to be lazily freed */ + if (!(cur->flags & VM_LAZY_FREE)) { + if (kaslr_mod_randomize_each_module()) + backup_cnt += is_in_backup(cur->va_start); + gap = cur->va_start - last_end; + if (gap > largest_free) + largest_free = gap; + total_free += gap; + last_end = cur->va_end; + } + + if (list_is_last(&cur->list, &vmap_area_list)) + break; + } + +done: + gap = (MODULES_END - last_end); + if (gap > largest_free) + largest_free = gap; + total_free += gap; + spin_unlock(&vmap_area_lock); + + seq_printf(m, "\tLargest free space:\t%lu kB\n", largest_free / 1024); + seq_printf(m, "\t Total free space:\t%lu kB\n", total_free / 1024); + + if (kaslr_mod_randomize_each_module()) + seq_printf(m, "Allocations in backup area:\t%lu\n", backup_cnt); + + return 0; +} + +static int proc_module_frag_debug_open(struct inode *inode, struct file *file) +{ + return single_open(file, modulefraginfo_debug_show, NULL); +} + +static const struct file_operations debug_module_frag_operations = { + .open = proc_module_frag_debug_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static void __init debug_modfrag_init(void) +{ + debugfs_create_file("modfraginfo", 0400, NULL, NULL, + &debug_module_frag_operations); +} +#elif defined(CONFIG_DEBUG_FS) || defined(CONFIG_PROC_FS) +static void __init debug_modfrag_init(void) +{ +} #endif +#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_PROC_FS) +static int __init info_vmalloc_init(void) +{ + proc_vmalloc_init(); + debug_modfrag_init(); + return 0; +} + +module_init(info_vmalloc_init); +#endif From patchwork Sat Nov 10 01:38:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10676805 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 15B0814DB for ; Sat, 10 Nov 2018 01:36:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 023762E2AF for ; Sat, 10 Nov 2018 01:36:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E76BE2F050; Sat, 10 Nov 2018 01:36:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DBDCF2E2AF for ; Sat, 10 Nov 2018 01:36:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C20C6B075F; Fri, 9 Nov 2018 20:36:32 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 74BCB6B0762; Fri, 9 Nov 2018 20:36:32 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 54D6C6B0763; Fri, 9 Nov 2018 20:36:32 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 086986B075F for ; Fri, 9 Nov 2018 20:36:32 -0500 (EST) Received: by mail-pg1-f198.google.com with SMTP id l2-v6so2357931pgp.22 for ; Fri, 09 Nov 2018 17:36:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=X+yXoRa7dyjtA1PEPWHn9Ke2eBTGNjHAfdIPgzyaQw4=; b=iYthp5u8c24UFvpZE48c+j4tm9ihJRT0T1ElaiPtfkaUwjETWjwiGM69bgPxF9NCYN m9eHQDX80r0UMxlDltXYRefQSr4FojCq4t0XnmT5meAtloqbhluqX/D9p+zesOr4oz+E P8DYJEVoaioIiGKbq5rqI1YADti5SWtDCGGmJPtnx/eaMfe1Yju3y8anJLzrLzB2CyYV 4jkGQagAHRiWN5qnRc2Vvj4urv6t7UekDqOg+OzoUlXulULDnP69L5VAkHVoNYRVr8cK eXSqXngtJoaemb418xQozyD5OlFbYlPNH9MAXW2ZFQvKlUdAmtA22Eu9TZOK9TONrJOX enxw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AGRZ1gKsK636yVWPY4V15YTUEpMV2teXXOaVGkqmQzIk09eD2f0mrvgy yK7DhYWFeOPM8Hz2ySm6MjPuh9w0GuRJuogpV3Zm9LJY9GQlox7gIAw399stZZcJPSJdmFcP46M cRyq3xyNWMUkI+UHtaTePCJ1n3/1NCGK+JdilXNHJ903U6wAAWlrU8asH70b/cX/dLA== X-Received: by 2002:a62:39d2:: with SMTP id u79-v6mr11419286pfj.116.1541813791641; Fri, 09 Nov 2018 17:36:31 -0800 (PST) X-Google-Smtp-Source: AJdET5eeX8G4cb5nZ0ypHaGq31lbSLXX/ddmZON1pxoLLlJUJUmYBVHLD3uKtvREp4R4afmGLIAE X-Received: by 2002:a62:39d2:: with SMTP id u79-v6mr11419215pfj.116.1541813790138; Fri, 09 Nov 2018 17:36:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541813790; cv=none; d=google.com; s=arc-20160816; b=0bEBN7+vlkAnML3Qk+eDnTtLDyFDQ+W7Un6XhrsNCFVWF9AtEleh0NmDU8U4BwrhBQ F8fzw3+kPCqxOF/3XbFr2sc/9rIiq3vgCAfUDdBO83FkojiFT8jp+tefaUW+bKLlolwd gM1p95fDBz3Cf+vB2tkyL9kbVrZROdlKCWBX5pmSmia20kuJUFPX2OY59/wINHXJ1JkE hmVngNNznbsMhZhSlVmmTS2ugBla147rEDC37rXARUlrUpVcPhUwBuD3AeaW4ofrFxpG FyHo4mw8YJgqhCe01nb2t3+e08F8D6r/BxPueeKvxRaLTWsarfrKEEYlcIzW8SH5pwqt j7tQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=X+yXoRa7dyjtA1PEPWHn9Ke2eBTGNjHAfdIPgzyaQw4=; b=hYxgEcHDBGk3dkAWvKeRofwiRvXUWaY8UeZkv0pUYZ52pwhQgHeMgSm/y83Hv44D0A NVezX7ps5NGUx+Z2P1h8O9W5F8h5Ka18BnI18e5OVlU2oz4Q0vqWj+4PGS2fTRr6woIx WefFh1Bcv7cj0Ku6nm2WEcbtkAu1dmGtXtZsrwVg2Aej2lOMDhUls0sR2dVtdiFcPGlN VK2Dp6dKW8xyUS5Qf0NIwBxqeQbt4WEjjXl7BSJ4F0KMi2u7f6RyUxMfN3bJD7SnKaZx LmldsE0hJYOHUUDj8atmKEEcNrFmOT+2RpsAtD1B/zxN70CqlzNBO9Y6Kend5AOWoPRn rSxw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n32si8099080pgm.439.2018.11.09.17.36.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 09 Nov 2018 17:36:30 -0800 (PST) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Nov 2018 17:36:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,485,1534834800"; d="scan'208";a="105025669" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga004.fm.intel.com with ESMTP; 09 Nov 2018 17:36:28 -0800 From: Rick Edgecombe To: akpm@linux-foundation.org, willy@infradead.org, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v9 4/4] Kselftest for module text allocation benchmarking Date: Fri, 9 Nov 2018 17:38:07 -0800 Message-Id: <20181110013807.24903-5-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181110013807.24903-1-rick.p.edgecombe@intel.com> References: <20181110013807.24903-1-rick.p.edgecombe@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This adds a test module in lib/, and a script in kselftest that does benchmarking on the allocation of memory in the module space. Performance here would have some small impact on kernel module insertions, BPF JIT insertions and kprobes. In the case of KASLR features for the module space, this module can be used to measure the allocation performance of different configurations. This module needs to be compiled into the kernel because module_alloc is not exported. With some modification to the code, as explained in the comments, it can be enabled to measure TLB flushes as well. There are two tests in the module. One allocates until failure in order to test module capacity and the other times allocating space in the module area. They both use module sizes that roughly approximate the distribution of in-tree X86_64 modules. You can control the number of modules used in the tests like this: echo m1000>/dev/mod_alloc_test Run the test for module capacity like: echo t1>/dev/mod_alloc_test The other test will measure the allocation time, and for CONFG_X86_64 and CONFIG_RANDOMIZE_BASE, also give data on how often the “backup area" is used. Run the test for allocation time and backup area usage like: echo t2>/dev/mod_alloc_test The output will be something like this: num all(ns) last(ns) 1000 1083 1099 Last module in backup count = 0 Total modules in backup = 0 >1 module in backup count = 0 To run a suite of allocation time tests for a collection of module numbers you can run: tools/testing/selftests/bpf/test_mod_alloc.sh Signed-off-by: Rick Edgecombe --- lib/Kconfig.debug | 9 + lib/Makefile | 1 + lib/test_mod_alloc.c | 375 ++++++++++++++++++ tools/testing/selftests/bpf/test_mod_alloc.sh | 29 ++ 4 files changed, 414 insertions(+) create mode 100644 lib/test_mod_alloc.c create mode 100755 tools/testing/selftests/bpf/test_mod_alloc.sh diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 1af29b8224fd..b590b2bb312f 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1886,6 +1886,15 @@ config TEST_BPF If unsure, say N. +config TEST_MOD_ALLOC + bool "Tests for module allocator/vmalloc" + help + This builds the "test_mod_alloc" module that performs performance + tests on the module text section allocator. The module uses X86_64 + module text sizes for simulations. + + If unsure, say N. + config FIND_BIT_BENCHMARK tristate "Test find_bit functions" help diff --git a/lib/Makefile b/lib/Makefile index db06d1237898..c447e07931b0 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -60,6 +60,7 @@ UBSAN_SANITIZE_test_ubsan.o := y obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o obj-$(CONFIG_TEST_LIST_SORT) += test_list_sort.o obj-$(CONFIG_TEST_LKM) += test_module.o +obj-$(CONFIG_TEST_MOD_ALLOC) += test_mod_alloc.o obj-$(CONFIG_TEST_OVERFLOW) += test_overflow.o obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o obj-$(CONFIG_TEST_SORT) += test_sort.o diff --git a/lib/test_mod_alloc.c b/lib/test_mod_alloc.c new file mode 100644 index 000000000000..3a6fb7999df4 --- /dev/null +++ b/lib/test_mod_alloc.c @@ -0,0 +1,375 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * This module can be used to test allocation allocation time and randomness. + * + * To interact with this module, mount debugfs, for example: + * mount -t debugfs none /sys/kernel/debug/ + * + * Then write to the file: + * /sys/kernel/debug/mod_alloc_test + * + * There are two tests: + * Test 1: Allocate until failure + * Test 2: Run 1000 iterations of a test the simulates loading modules with + * x86_64 module sizes. + * + * Configure the number (ex:1000) of modules to use per test in the tests: + * echo m1000 > /sys/kernel/debug/mod_alloc_test + * + * To run test (ex: Test 2): + * echo t2 > /sys/kernel/debug/mod_alloc_test + * + * For test 1 it will print the results of each test. For test 2 it will print + * out statistics for example: + * New module count: 1000 + * Starting 10000 iterations of 1000 modules + * num all(ns) last(ns) + * 1000 1984 2112 + * Last module in backup count = 0 + * Total modules in backup = 188 + * >1 module in backup count = 7 + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct mod { int filesize; int coresize; int initsize; }; + +/* ==== Begin optional logging ==== */ +/* + * Note: In order to get an accurate count for the tlb flushes triggered in + * vmalloc, create a counter in vmalloc.c with this method signature and export + * it. Then replace the below with: + * + * extern unsigned long get_tlb_flushes_vmalloc(void); + */ +static unsigned long get_tlb_flushes_vmalloc(void) +{ + return 0; +} + +/* ==== End optional logging ==== */ + + +#define MAX_ALLOC_CNT 20000 +#define ITERS 1000 + +struct vm_alloc { + void *core; + unsigned long core_size; + void *init; +}; + +static struct vm_alloc *allocs_vm; +static long mod_cnt; +static DEFINE_MUTEX(test_mod_alloc_mutex); + +const static int core_hist[10] = {1, 5, 21, 46, 141, 245, 597, 2224, 1875, 0}; +const static int init_hist[10] = {0, 0, 0, 0, 10, 19, 70, 914, 3906, 236}; +const static int file_hist[10] = {6, 20, 55, 86, 286, 551, 918, 2024, 1028, + 181}; + +const static int bins[10] = {5000000, 2000000, 1000000, 500000, 200000, 100000, + 50000, 20000, 10000, 5000}; +/* + * Rough approximation of the X86_64 module size distribution. + */ +static int get_mod_rand_size(const int *hist) +{ + int area_under = get_random_int() % 5155; + int i; + int last_bin = bins[0] + 1; + int sum = 0; + + for (i = 0; i <= 9; i++) { + sum += hist[i]; + if (area_under <= sum) + return bins[i] + + (get_random_int() % (last_bin - bins[i])); + last_bin = bins[i]; + } + return 4096; +} + +static struct mod get_rand_module(void) +{ + struct mod ret; + + ret.coresize = get_mod_rand_size(core_hist); + ret.initsize = get_mod_rand_size(init_hist); + ret.filesize = get_mod_rand_size(file_hist); + return ret; +} + +static void do_test_alloc_fail(void) +{ + struct vm_alloc *cur_alloc; + struct mod cur_mod; + void *file; + int mod_n, free_mod_n; + unsigned long fail = 0; + int iter; + + for (iter = 0; iter < ITERS; iter++) { + pr_info("Running iteration: %d\n", iter); + memset(allocs_vm, 0, mod_cnt * sizeof(struct vm_alloc)); + vm_unmap_aliases(); + for (mod_n = 0; mod_n < mod_cnt; mod_n++) { + cur_mod = get_rand_module(); + cur_alloc = &allocs_vm[mod_n]; + + /* Allocate */ + file = vmalloc(cur_mod.filesize); + cur_alloc->core = module_alloc(cur_mod.coresize); + cur_alloc->init = module_alloc(cur_mod.initsize); + + /* Clean up everything except core */ + if (!cur_alloc->core || !cur_alloc->init) { + fail++; + vfree(file); + if (cur_alloc->init) { + module_memfree(cur_alloc->init); + vm_unmap_aliases(); + } + break; + } + module_memfree(cur_alloc->init); + vm_unmap_aliases(); + vfree(file); + } + + /* Clean up core sizes */ + for (free_mod_n = 0; free_mod_n < mod_n; free_mod_n++) { + cur_alloc = &allocs_vm[free_mod_n]; + if (cur_alloc->core) + module_memfree(cur_alloc->core); + } + } + pr_info("Failures(%ld modules): %lu\n", mod_cnt, fail); +} + +#ifdef CONFIG_RANDOMIZE_FINE_MODULE +static int is_in_backup(void *addr) +{ + return (unsigned long)addr >= MODULES_VADDR + MODULES_RAND_LEN; +} +#else +static int is_in_backup(void *addr) +{ + return 0; +} +#endif + +static void do_test_last_perf(void) +{ + struct vm_alloc *cur_alloc; + struct mod cur_mod; + void *file; + int mod_n, mon_n_free; + unsigned long fail = 0; + int iter; + ktime_t start, diff; + ktime_t total_last = 0; + ktime_t total_all = 0; + + /* + * The number of last core allocations for each iteration that were + * allocated in the backup area. + */ + int last_in_bk = 0; + + /* + * The total number of core allocations that were in the backup area for + * all iterations. + */ + int total_in_bk = 0; + + /* The number of iterations where the count was more than 1 */ + int cnt_more_than_1 = 0; + + /* + * The number of core allocations that were in the backup area for the + * current iteration. + */ + int cur_in_bk = 0; + + unsigned long before_tlbs; + unsigned long tlb_cnt_total; + unsigned long tlb_cur; + unsigned long total_tlbs = 0; + + pr_info("Starting %d iterations of %ld modules\n", ITERS, mod_cnt); + + for (iter = 0; iter < ITERS; iter++) { + vm_unmap_aliases(); + before_tlbs = get_tlb_flushes_vmalloc(); + memset(allocs_vm, 0, mod_cnt * sizeof(struct vm_alloc)); + tlb_cnt_total = 0; + cur_in_bk = 0; + for (mod_n = 0; mod_n < mod_cnt; mod_n++) { + /* allocate how the module allocator allocates */ + + cur_mod = get_rand_module(); + cur_alloc = &allocs_vm[mod_n]; + file = vmalloc(cur_mod.filesize); + + tlb_cur = get_tlb_flushes_vmalloc(); + + start = ktime_get(); + cur_alloc->core = module_alloc(cur_mod.coresize); + diff = ktime_get() - start; + + cur_alloc->init = module_alloc(cur_mod.initsize); + + /* Collect metrics */ + if (is_in_backup(cur_alloc->core)) { + cur_in_bk++; + if (mod_n == mod_cnt - 1) + last_in_bk++; + } + total_all += diff; + + if (mod_n == mod_cnt - 1) + total_last += diff; + + tlb_cnt_total += get_tlb_flushes_vmalloc() - tlb_cur; + + /* If there is a failure, quit. init/core freed later */ + if (!cur_alloc->core || !cur_alloc->init) { + fail++; + vfree(file); + break; + } + /* Init sections do not last long so free here */ + module_memfree(cur_alloc->init); + vm_unmap_aliases(); + cur_alloc->init = NULL; + vfree(file); + } + + /* Collect per iteration metrics */ + total_in_bk += cur_in_bk; + if (cur_in_bk > 1) + cnt_more_than_1++; + total_tlbs += get_tlb_flushes_vmalloc() - before_tlbs; + + /* Collect per iteration metrics */ + for (mon_n_free = 0; mon_n_free < mod_cnt; mon_n_free++) { + cur_alloc = &allocs_vm[mon_n_free]; + module_memfree(cur_alloc->init); + module_memfree(cur_alloc->core); + } + } + + if (fail) + pr_info("There was an alloc failure, results invalid!\n"); + + pr_info("num\t\tall(ns)\t\tlast(ns)"); + pr_info("%ld\t\t%llu\t\t%llu\n", mod_cnt, + div64_s64(total_all, ITERS * mod_cnt), + div64_s64(total_last, ITERS)); + + if (IS_ENABLED(CONFIG_RANDOMIZE_FINE_MODULE)) { + pr_info("Last module in backup count = %d\n", last_in_bk); + pr_info("Total modules in backup = %d\n", total_in_bk); + pr_info(">1 module in backup count = %d\n", cnt_more_than_1); + } + /* + * This will usually hide info when the instrumentation is not in place. + */ + if (tlb_cnt_total) + pr_info("TLB Flushes: %lu\n", tlb_cnt_total); +} + +static void do_test(int test) +{ + switch (test) { + case 1: + do_test_alloc_fail(); + break; + case 2: + do_test_last_perf(); + break; + default: + pr_info("Unknown test\n"); + } +} + +static ssize_t device_file_write(struct file *filp, const char __user *user_buf, + size_t count, loff_t *offp) +{ + char buf[100]; + long input_num; + + if (count >= sizeof(buf) - 1) { + pr_info("Command too long\n"); + return count; + } + + if (!mutex_trylock(&test_mod_alloc_mutex)) { + pr_info("test_mod_alloc busy\n"); + return count; + } + + if (copy_from_user(buf, user_buf, count)) + goto error; + + buf[count] = 0; + + if (kstrtol(buf+1, 10, &input_num)) + goto error; + + switch (buf[0]) { + case 'm': + if (input_num > 0 && input_num <= MAX_ALLOC_CNT) { + pr_info("New module count: %ld\n", input_num); + mod_cnt = input_num; + if (allocs_vm) + vfree(allocs_vm); + allocs_vm = vmalloc(sizeof(struct vm_alloc) * mod_cnt); + } else + pr_info("more than %d not supported\n", MAX_ALLOC_CNT); + break; + case 't': + if (!mod_cnt) { + pr_info("Set module count first\n"); + break; + } + + do_test(input_num); + break; + default: + pr_info("Unknown command\n"); + } + goto done; +error: + pr_info("Could not process input\n"); +done: + mutex_unlock(&test_mod_alloc_mutex); + return count; +} + +static const char *dv_name = "mod_alloc_test"; +const static struct file_operations test_mod_alloc_fops = { + .owner = THIS_MODULE, + .write = device_file_write, +}; + +static int __init mod_alloc_test_init(void) +{ + debugfs_create_file(dv_name, 0400, NULL, NULL, &test_mod_alloc_fops); + + return 0; +} + +MODULE_LICENSE("GPL"); + +module_init(mod_alloc_test_init); diff --git a/tools/testing/selftests/bpf/test_mod_alloc.sh b/tools/testing/selftests/bpf/test_mod_alloc.sh new file mode 100755 index 000000000000..e9aea570de78 --- /dev/null +++ b/tools/testing/selftests/bpf/test_mod_alloc.sh @@ -0,0 +1,29 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +UNMOUNT_DEBUG_FS=0 +if ! mount | grep -q debugfs; then + if mount -t debugfs none /sys/kernel/debug/; then + UNMOUNT_DEBUG_FS=1 + else + echo "Could not mount debug fs." + exit 1 + fi +fi + +if [ ! -e /sys/kernel/debug/mod_alloc_test ]; then + echo "Test module not found, did you build kernel with TEST_MOD_ALLOC?" + exit 1 +fi + +echo "Beginning module_alloc performance tests." + +for i in `seq 1000 1000 8000`; do + echo m$i>/sys/kernel/debug/mod_alloc_test + echo t2>/sys/kernel/debug/mod_alloc_test +done + +echo "Module_alloc performance tests ended." + +if [ $UNMOUNT_DEBUG_FS -eq 1 ]; then + umount /sys/kernel/debug/ +fi