From patchwork Wed Sep 12 22:55:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Edgecombe, Rick P" X-Patchwork-Id: 10598431 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D15C14F9 for ; Wed, 12 Sep 2018 22:55:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A6342AC7E for ; Wed, 12 Sep 2018 22:55:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2E2EE2AC80; Wed, 12 Sep 2018 22:55:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E8DD2AC7E for ; Wed, 12 Sep 2018 22:55:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 503B48E0004; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 44F908E0001; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 317B18E0003; Wed, 12 Sep 2018 18:55:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id E3A558E0001 for ; Wed, 12 Sep 2018 18:55:38 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id h3-v6so1547626pgc.8 for ; Wed, 12 Sep 2018 15:55:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=y9y4Vx2iOX2o/hg2UHXLgGMAdsn0DmQc363NJwvUEWs=; b=G5u/ctiM1/uYaleIuCoGQf+PxNDxWtgbJZD+kojTn+9Zke4DoXCQD9mRq98mMar73+ 95kNv6k3BvXlwMzl4crayn6IQ/jkpAUZXmpSwIsMtEC/tJ8SuC2ykpsMKmErjH+p5c62 FBURb+8ZB5Mu38rygnWgfKhrNBNik48eYodu2fl9R8JRYAJJnhjok0eTf4fXW4zBD43j IYQ3lSnUDZ1knYD2ZrmUIoV9k7lkxBMkXmoUGGL6aAUB6EHPzl2WZ9igdExFDV1bNpgw TyxFiExPx5k0dwiuYifzHhlYDiHlyzfjGTNiBo0l8rk9a7sbvYyU7Jn0BzcZxskslvur q8lA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51ACV8AfnJHdf0m1fkM+PpQnQFzXSUMyg14yVPoX2OgCjUG12tAA 6e7/Oky9ITISCtmk7RHLPou+Efq+15l97V7I36pIuoTRTfX78CjXX8oN12VUdEgLSJ3jRgTlOCR HjvMvMzDeZzeiXwAwOLFQN4Vq/fLYIUUNQ7Fd/M5+EjTNshWLjPxFI4MIrfdDM6JjLQ== X-Received: by 2002:a17:902:622:: with SMTP id 31-v6mr4458623plg.153.1536792938565; Wed, 12 Sep 2018 15:55:38 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbNICRMZGWbsTct/e7ArW97SwtS3C03NFB1oqT7mKPuwE/LgwfuV7T8YiT9rvg5VJFgLO7k X-Received: by 2002:a17:902:622:: with SMTP id 31-v6mr4458579plg.153.1536792937252; Wed, 12 Sep 2018 15:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536792937; cv=none; d=google.com; s=arc-20160816; b=gMBnJ5Gxgt00ZHzoCQ8QC458Agu0WQCu3xYtm7zC2Ndhf1/zttEkEOLIvIW/fkmrpw skpXP0i7exjPBK3WlvQrg+c1+T8EsPQImY8YpHwZDZ4QH5Gl0jvyRE3+wV1iRpTAQGta 0stO9BnWwmHg4L7Rdy8i1Q5RVR+OCj1KHiIQ9QQAtMSJXoMae9VQP+bAnuuN6UkaP1UV hF8SF12MIEjXbHe2+flBCDgzNvDAPNGlezNItN/pOdZD1av/uWbKNZJ3oJWOPfJj77Wi 84LTowcHKdeGqNbLdOA74cMxq6FRy62BVAwEkjsJVtneOlQ5erL1rbH3C1+Ua++2VbGL 6BCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=y9y4Vx2iOX2o/hg2UHXLgGMAdsn0DmQc363NJwvUEWs=; b=Z3Zjg53fTfvyoDbbcHg+2aNhEY74uQE5gTuEcAiIIGNQRKmn+hSqeD6Ow5VIRFo7Ph sstP+xpqpmIo6zgoK4tNqa6OrZuOKegD2Uoat10M8FgyyARf2R81NebH9cWDCIcBoWfo Twus2O4Vf4I47wBnGozgRpcvPK6EdmIeQRMegNu4ze2DqSxmy4AfNlWBPlhOL/nfzRab k1NjfZ8EXaRlIUp8pijYMsbm4wFJjcxZgBa4v9BoZaJT4puwb3xHvbrkg9T6AD8NA4Li S4KIrpNeKswFgr+BRMMFnXvc8WQ7NSZGyc59Bu4UWI7N8CUC0x4Xdh6UOd3vrM0Lo5bp e4Jw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id n5-v6si2470946pgf.529.2018.09.12.15.55.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Sep 2018 15:55:37 -0700 (PDT) Received-SPF: pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of rick.p.edgecombe@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=rick.p.edgecombe@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Sep 2018 15:55:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,366,1531810800"; d="scan'208";a="69545856" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.168]) by fmsmga007.fm.intel.com with ESMTP; 12 Sep 2018 15:55:22 -0700 From: Rick Edgecombe To: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-hardening@lists.openwall.com, daniel@iogearbox.net, jannh@google.com, keescook@chromium.org, alexei.starovoitov@gmail.com Cc: kristen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, Rick Edgecombe Subject: [PATCH v5 1/4] vmalloc: Add __vmalloc_node_try_addr function Date: Wed, 12 Sep 2018 15:55:37 -0700 Message-Id: <1536792940-8294-2-git-send-email-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> References: <1536792940-8294-1-git-send-email-rick.p.edgecombe@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Create __vmalloc_node_try_addr function that tries to allocate at a specific address and supports caller specified behavior for whether any lazy purging happens if there is a collision. This new function draws from the __vmalloc_node_range implementation. Attempts to merge the two into a single allocator resulted in logic that was difficult to follow, so they are left separate. Signed-off-by: Rick Edgecombe --- include/linux/vmalloc.h | 3 + mm/vmalloc.c | 177 +++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 179 insertions(+), 1 deletion(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 398e9c9..c7712c8 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -82,6 +82,9 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align, unsigned long start, unsigned long end, gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, int node, const void *caller); +extern void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size, + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, int try_purge, const void *caller); #ifndef CONFIG_MMU extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags); static inline void *__vmalloc_node_flags_caller(unsigned long size, int node, diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a728fc4..1954458 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1709,6 +1709,181 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, return NULL; } +static bool pvm_find_next_prev(unsigned long end, + struct vmap_area **pnext, + struct vmap_area **pprev); + +/* Try to allocate a region of KVA of the specified address and size. */ +static struct vmap_area *try_alloc_vmap_area(unsigned long addr, + unsigned long size, int node, gfp_t gfp_mask, + int try_purge) +{ + struct vmap_area *va; + struct vmap_area *cur_va = NULL; + struct vmap_area *first_before = NULL; + int need_purge = 0; + int blocked = 0; + int purged = 0; + unsigned long addr_end; + + WARN_ON(!size); + WARN_ON(offset_in_page(size)); + + addr_end = addr + size; + if (addr > addr_end) + return ERR_PTR(-EOVERFLOW); + + might_sleep(); + + va = kmalloc_node(sizeof(struct vmap_area), + gfp_mask & GFP_RECLAIM_MASK, node); + if (unlikely(!va)) + return ERR_PTR(-ENOMEM); + + /* + * Only scan the relevant parts containing pointers to other objects + * to avoid false negatives. + */ + kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK); + +retry: + spin_lock(&vmap_area_lock); + + pvm_find_next_prev(addr, &cur_va, &first_before); + + if (!cur_va) + goto found; + + /* + * If there is no VA that starts before the target address, start the + * check from the closest VA in order to cover the case where the + * allocation overlaps at the end. + */ + if (first_before && addr < first_before->va_end) + cur_va = first_before; + + /* Linearly search through to make sure there is a hole */ + while (cur_va->va_start < addr_end) { + if (cur_va->va_end > addr) { + if (cur_va->flags & VM_LAZY_FREE) { + need_purge = 1; + } else { + blocked = 1; + break; + } + } + + if (list_is_last(&cur_va->list, &vmap_area_list)) + break; + + cur_va = list_next_entry(cur_va, list); + } + + /* + * If a non-lazy free va blocks the allocation, or + * we are not supposed to purge, but we need to, the + * allocation fails. + */ + if (blocked || (need_purge && !try_purge)) + goto fail; + + if (try_purge && need_purge) { + /* if purged once before, give up */ + if (purged) + goto fail; + + /* + * If the va blocking the allocation is set to + * be purged then purge all vmap_areas that are + * set to purged since this will flush the TLBs + * anyway. + */ + spin_unlock(&vmap_area_lock); + purge_vmap_area_lazy(); + need_purge = 0; + purged = 1; + goto retry; + } + +found: + va->va_start = addr; + va->va_end = addr_end; + va->flags = 0; + __insert_vmap_area(va); + spin_unlock(&vmap_area_lock); + + return va; +fail: + spin_unlock(&vmap_area_lock); + kfree(va); + if (need_purge && !blocked) + return ERR_PTR(-EUCLEAN); + return ERR_PTR(-EBUSY); +} + +/** + * __vmalloc_try_addr - try to alloc at a specific address + * @addr: address to try + * @size: size to try + * @gfp_mask: flags for the page level allocator + * @prot: protection mask for the allocated pages + * @vm_flags: additional vm area flags (e.g. %VM_NO_GUARD) + * @node: node to use for allocation or NUMA_NO_NODE + * @try_purge: try to purge if needed to fulfill and allocation + * @caller: caller's return address + * + * Try to allocate at the specific address. If it succeeds the address is + * returned. If it fails an EBUSY ERR_PTR is returned. If try_purge is + * zero, it will return an EUCLEAN ERR_PTR if it could have allocated if it + * was allowed to purge. It may trigger TLB flushes if a purge is needed, + * and try_purge is set. + */ +void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size, + gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags, + int node, int try_purge, const void *caller) +{ + struct vmap_area *va; + struct vm_struct *area; + void *alloc_addr; + unsigned long real_size = size; + + size = PAGE_ALIGN(size); + if (!size || (size >> PAGE_SHIFT) > totalram_pages) + return NULL; + + WARN_ON(in_interrupt()); + + if (!(vm_flags & VM_NO_GUARD)) + size += PAGE_SIZE; + + va = try_alloc_vmap_area(addr, size, node, gfp_mask, try_purge); + if (IS_ERR(va)) + goto fail; + + area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node); + if (unlikely(!area)) { + warn_alloc(gfp_mask, NULL, "kmalloc: allocation failure"); + return ERR_PTR(-ENOMEM); + } + + setup_vmalloc_vm(area, va, vm_flags, caller); + + alloc_addr = __vmalloc_area_node(area, gfp_mask, prot, node); + if (!alloc_addr) { + warn_alloc(gfp_mask, NULL, + "vmalloc: allocation failure: %lu bytes", real_size); + return ERR_PTR(-ENOMEM); + } + + clear_vm_uninitialized_flag(area); + + kmemleak_vmalloc(area, real_size, gfp_mask); + + return alloc_addr; +fail: + return va; +} + /** * __vmalloc_node_range - allocate virtually contiguous memory * @size: allocation size @@ -2355,7 +2530,6 @@ void free_vm_area(struct vm_struct *area) } EXPORT_SYMBOL_GPL(free_vm_area); -#ifdef CONFIG_SMP static struct vmap_area *node_to_va(struct rb_node *n) { return rb_entry_safe(n, struct vmap_area, rb_node); @@ -2403,6 +2577,7 @@ static bool pvm_find_next_prev(unsigned long end, return true; } +#ifdef CONFIG_SMP /** * pvm_determine_end - find the highest aligned address between two vmap_areas * @pnext: in/out arg for the next vmap_area