From patchwork Fri Jun 24 17:36:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 12894938 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C950FCCA47E for ; Fri, 24 Jun 2022 17:37:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A47BD8E024B; Fri, 24 Jun 2022 13:37:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9AA358E0244; Fri, 24 Jun 2022 13:37:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FB698E024B; Fri, 24 Jun 2022 13:37:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5C9808E0244 for ; Fri, 24 Jun 2022 13:37:29 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3D93720749 for ; Fri, 24 Jun 2022 17:37:29 +0000 (UTC) X-FDA: 79613836218.15.BCB0C44 Received: from mail-ua1-f74.google.com (mail-ua1-f74.google.com [209.85.222.74]) by imf06.hostedemail.com (Postfix) with ESMTP id D2C411800A5 for ; Fri, 24 Jun 2022 17:37:28 +0000 (UTC) Received: by mail-ua1-f74.google.com with SMTP id v14-20020ab0768e000000b0037efa637aeeso1007027uaq.23 for ; Fri, 24 Jun 2022 10:37:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=fLhWVrtL3jCVQZTI/SpR/dF9bcFaOpNW6kUek+4VMrQ=; b=h/ysFkhblpzJ+iMXOAuoAW1RCDzHD/+S02IOwgDp+W1ftcj/5GpmKopNXVJGmI11dt mAS2/ecnjScCvn9y+SSrPgkps+68W5ymor9ZeqP24UdBQp7MkKwq9sGbZh9bcS3Kot12 BGqJ5gfYWmse/+yrqaROihgM78qPJlOL2tffwnW0DZUm7UP9c6M6BVGl7wnznm72cE8e Gmgcs44D/bkYNixg8jOWwZMa5AJhJKIcbwa6ooNOCsk7KyEQSYEcelE/7NURJzuTXfld QKmlDWTSdSuaR2EkI87u4H/5x0GVEh50Cju+5MyrG72wa/G6IgHcBFf2GIqziepRDeTo ynXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=fLhWVrtL3jCVQZTI/SpR/dF9bcFaOpNW6kUek+4VMrQ=; b=PsuDsaeFlj2v+MaFRpRyDfFiyj/S1nSDdsr9hj7M2Wv5vfJCsk73CjbZvCy78w22Dk FLDU7baBw4GoMhk0axY0UnJg1h//up127oKc/+yBTgEW7JRey3baD+lDx8KQOsaUCGij rBxnqxSAJbTEclTC485EUQv7/WbabBOQAcuRGV1XgM6w1mOwBNaG57XG4bvf+qcVVINm 8hN6WHy+2rh/hdg+YyKZ0wGXCoNVR6lZCOpPuiU7cVKHtOhYCh+SGJWtX4CTgGeY/g+7 hmGGUrUCbVpOAqRWfUmZ1EwqDzecL17EQl76fR0/3eVZ0GWKi0l9C3uDs9vH3nhpEoXO TBcw== X-Gm-Message-State: AJIora/G1awtQ4NygXic/QNyIPUFOX0dJNFHGf4sqsXTACMvv6ss7cy0 NEYoi4Svov+YMwyR2wqwSwNOr0iPxV3w+sFY X-Google-Smtp-Source: AGRyM1uRSxwvI1ZalFt+ckvzYpzesshV4sVRNlb4t9wpXcehiUcvJo2Hg3rBOz88Imnv+bFCKqP3jsH6tlItOe1M X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a67:c30f:0:b0:354:45bf:748c with SMTP id r15-20020a67c30f000000b0035445bf748cmr351vsj.13.1656092248216; Fri, 24 Jun 2022 10:37:28 -0700 (PDT) Date: Fri, 24 Jun 2022 17:36:43 +0000 In-Reply-To: <20220624173656.2033256-1-jthoughton@google.com> Message-Id: <20220624173656.2033256-14-jthoughton@google.com> Mime-Version: 1.0 References: <20220624173656.2033256-1-jthoughton@google.com> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog Subject: [RFC PATCH 13/26] hugetlb: add huge_pte_alloc_high_granularity From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , Jue Wang , Manish Mishra , "Dr . David Alan Gilbert" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="h/ysFkhb"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3WPa1YgoKCDkeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3WPa1YgoKCDkeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656092248; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fLhWVrtL3jCVQZTI/SpR/dF9bcFaOpNW6kUek+4VMrQ=; b=PGf/UoUgJNDF5qc/p59ZifG69ErDa7IHZMHbLu6kQ/s6BWYdWzZmTutF1Yk6xqEpcNJsOB 6GKYSG/yuDBczsfDEhdCAhTb1j6cvXCGabzx4iv/RR+0yFNDLdyDyPb/EFusxP+hvigxsy 8YERIvNbSv4bfEvmUkkZyNKXAGJJGds= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656092248; a=rsa-sha256; cv=none; b=50fqXejmGqPb9OV8Uzn7l7xdmfI/pqCG5a+1N0YrJDEvPiIKoXR4Dzji9fuTr2aE6/hVFB gU/N5K9Mv758+P9MHutMcgs+9uQIZceLQXlAZcry/6SGdj5KeSEXw+sEzixYgSHfffEg1F p8UnBuJ+OPLQBa5t7Z8O+b+JWyKXcuo= X-Stat-Signature: bkfgzyxxi65k46x5exeew8ddnshqdc1z X-Rspamd-Queue-Id: D2C411800A5 X-Rspam-User: Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="h/ysFkhb"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3WPa1YgoKCDkeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3WPa1YgoKCDkeocjpbcojibjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--jthoughton.bounces.google.com X-Rspamd-Server: rspam12 X-HE-Tag: 1656092248-939071 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This function is to be used to do a HugeTLB page table walk where we may need to split a leaf-level huge PTE into a new page table level. Consider the case where we want to install 4K inside an empty 1G page: 1. We walk to the PUD and notice that it is pte_none. 2. We split the PUD by calling `hugetlb_split_to_shift`, creating a standard PUD that points to PMDs that are all pte_none. 3. We continue the PT walk to find the PMD. We split it just like we split the PUD. 4. We find the PTE and give it back to the caller. To avoid concurrent splitting operations on the same page table entry, we require that the mapping rwsem is held for writing while collapsing and for reading when doing a high-granularity PT walk. Signed-off-by: James Houghton --- include/linux/hugetlb.h | 23 ++++++++++++++ mm/hugetlb.c | 67 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 605aa19d8572..321f5745d87f 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1176,14 +1176,37 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, } #endif /* CONFIG_HUGETLB_PAGE */ +enum split_mode { + HUGETLB_SPLIT_NEVER = 0, + HUGETLB_SPLIT_NONE = 1 << 0, + HUGETLB_SPLIT_PRESENT = 1 << 1, + HUGETLB_SPLIT_ALWAYS = HUGETLB_SPLIT_NONE | HUGETLB_SPLIT_PRESENT, +}; #ifdef CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING /* If HugeTLB high-granularity mappings are enabled for this VMA. */ bool hugetlb_hgm_enabled(struct vm_area_struct *vma); +int huge_pte_alloc_high_granularity(struct hugetlb_pte *hpte, + struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long addr, + unsigned int desired_sz, + enum split_mode mode, + bool write_locked); #else static inline bool hugetlb_hgm_enabled(struct vm_area_struct *vma) { return false; } +static inline int huge_pte_alloc_high_granularity(struct hugetlb_pte *hpte, + struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long addr, + unsigned int desired_sz, + enum split_mode mode, + bool write_locked) +{ + return -EINVAL; +} #endif static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index eaffe7b4f67c..6e0c5fbfe32c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7166,6 +7166,73 @@ static int hugetlb_split_to_shift(struct mm_struct *mm, struct vm_area_struct *v tlb_finish_mmu(&tlb); return ret; } + +/* + * Similar to huge_pte_alloc except that this can be used to create or walk + * high-granularity mappings. It will automatically split existing HugeTLB PTEs + * if required by @mode. The resulting HugeTLB PTE will be returned in @hpte. + * + * There are three options for @mode: + * - HUGETLB_SPLIT_NEVER - Never split. + * - HUGETLB_SPLIT_NONE - Split empty PTEs. + * - HUGETLB_SPLIT_PRESENT - Split present PTEs. + * - HUGETLB_SPLIT_ALWAYS - Split both empty and present PTEs. + */ +int huge_pte_alloc_high_granularity(struct hugetlb_pte *hpte, + struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long addr, + unsigned int desired_shift, + enum split_mode mode, + bool write_locked) +{ + struct address_space *mapping = vma->vm_file->f_mapping; + bool has_write_lock = write_locked; + unsigned long desired_sz = 1UL << desired_shift; + int ret; + + BUG_ON(!hpte); + + if (has_write_lock) + i_mmap_assert_write_locked(mapping); + else + i_mmap_assert_locked(mapping); + +retry: + ret = 0; + hugetlb_pte_init(hpte); + + ret = hugetlb_walk_to(mm, hpte, addr, desired_sz, + !(mode & HUGETLB_SPLIT_NONE)); + if (ret || hugetlb_pte_size(hpte) == desired_sz) + goto out; + + if ( + ((mode & HUGETLB_SPLIT_NONE) && hugetlb_pte_none(hpte)) || + ((mode & HUGETLB_SPLIT_PRESENT) && + hugetlb_pte_present_leaf(hpte)) + ) { + if (!has_write_lock) { + i_mmap_unlock_read(mapping); + i_mmap_lock_write(mapping); + has_write_lock = true; + goto retry; + } + ret = hugetlb_split_to_shift(mm, vma, hpte, addr, + desired_shift); + } + +out: + if (has_write_lock && !write_locked) { + /* Drop the write lock. */ + i_mmap_unlock_write(mapping); + i_mmap_lock_read(mapping); + has_write_lock = false; + goto retry; + } + + return ret; +} #endif /* CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING */ /*