From patchwork Mon Oct 21 04:22:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13843547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BE9DCD3C93D for ; Mon, 21 Oct 2024 04:25:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=KalWPiT5Cmq0awZpJ2ht7HZzxnfLYA4lEUcwS3DKUno=; b=jM/nwhwXguluoqETm/TZQgJNvx fUWOJHW9UHo16lzOiqGu2OfTQYKUHFPW4QbIDxb7av78telxvWBJCR+KsISBnW/XFRpFzobVZK8JB bkA1YugZqgJooloJ0wjRaMYtWtH7TeWbnblxFYGmmTrYj26qh7biYVTUrepECkuGxhyJ9YLScHViO atX9zM2TObwL/npoHIfJhylA8gXk1EfbHVEbzHbcZ8DEuxjeotYhIXP4Oo6n1NXhlFoP9qAWsrohu AjqdrNu8gPuCWpkT/AOWlia0NGuqpajT/DLiN1RmOC6lmF11OW0sUI19pXwVL8zGDpX8PSiubLh+s 1kMrhpCw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t2jz7-000000061NY-07Cs; Mon, 21 Oct 2024 04:25:37 +0000 Received: from mail-yw1-x1149.google.com ([2607:f8b0:4864:20::1149]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t2jw3-000000060qs-1bwA for linux-arm-kernel@lists.infradead.org; Mon, 21 Oct 2024 04:22:28 +0000 Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-6e1fbe2a6b1so70600127b3.2 for ; Sun, 20 Oct 2024 21:22:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484545; x=1730089345; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KalWPiT5Cmq0awZpJ2ht7HZzxnfLYA4lEUcwS3DKUno=; b=zxROUKbwEsmkMoOlGQpUb8PDtvPrCHI2LUYAPKtJex5Bh8AKAaDAQXWulAOKoUmthv zGLtqiNGjU4lb+Q2EFcb8PW/fhq9PhhLr0TNg1yil7kW6rp9A/z7ar/sWh/cbc4U7/yV bp6epofOJdnay2Z/nr04b+dIqHo9vckhfFI+q4VEvFnlxyCY4Fapth6Mh9DjvKb1U0cE VE5RsIWSq6v8HE+GQ1Oam70ws5i+/JL8TpxeuoV4GVA2pTsHVfC0FrAJup+X1U5ZZVJb VAWJDfJROE2MjZGW4AaBpC4HazoDWs3CIgVRoouMSErl8H3/1YIdmt7IWqWtQA6ooLtO YVYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484545; x=1730089345; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KalWPiT5Cmq0awZpJ2ht7HZzxnfLYA4lEUcwS3DKUno=; b=A2BtJsGt71JbQe/lz8vRRFwragmWUgd3sMv3NGaITqPtyY2b5AJILUgrkgFMBXi1bj Y2CdI2mPzwpjQANwmD1FjnxLQ3hLNypa9Uow9n0K9n6G9ObZt/iVO7PZspIqd6WUMxqk H3ABiNbnI5S8uKvlvbyu5hfI7KsPLdJUU20KF74idyLVjcIelAQWPZ3Wm6rCC+SUAW2h 6hqQFfeWnbNXYoAzudpm/F5YEQXi2SmpRUGKqzOmWnRoIFzAGOvmvoWUspoFu6F1fPeZ vegROez+8g2Yu3wvzDBEnIYvtHNZGubt3o6vqGTpVSr5h9DYdD7x4kGyKUCR6lzPm/OR te/g== X-Forwarded-Encrypted: i=1; AJvYcCXyG7AUue98eV9jvnN6riAZrSEbgqExavEQVeG5fGQV6VpCbimtf8VGGmT5bM/F4Vpn5Udtu/qdN3rsdlfQa9Jo@lists.infradead.org X-Gm-Message-State: AOJu0YzIYiIKF1k3r0RJdH2qK48skPW5GKC6MzZTj3K+501pwomVUC/f APKYbnSMVaTcd3bkZyDmJStPc3WhmJ+dSN8B88OuHu1DVZCqVnLld5cQQmog3Ww/vI6pOP1SVfT kdg== X-Google-Smtp-Source: AGHT+IH2Ur7HHiXBaY4bDEA69MOZ2p1T90XrFcEHYsQ0uSyMb8pWEEibt70U8jRDqfM7NgoMPydl9BV9rAw= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:dc41:0:b0:e25:caa3:9c2e with SMTP id 3f1490d57ef6-e2bb16d9206mr22332276.11.1729484545011; Sun, 20 Oct 2024 21:22:25 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:13 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-2-yuzhao@google.com> Subject: [PATCH v1 1/6] mm/hugetlb_vmemmap: batch update PTEs From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241020_212227_489275_FFDA4BC8 X-CRM114-Status: GOOD ( 25.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Convert vmemmap_remap_walk->remap_pte to ->remap_pte_range so that vmemmap remap walks can batch update PTEs. The goal of this conversion is to allow architectures to implement their own optimizations if possible, e.g., only to stop remote CPUs once for each batch when updating vmemmap on arm64. It is not intended to change the remap workflow nor should it by itself have any side effects on performance. Signed-off-by: Yu Zhao --- mm/hugetlb_vmemmap.c | 163 ++++++++++++++++++++++++------------------- 1 file changed, 91 insertions(+), 72 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 57b7f591eee8..46befab48d41 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -22,7 +22,7 @@ /** * struct vmemmap_remap_walk - walk vmemmap page table * - * @remap_pte: called for each lowest-level entry (PTE). + * @remap_pte_range: called on a range of PTEs. * @nr_walked: the number of walked pte. * @reuse_page: the page which is reused for the tail vmemmap pages. * @reuse_addr: the virtual address of the @reuse_page page. @@ -32,8 +32,8 @@ * operations. */ struct vmemmap_remap_walk { - void (*remap_pte)(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk); + void (*remap_pte_range)(pte_t *pte, unsigned long start, + unsigned long end, struct vmemmap_remap_walk *walk); unsigned long nr_walked; struct page *reuse_page; unsigned long reuse_addr; @@ -101,10 +101,6 @@ static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long addr, struct page *head; struct vmemmap_remap_walk *vmemmap_walk = walk->private; - /* Only splitting, not remapping the vmemmap pages. */ - if (!vmemmap_walk->remap_pte) - walk->action = ACTION_CONTINUE; - spin_lock(&init_mm.page_table_lock); head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL; /* @@ -129,33 +125,36 @@ static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long addr, ret = -ENOTSUPP; } spin_unlock(&init_mm.page_table_lock); - if (!head || ret) + if (ret) return ret; - return vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); -} + if (head) { + ret = vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); + if (ret) + return ret; + } -static int vmemmap_pte_entry(pte_t *pte, unsigned long addr, - unsigned long next, struct mm_walk *walk) -{ - struct vmemmap_remap_walk *vmemmap_walk = walk->private; + if (vmemmap_walk->remap_pte_range) { + pte_t *pte = pte_offset_kernel(pmd, addr); - /* - * The reuse_page is found 'first' in page table walking before - * starting remapping. - */ - if (!vmemmap_walk->reuse_page) - vmemmap_walk->reuse_page = pte_page(ptep_get(pte)); - else - vmemmap_walk->remap_pte(pte, addr, vmemmap_walk); - vmemmap_walk->nr_walked++; + vmemmap_walk->nr_walked += (next - addr) / PAGE_SIZE; + /* + * The reuse_page is found 'first' in page table walking before + * starting remapping. + */ + if (!vmemmap_walk->reuse_page) { + vmemmap_walk->reuse_page = pte_page(ptep_get(pte)); + pte++; + addr += PAGE_SIZE; + } + vmemmap_walk->remap_pte_range(pte, addr, next, vmemmap_walk); + } return 0; } static const struct mm_walk_ops vmemmap_remap_ops = { .pmd_entry = vmemmap_pmd_entry, - .pte_entry = vmemmap_pte_entry, }; static int vmemmap_remap_range(unsigned long start, unsigned long end, @@ -172,7 +171,7 @@ static int vmemmap_remap_range(unsigned long start, unsigned long end, if (ret) return ret; - if (walk->remap_pte && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) + if (walk->remap_pte_range && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, end); return 0; @@ -204,33 +203,45 @@ static void free_vmemmap_page_list(struct list_head *list) free_vmemmap_page(page); } -static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) +static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned long end, + struct vmemmap_remap_walk *walk) { - /* - * Remap the tail pages as read-only to catch illegal write operation - * to the tail pages. - */ - pgprot_t pgprot = PAGE_KERNEL_RO; - struct page *page = pte_page(ptep_get(pte)); - pte_t entry; - - /* Remapping the head page requires r/w */ - if (unlikely(addr == walk->reuse_addr)) { - pgprot = PAGE_KERNEL; - list_del(&walk->reuse_page->lru); + int i; + struct page *page; + int nr_pages = (end - start) / PAGE_SIZE; + for (i = 0; i < nr_pages; i++) { + page = pte_page(ptep_get(pte + i)); + + list_add(&page->lru, walk->vmemmap_pages); + } + + page = walk->reuse_page; + + if (start == walk->reuse_addr) { + list_del(&page->lru); + copy_page(page_to_virt(page), (void *)walk->reuse_addr); /* - * Makes sure that preceding stores to the page contents from - * vmemmap_remap_free() become visible before the set_pte_at() - * write. + * Makes sure that preceding stores to the page contents become + * visible before set_pte_at(). */ smp_wmb(); } - entry = mk_pte(walk->reuse_page, pgprot); - list_add(&page->lru, walk->vmemmap_pages); - set_pte_at(&init_mm, addr, pte, entry); + for (i = 0; i < nr_pages; i++) { + pte_t val; + + /* + * The head page must be mapped read-write; the tail pages are + * mapped read-only to catch illegal modifications. + */ + if (!i && start == walk->reuse_addr) + val = mk_pte(page, PAGE_KERNEL); + else + val = mk_pte(page, PAGE_KERNEL_RO); + + set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); + } } /* @@ -252,27 +263,39 @@ static inline void reset_struct_pages(struct page *start) memcpy(start, from, sizeof(*from) * NR_RESET_STRUCT_PAGE); } -static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) +static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned long end, + struct vmemmap_remap_walk *walk) { - pgprot_t pgprot = PAGE_KERNEL; + int i; struct page *page; - void *to; - - BUG_ON(pte_page(ptep_get(pte)) != walk->reuse_page); + int nr_pages = (end - start) / PAGE_SIZE; page = list_first_entry(walk->vmemmap_pages, struct page, lru); - list_del(&page->lru); - to = page_to_virt(page); - copy_page(to, (void *)walk->reuse_addr); - reset_struct_pages(to); + + for (i = 0; i < nr_pages; i++) { + BUG_ON(pte_page(ptep_get(pte + i)) != walk->reuse_page); + + copy_page(page_to_virt(page), (void *)walk->reuse_addr); + reset_struct_pages(page_to_virt(page)); + + page = list_next_entry(page, lru); + } /* * Makes sure that preceding stores to the page contents become visible - * before the set_pte_at() write. + * before set_pte_at(). */ smp_wmb(); - set_pte_at(&init_mm, addr, pte, mk_pte(page, pgprot)); + + for (i = 0; i < nr_pages; i++) { + pte_t val; + + page = list_first_entry(walk->vmemmap_pages, struct page, lru); + list_del(&page->lru); + + val = mk_pte(page, PAGE_KERNEL); + set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); + } } /** @@ -290,7 +313,6 @@ static int vmemmap_remap_split(unsigned long start, unsigned long end, unsigned long reuse) { struct vmemmap_remap_walk walk = { - .remap_pte = NULL, .flags = VMEMMAP_SPLIT_NO_TLB_FLUSH, }; @@ -322,10 +344,10 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, { int ret; struct vmemmap_remap_walk walk = { - .remap_pte = vmemmap_remap_pte, - .reuse_addr = reuse, - .vmemmap_pages = vmemmap_pages, - .flags = flags, + .remap_pte_range = vmemmap_remap_pte_range, + .reuse_addr = reuse, + .vmemmap_pages = vmemmap_pages, + .flags = flags, }; int nid = page_to_nid((struct page *)reuse); gfp_t gfp_mask = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; @@ -340,8 +362,6 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, */ walk.reuse_page = alloc_pages_node(nid, gfp_mask, 0); if (walk.reuse_page) { - copy_page(page_to_virt(walk.reuse_page), - (void *)walk.reuse_addr); list_add(&walk.reuse_page->lru, vmemmap_pages); memmap_pages_add(1); } @@ -371,10 +391,9 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, * They will be restored in the following call. */ walk = (struct vmemmap_remap_walk) { - .remap_pte = vmemmap_restore_pte, - .reuse_addr = reuse, - .vmemmap_pages = vmemmap_pages, - .flags = 0, + .remap_pte_range = vmemmap_restore_pte_range, + .reuse_addr = reuse, + .vmemmap_pages = vmemmap_pages, }; vmemmap_remap_range(reuse, end, &walk); @@ -425,10 +444,10 @@ static int vmemmap_remap_alloc(unsigned long start, unsigned long end, { LIST_HEAD(vmemmap_pages); struct vmemmap_remap_walk walk = { - .remap_pte = vmemmap_restore_pte, - .reuse_addr = reuse, - .vmemmap_pages = &vmemmap_pages, - .flags = flags, + .remap_pte_range = vmemmap_restore_pte_range, + .reuse_addr = reuse, + .vmemmap_pages = &vmemmap_pages, + .flags = flags, }; /* See the comment in the vmemmap_remap_free(). */ From patchwork Mon Oct 21 04:22:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13843548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 49373D3C93E for ; Mon, 21 Oct 2024 04:27:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=kQVfzlUfyOXzQNiM6rrgf1IxWXC/TuVQ/9/mg1gNSUk=; b=iBJ2aDUncQekrhdjpLKv3gZIBt EZ52m/MZCq59Mncs9sXE5oXzntKnWaDHq9qU59wXdprjSp15pvsuWTIqIe3b53mjKtQ3W9ozrxDAy rTbfQ/KKkOKqgwJ3yAXd8wXg1B1a4seLYfkdQCvi4o2P2XTd3azBFKRVosdREWm9nv6SS8kE/T2Sy yP1YTrh18G/g/bhbjvKWOxwNbAv4TLfirQqwzV1W0i4pXVUplYe0XeDz+f9ATYzBu63HT9vg0L/pu 7MfnOgEf0aHlkD3AmsfNx76z/o41/l0kkVr+lv/Spg3Ru4L35mcDikkdcUBL/HVCY5QMslr4vpjoe CXRIBeiw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t2k0Y-000000061VW-1Pk5; Mon, 21 Oct 2024 04:27:06 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t2jw5-000000060rS-2HUa for linux-arm-kernel@lists.infradead.org; Mon, 21 Oct 2024 04:22:30 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-6e32e8436adso55268507b3.0 for ; Sun, 20 Oct 2024 21:22:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484548; x=1730089348; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kQVfzlUfyOXzQNiM6rrgf1IxWXC/TuVQ/9/mg1gNSUk=; b=GrZaMfp6gugHLWPZBJfgz+fakm1e70yG6gZBuND7yf4A0iXDBUaVTraG0HVkHehc0a SKKT5Zqoih8IH9bdzwNYvgARFcRTZ3/22d56maFOeaOcDGMfLTWPE/la/tb/EF2GMVgU m+PdnGH4cJ/UCB4QJ0Je64U5S7NZkk5eCy9RCE22DAGO2Lrrykubx/Y0/Sk90b/tTTaD GHpPiRm1uVDVJrPw9h3m+mq3ZZyKdehXX4RHZNp7K7K7u1fkUb/tuCtE2aCWCJPkk/hm uegf0lIze4MsgVpLAQ9VJYmpT60qRukb79Ke37owzb6eBFR/zumuTugStozGju2jE9Xr kZAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484548; x=1730089348; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kQVfzlUfyOXzQNiM6rrgf1IxWXC/TuVQ/9/mg1gNSUk=; b=DiNvTMqjchNZPXhNyKX74lKBSjTvHpUpmg1uf7IzKfS0ogeiirJc+VSlmPFOKqeQKp bYGGEAzDRbKqWXUMOvEkgoSjNGKpvHduljnOY22pjE5U4wXByIEDGqLcSFTLPA7N1Lga cdFLZQRtEI2351Z4weD2VLPocTP17K5GuRjGQm8DEZzZD0+bkYGyIRzVtC34I7P5pi2n mgwqfwv1gDeNfIRNNIGOu4Sug9D8QvONruqTcPmBcRIRhVFHyV6Mhwbazjwx6BTZjXSX +xfgiwXzRw07ppgHd64COhp0GG2yMY1kiCPRb7lXEQXIWT1noviLAJPV6jC/GeFtpXLz fVfQ== X-Forwarded-Encrypted: i=1; AJvYcCWQ26K1GrBchu1tKoRSbu6baNLVDPHWjztHXJ/IAhy4pjBFhoAa6tS1+W5nl4GhlbLIBRarjw/Ci0SvQ8NiU1bZ@lists.infradead.org X-Gm-Message-State: AOJu0YxaPtRglyaMpmCI+yZ2WrKZ88NTmTKcLlj9eCGC3DozTCoDq6+e 1+IonL6AvuCCVa9pO2WIAWr0rUIzNC1gu3gBwvoWx/G55YOeqZ9QeEPLezloGM/7di56JZpQ7iV w3A== X-Google-Smtp-Source: AGHT+IFa+RAhMnCTBGawKhIBMzr1i4wNva1osbWPtOgVISCaAsioJIQoe2S7YEzQTgFhpJ04zIjZfcK8uDQ= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a05:690c:6383:b0:663:ddc1:eab8 with SMTP id 00721157ae682-6e5bfbeb2e0mr2568607b3.4.1729484547736; Sun, 20 Oct 2024 21:22:27 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:14 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-3-yuzhao@google.com> Subject: [PATCH v1 2/6] mm/hugetlb_vmemmap: add arch-independent helpers From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241020_212229_622317_E19A8913 X-CRM114-Status: GOOD ( 21.87 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add architecture-independent helpers to allow individual architectures to work around their own limitations when updating vmemmap. Specifically, the current remap workflow requires break-before-make (BBM) on arm64. By overriding the default helpers later in this series, arm64 will be able to support the current HVO implementation. Signed-off-by: Yu Zhao --- include/linux/mm_types.h | 7 +++ mm/hugetlb_vmemmap.c | 99 ++++++++++++++++++++++++++++++++++------ 2 files changed, 92 insertions(+), 14 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bc..0f3ae6e173f6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1499,4 +1499,11 @@ enum { /* See also internal only FOLL flags in mm/internal.h */ }; +/* Skip the TLB flush when we split the PMD */ +#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) +/* Skip the TLB flush when we remap the PTE */ +#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) +/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ +#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) + #endif /* _LINUX_MM_TYPES_H */ diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 46befab48d41..e50a196399f5 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -38,16 +38,56 @@ struct vmemmap_remap_walk { struct page *reuse_page; unsigned long reuse_addr; struct list_head *vmemmap_pages; - -/* Skip the TLB flush when we split the PMD */ -#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) -/* Skip the TLB flush when we remap the PTE */ -#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) -/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ -#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) unsigned long flags; }; +#ifndef VMEMMAP_ARCH_TLB_FLUSH_FLAGS +#define VMEMMAP_ARCH_TLB_FLUSH_FLAGS 0 +#endif + +#ifndef vmemmap_update_supported +static bool vmemmap_update_supported(void) +{ + return true; +} +#endif + +#ifndef vmemmap_update_lock +static void vmemmap_update_lock(void) +{ +} +#endif + +#ifndef vmemmap_update_unlock +static void vmemmap_update_unlock(void) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_start +static void vmemmap_update_pte_range_start(pte_t *pte, unsigned long start, unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_end +static void vmemmap_update_pte_range_end(void) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_start +static void vmemmap_update_pmd_range_start(pmd_t *pmd, unsigned long start, unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_end +static void vmemmap_update_pmd_range_end(void) +{ +} +#endif + static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, struct vmemmap_remap_walk *walk) { @@ -83,7 +123,9 @@ static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); + vmemmap_update_pmd_range_start(pmd, start, start + PMD_SIZE); pmd_populate_kernel(&init_mm, pmd, pgtable); + vmemmap_update_pmd_range_end(); if (!(walk->flags & VMEMMAP_SPLIT_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, start + PMD_SIZE); } else { @@ -164,10 +206,12 @@ static int vmemmap_remap_range(unsigned long start, unsigned long end, VM_BUG_ON(!PAGE_ALIGNED(start | end)); + vmemmap_update_lock(); mmap_read_lock(&init_mm); ret = walk_page_range_novma(&init_mm, start, end, &vmemmap_remap_ops, NULL, walk); mmap_read_unlock(&init_mm); + vmemmap_update_unlock(); if (ret) return ret; @@ -228,6 +272,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned lo smp_wmb(); } + vmemmap_update_pte_range_start(pte, start, end); + for (i = 0; i < nr_pages; i++) { pte_t val; @@ -242,6 +288,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsigned lo set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } /* @@ -287,6 +335,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned */ smp_wmb(); + vmemmap_update_pte_range_start(pte, start, end); + for (i = 0; i < nr_pages; i++) { pte_t val; @@ -296,6 +346,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, unsigned val = mk_pte(page, PAGE_KERNEL); set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } /** @@ -513,7 +565,8 @@ static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, */ int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio) { - return __hugetlb_vmemmap_restore_folio(h, folio, VMEMMAP_SYNCHRONIZE_RCU); + return __hugetlb_vmemmap_restore_folio(h, folio, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); } /** @@ -553,7 +606,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, list_move(&folio->lru, non_hvo_folios); } - if (restored) + if (restored && !(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) flush_tlb_all(); if (!ret) ret = restored; @@ -641,7 +694,8 @@ void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio) { LIST_HEAD(vmemmap_pages); - __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, VMEMMAP_SYNCHRONIZE_RCU); + __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); free_vmemmap_page_list(&vmemmap_pages); } @@ -683,7 +737,8 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_SPLIT_NO_TLB_FLUSH)) + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { int ret; @@ -701,24 +756,35 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l * allowing more vmemmap remaps to occur. */ if (ret == -ENOMEM && !list_empty(&vmemmap_pages)) { - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); INIT_LIST_HEAD(&vmemmap_pages); __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, flags); } } - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +static int hugetlb_vmemmap_sysctl(const struct ctl_table *ctl, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + if (!vmemmap_update_supported()) + return -ENODEV; + + return proc_dobool(ctl, write, buffer, lenp, ppos); +} + static struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", .data = &vmemmap_optimize_enabled, .maxlen = sizeof(vmemmap_optimize_enabled), .mode = 0644, - .proc_handler = proc_dobool, + .proc_handler = hugetlb_vmemmap_sysctl, }, }; @@ -729,6 +795,11 @@ static int __init hugetlb_vmemmap_init(void) /* HUGETLB_VMEMMAP_RESERVE_SIZE should cover all used struct pages */ BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); + if (READ_ONCE(vmemmap_optimize_enabled) && !vmemmap_update_supported()) { + pr_warn("HugeTLB: disabling HVO due to missing support.\n"); + WRITE_ONCE(vmemmap_optimize_enabled, false); + } + for_each_hstate(h) { if (hugetlb_vmemmap_optimizable(h)) { register_sysctl_init("vm", hugetlb_vmemmap_sysctls); From patchwork Mon Oct 21 04:22:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13843549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2BAC1D3C93D for ; Mon, 21 Oct 2024 04:28:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=rF+TQYGEDP6+t2V2Dte+w8ImgbG43FZ1I4yOJyMV9jY=; b=dA6J630dOYYvVABvXfmfnf+lFh OqHlk1o531QSbcclqKPktR8TBuzjbgAmBducnBT9NhwYAjGu+AC8iTrg2dP1bmEw4WfnupUubVgAI PAaEdZOjhlCBgZypzNFDFyoW6whdPzl10fcgie/1b77IjKeEtbqLai4AijPtXuus6R9LUjG4DO1Cy 5ef73zHYD6Y1NiT5IPGTDIbCQY9f3ZXtcHWzlXvGG+isyMqK/+nUQH9jCu20SrAHCMOnbtsJJGboR y7BVNTp3sGIS6lepAvnM6f4CzmQayskpmqkYEiCUWnEpOVQOVI5EuRG5Wb2PjZlae7VXBg2+bNVzs 5IUE9Ywg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t2k1y-000000061fX-20Wv; Mon, 21 Oct 2024 04:28:34 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t2jw8-000000060sD-0Oeu for linux-arm-kernel@lists.infradead.org; Mon, 21 Oct 2024 04:22:34 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-e291ac8dcf3so6556973276.3 for ; Sun, 20 Oct 2024 21:22:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484550; x=1730089350; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rF+TQYGEDP6+t2V2Dte+w8ImgbG43FZ1I4yOJyMV9jY=; b=E9iJmfkK83iR+rX5gkCDs7/ppR2rpAxNimIHmzO7GEdsm1zFTgFSL1ds+lb1yGcljp mH7xyDTaYxAXdShyPyTa416Ly3kqiLIq4ujvk29qjxvBoQKp248MFYotP/rvdtZNCK7r nVbvWI2EAaG6atSqV0GjXItlG9seOFSGR7gB7XtiBdFHAMQQzPowkFTNBd8bY4lJy6QR hdseeVHdTMSqcLh5Qb1Bgt/eEgnsEBLBP9mQJfwDuWQGLolC3CiNJHtMdiQ6hPm8eb1c pW05R9fPHyNfzIObyxIZbwmFIKTfZi2p8aYM1BuXCjHZZcY6TmGk2PQ4qpbtrmDX6i6v nNxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484550; x=1730089350; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rF+TQYGEDP6+t2V2Dte+w8ImgbG43FZ1I4yOJyMV9jY=; b=f5feOHHGOf/p96aoK0ha1kqVETBstZsoi9oEDEPl/cZRSmy8eNb9+E74rwElTJyDG7 B1bXMi2qSyGrxr4/GKfS0CbZ0ylMffUwsrefl8lVFwO94YjrL4I8aSCegaO45uSJLQ/U 1gCMtvWrTUOfMihhTbHCCmoWsENCSJjcxMSLM/VvLLbVbH7CuNqMI4AsBZ7H+Jen3vPD 9x/mEuz6nYrPXro2qhvx9KjsOTqh9SdpiK+i0LcMR/hlftH/YVHgTYJp03IE0xyqEs16 Sg/pwREZKj1JLm7qIRUYa9klIJ+Ln6slh9usfD2rzUxLZJ4jNjfoSyMiQcqUpd5IOuu6 b2ew== X-Forwarded-Encrypted: i=1; AJvYcCUQJCAnNJPzxyjwTi0L2o8vZsbVMASrMgFQiet4wQyTfYX0mFOWsjSoKKcm93zxIYxch802gFlPMIUezV56E0UX@lists.infradead.org X-Gm-Message-State: AOJu0YwCknMDLG9DXMhLTVhGcBmNDksthJserM0IJASrs/zwJ0zo9MRZ 8t6f9t5apuw6KroyaZBMTrpf2ZR5pIH1+pKWb3Typ5T3b0VNrb0f8phOviQ/7C0V+ki589y/XKu sQQ== X-Google-Smtp-Source: AGHT+IEeIIe5Xg98lfzs9VJWYOJ07F6ek87y8El+d0cpjxlVvqWrNUTXfkoh0Qdzc1+S/vFogSJ2meafeF0= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:800f:0:b0:e05:6532:166 with SMTP id 3f1490d57ef6-e2bb11cccaemr14439276.1.1729484550141; Sun, 20 Oct 2024 21:22:30 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:15 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-4-yuzhao@google.com> Subject: [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241020_212232_168273_A1C2F560 X-CRM114-Status: GOOD ( 13.82 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org GIC v3 and later support SGI broadcast, i.e., the mode that routes interrupts to all PEs in the system excluding the local CPU. Supporting this mode can avoid looping through all the remote CPUs when broadcasting SGIs, especially for systems with 200+ CPUs. The performance improvement can be measured with the rest of this series booted with "hugetlb_free_vmemmap=on irqchip.gicv3_pseudo_nmi=1": cd /sys/kernel/mm/hugepages/ echo 600 >hugepages-1048576kB/nr_hugepages echo 2048kB >hugepages-1048576kB/demote_size perf record -g -- bash -c "echo 600 >hugepages-1048576kB/demote" gic_ipi_send_mask() bash sys time Before: 38.14% 0m10.513s After: 0.20% 0m5.132s Signed-off-by: Yu Zhao --- drivers/irqchip/irq-gic-v3.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index ce87205e3e82..42c39385e1b9 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -1394,9 +1394,20 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq) gic_write_sgi1r(val); } +static void gic_broadcast_sgi(unsigned int irq) +{ + u64 val; + + val = BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT) | (irq << ICC_SGI1R_SGI_ID_SHIFT); + + pr_devel("CPU %d: broadcasting SGI %u\n", smp_processor_id(), irq); + gic_write_sgi1r(val); +} + static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) { int cpu; + cpumask_t broadcast; if (WARN_ON(d->hwirq >= 16)) return; @@ -1407,6 +1418,13 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) */ dsb(ishst); + cpumask_copy(&broadcast, cpu_present_mask); + cpumask_clear_cpu(smp_processor_id(), &broadcast); + if (cpumask_equal(&broadcast, mask)) { + gic_broadcast_sgi(d->hwirq); + goto done; + } + for_each_cpu(cpu, mask) { u64 cluster_id = MPIDR_TO_SGI_CLUSTER_ID(gic_cpu_to_affinity(cpu)); u16 tlist; @@ -1414,7 +1432,7 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask) tlist = gic_compute_target_list(&cpu, mask, cluster_id); gic_send_sgi(cluster_id, tlist, d->hwirq); } - +done: /* Force the above writes to ICC_SGI1R_EL1 to be executed */ isb(); } From patchwork Mon Oct 21 04:22:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13843550 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2D3BBD3C93D for ; Mon, 21 Oct 2024 04:30:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=KJDQv9BsbNqPzAGIVHBRqdKJMgT8Ksx982Co8Pglskw=; b=MMspVBtY+YKdKnY3RlCl55DaS5 tcHJ61k1Wgdue+OHQflv8t+3TBJYjJwFU3p7QwDkBdL2REooOC+GOsad5sf4LSpRWGE00gC/VjlUb cEChwHHtrxjbhnNSYYQH8uX2tjcoqyRYrlOeD2ZzmVtdM5RrBaA4gBRTe8uxP8sp4lhNlADQ/kjPT 85sMt3EiPrtBtDvp2w+7b15aPqXm170nN8jcGdactBQCBuNP681AQ4tlw5hmOQSWIvmYaUNai4bN9 HeavP+lByLpwF2BpLyl6JFcwcMgBzB94xvGUIDP269UnZt+063b/+GwDDmaDBtwmYRq+s5/A6HENi DR0RlWRQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t2k3Q-000000061m7-1uAo; Mon, 21 Oct 2024 04:30:04 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t2jw9-000000060t0-32kO for linux-arm-kernel@lists.infradead.org; Mon, 21 Oct 2024 04:22:35 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-6e383ba6763so67843097b3.3 for ; Sun, 20 Oct 2024 21:22:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484552; x=1730089352; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KJDQv9BsbNqPzAGIVHBRqdKJMgT8Ksx982Co8Pglskw=; b=ucQW083j6/ena4sJhKWkxVFEBc8fylkvQf/qgyQhwKBNjpP83Wq9g9S6S+klKmD9TH Wirzuf6+SL10ulRYn4MmRE0q5kVn2IuZj0ghSt4VEWi+zxO9FLBP39GmEbWnfmS1UZFL ED2/174Z/+Cx0gyF/sL6DjX1qepH/RmzoXDCqkXY9qqnCDDfDzgFL2yndXty5lv4e21Q A5P71dezEhDV6n30NdC575IovMRkRlNfcHf0jQwqB4tdGi/bHS2BW2PL9dqXG21G+DuE mx/XWZh8ZXob72I/TRURqBhB/E/xH05YeYI4g4vEUyX7Z7wDTvoUgEA34SgPUX7Pu6jt b1Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484552; x=1730089352; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KJDQv9BsbNqPzAGIVHBRqdKJMgT8Ksx982Co8Pglskw=; b=vA7HG5Ehlv8t4DnHAZ2Y3WBgSAPd/0aia0VT7lig4KH2GUdcxMNlmfHOICO3bjjMCd pRt8y2bX77yr2+rV9haCjfP9VeUxpUCbM1EAJWl6/RKCZvz+z062kAo3D31epNMUVD2k AugzaRmfz0r0yF2z0Dfs1DZ8+xGcMsKhD2kpM+GUmxMXZc5h9ezQL25AzaiNgekOGdp1 8aTDwzB7n1bfwcLhDOFBaYPnxlTr46u/A1cLyqa87I3GUcKSLsPmkQxBPU5V58i9V9Ud yeRS8pby7WW9P9VKzG2IzAksSHiy8QObeot4gzQpcaWwH1QJxEAj5OKX3oE5WxlBsG14 xqgw== X-Forwarded-Encrypted: i=1; AJvYcCXL8Uy7Wvx6KaXp8pttC5t6XWBZ3rh7iU6YLFFdwwSEFeJp3ysrFL7JSm3/UPOpsOHUjY9nvQ9oeYSxu+vDtD9I@lists.infradead.org X-Gm-Message-State: AOJu0YybMxAR7dB4VvWjoT2wDX0moxEJOL4rgTOO+38VsSOapmFzHiHR f8+3ne8sidgxKEeeryre6B/vIbTKZvDqKnHU7o7b5Jhai11/mLvgbgSODqaJ52VICAayc9QdC78 01Q== X-Google-Smtp-Source: AGHT+IGEZGtfLFTAo+zlX7imoG9sifk97tpSM8aK6pq0aGlMOVsYXw6fthvIZGCv/EPFEWt1vDf5ZDyKg70= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a05:690c:fc2:b0:6e2:70e:e82e with SMTP id 00721157ae682-6e5bfc0c757mr2067897b3.6.1729484552429; Sun, 20 Oct 2024 21:22:32 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:16 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-5-yuzhao@google.com> Subject: [PATCH v1 4/6] arm64: broadcast IPIs to pause remote CPUs From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241020_212234_021153_9BEB1306 X-CRM114-Status: GOOD ( 22.28 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Broadcast pseudo-NMI IPIs to pause remote CPUs for a short period of time, and then reliably resume them when the local CPU exits critical sections that preclude the execution of remote CPUs. A typical example of such critical sections is BBM on kernel PTEs. HugeTLB Vmemmap Optimization (HVO) on arm64 was disabled by commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") due to the folllowing reason: This is deemed UNPREDICTABLE by the Arm architecture without a break-before-make sequence (make the PTE invalid, TLBI, write the new valid PTE). However, such sequence is not possible since the vmemmap may be concurrently accessed by the kernel. Supporting BBM on kernel PTEs is one of the approaches that can make HVO theoretically safe on arm64. Note that it is still possible for the paused CPUs to perform speculative translations. Such translations would cause spurious kernel PFs, which should be properly handled by is_spurious_el1_translation_fault(). Signed-off-by: Yu Zhao --- arch/arm64/include/asm/smp.h | 3 ++ arch/arm64/kernel/smp.c | 92 +++++++++++++++++++++++++++++++++--- 2 files changed, 88 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 2510eec026f7..cffb0cfed961 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -133,6 +133,9 @@ bool cpus_are_stuck_in_kernel(void); extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); +void pause_remote_cpus(void); +void resume_remote_cpus(void); + #endif /* ifndef __ASSEMBLY__ */ #endif /* ifndef __ASM_SMP_H */ diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 3b3f6b56e733..68829c6de1b1 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -85,7 +85,12 @@ static int ipi_irq_base __ro_after_init; static int nr_ipi __ro_after_init = NR_IPI; static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init; -static bool crash_stop; +enum { + SEND_STOP = BIT(0), + CRASH_STOP = BIT(1), +}; + +static unsigned long stop_in_progress; static void ipi_setup(int cpu); @@ -917,6 +922,79 @@ static void __noreturn ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs #endif } +static DEFINE_SPINLOCK(cpu_pause_lock); +static cpumask_t paused_cpus; +static cpumask_t resumed_cpus; + +static void pause_local_cpu(void) +{ + int cpu = smp_processor_id(); + + cpumask_clear_cpu(cpu, &resumed_cpus); + /* + * Paired with pause_remote_cpus() to confirm that this CPU not only + * will be paused but also can be reliably resumed. + */ + smp_wmb(); + cpumask_set_cpu(cpu, &paused_cpus); + /* paused_cpus must be set before waiting on resumed_cpus. */ + barrier(); + while (!cpumask_test_cpu(cpu, &resumed_cpus)) + cpu_relax(); + /* A typical example for sleep and wake-up functions. */ + smp_mb(); + cpumask_clear_cpu(cpu, &paused_cpus); +} + +void pause_remote_cpus(void) +{ + cpumask_t cpus_to_pause; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + cpumask_copy(&cpus_to_pause, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_pause); + + spin_lock(&cpu_pause_lock); + + WARN_ON_ONCE(!cpumask_empty(&paused_cpus)); + + smp_cross_call(&cpus_to_pause, IPI_CPU_STOP_NMI); + + while (!cpumask_equal(&cpus_to_pause, &paused_cpus)) + cpu_relax(); + /* + * Paired with pause_local_cpu() to confirm that all CPUs not only will + * be paused but also can be reliably resumed. + */ + smp_rmb(); + WARN_ON_ONCE(cpumask_intersects(&cpus_to_pause, &resumed_cpus)); + + spin_unlock(&cpu_pause_lock); +} + +void resume_remote_cpus(void) +{ + cpumask_t cpus_to_resume; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + cpumask_copy(&cpus_to_resume, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_resume); + + spin_lock(&cpu_pause_lock); + + cpumask_setall(&resumed_cpus); + /* A typical example for sleep and wake-up functions. */ + smp_mb(); + while (cpumask_intersects(&cpus_to_resume, &paused_cpus)) + cpu_relax(); + + spin_unlock(&cpu_pause_lock); +} + static void arm64_backtrace_ipi(cpumask_t *mask) { __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask); @@ -970,7 +1048,9 @@ static void do_handle_IPI(int ipinr) case IPI_CPU_STOP: case IPI_CPU_STOP_NMI: - if (IS_ENABLED(CONFIG_KEXEC_CORE) && crash_stop) { + if (!test_bit(SEND_STOP, &stop_in_progress)) { + pause_local_cpu(); + } else if (test_bit(CRASH_STOP, &stop_in_progress)) { ipi_cpu_crash_stop(cpu, get_irq_regs()); unreachable(); } else { @@ -1142,7 +1222,6 @@ static inline unsigned int num_other_online_cpus(void) void smp_send_stop(void) { - static unsigned long stop_in_progress; cpumask_t mask; unsigned long timeout; @@ -1154,7 +1233,7 @@ void smp_send_stop(void) goto skip_ipi; /* Only proceed if this is the first CPU to reach this code */ - if (test_and_set_bit(0, &stop_in_progress)) + if (test_and_set_bit(SEND_STOP, &stop_in_progress)) return; /* @@ -1230,12 +1309,11 @@ void crash_smp_send_stop(void) * This function can be called twice in panic path, but obviously * we execute this only once. * - * We use this same boolean to tell whether the IPI we send was a + * We use the CRASH_STOP bit to tell whether the IPI we send was a * stop or a "crash stop". */ - if (crash_stop) + if (test_and_set_bit(CRASH_STOP, &stop_in_progress)) return; - crash_stop = 1; smp_send_stop(); From patchwork Mon Oct 21 04:22:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13843551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B8187D3C93D for ; Mon, 21 Oct 2024 04:31:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=toVG9Bv1ZFXM9w/cWVF/q6arTJn/fxOOJss0eISMw6g=; b=4xL1VBjvy0qiCZ7l3alf69kfV9 2DUtJ5k528qlGg5mMtCOR0izpwGtkH2GNWRPC92TXPsUkFRMbhUKJPeZ+5m+cQdEcBcqvXeTZ9yap E26K0e96ksiCBm2+n6fHsTE/fQw2C3xXUoA8R6q+0s3YGn7pBz6or6CP4ot0ht0YImE+kvv6PZtTD +8zJzYwBcUbp+TadojqPp/nijBVnJVIvjFQKJ+QOkMyNL7T7cG4FCVHbFp85D7NB+GKJiixrRP9L7 mX33rAPJO8H9vTPDm9oR1ga4WETNnGHwGdWMBuMdNJCuuGNm+H6xsGaLCuLL0dvzBB0xPVLL6UKSh b3ORqS9g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t2k4q-000000061sB-1I3p; Mon, 21 Oct 2024 04:31:32 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t2jwC-000000060u0-0MHZ for linux-arm-kernel@lists.infradead.org; Mon, 21 Oct 2024 04:22:37 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-e28fdb4f35fso6449262276.0 for ; Sun, 20 Oct 2024 21:22:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484555; x=1730089355; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=toVG9Bv1ZFXM9w/cWVF/q6arTJn/fxOOJss0eISMw6g=; b=fT21/U9Oq5AHLEEaZrZmQRE9VHOVxw656+I9DmKZeVOeLQNm5VoSnH7gatVS/vCRuu Z5qOO5jM88Q5I0dgktsqGgBih7YLCPMgmA1krjNcYUQ9Nb3cJ6satg4JqY3W6nhK+ltd ZlzrLPXgzmzRVwAl0uRsSUlTQScGVMNaXzjOf0IohnGcszkR7I4rG2t0mT5dvGRuSAFP GGaaFurKAtPxsY5GQI/38S/IJzWj4uoKaMhZAAEe6LYLuGQxUQEu19hlxJvtqacq0w1f ycxXCvwXbmlAchqMNzTlavm1wjoaRTTYkoeFuBbA2kdzOY0N3euOGu1Oi3oSo68UbRv6 TJ6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484555; x=1730089355; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=toVG9Bv1ZFXM9w/cWVF/q6arTJn/fxOOJss0eISMw6g=; b=RZfSItwiD2DbbJI4sqOH8ZW4wON4wjf5mn1lSPBEA3kW0CYrXlb52eiyKNY489yKnK wuRmpdX8GzooyVjRvchsCG13VScn8BG6AIr4nRYobgGG3WQEAhII2dFYvdN3lLVsB+/I HWekEL2FuBGFsdSAm9b1UJ4gsPvlzlVCwKrU1xAYVWHYR+itthYj7HMeevfP4gZAl9UB yEiQhkfHWBB3fFYP98baHZDugLLA42lgHA7P6PK1mMqPHDkEEz85DL9/7NqewWvpRt3p p+wTsugJeH+ZraSODsyMsaMdKIihC9bnP5GXvrxGOIqWSDJtc6TrT5nBUmsiHZQZs0YY fGlA== X-Forwarded-Encrypted: i=1; AJvYcCUbHpj+EXQWP+FyaGx3rFrX/BFVW8s9zhseAW+l2RmgJU33HpZNoW+QenjH8QpWX2dacecXFPvcETvX/KZM//hH@lists.infradead.org X-Gm-Message-State: AOJu0YyPB2M0dNt1msYi0+zKx7hEdspaQagtbQw/8NKbkwKp5jZSi+h0 jjKUlseMltQHRORRi8t18NqJ4CfHpk5wXUi6ec8YmuspQ8UCJMXMHDgFLEnGUld8UmIWYc71bZQ lvQ== X-Google-Smtp-Source: AGHT+IFk9uamhIyVPiPVkC9AAkl4R2cZvpHrX9RuV4B2/rBtrreDBZ8gNEYfOVDCArI/WEZD+WVYAn+KZ5o= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:7453:0:b0:e28:eee0:aaa1 with SMTP id 3f1490d57ef6-e2bb12db7e5mr5668276.4.1729484554722; Sun, 20 Oct 2024 21:22:34 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:17 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-6-yuzhao@google.com> Subject: [PATCH v1 5/6] arm64: pause remote CPUs to update vmemmap From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241020_212236_175369_22D6F77F X-CRM114-Status: GOOD ( 11.89 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Pause remote CPUs so that the local CPU can follow the proper BBM sequence to safely update the vmemmap mapping `struct page` areas. While updating the vmemmap, it is guaranteed that neither the local CPU nor the remote ones will access the `struct page` area being updated, and therefore they should not trigger (non-spurious) kernel PFs. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/pgalloc.h | 69 ++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h index 8ff5f2a2579e..f50f79f57c1e 100644 --- a/arch/arm64/include/asm/pgalloc.h +++ b/arch/arm64/include/asm/pgalloc.h @@ -12,6 +12,7 @@ #include #include #include +#include #define __HAVE_ARCH_PGD_FREE #define __HAVE_ARCH_PUD_FREE @@ -137,4 +138,72 @@ pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep) __pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE | PMD_TABLE_PXN); } +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP + +#define VMEMMAP_ARCH_TLB_FLUSH_FLAGS (VMEMMAP_SPLIT_NO_TLB_FLUSH | VMEMMAP_REMAP_NO_TLB_FLUSH) + +#define vmemmap_update_supported vmemmap_update_supported +static inline bool vmemmap_update_supported(void) +{ + return system_uses_irq_prio_masking(); +} + +#define vmemmap_update_lock vmemmap_update_lock +static inline void vmemmap_update_lock(void) +{ + cpus_read_lock(); +} + +#define vmemmap_update_unlock vmemmap_update_unlock +static inline void vmemmap_update_unlock(void) +{ + cpus_read_unlock(); +} + +#define vmemmap_update_pte_range_start vmemmap_update_pte_range_start +static inline void vmemmap_update_pte_range_start(pte_t *pte, + unsigned long start, unsigned long end) +{ + unsigned long addr; + + local_irq_disable(); + pause_remote_cpus(); + + for (addr = start; addr != end; addr += PAGE_SIZE, pte++) + pte_clear(&init_mm, addr, pte); + + flush_tlb_kernel_range(start, end); +} + +#define vmemmap_update_pte_range_end vmemmap_update_pte_range_end +static inline void vmemmap_update_pte_range_end(void) +{ + resume_remote_cpus(); + local_irq_enable(); +} + +#define vmemmap_update_pmd_range_start vmemmap_update_pmd_range_start +static inline void vmemmap_update_pmd_range_start(pmd_t *pmd, + unsigned long start, unsigned long end) +{ + unsigned long addr; + + local_irq_disable(); + pause_remote_cpus(); + + for (addr = start; addr != end; addr += PMD_SIZE, pmd++) + pmd_clear(pmd); + + flush_tlb_kernel_range(start, end); +} + +#define vmemmap_update_pmd_range_end vmemmap_update_pmd_range_end +static inline void vmemmap_update_pmd_range_end(void) +{ + resume_remote_cpus(); + local_irq_enable(); +} + +#endif /* CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP */ + #endif From patchwork Mon Oct 21 04:22:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13843553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DD9D9D3C93D for ; Mon, 21 Oct 2024 04:33:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=0qOoUc7S7Ph4MJcNIkBb7u3tw8zxiypBu0RBb7iGSGQ=; b=USLvikFnGp9ToGYY+TUN4tCPF0 g2SWuEHuyqGj6JFPq8+Cy5SkG+SpYxEp/kU5f0McYiPrj1dsdsqWrZ3Z62cOexdtAwn9mZGClk2vd jA0PuH4ozSKfoaLLUm2KGQ3Y7/RX1b4iwLPtUPaZDvmJqSIvlF2GYSn4SWUV6O9P5aQLqfiGmVgZu AcOp1KrOEwosHbC1YjpAA2CjUo9DuY7cC0ZQl7Q3mG640iSL1e5yVFHoX8povP6HDkKKiZXNJL/5w njxLevSICOmj3J9kpXQ+qgHAB9oHXF5t1oZmfS9GXxbkdTD8ie6wH0TeNLHdALrj0ptK0kQKfuWxH 4QdbxORQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t2k6I-0000000622G-22EE; Mon, 21 Oct 2024 04:33:02 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t2jwE-000000060ve-3nyc for linux-arm-kernel@lists.infradead.org; Mon, 21 Oct 2024 04:22:40 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-6e3c638cc27so68213697b3.0 for ; Sun, 20 Oct 2024 21:22:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484557; x=1730089357; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0qOoUc7S7Ph4MJcNIkBb7u3tw8zxiypBu0RBb7iGSGQ=; b=sEIIm/JiS87B8c8EKV8IssGkFx+6MHx7oQ8ItDZYkIZ7SxXbaij+CObEvOtmY62k5L unoBfRQGAdfACv3eTtMlGeZ5iyK2psAWYtaS4JKUzRFXIXWKN8OLbswgYYH1N4JK3gsn KLMF3ldbjWMYfP5aIlsrpDJYB0/H+V77BBeuyT0IzTK+BvAlzEWlzHBFdUfjqsPHRbV6 gO00F4dtxfC3p9S1YFOkR68rhItA6Fm1FuKZyaM8pjg8Xlr1RslU04wQr27xlUy+ELe3 dA7S7ohB2wCLwcTeMtZL3uZ2gPj7KP4Rd6K0nKKlJce6bLWzl69V1k1NU4MkIjAUcbRI 4O7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484557; x=1730089357; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0qOoUc7S7Ph4MJcNIkBb7u3tw8zxiypBu0RBb7iGSGQ=; b=bn/3S1708MS71H4FVoZcNcnXP29HkveUPWpAen545oozuc994m3ZyJ0FHF3UiRamkr eh1KW2h+ZpvbIyinGS8R78ChpXzdrlVUpwWA0xCqfbmK2Odh/xZuxutpwKfpaICcZ0DA hGrH+7owoWT2RmL35MIwEeK+Ii3VX1HO01Qw2FywumnEIn7mEzdSuP/0juuhY6u4c+oE lIipBgzD/ubSNo6XjHhlk0caPst8Rxe4nw0kkDXVWHshdk0T79Ge6cy8HIRBF6F4tX/U k6gDisOlqoqHgp05bvPwLODka7Ih6PQliKkIPPFOmJ4NHoVTxKkTTMjZyGg4EPsHcOMv pERQ== X-Forwarded-Encrypted: i=1; AJvYcCXdDa75WsWI8TBL45fa5iFfb4W2m6HwsSdAQiGtkpqrXrOPbakRYrmWR7Wq6VeVFy0tkHg1CD11iK4r3rBaynDb@lists.infradead.org X-Gm-Message-State: AOJu0YyiJz0ISO7tG2C/xt+v+zvWrRRRN1TlfdlH+UIJfGx+O1VRnPPO DbeWcVqazoa2VpfPbDzHybIwTNbnRlS1Yaj2WaEF4oteERrxtqIMVmzGHx8MhcSKuDPKyQ26VZN ugg== X-Google-Smtp-Source: AGHT+IHT0kbssKqnqvzVXYJ6z8GdpmqmvaKobmcxDyj7IjWvvjuKhF8UkhPdehMt/EKyLWoXTgeCtaMmNCE= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:c781:0:b0:e29:7cd6:593b with SMTP id 3f1490d57ef6-e2bb169afa4mr24157276.8.1729484557262; Sun, 20 Oct 2024 21:22:37 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:18 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-7-yuzhao@google.com> Subject: [PATCH v1 6/6] arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241020_212239_007138_1CFD4CFC X-CRM114-Status: GOOD ( 10.58 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org To use HVO, make sure that the kernel is booted with pseudo-NMI enabled by "irqchip.gicv3_pseudo_nmi=1", as well as "hugetlb_free_vmemmap=on" unless HVO is enabled by default. Note that HVO checks the pseudo-NMI capability and is disabled at runtime if the capability turns out not supported. Successfully enabling HVO should have the following: # dmesg | grep NMI GICv3: Pseudo-NMIs enabled using ... # sysctl vm.hugetlb_optimize_vmemmap vm.hugetlb_optimize_vmemmap = 1 Signed-off-by: Yu Zhao --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd9df6dcc593..e93745f819d9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -109,6 +109,7 @@ config ARM64 select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_FRAME_POINTERS select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36) + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_EXECMEM_LATE if EXECMEM select ARCH_WANTS_NO_INSTR