From patchwork Thu Jun 22 14:42:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13289260 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A54B3EB64D8 for ; Thu, 22 Jun 2023 14:43:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9880A8D0010; Thu, 22 Jun 2023 10:43:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 872518D000C; Thu, 22 Jun 2023 10:43:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6A0088D0010; Thu, 22 Jun 2023 10:43:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4E36D8D000C for ; Thu, 22 Jun 2023 10:43:05 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C52EC120B88 for ; Thu, 22 Jun 2023 14:43:04 +0000 (UTC) X-FDA: 80930651088.17.FB6AD7C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 0262218000D for ; Thu, 22 Jun 2023 14:43:02 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687444983; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dsj19i4WO7Xm5RCy72VgkaRoah4zje0Yv8SkvWhAYkk=; b=VsNNGeZlC1MNiTrqGFQlydoImI5PPPsAn4zRx5GZEEq965y52VZ20dMT7oQaWgt6VmmN4E ftRApaGA72SvzItV/8i96IqX7aMq0B57+Ajajyt9PT8WcrD5dNADNtcTT/OJ3+Kih1n/4C TqM8KRTqZg+9rQvpRYz/DR9Pi/O/RcM= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687444983; a=rsa-sha256; cv=none; b=XTxldbtdVorvV/W4PY5yf0kI4pPMYsl6e2VbNPR8Il3bK2OprMlnOfHrncWeWg0d8ZdIw3 2HfSSA2OzW4bQFu85LOuvdEbQcVHt0kdDZ1Hid9ixu9XDpGLtw/DtR35KrPhvITnPMXHxr Ew1na1hoUTcBAjZLUK6tOHusNkPicPQ= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F0FCAC14; Thu, 22 Jun 2023 07:43:45 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6C4793F663; Thu, 22 Jun 2023 07:42:59 -0700 (PDT) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 12/14] arm64/mm: Add ptep_get_and_clear_full() to optimize process teardown Date: Thu, 22 Jun 2023 15:42:07 +0100 Message-Id: <20230622144210.2623299-13-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230622144210.2623299-1-ryan.roberts@arm.com> References: <20230622144210.2623299-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 0262218000D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: m749xbdy88waju8mcsr9rjxd4tf5d8e8 X-HE-Tag: 1687444982-501742 X-HE-Meta: U2FsdGVkX19j9FBaIhV4XfOrWKdLzdimEM6FYC7xGaZT8SbkMM5zEWPA7yL5E8Zp1uAShEi93Z9YcaQA9rPdAm4qSfVJUDJXqmCyPLRNaq25UMUOH3Gs7JB79pR3HGgLkgkPEpEEXqLvKu6EsHqhwMUcL5jiI214cvFTz1/xpwmHOYB5O02STgg8A5tkmCoKsk39p/jy3KgKGeX8npyWhK09WukWsAhuMLjOeA0GSI7XU0Bwrslr8TaT1pkO5UaMBskx+A9SfQrhH5hkeGtdWvO2g0KPWwE1ExRhdY6a3CJvc/3xy/phvhVWEUT3V/H7Sn0FFB4yJ+ibrn3OUJUZdol1bjC3nWmgqmk4/TGgo1K8kKDytR1QUtcszcYss40MTmwx67eEB5EMY1DLMcACVs7WbvrPvZZeARcdpyYLaOt46R/PhQ6+hTYUE3nFqqOYsmAPHfwLzBW8dUNMM4btCwnD+gYhzpqlMrhvXSI0Folj1VMUvfu4l5XfSAinm6KfsF0ClnqpvXiaQc5zx0aOINvy9bU3fgcsvBaHQGwVP4k3Kky+i72v9zhwieqetxVIyR8HF0UO9m8QhzJRlGN7s1MbzQW16+B8P9aJ4FEus+PXZIL2AJEfVzJQb+YjQZlekhbu9S4Rq+PutuszCYH19+8BJ4X8ZjF1gWgCQklTXBflfcEGds/1elGLaAYsLKal1sC7TemjOgXtkur77TvWlOEDkFKKAjnHNTw+CYBF2oxCu+RG1d/4otm293X/sEChofzPNPNEqDq+pIJfUgbe8ASeuUnI0zzPmjjBn0HHJgLyo6nYV0GqrEQSBrq1Po/SqyMe0ulgEfvqun6EVqsCYGgHdLtTtKerOjWOv+oee2lFZUMKhWCYjdHg+5NgUQrj7jMlT5/btLOB56Ux5ad35yupyxAF1L4pXU33PprBdu9fHOSr2AAj5z7DZPHkpbsIFJQqA9DC3weojzabk4Q 3lF/bFrT lwq8b3L1fLI4JBYWkfQ2Jbt99S/wpmBtCbHPqHP5VOEAk3x2HmpcGUD9WTav6tgoLlbN3RFKaxdSU059slGcu7zIc1eHMi6kOn1a5g8tDorcM0yopFOdlLeTolxGIfB9v/npdUsjCuBZJSf2OVvNQKzpJTRf7T9IZDTVhcH6/TbcplFaWfScSwsYhbvEoefZWbeB87Yj6L97zJ6mGK2Vn3V7MRW7y8Sy/qZ5q1Aqs1jO3mkgiqzCBCcmHfwRHpIs2r2ikeV1ccA/Ez2kevuhvA+z0ZQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: ptep_get_and_clear_full() adds a 'full' parameter which is not present for the fallback ptep_get_and_clear() function. 'full' is set to 1 when a full address space teardown is in progress. We use this information to optimize arm64_sys_exit_group() by avoiding unfolding (and therefore tlbi) contiguous ranges. Instead we just clear the PTE but allow all the contiguous neighbours to keep their contig bit set, because we know we are about to clear the rest too. Before this optimization, the cost of arm64_sys_exit_group() exploded to 32x what it was before PTE_CONT support was wired up, when compiling the kernel. With this optimization in place, we are back down to the original cost. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 18 ++++++++- arch/arm64/mm/contpte.c | 68 ++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 17ea534bc5b0..5963da651da7 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1128,6 +1128,8 @@ extern pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte); extern pte_t contpte_ptep_get_lockless(pte_t *orig_ptep); extern void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); +extern pte_t contpte_ptep_get_and_clear_full(struct mm_struct *mm, + unsigned long addr, pte_t *ptep); extern int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); extern int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, @@ -1252,12 +1254,24 @@ static inline void pte_clear(struct mm_struct *mm, __pte_clear(mm, addr, ptep); } +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL +static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, int full) +{ + pte_t orig_pte = __ptep_get(ptep); + + if (!pte_present(orig_pte) || !pte_cont(orig_pte) || !full) { + contpte_try_unfold(mm, addr, ptep, orig_pte); + return __ptep_get_and_clear(mm, addr, ptep); + } else + return contpte_ptep_get_and_clear_full(mm, addr, ptep); +} + #define __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); - return __ptep_get_and_clear(mm, addr, ptep); + return ptep_get_and_clear_full(mm, addr, ptep, 0); } #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index e8e4a298fd53..0b585d1c4c94 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -241,6 +241,74 @@ void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, } while (addr != end); } +pte_t contpte_ptep_get_and_clear_full(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + /* + * When doing a full address space teardown, we can avoid unfolding the + * contiguous range, and therefore avoid the associated tlbi. Instead, + * just clear the pte. The caller is promising to call us for every pte, + * so every pte in the range will be cleared by the time the tlbi is + * issued. + * + * However, this approach will leave the ptes in an inconsistent state + * until ptep_get_and_clear_full() has been called for every pte in the + * range. This could cause ptep_get() to fail to return the correct + * access/dirty bits, if ptep_get() calls are interleved with + * ptep_get_and_clear_full() (which they are). Solve this by copying the + * access/dirty bits to every pte in the range so that ptep_get() still + * sees them if we have already cleared pte that the hw chose to update. + * Note that a full teardown will only happen when the process is + * exiting, so we do not expect anymore accesses and therefore no more + * access/dirty bit updates, so there is no race here. + */ + + pte_t *orig_ptep = ptep; + pte_t pte; + bool flags_propagated = false; + bool dirty = false; + bool young = false; + int i; + + /* First, gather access and dirty bits. */ + ptep = contpte_align_down(orig_ptep); + for (i = 0; i < CONT_PTES; i++, ptep++) { + pte = __ptep_get(ptep); + + /* + * If we find a zeroed PTE, contpte_ptep_get_and_clear_full() + * must have already been called for it, so we have already + * propagated the flags to the other ptes. + */ + if (pte_val(pte) == 0) { + flags_propagated = true; + break; + } + + if (pte_dirty(pte)) + dirty = true; + + if (pte_young(pte)) + young = true; + } + + /* Now copy the access and dirty bits into each pte in the range. */ + if (!flags_propagated) { + ptep = contpte_align_down(orig_ptep); + for (i = 0; i < CONT_PTES; i++, ptep++) { + pte = __ptep_get(ptep); + + if (dirty) + pte = pte_mkdirty(pte); + + if (young) + pte = pte_mkyoung(pte); + } + } + + return __ptep_get_and_clear(mm, addr, orig_ptep); +} + int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) {