From patchwork Fri Nov 24 13:26:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13467662 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49290C61D97 for ; Fri, 24 Nov 2023 13:27:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D37D68D007D; Fri, 24 Nov 2023 08:27:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CBFFE8D006E; Fri, 24 Nov 2023 08:27:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B130E8D007D; Fri, 24 Nov 2023 08:27:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 95DF98D006E for ; Fri, 24 Nov 2023 08:27:12 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5337912057C for ; Fri, 24 Nov 2023 13:27:12 +0000 (UTC) X-FDA: 81492923904.24.1A0C00A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 7B6D040028 for ; Fri, 24 Nov 2023 13:27:10 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ci6nzm9h; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700832430; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qOqXc/Lc47PjYVBDjn5tNISm7M7uwGrIMk+t0cHF8N0=; b=HcjJmqS8Z0WxR3wRaj2D05mkHYLbI0mOpCceKzl830dgd4fV9RH/K5RS5RQZFux0O0mDfn gmUcRfYptJ8JG01aNX4xYgW3HAQlRDIv0wAOQ91hU19JCBdaxye71l/p4e1tfMZlO62TA3 SO2Xzbj7UnwHhdzFQu9bTNKaIYkcfs8= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Ci6nzm9h; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700832430; a=rsa-sha256; cv=none; b=bUMdptXwg5ATpNhOT0zfRcLfvx+LIB6UcEVoEviSKXNbwmoL5uMbIYuKjeMOYaGqhFHztF PornqY7YLdm7Wl9Iw/t8GlFYXMEiegMYjOeHjNi9+Uwozh9BXyChi8OK2RXT94LYFD/WPN ZQqHuzB8pK1O+O3q+GwdB8hXvXVvU5c= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700832429; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qOqXc/Lc47PjYVBDjn5tNISm7M7uwGrIMk+t0cHF8N0=; b=Ci6nzm9hcHMWZuOapqbpBSXFSXWl0TnzZyMigA7IRrFl1gTanwBv+Yxr8wUCSiBJO071Bt MEK/fQHM2ZYEnjZIVNPCV7ntYVs3F6f0Gge60gEQr9Gbs9WlQj3RSxMpCtrZZEzpBXX0pM er3r07Kwxy5b54n2RQrliMdEl3Nzk/E= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-85-HfF3hCRgOCiCl6qXEnqmsA-1; Fri, 24 Nov 2023 08:27:04 -0500 X-MC-Unique: HfF3hCRgOCiCl6qXEnqmsA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AD25B811E86; Fri, 24 Nov 2023 13:27:03 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 477E52166B2A; Fri, 24 Nov 2023 13:27:00 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Linus Torvalds , Ryan Roberts , Matthew Wilcox , Hugh Dickins , Yin Fengwei , Yang Shi , Ying Huang , Zi Yan , Peter Zijlstra , Ingo Molnar , Will Deacon , Waiman Long , "Paul E. McKenney" Subject: [PATCH WIP v1 09/20] mm: improve folio_mapped_shared() for partially-mappable folios using rmap IDs Date: Fri, 24 Nov 2023 14:26:14 +0100 Message-ID: <20231124132626.235350-10-david@redhat.com> In-Reply-To: <20231124132626.235350-1-david@redhat.com> References: <20231124132626.235350-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 X-Rspamd-Queue-Id: 7B6D040028 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: i3t3tme5xqxx81bwqejhub836fxmebh6 X-HE-Tag: 1700832430-767845 X-HE-Meta: U2FsdGVkX19ai0+lHL3XWCnoQ3xEBUQiuiD1mJcgOEx8/mtQ9n5Atd/i/ARFRebKmE3FSQtLMM+N2qZXDSF2bdHoqYLnME1tWlPg4USksxBkLSPU14QES7AAwSe/xPkLqBaUQBYkHU2M+tILIvKmf/pJDS2YP+3fPrrToRFyApNSJRI/P3HFmUWnHfvaY6rlupMEU/4fpR4Yp8DA5smLAQ2nbxN6wm62eqhNPcmgfejIzyx7Klh3DnZc/7Qj768xQuZeLTXK02U8eAZwOH5mX3ao3+U88zYEwlX4fKSEYIkE5u1LZCFOEjeB0AMuKxs4AdQao840JH7r1YU48AnH7fK14cxu+4V8BbjlLeu786lPgGISp2+OBWxlA1XyGrwAKmQlNzN6dFkwHvFr/HrW9fpfnvlOTlMMPECrNOvmB9khREOTcQyR5+XL3AGFZjEySViQz8eQ1gd7rpbZN/58i2vPlNKwy8jYRY3EaRELGkgDpO/BS/XHdhAR+C8FdG9Q88j3QnFSXULikWSTXCzXhVOpEaCy976iPczRSfNKy5ijtt5AbqjPLkWIAnUr8MsejL8do5By+nTEN/xFAiLOHKsukx853yyeuDWQUD/YqUW3s6r0GGfivGESGNT+RAcMZI0JkoT7AiW5FqYVAtjaJE2rxexp7FImbUChB4QQUQv+v0hSapyr4+kw9XcLH1V1+Qk8K6ATwHPRKj1Enw3gDBMmG2fG/AKCUa2Ra3wZ+c5lSFKSz/SFlefzu1ZR9M9KTu7kWHsLCdgx/WImdw0I68UWhZGG8459NsojadMsiAXeuwT1xNXGuMtSCbAxLc8jtGCsEKwwXMujrKzL9ksZk4TD8hIq5MINPchuLiYahGofXltXU2/sknoiIq2sjtzQjo2W/7NFo85Q0loLjOgjUbKOnUr/RaDYBwo+Xu/yJIQ1y3fYRnN4qI+uXfdJbWiYSZnlNgd9ZYteiSY/waw o6thgMFH Wnbj3tN5adqwAehOL3AuHMxOxUxRRpPwV+pe+888LRUkJAjmrth5ITWrgeVOag42XtKbnhZro4pjJ6BpHVIhRgdEVtao/gFtGktLqxcoI6aWV/20MhD7w+Zq9KNp0/S2MwDKUg7F54a+ICqRXAasm4B9lyzHcTRZqbc84LtgDkRj3xR4risNnFkcVIzfrOfuSZm62sWknPZCAu9i2OJ8KYSbtQS1bDdDeh232 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's make folio_mapped_shared() precise by using or rmap ID magic to identify if a single MM is responsible for all mappings. If there is a lot of concurrent (un)map activity, we could theoretically spin for quite a while. But we're only looking at the rmap values in case we didn't already identify the folio as "obviously shared". In most cases, there should only be one or a handful of page tables involved. For current THPs with ~512 .. 2048 subpages, we really shouldn't see a lot of concurrent updates that keep us spinning for a long time. Anyhow, if ever a problem this can be optimized later if there is real demand. Signed-off-by: David Hildenbrand --- include/linux/mm.h | 21 ++++++++++++--- include/linux/rmap.h | 2 ++ mm/rmap_id.c | 63 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 82 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 765e688690f1..1081a8faa1a3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2113,6 +2113,17 @@ static inline size_t folio_size(struct folio *folio) return PAGE_SIZE << folio_order(folio); } +#ifdef CONFIG_RMAP_ID +bool __folio_large_mapped_shared(struct folio *folio, struct mm_struct *mm); +#else +static inline bool __folio_large_mapped_shared(struct folio *folio, + struct mm_struct *mm) +{ + /* ... guess based on the mapcount of the first page of the folio. */ + return atomic_read(&folio->page._mapcount) > 0; +} +#endif + /** * folio_mapped_shared - Report if a folio is certainly mapped by * multiple entities in their page tables @@ -2141,8 +2152,11 @@ static inline size_t folio_size(struct folio *folio) * PMD-mapped PMD-sized THP), the result will be exactly correct. * * For all other (partially-mappable) folios, such as PTE-mapped THP, the - * return value is partially fuzzy: true is not fuzzy, because it means - * "certainly mapped shared", but false means "maybe mapped exclusively". + * return value is partially fuzzy without CONFIG_RMAP_ID: true is not fuzzy, + * because it means "certainly mapped shared", but false means + * "maybe mapped exclusively". + * + * With CONFIG_RMAP_ID, the result will be exactly correct. * * Note that this function only considers *current* page table mappings * tracked via rmap -- that properly adjusts the folio mapcount(s) -- and @@ -2177,8 +2191,7 @@ static inline bool folio_mapped_shared(struct folio *folio, */ if (total_mapcount > folio_nr_pages(folio)) return true; - /* ... guess based on the mapcount of the first page of the folio. */ - return atomic_read(&folio->page._mapcount) > 0; + return __folio_large_mapped_shared(folio, mm); } #ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 19c9dc3216df..a73e146d82d1 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -253,6 +253,8 @@ void __folio_set_large_rmap_val(struct folio *folio, int count, struct mm_struct *mm); void __folio_add_large_rmap_val(struct folio *folio, int count, struct mm_struct *mm); +bool __folio_has_large_matching_rmap_val(struct folio *folio, int count, + struct mm_struct *mm); #else static inline void __folio_prep_large_rmap(struct folio *folio) { diff --git a/mm/rmap_id.c b/mm/rmap_id.c index e66b0f5aea2d..85a61c830f19 100644 --- a/mm/rmap_id.c +++ b/mm/rmap_id.c @@ -322,6 +322,69 @@ void __folio_add_large_rmap_val(struct folio *folio, int count, } } +bool __folio_has_large_matching_rmap_val(struct folio *folio, int count, + struct mm_struct *mm) +{ + const unsigned int order = folio_order(folio); + unsigned long diff = 0; + + switch (order) { +#if MAX_ORDER >= RMAP_SUBID_6_MIN_ORDER + case RMAP_SUBID_6_MIN_ORDER .. RMAP_SUBID_6_MAX_ORDER: + diff |= atomic_long_read(&folio->_rmap_val0) ^ (get_rmap_subid_6(mm, 0) * count); + diff |= atomic_long_read(&folio->_rmap_val1) ^ (get_rmap_subid_6(mm, 1) * count); + diff |= atomic_long_read(&folio->_rmap_val2) ^ (get_rmap_subid_6(mm, 2) * count); + diff |= atomic_long_read(&folio->_rmap_val3) ^ (get_rmap_subid_6(mm, 3) * count); + diff |= atomic_long_read(&folio->_rmap_val4) ^ (get_rmap_subid_6(mm, 4) * count); + diff |= atomic_long_read(&folio->_rmap_val5) ^ (get_rmap_subid_6(mm, 5) * count); + break; +#endif +#if MAX_ORDER >= RMAP_SUBID_5_MIN_ORDER + case RMAP_SUBID_5_MIN_ORDER .. RMAP_SUBID_5_MAX_ORDER: + diff |= atomic_long_read(&folio->_rmap_val0) ^ (get_rmap_subid_5(mm, 0) * count); + diff |= atomic_long_read(&folio->_rmap_val1) ^ (get_rmap_subid_5(mm, 1) * count); + diff |= atomic_long_read(&folio->_rmap_val2) ^ (get_rmap_subid_5(mm, 2) * count); + diff |= atomic_long_read(&folio->_rmap_val3) ^ (get_rmap_subid_5(mm, 3) * count); + diff |= atomic_long_read(&folio->_rmap_val4) ^ (get_rmap_subid_5(mm, 4) * count); + break; +#endif + default: + diff |= atomic_long_read(&folio->_rmap_val0) ^ (get_rmap_subid_4(mm, 0) * count); + diff |= atomic_long_read(&folio->_rmap_val1) ^ (get_rmap_subid_4(mm, 1) * count); + diff |= atomic_long_read(&folio->_rmap_val2) ^ (get_rmap_subid_4(mm, 2) * count); + diff |= atomic_long_read(&folio->_rmap_val3) ^ (get_rmap_subid_4(mm, 3) * count); + break; + } + return !diff; +} + +bool __folio_large_mapped_shared(struct folio *folio, struct mm_struct *mm) +{ + unsigned long start; + bool exclusive; + int mapcount; + + VM_WARN_ON_ONCE(!folio_test_large_rmappable(folio)); + VM_WARN_ON_ONCE(folio_test_hugetlb(folio)); + + /* + * Livelocking here is unlikely, as the caller already handles the + * "obviously shared" cases. If ever an issue and there is too much + * concurrent (un)mapping happening (using different page tables), we + * could stop earlier and just return "shared". + */ + do { + start = raw_read_atomic_seqcount_begin(&folio->_rmap_atomic_seqcount); + mapcount = folio_mapcount(folio); + if (unlikely(mapcount > folio_nr_pages(folio))) + return true; + exclusive = __folio_has_large_matching_rmap_val(folio, mapcount, mm); + } while (raw_read_atomic_seqcount_retry(&folio->_rmap_atomic_seqcount, + start)); + + return !exclusive; +} + int alloc_rmap_id(void) { int id;