From patchwork Tue Aug 15 21:06:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13354322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA05DC0015E for ; Tue, 15 Aug 2023 21:07:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 06F44940031; Tue, 15 Aug 2023 17:07:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 01FAD8D0001; Tue, 15 Aug 2023 17:07:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E022A940031; Tue, 15 Aug 2023 17:07:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CE6868D0001 for ; Tue, 15 Aug 2023 17:07:07 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BCBD11A0DE8 for ; Tue, 15 Aug 2023 21:07:06 +0000 (UTC) X-FDA: 81127574052.27.581FFF3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 83A27C0010 for ; Tue, 15 Aug 2023 21:07:04 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=heNC52rr; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692133624; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=kLOX7N0vMFqXo1EtRz1ImdE1j/Euy4flwOdsfxvmLk0=; b=m4O930Q6sncQLtSCiNAO52/uOYzJs+VmeKIjTjU1atfLOx3GDEYQebSTg1Ob8Q922Ruju2 Y0S1lj2lW6mJM3lzQYrMxnZle3oKdywF8RM0qjEviUzVxART3pbEomsvXlAgdBc8DUem3P 76Z9NsgIUsE1ovbYLpagTw/BAcVwETo= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=heNC52rr; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692133624; a=rsa-sha256; cv=none; b=AHqCKvw4vEZ7AP2K99r9UH3xFiUrvvpPrXUrzV5A04MWQqU/63pEqnNRws9EX2bvSOeJUz CY0VJxgjWhXXGPEI/DnZKZHjbCZjJPSwSVmjYUvMOJ0xPIfKQ6XktIz+HISPF/rlEODJ3Z ViUxjchBQYisExxsJZrazFLVXjZs+gQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1692133623; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=kLOX7N0vMFqXo1EtRz1ImdE1j/Euy4flwOdsfxvmLk0=; b=heNC52rrk6EJ7miBA/yv6hwtGioJk0whTnM7sjeMm2xMOv1lxWVdYXZCMAC4+CLrTuaO5Y ngjoVWCxFLCMf21EgefOSVPOhXDcOwl4tSTPKz4y9N7ouqZvxXfUgm2jP/sevDvighXHsI GZSLWS1E6scG9NjelWWfwDFwsWMgRIw= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-88-QM67MAw5MDCgZoUhSzgmhA-1; Tue, 15 Aug 2023 17:07:02 -0400 X-MC-Unique: QM67MAw5MDCgZoUhSzgmhA-1 Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-76cb9958d60so159991285a.0 for ; Tue, 15 Aug 2023 14:07:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692133622; x=1692738422; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kLOX7N0vMFqXo1EtRz1ImdE1j/Euy4flwOdsfxvmLk0=; b=lRg156KscOFz/MxXbqwRQezpOG2J2obvYFbSjx1u7mDx1T/+RwB+/Lg4KCxUPIEMRe HIC8yLt52eJNSWSviw6UzVUcmhmQqGG29VJP+avXwkgtX1vQCyGpT/FlIYH5eWJWxCdZ rSX54/Cqe0qWbjb82NJ1N/LfYpab+/zE8FGdVNMlru+oPB7s3jZ9Q7dwgaGYQgBB3MKW kqukn/AWFAPt3gjb8MPYCCgY4qfari/RAYVCnoDzB4VpWXVTsTghXFfdpta+SCdE4cMZ soWzC8MUkF3tutmozpGZyaoGFH7DIAosgze6clT6Zcw/M/EnCiIShbWvwQf9imVlOiCw qyug== X-Gm-Message-State: AOJu0YzoE8lnlrTEIFbZnciWagNtneoNCDHzmSpsceXdfHYenAIk9SJO /o8tFDYbhEtqZbEcB7UDMZ246uLyEtBS+J7NscJmK38hTb0ve4tRO+5FChcVfylvwA6KXQ1fgz+ ZNfO3GUF7fnrLziQ9AAxeleyJ6TO5S8tOQaWjBpEakm1BjguleELlaoG68uY5bbtNMhCL X-Received: by 2002:a05:620a:468e:b0:767:f284:a3dd with SMTP id bq14-20020a05620a468e00b00767f284a3ddmr41704qkb.1.1692133621778; Tue, 15 Aug 2023 14:07:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHEcCv3uoT2IKvIJjms4jW7lCugZw9SlWsaem1IA3br6gn+D7JChEUxAoSuxIzqYuGHU1tHEQ== X-Received: by 2002:a05:620a:468e:b0:767:f284:a3dd with SMTP id bq14-20020a05620a468e00b00767f284a3ddmr41669qkb.1.1692133621386; Tue, 15 Aug 2023 14:07:01 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id v13-20020ae9e30d000000b0076aeeb69cb1sm4010955qkf.4.2023.08.15.14.07.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 14:07:01 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Matthew Wilcox , Yang Shi , Mike Kravetz , David Hildenbrand , "Kirill A . Shutemov" , Hugh Dickins , Andrew Morton Subject: [PATCH] mm: Wire up tail page poisoning over ->mappings Date: Tue, 15 Aug 2023 17:06:59 -0400 Message-ID: <20230815210659.430010-1-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Stat-Signature: bxzn6hqaiu5bzgn7t63e86ah51xkkhkh X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 83A27C0010 X-HE-Tag: 1692133624-568054 X-HE-Meta: U2FsdGVkX1+r/EuL2QISqyotQDYHiuXgErM02jUulmAjksbraYHJ/VmF3UKosz2hqfpeznepdF1uZnu/stAcLcbWYg9+/1zgZfA2q7/nZxYXuzyDTHmKF2fAPrnuUo5hJb4p0kZqSVOrvf3r2KAlmLfvHhtNw5hNE6qnFwKTb1CKyt22XvJuH5fMbFSbeUR4Rb+W5TLOK1PbbTiD2laIUoJMNHVUPKQxnLQDwxU35q/8IJtGhNnb4Ki+/z/jLdw8zxT28L+3Qnam2g/j8NS0DIYDdyBjSpLDNN5PItwka0RtBg9Mh+wWnRbjt7HWPkheG20z/xzVIDoJiD0SLBve2mIbpmwanhLMFWARGw6K0uG6t2tXqsM0y2hrZZPbVY0xOIZc0r4HqDi/VmkhjoWaLJxwcnjKtY694uHPn+6/i36FTIcYCJvrrxkG5yOn1l1ICxt4i87ZQvX8sRO4j8OtqMj1OMOJzhQwsuKOdQCrw4HkXjHENQ5yXBEroK6wIYXVKlWR5GPfzVf/gLeE7Eg3k/bmwv5FioOQ9rT6JdtWOXwQyu4LhaiWuQh9b00pCPNL80sHanAvo2Zxl46NMk6ctvFd+SjOcr2yKa3vB17VooIrLpzDCwRAbH8DgXe7anQZDo17vsMdwn8ZRTcHv3Dd4uMUDk/EjqXaq5SfStiPv6M+ZmWyCoda3ot6fN6XbJo2/PvCW8EOS31miHfZyb/ZH3S7M20PsTOVA+xMQfdZTedutLOh9Zz9/8uLGPELcUgMVh5MVIGX4cTogJnxvkMTuQ0bAxQeETvxdL5P02U8ZqwMVcs/JnkJ8s0UqTq2Bbw+J2LX2eX1ztveHwuGhANG4xYi5buDXfn8pDr7nw/PCtiAxR/fd4bs8xXh3BwL+UftcjQzVHPryxghV7csHzQXe+piXlkMpwoEhH6SbfS7wbG4q7iuKpTCo8ZsepFNVR8bGSGHNVQ/2rP4WqMflcu 47DK8hmp BP9hLBM4ppNDVmed8JXAD0wlNCe1Z+HnohKxwz8IBzRFsSEnRWJ2Tfg6To4sdEJK5zilZAUr1RJCjWz9akjjGYx1J8yeIFA/X6JFTPVnvs6YyAbcLv/GDgY0HXEgQzuvI/iCuWZfmo2/7WA82K6Yj2zPOK6ipAxQh1v2KKCNijEVjqGLnE7vIDXl6za5Gfl6X2N8twT5jXriwtL9rpZbY3B//us1C7xV+p9oCxOnh691J6z/zIyVrjl5vwbofI/4rbKEGUXGhQ9ESAbIsZPAAAe33uuG/qNZ8VUZ1NX/en7a9YbTejGKIxzGeKX+vEAf4hTa+wLUKfjwcXj9LoxMrvBgHcfk5LzhLm9N28TxiBFLI4SPSBhzdbVz7RPFRZiuNjSBy8/2AK2shnoY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Tail pages have a sanity check on ->mapping fields, not all of them but only upon index>2, for now. It's because we reused ->mapping fields of the tail pages index=1,2 for other things. Define a macro for "max index of tail pages that got ->mapping field reused" on top of folio definition, because when we grow folio tail pages we'd want to boost this too together. Then wire everything up using that macro. Don't try to poison the ->mapping field in prep_compound_tail() for tail pages <=TAIL_MAPPING_REUSED_MAX because it's wrong. For example, the 1st tail page already reused ->mapping field as _nr_pages_mapped. It didn't already blow up only because we luckily always prepare tail pages before preparing the head, then prep_compound_head() will update folio->_nr_pages_mapped so as to void the poisoning. This should make it always safe again, even e.g. if we prep the head first. Clean up free_tail_page_prepare() along the way on checking ->mapping poisoning to also leverage the new macro. Signed-off-by: Peter Xu --- I split this out from another rfc series. Removed RFC tag because it wasn't for this patch but for the documentation updates. I'll post the rfc part alone. Comments welcomed, thanks. --- include/linux/mm_types.h | 11 +++++++++++ mm/huge_memory.c | 6 +++--- mm/internal.h | 3 ++- mm/page_alloc.c | 28 +++++++++++----------------- 4 files changed, 27 insertions(+), 21 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 291c05cacd48..81456fa5fda5 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -248,6 +248,17 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page) return (struct page *)(~ENCODE_PAGE_BITS & (unsigned long)page); } +/* + * This macro defines the maximum tail pages (of a folio) that can have the + * page->mapping field reused. + * + * When the tail page's mapping field reused, it'll be exempted from + * ->mapping poisoning and checks. Also see the macro TAIL_MAPPING. + * + * When grow the folio struct, please consider growing this too. + */ +#define TAIL_MAPPING_REUSED_MAX (2) + /** * struct folio - Represents a contiguous set of bytes. * @flags: Identical to the page flags. diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0b709d2c46c6..72f244e16dcb 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2444,9 +2444,9 @@ static void __split_huge_page_tail(struct page *head, int tail, (1L << PG_dirty) | LRU_GEN_MASK | LRU_REFS_MASK)); - /* ->mapping in first and second tail page is replaced by other uses */ - VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, - page_tail); + /* ->mapping in <=TAIL_MAPPING_REUSED_MAX tail pages are reused */ + VM_BUG_ON_PAGE(tail > TAIL_MAPPING_REUSED_MAX && + page_tail->mapping != TAIL_MAPPING, page_tail); page_tail->mapping = head->mapping; page_tail->index = head->index + tail; diff --git a/mm/internal.h b/mm/internal.h index 02a6fd36f46e..a75df0bd1f89 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -428,7 +428,8 @@ static inline void prep_compound_tail(struct page *head, int tail_idx) { struct page *p = head + tail_idx; - p->mapping = TAIL_MAPPING; + if (tail_idx > TAIL_MAPPING_REUSED_MAX) + p->mapping = TAIL_MAPPING; set_compound_head(p, head); set_page_private(p, 0); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 96b7c1a7d1f2..7ab7869f3c7f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -990,7 +990,7 @@ static inline bool is_check_pages_enabled(void) static int free_tail_page_prepare(struct page *head_page, struct page *page) { struct folio *folio = (struct folio *)head_page; - int ret = 1; + int ret = 1, index = page - head_page; /* * We rely page->lru.next never has bit 0 set, unless the page @@ -1002,9 +1002,9 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) ret = 0; goto out; } - switch (page - head_page) { - case 1: - /* the first tail page: these may be in place of ->mapping */ + + /* Sanity check the first tail page */ + if (index == 1) { if (unlikely(folio_entire_mapcount(folio))) { bad_page(page, "nonzero entire_mapcount"); goto out; @@ -1017,20 +1017,14 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) bad_page(page, "nonzero pincount"); goto out; } - break; - case 2: - /* - * the second tail page: ->mapping is - * deferred_list.next -- ignore value. - */ - break; - default: - if (page->mapping != TAIL_MAPPING) { - bad_page(page, "corrupted mapping in tail page"); - goto out; - } - break; } + + /* Sanity check the rest tail pages over ->mapping */ + if (index > TAIL_MAPPING_REUSED_MAX && page->mapping != TAIL_MAPPING) { + bad_page(page, "corrupted mapping in tail page"); + goto out; + } + if (unlikely(!PageTail(page))) { bad_page(page, "PageTail not set"); goto out;