From patchwork Thu Aug 25 14:32:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12954809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEB6CC64990 for ; Thu, 25 Aug 2022 14:33:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DD5396B0073; Thu, 25 Aug 2022 10:33:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D84DD940007; Thu, 25 Aug 2022 10:33:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C25306B0075; Thu, 25 Aug 2022 10:33:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B10A46B0073 for ; Thu, 25 Aug 2022 10:33:19 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7ABCA121CB4 for ; Thu, 25 Aug 2022 14:33:19 +0000 (UTC) X-FDA: 79838357718.28.BDF392C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf26.hostedemail.com (Postfix) with ESMTP id A3A9514000B for ; Thu, 25 Aug 2022 14:33:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661437997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HAu4rC9CyeyNudYPR+y9tR4vVDLBYqDeDne8QlTGLbM=; b=eX6J7vGC97jvonQRvB7DQVVbDYyZ3ipVlvUW74CtHdyMCSjI32aETRvkRDBGET8Pa5cg8r sGp3oWdIaWELcVCLu9zAdLHzFdSaKQk02gDmVHcdVA0JtzQVpieDMkoZ2LVU84NRF3m+6O S+TZ6lLA5YehNg4HYqxVHxCNjUpHQQE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-170-ryT9HQ2ANxCKwMP8lBcWpw-1; Thu, 25 Aug 2022 10:33:15 -0400 X-MC-Unique: ryT9HQ2ANxCKwMP8lBcWpw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 45CA218A01D5; Thu, 25 Aug 2022 14:33:03 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.192.29]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7CC50492CA5; Thu, 25 Aug 2022 14:32:59 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, gregkh@linuxfoundation.org, David Hildenbrand , Mike Kravetz , Peter Feiner , "Kirill A . Shutemov" , Cyrill Gorcunov , Pavel Emelyanov , Jamie Liu , Hugh Dickins , Naoya Horiguchi , Bjorn Helgaas , Muchun Song , Peter Xu , stable@vger.kernel.org Subject: [PATCH 4.9-stable -- 5.19-stable] mm/hugetlb: fix hugetlb not supporting softdirty tracking Date: Thu, 25 Aug 2022 16:32:58 +0200 Message-Id: <20220825143258.36151-1-david@redhat.com> In-Reply-To: <1661424546448@kroah.com> References: <1661424546448@kroah.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661437997; a=rsa-sha256; cv=none; b=CycNZB3ppRVCvww4StQifGtWfaBZtOFE6UuFwtv5VYpAf9dNxMcixCxT7GtzPpcmBYai/y iGBF9HWUB+qaJE27bNV5OExrda1uuI/r2DUy6MaE5sZXP0aJNMr4+yIGF21MkxBJ+hPTYy vtt8FX9OpBOXZECRdlvcYujE8KKrUVg= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eX6J7vGC; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661437997; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HAu4rC9CyeyNudYPR+y9tR4vVDLBYqDeDne8QlTGLbM=; b=5vivnfLxUepbEpPEk5kmpqsxRQQ4gVVYfVb4ATwuWbxZpgId8yxEIlI8VB0rTbceeskxdd U26ON+XLVhpGvgnw4Ip2RtjZ0BaFQi8mhzHYHVwWmSxwlUXB/Cmm8Ukx7o5cFcEXGPMsXH 0HG8/4LX6zXX8Xt3wbz+npEsFR7QvGI= Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eX6J7vGC; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: kakuy1ttg96bcnfsyfpw65kaccjpun9g X-Rspamd-Queue-Id: A3A9514000B X-HE-Tag: 1661437997-947870 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: commit f96f7a40874d7c746680c0b9f57cef2262ae551f upstream. Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2. I observed that hugetlb does not support/expect write-faults in shared mappings that would have to map the R/O-mapped page writable -- and I found two case where we could currently get such faults and would erroneously map an anon page into a shared mapping. Reproducers part of the patches. I propose to backport both fixes to stable trees. The first fix needs a small adjustment. This patch (of 2): Staring at hugetlb_wp(), one might wonder where all the logic for shared mappings is when stumbling over a write-protected page in a shared mapping. In fact, there is none, and so far we thought we could get away with that because e.g., mprotect() should always do the right thing and map all pages directly writable. Looks like we were wrong: -------------------------------------------------------------------------- #include #include #include #include #include #include #include #define HUGETLB_SIZE (2 * 1024 * 1024u) static void clear_softdirty(void) { int fd = open("/proc/self/clear_refs", O_WRONLY); const char *ctrl = "4"; int ret; if (fd < 0) { fprintf(stderr, "open(clear_refs) failed\n"); exit(1); } ret = write(fd, ctrl, strlen(ctrl)); if (ret != strlen(ctrl)) { fprintf(stderr, "write(clear_refs) failed\n"); exit(1); } close(fd); } int main(int argc, char **argv) { char *map; int fd; fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT); if (!fd) { fprintf(stderr, "open() failed\n"); return -errno; } if (ftruncate(fd, HUGETLB_SIZE)) { fprintf(stderr, "ftruncate() failed\n"); return -errno; } map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } *map = 0; if (mprotect(map, HUGETLB_SIZE, PROT_READ)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } clear_softdirty(); if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } *map = 0; return 0; } -------------------------------------------------------------------------- Above test fails with SIGBUS when there is only a single free hugetlb page. # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test Bus error (core dumped) And worse, with sufficient free hugetlb pages it will map an anonymous page into a shared mapping, for example, messing up accounting during unmap and breaking MAP_SHARED semantics: # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test # cat /proc/meminfo | grep HugePages_ HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Reason in this particular case is that vma_wants_writenotify() will return "true", removing VM_SHARED in vma_set_page_prot() to map pages write-protected. Let's teach vma_wants_writenotify() that hugetlb does not support softdirty tracking. Link: https://lkml.kernel.org/r/20220811103435.188481-1-david@redhat.com Link: https://lkml.kernel.org/r/20220811103435.188481-2-david@redhat.com Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared") Signed-off-by: David Hildenbrand Reviewed-by: Mike Kravetz Cc: Peter Feiner Cc: Kirill A. Shutemov Cc: Cyrill Gorcunov Cc: Pavel Emelyanov Cc: Jamie Liu Cc: Hugh Dickins Cc: Naoya Horiguchi Cc: Bjorn Helgaas Cc: Muchun Song Cc: Peter Xu Cc: [3.18+] Signed-off-by: Andrew Morton Signed-off-by: David Hildenbrand --- mm/mmap.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 7c59ec73acc3..3b284b091bb7 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1693,8 +1693,12 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) pgprot_val(vm_pgprot_modify(vm_page_prot, vm_flags))) return 0; - /* Do we need to track softdirty? */ - if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && !(vm_flags & VM_SOFTDIRTY)) + /* + * Do we need to track softdirty? hugetlb does not support softdirty + * tracking yet. + */ + if (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && !(vm_flags & VM_SOFTDIRTY) && + !is_vm_hugetlb_page(vma)) return 1; /* Specialty mapping? */