From patchwork Wed Mar 6 00:15:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13583145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70612C54E49 for ; Wed, 6 Mar 2024 00:15:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E37326B007E; Tue, 5 Mar 2024 19:15:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DE61F6B0080; Tue, 5 Mar 2024 19:15:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAEAD6B0081; Tue, 5 Mar 2024 19:15:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B45456B007E for ; Tue, 5 Mar 2024 19:15:24 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7A018120E74 for ; Wed, 6 Mar 2024 00:15:24 +0000 (UTC) X-FDA: 81864694968.04.27D767E Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf05.hostedemail.com (Postfix) with ESMTP id F22BB100020 for ; Wed, 6 Mar 2024 00:15:22 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P4fDJB5j; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 3mbXnZQoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3mbXnZQoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709684123; a=rsa-sha256; cv=none; b=gSV7IoeLUqi6gb0G5j5lHXPvmH1ag9OSnrbNPYN6hBzt0gmTIGIm+5kjwgno0AMcZtjlTd 9IK1qL+oADyUuCY1i9WTv9o4LK2j0o54rSdnV0eJcDMpv4kT6cL2HbMq7A7nQGZL7tKv/V HcQshHyGTMHeqnT164Are7EFfv1GPYU= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P4fDJB5j; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 3mbXnZQoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3mbXnZQoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709684123; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=ChgiPj0kYltLQzZMIn6xeFnYbJXLFSJLwKxQczKImDM=; b=uUXpvEmwNTO4SIQRQozam7OBtYQxiAfpNmby+38a75euJZJxvHZzIsBgPEXhTd8gumSGP4 E/sQdBjxdqsxyAk10/fAlbpiH32UyWfZcvnzK8TGeA/qM30/Q9q8dLsZLOGImTm78hKapw Xs9KALqhuyjGFNr23dlqHrM4NJxXfgQ= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-608dc99b401so107869987b3.0 for ; Tue, 05 Mar 2024 16:15:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709684122; x=1710288922; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=ChgiPj0kYltLQzZMIn6xeFnYbJXLFSJLwKxQczKImDM=; b=P4fDJB5j83e/W/kWCoNpqff1/aAqWD3bX4dpkx8IZ5OF5nXb01RCXIDcji4MwdDnYZ 0WZTb+Bjb9mH+ArcsJTqu5FgE0BzkxoFCtW/mpkX80PcS06YdZzXNGnTcg/6ZxRGIsXq wibpjycNXRH/D0kEMK8zNNI6WnRmi+KwBIlBlrNLViKId+8B/NYjscVPrLaaRvN2nG/8 8WDhOjjBiWgASPWbfjcTypZ1KSNrnFuOWzDb787n5dP8oUtVetB9g0lsNDvcXgfX2bnx hEQmv0YthDW5orAjlpfJ6DDva+0LwAaQtNIFFXkSAW/Afdleqd9Qh8ap+7VAl8GeXDxz fuOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709684122; x=1710288922; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ChgiPj0kYltLQzZMIn6xeFnYbJXLFSJLwKxQczKImDM=; b=F7B7NWxnOw36h4SU7cUALQyJCQ3qaBBkFmoWpXfHeAgGd4YDaYd+J105+WIawUfeir 122cTRFXMjYEyiu/27ZKgUAsmd1VlK++2x4OhwgMjf1g03wNdTuBV7/W2TRnfORr0U0u THMU4K7WQ8dJA83UN56xuErf8r24QbTFSPSmzrV/hu1tq9g6OM1raLkvgR4vx9BKzoDM pXKuGk6J3o2+MdBPu+Ye0Y367DGznP6o15QdLLi0FOaVytRmoMbB2h/wCAN2BBURi64X TXL7Xyxn7Pt8SHs3rSebWFlTJlGHn4Gz+FTpJ8ajwM3HYNWD30J75x7GAw+Cc8ovMcDc uVBg== X-Forwarded-Encrypted: i=1; AJvYcCXvuR4owUBN9PH09XnqQB9fVTM+hPFJ5sRMfXAzDVj+hSiQU58cgL4rmZK9nljNbRgtMH7xZljyFWY8nPiAg1PjqHw= X-Gm-Message-State: AOJu0Yzhk/hKIxWbKBOkhe8siS5wmT65B/eDczRr3K2NXgQoPLaB2BTH Lj+/A4HkZWbdkXfGwRS1iys7e1WqJH67kwchP7s1OT1cMTDmhcE13su8kqoEUEUdSD9dhJwPqn6 sXPeVUHlQCZfIr8PWTg== X-Google-Smtp-Source: AGHT+IFfsY8LjMQNkNxTJ0rSe20ZUdfR3+FzZIQB2yeaN8wmwfvRIu02vM39uiLYIz0MeSq9L6HZCKPA8Bw8F9cK X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:9a92:0:b0:607:fd67:b946 with SMTP id r140-20020a819a92000000b00607fd67b946mr3796447ywg.10.1709684121991; Tue, 05 Mar 2024 16:15:21 -0800 (PST) Date: Wed, 6 Mar 2024 00:15:10 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.44.0.278.ge034bb2e1d-goog Message-ID: <20240306001511.932348-1-jthoughton@google.com> Subject: [PATCH] mm: Add an explicit smp_wmb() to UFFDIO_CONTINUE From: James Houghton To: Peter Xu , Axel Rasmussen , Andrew Morton Cc: Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: F22BB100020 X-Stat-Signature: gbgd41p6fuxpue8b3nxz1kk6wsikehfd X-HE-Tag: 1709684122-733940 X-HE-Meta: U2FsdGVkX1/dzNzdRZNHlDlKF1/4DQyN5DMvsz0bVDhVroDYomf0z/G5a1plu+pgLKVNMQzwrhzjTt6UH1Q2/PhSbELSBWoalJmVs5x9ihrqLFbr4LEGoNPMzuxBsBUsXaaQSd/+M2KiL/IhjFxB1sIEOPr/p4Hxkz8JgsLClKHEoKr00MEI12/uIpmEM+e1f5gxFxNvBuwKoR01Sho4xsaiUVTGfM+x8Az9IH5QzEJkv3Qu+ftxR6v/+MCv6XyR0w1JYc+JqSDEG5qcL6J1Q2NIHw25zWMI7QAHPFu3pzbjQIN0j/b5qeWv0o54UG93lSQgiu4hEk53wvoYGK+njSCYVHSt7CLiEEj97PqzohytMuqRI114MmYD5wCI04LIIp6Ui5wPfuRivQNa7ant9HHfBsHIXYXGcv9DFOk4U2vaNBkhk7pDZlT0JK1u4gZQsGexw/GnUxKTFCo35qk5zMWyD/Yt5LWe3vXRCT9r7fD8p3mTyCaE4bGpmFrNxZ0u66csEUqC4GQJxZrnzhtFaeQvro/6HfPaUBW0nLf1M5GiB5awJ4TJieKHTYCls7JJU8TclrU5BFj1/i4EYfB9jrB4WOTUtSMZhhpoMxJqtqyKRsd6SJgtVE6CsvyPMbL7Fwx9I83hK6eDy5I3kLPvECl2bZJobZtFmYWNzVTFHUAwccVHaqCiZ5VhSDqCqGPROApLw0ccWAoafEpslsvH3EHl8a0N/jrRqBiUlaB2RORu6amGH/8OFfvi4y73vj3ISLKamhG4L3SXSNoQr646vUKn8kjV7FEQh9IzpLZ8UA0euSP8JCLraX0wg300wmse9M82UUBsF2QtL+hcixHTK/ttPaa4NCAPLaw55/LqaKVPgAHXxpt+KgH9HSCEVzgqRr6YYOO2A/RjVa5VmMnjbi1NwUlZVmBmFzbIt/6z7EPqQpMgcP6pBKR4huBUxO8eELsNNRzhP9oMUV+JNvH FKTcUX/1 Ad7BGH5n9Q6gHvRuNQNLqlGK3EX6huC4U9k9Jc2N5QEZDCK5ni+npRTSzFbhvXH6qP+w7G8aMezgvgwoT7rY5urqtAXprRQegBE8FP0/5ZySgmJ6V/Vgqrfb30aKRPnADukzR4GsmaxcMm4Gh2qElSY+Q30ErVCu2lxoy4s9XmWj+ISpAcdHaYVHumXQ8p5SNEv3R08LawUSkD516ESG+IT3jLOT9nAQNrUz5bCGabnaBOrHctSfKTdZhte6Kie6yVXCY1lOUof54k4wsq+u/aY4HVRDu0EZU3Svn8EGRZ8QMMIlqu+KZr+sV4pHW+gHOBf7tEaNgcsx+gq4lrZy/Ddw++6COCjWwkBCnNnGM2vepbxbRkiMm2ovFNuB6ye2d7FnQgT5Bdei40hSVBMlQ3UIeXMpPiChpVotUGd0EqDhi8zw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Users of UFFDIO_CONTINUE may reasonably assume that a write memory barrier is included as part of UFFDIO_CONTINUE. That is, a user may (mistakenly) believe that all writes it has done to a page that it is now UFFDIO_CONTINUE'ing are guaranteed to be visible to anyone subsequently reading the page through the newly mapped virtual memory region. Include only a single smp_wmb() for each UFFDIO_CONTINUE, as that is all that is necessary. While we're at it, optimize the smp_wmb() that is already incidentally present for the HugeTLB case. Documentation doesn't specify if the kernel does a wmb(), so it's not wrong not to include it. But by not including it, we are making is easy for a user to have a very hard-to-detect bug. Include it now to be safe. A user that decides to update the contents of the page in one thread and UFFDIO_CONTINUE that page in another must already take additional steps to synchronize properly. Signed-off-by: James Houghton --- I'm not sure if this patch should be merged. I think it's the right thing to do, as it is very easy for a user to get this wrong. (I have been using UFFDIO_CONTINUE for >2 years and only realized this problem recently.) Given that it's not a "bug" strictly speaking, even if this patch is a good idea, I'm unsure if it needs to be backported. This quirk has existed since minor fault support was added for shmem[1]. I've tried to see if I can legitimately get a user to read stale data, and a few attempts with this test[2] have been unsuccessful. [1]: commit 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem") [2]: https://gist.github.com/48ca/38d0665b0f1a6319a56507dc73a173f9 mm/hugetlb.c | 15 +++++++++------ mm/userfaultfd.c | 18 ++++++++++++++++++ 2 files changed, 27 insertions(+), 6 deletions(-) base-commit: a7f399ae964e1d2a11d88d863a1d64392678ccaf diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bb17e5c22759..533bf6b2d94d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6779,12 +6779,15 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, } } - /* - * The memory barrier inside __folio_mark_uptodate makes sure that - * preceding stores to the page contents become visible before - * the set_pte_at() write. - */ - __folio_mark_uptodate(folio); + if (!is_continue) { + /* + * The memory barrier inside __folio_mark_uptodate makes sure + * that preceding stores to the page contents become visible + * before the set_pte_at() write. + */ + __folio_mark_uptodate(folio); + } else + WARN_ON_ONCE(!folio_test_uptodate(folio)); /* Add shared, newly allocated pages to the page cache. */ if (vm_shared && !is_continue) { diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 503ea77c81aa..d515b640ca48 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -531,6 +531,10 @@ static __always_inline ssize_t mfill_atomic_hugetlb( goto out_unlock; } + if (uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) + /* See the comment in mfill_atomic. */ + smp_wmb(); + while (src_addr < src_start + len) { BUG_ON(dst_addr >= dst_start + len); @@ -743,6 +747,20 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) goto out_unlock; + if (uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) + /* + * A caller might reasonably assume that UFFDIO_CONTINUE + * contains a wmb() to ensure that any writes to the + * about-to-be-mapped page by the thread doing the + * UFFDIO_CONTINUE are guaranteed to be visible to subsequent + * reads of the page through the newly mapped address. + * + * For MFILL_ATOMIC_COPY, the wmb() is done for each COPYed + * page. We can do the wmb() now for CONTINUE as the user has + * already prepared the page contents. + */ + smp_wmb(); + while (src_addr < src_start + len) { pmd_t dst_pmdval;