From patchwork Tue Nov 19 09:07:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13879517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45C24D6204A for ; Tue, 19 Nov 2024 09:16:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEBBA6B0085; Tue, 19 Nov 2024 04:16:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A9B056B0088; Tue, 19 Nov 2024 04:16:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 989FC6B0089; Tue, 19 Nov 2024 04:16:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 757A06B0085 for ; Tue, 19 Nov 2024 04:16:41 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 1246EC02CB for ; Tue, 19 Nov 2024 09:10:12 +0000 (UTC) X-FDA: 82802271636.12.B0F2066 Received: from mail-ot1-f44.google.com (mail-ot1-f44.google.com [209.85.210.44]) by imf05.hostedemail.com (Postfix) with ESMTP id A369D10000E for ; Tue, 19 Nov 2024 09:08:35 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=BeAKEQ0q; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.44 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732007344; a=rsa-sha256; cv=none; b=Ty0K7Ha1phnZo6pWANtolZIrvf6e0l86pL0RZckZ6TYtsFBkBFJehJYk8eB37G0St0c/D9 QIWJ02TaV6VryJcFelw06ofQmE5BbXNVVtNyoVu+wbxt2N2IZnXeUuvaL6HyFvSc39BNtq AfRKzSKwbfCb4i+A5fjyhL1HMeCd2HA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=BeAKEQ0q; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.44 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732007344; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=IzLEhxN+4asFK9Bs4e0Mnm9KxvFCRf0oHSjev/+VhzA=; b=WwXfqoxi7y8DH8uePKz8zOOwu/eQKXd1GkKdECcTrXo1aZCFxl0ulbhR8gMTPvmLg4WBCB 0zkzIqZ2K10kM+nAkqzvYITErVfBdaYX3hF3ZqgBjs2UCcG+fQh5SP3ALi+czX1wzJY3w0 jmt8MgAcSnnMaKEfvBeeCidwk4kZx/A= Received: by mail-ot1-f44.google.com with SMTP id 46e09a7af769-71809fe188cso1655300a34.0 for ; Tue, 19 Nov 2024 01:10:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1732007409; x=1732612209; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=IzLEhxN+4asFK9Bs4e0Mnm9KxvFCRf0oHSjev/+VhzA=; b=BeAKEQ0qpHVDsan/CfrcBT6e0/y1PwGwIUPkE1lm0gOGO5LhbacFp5re1qykow8wp0 WCQ3HPW0x6GEE9Fjjx6vNuftWSRGJ/XpYmhIExyD0XTmTCWxNnATS+DfKgbSHMhPKSOK cdixwehe5YoCsK1peDpb53DuJ8jyUZ9zPIoQMXJA5GrAxwT5wLdtE7+P5cABw+adFlEZ dOrNELeaZ4taEu3Wr6aVKAJgea4rTKEf1ao8+mDZmMnBMYydfDLPQpGvluVkoQSM6cfB 4JqmeFFwu9yPa8zbqP0HG51UWHOh59aoucJzrpn8QD+AMxAB8OYpeyZlN9x1Ix5laqk0 n8/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732007409; x=1732612209; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=IzLEhxN+4asFK9Bs4e0Mnm9KxvFCRf0oHSjev/+VhzA=; b=UsScCRay+n5fw25MSDB/kEeC9z5pmXoiDAV6ETD/4iIDuol9+xlemgkv2Czl4JQ6cw X9X6qENCXROoATge6kVaL+ELxFdC0lemxHmPaha4g2mY2UHNoWIRZgK75gEzjFwH46r6 51+eHQoO2fdd1SyqryzUyAmQKjjqKtK29qsY45LFw0xIZptXJ5mbTw3L5WkksUMlS890 m61C5SPd0AuOxoFTys/sUKDldao9d7GbvOXdBg1YXbuOEblm8yXvSvRfS5q/ner8dgys W2MMYyQ+lhmcyKgpwr0DJVw7qaE4UQdZMEXdxJrUfT8OutNNMnNqFKATEIOB9m2wRQUu 7kzA== X-Forwarded-Encrypted: i=1; AJvYcCWJh+KbQUshQFDPvLgFIFXywBQvzKSur5NAQ/vKXHjZGhqhNNEN7iJM6LSOf+LG+VHmFAbbR7yGPg==@kvack.org X-Gm-Message-State: AOJu0YxYBF0pwBwrSurURAe2WrX1AQ6gq2BaVcjftKU/M7jjycF5TxsJ t4o9QMSoAHYodm2FaYAC1cv+2Yw7VllqMOqiXhHD4ifs/f2jbZRio4ugAF7gcRs= X-Google-Smtp-Source: AGHT+IGyGuiGMKlRiaG+HiYwFMl/Y5ShneMQY9hcGlp/IVUKhWeaDcz7mbY8/6SkHUeL9OGTgdF5pw== X-Received: by 2002:a05:6830:370a:b0:717:fe2d:a4e4 with SMTP id 46e09a7af769-71a77a09ab0mr12374401a34.19.1732007409238; Tue, 19 Nov 2024 01:10:09 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.151]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7f8c1c644c8sm7223522a12.47.2024.11.19.01.10.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 01:10:08 -0800 (PST) From: Qi Zheng To: pasha.tatashin@soleen.com, tongtiangen@huawei.com, jannh@google.com, lorenzo.stoakes@oracle.com, david@redhat.com, ryan.roberts@arm.com, peterx@redhat.com, jgg@ziepe.ca, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Qi Zheng Subject: [PATCH] mm: pgtable: make ptep_clear() non-atomic Date: Tue, 19 Nov 2024 17:07:40 +0800 Message-Id: <20241119090740.65768-1-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: A369D10000E X-Rspamd-Server: rspam11 X-Stat-Signature: 8ufyofqpctwnap9gfmndrcunhrffeo3x X-HE-Tag: 1732007315-158229 X-HE-Meta: U2FsdGVkX19wCqVCp+Bosb1iLlLEgnEGmWU+ORPRyTfhrtLiSfNBnS4TGf7HKQcYAGHDJKw188ULdWXsahV8Jg9lxTnUrU0sFJm28E7JiGq0dsPB5IJ9fHkk3t0Xhn99Ebh+0us0bQdzAF9GGVp4315rw8AEvXk6UrURQrAbquuWsQvA04CkdQf4FmwjMHpH7hdaXto8AD8fyzOqThzsZjqVQ1Tgax9vUEjEwuD7Iw7gZ2O7//lxDUlILASM7Erq+AYmad8WGaZpQhLQqLXcclHgIAguHluN/4dlF2voQ2464hUcqNWpKdGQyAChzO1q3vGBkiXZXFt4+xFLpbQBaDtkL7rITX9Te25Ue8ndOG5h/uHx/x9UbWPrq16inO1bRVL6hoBBeXHZcia/HF9RKgH6MmzywhXvPM7NcwcbJcNIje0RA1N+T/LHNlTJ9Oq+wCT3Moi5IS6UpA7MloBjt3YSJufQw9oS3zlxYqzEQtlDh8z1uPlQz8NFRgemFiMU6TUubm33qj+LGYc3yNknzfmhKS+1OZFmcaSEt3ZDqhjQHWY34Pr3Eo9Cpva5gyHe4B06hrBdADb82FgdXMCkhfsuQiEIh9fzZMj3Ri4jj/hc4qvVpRkggTnMDBgX2mMjBIviEbZdlruL5bPSCo9q4yB4BoWll6APT4BXMEucXRnOCPbEfWnLxUvoVjOy3jdAhkaXRywqkrzCkpy0LpINYCHpXyYcFLnY20t+/fVQXiEXbA6VrtrXpgXHfk0Bpkue0E02fmu8NH0FpzFPPE16jTJRCqP/dAWnQ2UAOsy/YRTUGEoZ8706J1iKpgDELTusPq2Fh8vuBwhK1+nAq97XSsR76VA0r/Fku3UZu0TUhgbSWToLUwWKlbK4ksR5KT/wYBOcG9TNtvjwEnCh/zxnItWA9Vx6xtY7nM1lda1AyO7DzBAafph0xirC89KrheJwbop0N5ZS1hhN+9yBPHP gdLK2TLY Iu4+cA15J3i2hb0KPfxqYPG/B4wj5jvgfOQB2xrwQOjUzS/fDJowJsyPHCDVLPCfXOZxX92AEW4C/lvqrD89Vrtkdpa8iK82+csV9TbrBzEQX9e2fc6f0nFo7KSdFzWytCNebQ1I1SD1BoQo7y7uA7YdFM3uosB8vWLHyjIOZlqBG8gDt5WGUoicnhTR/aCnoL2KZVH3X1VdKFxsz4hcuEu5AaKIdROcZLHZl0rvDQ832t3tGbDve0xL5tWSqDBJ3lMlcYgqeqXbDMHYE9I8HfbhYTs7U+wwNFNwoIDW0PvukVvsdHjcjgMcUca7c8wz8xfp7TPi29NzjggNG9bMADKkZaJaj9sNjdjDRODb57ojgQ+k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the generic ptep_get_and_clear() implementation, it is just a simple combination of ptep_get() and pte_clear(). But for some architectures (such as x86 and arm64, etc), the hardware will modify the A/D bits of the page table entry, so the ptep_get_and_clear() needs to be overwritten and implemented as an atomic operation to avoid contention, which has a performance cost. The commit d283d422c6c4 ("x86: mm: add x86_64 support for page table check") adds the ptep_clear() on the x86, and makes it call ptep_get_and_clear() when CONFIG_PAGE_TABLE_CHECK is enabled. The page table check feature does not actually care about the A/D bits, so only ptep_get() + pte_clear() should be called. But considering that the page table check is a debug option, this should not have much of an impact. But then the commit de8c8e52836d ("mm: page_table_check: add hooks to public helpers") changed ptep_clear() to unconditionally call ptep_get_and_clear(), so that the CONFIG_PAGE_TABLE_CHECK check can be put into the page table check stubs (in include/linux/page_table_check.h). This also cause performance loss to the kernel without CONFIG_PAGE_TABLE_CHECK enabled, which doesn't make sense. Currently ptep_clear() is only used in debug code and in khugepaged collapse paths, which are fairly expensive. So the cost of an extra atomic RMW operation does not matter. But this may be used for other paths in the future. After all, for the present pte entry, we need to call ptep_clear() instead of pte_clear() to ensure that PAGE_TABLE_CHECK works properly. So to be more precise, just calling ptep_get() and pte_clear() in the ptep_clear(). Signed-off-by: Qi Zheng Reviewed-by: Pasha Tatashin Reviewed-by: Jann Horn Reviewed-by: Muchun Song --- include/linux/pgtable.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index adef9d6e9b1ba..e59decd22e1cb 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -533,7 +533,10 @@ static inline void clear_young_dirty_ptes(struct vm_area_struct *vma, static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - ptep_get_and_clear(mm, addr, ptep); + pte_t pte = ptep_get(ptep); + + pte_clear(mm, addr, ptep); + page_table_check_pte_clear(mm, pte); } #ifdef CONFIG_GUP_GET_PXX_LOW_HIGH