From patchwork Fri Mar 11 19:07:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12778538 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D513C433F5 for ; Fri, 11 Mar 2022 19:07:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 020568D0006; Fri, 11 Mar 2022 14:07:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F12088D0001; Fri, 11 Mar 2022 14:07:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB2E08D0006; Fri, 11 Mar 2022 14:07:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id CB59F8D0001 for ; Fri, 11 Mar 2022 14:07:30 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 75D4F181D3975 for ; Fri, 11 Mar 2022 19:07:30 +0000 (UTC) X-FDA: 79233039060.19.3B7373E Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf13.hostedemail.com (Postfix) with ESMTP id 7FA9A200BC for ; Fri, 11 Mar 2022 19:07:04 +0000 (UTC) Received: by mail-pl1-f179.google.com with SMTP id e13so8445660plh.3 for ; Fri, 11 Mar 2022 11:07:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=XW7DMcnwk1D85iVl8OKvTHZ55GYo9rx6/0j7AsIh0HQ=; b=e47/s6ZBl4wIUGL5c01pNpGqkL5r6JZQLVE6zMWFkZY0l+FwUINd/znyYKiGbpZ+oe i9GeHQ9AYKGtzUHOKvYOwS0igdHMwtCFUfzhdTjtlBir9fkng/T986ta5EQU4GqCeK2/ d0bt4Pyp9nn8jeL99jDgwYLNPc8FYeC3BN83OwsOyhc0nzBhFsokE+NNLx9c1LyZaSEo dIMaxL7BAClSzsw+ZO/oqhmUen17ycSl+wo+iOmz26h6kitOiJu7mcF8Gi0z5bphzjdP Z++wHrNhvD5RmEGuhTPRxRkuJB4UIXy2aAfl2tC3ttnG2VDw6SqvzMLYIwMiciBYw3B6 8IrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=XW7DMcnwk1D85iVl8OKvTHZ55GYo9rx6/0j7AsIh0HQ=; b=pq7JD8Ol7N53VueyNhzCwpCIApaxfGrZ4TRAEk7bo3ldGo9l0Oxti4zTNceLLQTY14 TRKlEQRspoDBOIJQgFxp1oLRQHHX/RSaeUlIzef9R/mUtYfnSfR6Su44WalnHnC9Nkhk s0MyjKFZZ7ehrgt+7/3T0eBRV9Rfaxhhny1iqQc6g3vEUGVAHtPoL1L8HFHZqIARzm33 98afWbh8mOCHRsCnjnLdDN7y461tZqgMT6BZVBi5UEXDGyyRsrr6uuyrP9biNQqTEyZY aJ9H6K+7buW/ae6wVnnoAWLjMGfseTs/bXli1S9cZ7yOeuII1x3apSV1wdKRf7jUC6+w H5Vg== X-Gm-Message-State: AOAM532iDavCsk4cjHsD2/WC2814kPUZNnhLPd0PaVnRlXP+FAkOBXnl DMl3pJ9lA6+3NHXjKHGudKeRT9xFzlw= X-Google-Smtp-Source: ABdhPJxqrFhajsmYg+GWFFnHM9DyifPtdDuQJ2e2Dir9Apy8YRuyREqw+9Qpa5wx1sy4qbFcF4rw+w== X-Received: by 2002:a17:902:ba8f:b0:153:237c:a77f with SMTP id k15-20020a170902ba8f00b00153237ca77fmr9063928pls.1.1647025622607; Fri, 11 Mar 2022 11:07:02 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id a5-20020a621a05000000b004f79f8f795fsm857329pfa.0.2022.03.11.11.07.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 11:07:02 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Nadav Amit Subject: [RESEND PATCH v3 0/5] mm/mprotect: avoid unnecessary TLB flushes Date: Fri, 11 Mar 2022 11:07:44 -0800 Message-Id: <20220311190749.338281-1-namit@vmware.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="e47/s6ZB"; spf=none (imf13.hostedemail.com: domain of mail-pl1-f179.google.com has no SPF policy when checking 209.85.214.179) smtp.helo=mail-pl1-f179.google.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7FA9A200BC X-Stat-Signature: gbohih7ie3futesujidwpksp4ymmi97m X-HE-Tag: 1647025624-607108 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit This patch-set is intended to remove unnecessary TLB flushes during mprotect() syscalls. Once this patch-set make it through, similar and further optimizations for MADV_COLD and userfaultfd would be possible. Sorry for the time between it took me to get to v3. Basically, there are 3 optimizations in this patch-set: 1. Use TLB batching infrastructure to batch flushes across VMAs and do better/fewer flushes. This would also be handy for later userfaultfd enhancements. 2. Avoid TLB flushes on permission demotion. This optimization is the one that provides most of the performance benefits. Note that the previous batching infrastructure changes are needed for that to happen. 3. Avoiding TLB flushes on change_huge_pmd() that are only needed to prevent the A/D bits from changing. Andrew asked for some benchmark numbers. I do not have an easy determinate macrobenchmark in which it is easy to show benefit. I therre ran a microbenchmark: a loop that does the following on anonymous memory, just as a sanity check to see that time is saved by avoiding TLB flushes. The loop goes: mprotect(p, PAGE_SIZE, PROT_READ) mprotect(p, PAGE_SIZE, PROT_READ|PROT_WRITE) *p = 0; // make the page writable The test was run in KVM guest with 1 or 2 threads (the second thread was busy-looping). I measured the time (cycles) of each operation: 1 thread 2 threads mmots +patch mmots +patch PROT_READ 3494 2725 (-22%) 8630 7788 (-10%) PROT_READ|WRITE 3952 2724 (-31%) 9075 2865 (-68%) [ mmots = v5.17-rc6-mmots-2022-03-06-20-38 ] The exact numbers are really meaningless, but the benefit is clear. There are 2 interesting results though. (1) PROT_READ is cheaper, while one can expect it not to be affected. This is presumably due to TLB miss that is saved (2) Without memory access (*p = 0), the speedup of the patch is even greater. In that scenario mprotect(PROT_READ) also avoids the TLB flush. As a result both operations on the patched kernel take roughly ~1500 cycles (with either 1 or 2 threads), whereas on mmotm their cost is as high as presented in the table. --- v2 -> v3: * Fix orders of patches (order could lead to breakage) * Better comments * Clearer KNL detection [Dave] * Assertion on PF error-code [Dave] * Comments, code, function names improvements [PeterZ] * Flush on access-bit clearing on PMD changes to follow the way flushing on x86 is done today in the kernel. v1 -> v2: * Wrong detection of permission demotion [Andrea] * Better comments [Andrea] * Handle THP [Andrea] * Batching across VMAs [Peter Xu] * Avoid open-coding PTE analysis * Fix wrong use of the mmu_gather() *** BLURB HERE *** Nadav Amit (5): x86: Detection of Knights Landing A/D leak x86/mm: check exec permissions on fault mm/mprotect: use mmu_gather mm/mprotect: do not flush on permission promotion mm: avoid unnecessary flush on change_huge_pmd() arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/pgtable.h | 5 ++ arch/x86/include/asm/pgtable_types.h | 2 + arch/x86/include/asm/tlbflush.h | 82 ++++++++++++++++++++++++ arch/x86/kernel/cpu/intel.c | 5 ++ arch/x86/mm/fault.c | 22 ++++++- arch/x86/mm/pgtable.c | 10 +++ fs/exec.c | 6 +- include/asm-generic/tlb.h | 14 +++++ include/linux/huge_mm.h | 5 +- include/linux/mm.h | 5 +- include/linux/pgtable.h | 20 ++++++ mm/huge_memory.c | 19 ++++-- mm/mprotect.c | 94 +++++++++++++++------------- mm/pgtable-generic.c | 8 +++ mm/userfaultfd.c | 6 +- 16 files changed, 248 insertions(+), 56 deletions(-)