From patchwork Thu Feb 25 07:29:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12103557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92AF2C433E0 for ; Thu, 25 Feb 2021 07:33:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1983564F0A for ; Thu, 25 Feb 2021 07:33:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1983564F0A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 63B636B0072; Thu, 25 Feb 2021 02:33:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 558C46B0073; Thu, 25 Feb 2021 02:33:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 439DE6B0074; Thu, 25 Feb 2021 02:33:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0133.hostedemail.com [216.40.44.133]) by kanga.kvack.org (Postfix) with ESMTP id 2EEC46B0072 for ; Thu, 25 Feb 2021 02:33:51 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E26EF10F78 for ; Thu, 25 Feb 2021 07:33:50 +0000 (UTC) X-FDA: 77855975820.14.7918B35 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf06.hostedemail.com (Postfix) with ESMTP id 21068C0001FA for ; Thu, 25 Feb 2021 07:33:51 +0000 (UTC) Received: by mail-pf1-f176.google.com with SMTP id j24so3054242pfi.2 for ; Wed, 24 Feb 2021 23:33:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zgKO3MnWtgg50f4xin0lRAlnzGDn6IFUU+LS1IwPjiI=; b=W/Hk+wWyxzLK85Yspxk1j9kKCNb2H/TE8Nr3Gsd4yYPhFoQqeFYulVz1ESEv9Xuyuy RhNtM6Hwq7cRgFjCtqGqXH7Iue9w/00NIlLkAeKmW5eCHN17S92vu71m/qLWNUua1ZWj FNsUtTlmdX+nONjru1+lUpNNeuvBTw+cJrb0fzST/1ux/1h1Zf/pA+ka4vtqCmriGpAl 3cXsbLbzA+B0VKEYW2J9LQqIkd5v16cHVUPeZzrBkRZcLlpA8kDNqk7A6dSmOS1oZ84L ngFdFH4gKzdcvg4IPh7w74oNyMY8vDSCc91N2A5IfH2YVvLJDAPd30cPIBoXLCu23wxW nSKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zgKO3MnWtgg50f4xin0lRAlnzGDn6IFUU+LS1IwPjiI=; b=boHoFXh8ZjLYiGEkKQPHnxQG+w+5OTn/2J6C/NBnn7cAMUIfHqEugrDFElXmRdCO0w Fhu0FpwE76yBs5wK21gCagldxFDD112CB222Mrza0pIpCAoUuS8z7/1eiwCU1Z9Bxcpi Y8RX5u2Z1SwTjE24EBzqEW5macYhKxv3jB3XmB1JXZVCuKaBRWKYvfjFFbuCgb2ni2c6 mRPfdJmhCGQ7cveOQp+yjlYuszPFwnXMOnTQY+vkH8w4HzBjQeaEB0n3pKDwYuVIcYgj HU/m+ajvxIWEGX48YkUb0iopm2vCx+JUGzJdDdPx+pPR7nezhWQDBSx+O61XSTlPJsQO vZ6A== X-Gm-Message-State: AOAM533itkdBaiPba3JP+FI4AqeyJsIsmZJ3MnLEmJbLOmJW+zaxnYB4 SnrjWyMZ8ZzBYrE7liXv0rs1ANaE9Yp+oA== X-Google-Smtp-Source: ABdhPJxSLsgCDlJ2iNA9exMeg5JU1gPQkYTFM9O+DtIiU0GZ0RewqOlVs2JHJeIXT5peHT7BfkCfag== X-Received: by 2002:a05:6a00:1a4b:b029:1d2:8522:75e7 with SMTP id h11-20020a056a001a4bb02901d2852275e7mr2037326pfv.80.1614238429290; Wed, 24 Feb 2021 23:33:49 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id w3sm4917561pjt.24.2021.02.24.23.33.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 23:33:48 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Andy Lutomirski , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Nadav Amit , Sean Christopherson , Andrew Morton , x86@kernel.org Subject: [RFC 1/6] vdso/extable: fix calculation of base Date: Wed, 24 Feb 2021 23:29:05 -0800 Message-Id: <20210225072910.2811795-2-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210225072910.2811795-1-namit@vmware.com> References: <20210225072910.2811795-1-namit@vmware.com> MIME-Version: 1.0 X-Stat-Signature: 5kf13rq5rskhnxym7mbbpcbmqjs9idzb X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 21068C0001FA Received-SPF: none (<>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from="<>"; helo=mail-pf1-f176.google.com; client-ip=209.85.210.176 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614238431-171584 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Apparently, the assembly considers __ex_table as the location when the pushsection directive was issued. Therefore when there is more than a single entry in the vDSO exception table, the calculations of the base and fixup are wrong. Fix the calculations of the expected fault IP and new IP by adjusting the base after each entry. Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Sean Christopherson Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton Cc: x86@kernel.org Signed-off-by: Nadav Amit --- arch/x86/entry/vdso/extable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c index afcf5b65beef..c81e78636220 100644 --- a/arch/x86/entry/vdso/extable.c +++ b/arch/x86/entry/vdso/extable.c @@ -32,7 +32,7 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, nr_entries = image->extable_len / (sizeof(*extable)); extable = image->extable; - for (i = 0; i < nr_entries; i++) { + for (i = 0; i < nr_entries; i++, base += sizeof(*extable)) { if (regs->ip == base + extable[i].insn) { regs->ip = base + extable[i].fixup; regs->di = trapnr; From patchwork Thu Feb 25 07:29:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12103559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AFBFC433E6 for ; Thu, 25 Feb 2021 07:33:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 91A9764E90 for ; Thu, 25 Feb 2021 07:33:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 91A9764E90 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3528C8D0001; Thu, 25 Feb 2021 02:33:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DE556B0074; Thu, 25 Feb 2021 02:33:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 157A88D0001; Thu, 25 Feb 2021 02:33:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id F1B176B0073 for ; Thu, 25 Feb 2021 02:33:52 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C4F85180AC498 for ; Thu, 25 Feb 2021 07:33:52 +0000 (UTC) X-FDA: 77855975904.24.97DE335 Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) by imf24.hostedemail.com (Postfix) with ESMTP id BAC81A0009D1 for ; Thu, 25 Feb 2021 07:33:46 +0000 (UTC) Received: by mail-pg1-f182.google.com with SMTP id l2so3244369pgb.1 for ; Wed, 24 Feb 2021 23:33:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5ebRx4V31DSDhZXC0yr3LtKn279spf6opfPxtzn4I70=; b=b5Dg9LmaXf2YzgQNvM6K47rUortYW8n2hCPF5Krke30h9MXYtgfvQkBUZygduBw47k HOWRlBh3NPu3jRaUUgDwM9zK0n/HsCOVwYAsKiE+4GeDc0kfUqYBXSmbMPoMEeKO9L3r lUuUSKzR6/htE7vhh8dSgrr43yqm4IczLlTppo3nunX/DE0c9gLZuDIVFcjvFyUW8+27 UEME7RkgvXelyfL29k8zxAgPBQJnWM0zXlTCt8Tt3rEspcDcVntrYmHTCnLVYg1OX/k3 UeoXLbqXEl7oOYwyS1AseioOuPsNagDFOdNvUIkbWVYr9bhFwcyDy0qbjZqtZ0MWwKxl RA8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5ebRx4V31DSDhZXC0yr3LtKn279spf6opfPxtzn4I70=; b=NlKv2sEZuIvb145EYg2oVTnQLHBar1if71rho8VOR6iaMDJfWzz1iHvrFMIoSu26rh 1O2yCjjn9JmVgIwNRvfSfgAFUsWGV79IUsB92kcpdjvEjrZ6KLgH6509LZK/OMWqAQp0 WmlOS9B+Es+9D4Otvkj3g3ZKRH4sX2y6iA6BLso/FnDY3gsvAVwBE5wdLL8K2lPVpikk pFvJj9DsVt+qp05Q8wdwJD7ZMs7u34lFRAA5nM81eZPUx7lUZjtaI2rL4iIG6s8aw1L+ DR61sK1oAYknPx8o23EI3UUAqP8ZUw8Dk/m7037xVl4hu1v+19d4Hkk/a9hihWCXM+WC h48A== X-Gm-Message-State: AOAM530evFn3XJSo2x0qZl0fQn3WrnKnmMGZGHGnrpnR2tc7nvg5lpD8 onHcifr6ASIrcL/DelUeRXmouE0/2bTgcw== X-Google-Smtp-Source: ABdhPJz45/nZGq0qVdXrll6ISaRWXK4Er9JRFHjpVmUudiHHnAwEcTD+fJdDZQNzJuCCFu58chwvew== X-Received: by 2002:a05:6a00:22d1:b029:1b4:9bb5:724c with SMTP id f17-20020a056a0022d1b02901b49bb5724cmr2069327pfj.63.1614238430691; Wed, 24 Feb 2021 23:33:50 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id w3sm4917561pjt.24.2021.02.24.23.33.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 23:33:50 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Andy Lutomirski , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Nadav Amit , Sean Christopherson , Andrew Morton , x86@kernel.org Subject: [RFC 2/6] x86/vdso: add mask and flags to extable Date: Wed, 24 Feb 2021 23:29:06 -0800 Message-Id: <20210225072910.2811795-3-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210225072910.2811795-1-namit@vmware.com> References: <20210225072910.2811795-1-namit@vmware.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BAC81A0009D1 X-Stat-Signature: fw81q8bkkpkwwo7j4cjupxhu8yn1ebpi Received-SPF: none (<>: No applicable sender policy available) receiver=imf24; identity=mailfrom; envelope-from="<>"; helo=mail-pg1-f182.google.com; client-ip=209.85.215.182 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614238426-643960 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Add a "mask" field to vDSO exception tables that says which exceptions should be handled. Add a "flags" field to vDSO as well to provide additional information about the exception. The existing preprocessor macro _ASM_VDSO_EXTABLE_HANDLE for assembly is not easy to use as it requires the user to stringify the expanded C macro. Remove _ASM_VDSO_EXTABLE_HANDLE and use a similar scheme to ALTERNATIVE, using assembly macros directly in assembly without wrapping them in C macros. Move the vsgx supported exceptions test out of the generic C code into vsgx-specific assembly by setting vsgx supported exceptions in the mask. Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Sean Christopherson Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton Cc: x86@kernel.org Signed-off-by: Nadav Amit --- arch/x86/entry/vdso/extable.c | 9 +-------- arch/x86/entry/vdso/extable.h | 21 +++++++++++++-------- arch/x86/entry/vdso/vsgx.S | 9 +++++++-- 3 files changed, 21 insertions(+), 18 deletions(-) diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c index c81e78636220..93fb37bd32ad 100644 --- a/arch/x86/entry/vdso/extable.c +++ b/arch/x86/entry/vdso/extable.c @@ -7,6 +7,7 @@ struct vdso_exception_table_entry { int insn, fixup; + unsigned int mask, flags; }; bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, @@ -17,14 +18,6 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, unsigned int nr_entries, i; unsigned long base; - /* - * Do not attempt to fixup #DB or #BP. It's impossible to identify - * whether or not a #DB/#BP originated from within an SGX enclave and - * SGX enclaves are currently the only use case for vDSO fixup. - */ - if (trapnr == X86_TRAP_DB || trapnr == X86_TRAP_BP) - return false; - if (!current->mm->context.vdso) return false; diff --git a/arch/x86/entry/vdso/extable.h b/arch/x86/entry/vdso/extable.h index b56f6b012941..7ca8a0776805 100644 --- a/arch/x86/entry/vdso/extable.h +++ b/arch/x86/entry/vdso/extable.h @@ -2,26 +2,31 @@ #ifndef __VDSO_EXTABLE_H #define __VDSO_EXTABLE_H +#include + +#define ASM_VDSO_ASYNC_FLAGS (1 << 0) + /* * Inject exception fixup for vDSO code. Unlike normal exception fixup, * vDSO uses a dedicated handler the addresses are relative to the overall * exception table, not each individual entry. */ #ifdef __ASSEMBLY__ -#define _ASM_VDSO_EXTABLE_HANDLE(from, to) \ - ASM_VDSO_EXTABLE_HANDLE from to - -.macro ASM_VDSO_EXTABLE_HANDLE from:req to:req +.macro ASM_VDSO_EXTABLE_HANDLE from:req to:req mask:req flags:req .pushsection __ex_table, "a" .long (\from) - __ex_table .long (\to) - __ex_table + .long (\mask) + .long (\flags) .popsection .endm #else -#define _ASM_VDSO_EXTABLE_HANDLE(from, to) \ - ".pushsection __ex_table, \"a\"\n" \ - ".long (" #from ") - __ex_table\n" \ - ".long (" #to ") - __ex_table\n" \ +#define ASM_VDSO_EXTABLE_HANDLE(from, to, mask, flags) \ + ".pushsection __ex_table, \"a\"\n" \ + ".long (" #from ") - __ex_table\n" \ + ".long (" #to ") - __ex_table\n" \ + ".long (" #mask ")\n" \ + ".long (" #flags ")\n" \ ".popsection\n" #endif diff --git a/arch/x86/entry/vdso/vsgx.S b/arch/x86/entry/vdso/vsgx.S index 86a0e94f68df..c588255af480 100644 --- a/arch/x86/entry/vdso/vsgx.S +++ b/arch/x86/entry/vdso/vsgx.S @@ -4,6 +4,7 @@ #include #include #include +#include #include "extable.h" @@ -146,6 +147,10 @@ SYM_FUNC_START(__vdso_sgx_enter_enclave) .cfi_endproc -_ASM_VDSO_EXTABLE_HANDLE(.Lenclu_eenter_eresume, .Lhandle_exception) - +/* + * Do not attempt to fixup #DB or #BP. It's impossible to identify + * whether or not a #DB/#BP originated from within an SGX enclave. + */ +ASM_VDSO_EXTABLE_HANDLE .Lenclu_eenter_eresume, .Lhandle_exception, \ + ~((1< X-Patchwork-Id: 12103561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F13DFC433DB for ; Thu, 25 Feb 2021 07:33:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5E0F264E85 for ; Thu, 25 Feb 2021 07:33:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5E0F264E85 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F8AA8D0002; Thu, 25 Feb 2021 02:33:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 17A2D6B0074; Thu, 25 Feb 2021 02:33:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F11B58D0002; Thu, 25 Feb 2021 02:33:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0212.hostedemail.com [216.40.44.212]) by kanga.kvack.org (Postfix) with ESMTP id DCBBD6B0073 for ; Thu, 25 Feb 2021 02:33:54 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A31B4824C431 for ; Thu, 25 Feb 2021 07:33:54 +0000 (UTC) X-FDA: 77855975988.20.87C02E9 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by imf30.hostedemail.com (Postfix) with ESMTP id 1BBDAE0001B4 for ; Thu, 25 Feb 2021 07:33:53 +0000 (UTC) Received: by mail-pj1-f52.google.com with SMTP id i14so1355606pjz.4 for ; Wed, 24 Feb 2021 23:33:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y3sPy044K8OK/fHnemDT9h+eBrVf8ED4HLMVS2xW7Ww=; b=eePsIdmliCbqiaRylWemGy7PqRX5igzSUxNh11OhklFJWQo37Lut9vXXemqCp/bVTY mjs0PMLx9IXyr4wwaVkz9PFt7D7jKWB5hl5upqD0hFsMQY5Rk9cKAHJscSAvZ4rj91xX ND8CIMyVnqIeowKTaJNjQCtpQF7pwoGkYHXp496JEb0tORwaPzcExh3Wn4g3UCshRmWA xkdTjRZ09F1jloI1yO7JDQ2DuA2Jwi9eEyv7uqNbve6TrPftVujfi2kPpGIBuLoY4EC0 ARm069rNNRjb5W1r7e8Pow5SwWZPp3YpQm6Uh47biRuXcu2Bd36uaMX2opFMYEe2QYgu 0zYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y3sPy044K8OK/fHnemDT9h+eBrVf8ED4HLMVS2xW7Ww=; b=NWgIE3UQH/L7ARaImN6wpRyVq6ztr7xvKUjO3Cx5N9spkPsL0EVsL5WWrnpFhDm3z4 2C2dEDaJuM7StvXz4CVD7FnRyYCfptDH1uortjwVvW6A2A8VEAE4Gyqtb1DHlkLDV8pW /R2bixVTOpDe6oPLsIx4SuVrLGc1B4VzUd6Un/rntL+JsXIyz3HyVxbbqXoLHbsyLtXF D17APsOpYsfD2zO+6uVZ1flwuQ8bpJiC7A6UWRHNrHPe6/gtDennV1W89/XpI596pzHz J5bj1bDctwpNjVAZbJtYJW770N+rtBtv9gDf5krhJFz2AGIJdqIsTvrPC8+O/Au0NsMU RU7A== X-Gm-Message-State: AOAM532p2o+75+CKlDxHXqQ1QYGUANOTVIjqM+5zmJkJJQNS+RlAhlSV r0RBeeueylGJVhJmCh0rRzTWbBZ3mhGPHw== X-Google-Smtp-Source: ABdhPJwQKZ1gEUuuRDb10ggdD8J/o5EHLicPpQsdWwjLGG8ZFCGBr/8ITx253xVJibGmkOimKln6HA== X-Received: by 2002:a17:90a:4f85:: with SMTP id q5mr1997681pjh.42.1614238432716; Wed, 24 Feb 2021 23:33:52 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id w3sm4917561pjt.24.2021.02.24.23.33.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 23:33:51 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Andy Lutomirski , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Nadav Amit , Sean Christopherson , Andrew Morton , x86@kernel.org Subject: [RFC 3/6] x86/vdso: introduce page_prefetch() Date: Wed, 24 Feb 2021 23:29:07 -0800 Message-Id: <20210225072910.2811795-4-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210225072910.2811795-1-namit@vmware.com> References: <20210225072910.2811795-1-namit@vmware.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1BBDAE0001B4 X-Stat-Signature: rto19qecxzesnjs31p8cuisprdfidcrh Received-SPF: none (<>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from="<>"; helo=mail-pj1-f52.google.com; client-ip=209.85.216.52 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614238433-109115 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Introduce a new vDSO function: page_prefetch() which is to be used when certain memory, which might be paged out, is expected to be used soon. The function prefetches the page if needed. The function returns zero if the page is accessible after the call and -1 otherwise. page_prefetch() is intended to be very lightweight both when the page is already present and when the page is prefetched. The implementation leverages the new vDSO exception tables mechanism. page_prefetch() accesses the page for read and has a corresponding vDSO exception-table entry that indicates that a #PF might occur and that in such case the page should be brought asynchronously. If #PF indeed occurs, the page-fault handler sets the FAULT_FLAG_RETRY_NOWAIT flag. If the page-fault was not resolved, the page-fault handler does not retry, and instead jumps to the new IP that is marked in the exception table. The vDSO part returns accordingly the return value. Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Sean Christopherson Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton Cc: x86@kernel.org Signed-off-by: Nadav Amit --- arch/x86/Kconfig | 1 + arch/x86/entry/vdso/Makefile | 1 + arch/x86/entry/vdso/extable.c | 59 +++++++++++++++++++++++++-------- arch/x86/entry/vdso/vdso.lds.S | 1 + arch/x86/entry/vdso/vprefetch.S | 39 ++++++++++++++++++++++ arch/x86/include/asm/vdso.h | 38 +++++++++++++++++++-- arch/x86/mm/fault.c | 11 ++++-- lib/vdso/Kconfig | 5 +++ 8 files changed, 136 insertions(+), 19 deletions(-) create mode 100644 arch/x86/entry/vdso/vprefetch.S diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 21f851179ff0..86a4c265e8af 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -136,6 +136,7 @@ config X86 select GENERIC_TIME_VSYSCALL select GENERIC_GETTIMEOFDAY select GENERIC_VDSO_TIME_NS + select GENERIC_VDSO_PREFETCH select GUP_GET_PTE_LOW_HIGH if X86_PAE select HARDIRQS_SW_RESEND select HARDLOCKUP_CHECK_TIMESTAMP if X86_64 diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 02e3e42f380b..e32ca1375b84 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -28,6 +28,7 @@ vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o vobjs32-y := vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o vobjs32-y += vdso32/vclock_gettime.o vobjs-$(CONFIG_X86_SGX) += vsgx.o +vobjs-$(CONFIG_GENERIC_VDSO_PREFETCH) += vprefetch.o # files to link into kernel obj-y += vma.o extable.o diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c index 93fb37bd32ad..e821887112ce 100644 --- a/arch/x86/entry/vdso/extable.c +++ b/arch/x86/entry/vdso/extable.c @@ -4,36 +4,67 @@ #include #include #include +#include "extable.h" struct vdso_exception_table_entry { int insn, fixup; unsigned int mask, flags; }; -bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, - unsigned long error_code, unsigned long fault_addr) +static unsigned long +get_vdso_exception_table_entry(const struct pt_regs *regs, int trapnr, + unsigned int *flags) { const struct vdso_image *image = current->mm->context.vdso_image; const struct vdso_exception_table_entry *extable; unsigned int nr_entries, i; unsigned long base; + unsigned long ip = regs->ip; + unsigned long vdso_base = (unsigned long)current->mm->context.vdso; - if (!current->mm->context.vdso) - return false; - - base = (unsigned long)current->mm->context.vdso + image->extable_base; + base = vdso_base + image->extable_base; nr_entries = image->extable_len / (sizeof(*extable)); extable = image->extable; for (i = 0; i < nr_entries; i++, base += sizeof(*extable)) { - if (regs->ip == base + extable[i].insn) { - regs->ip = base + extable[i].fixup; - regs->di = trapnr; - regs->si = error_code; - regs->dx = fault_addr; - return true; - } + if (ip != base + extable[i].insn) + continue; + + if (!((1u << trapnr) & extable[i].mask)) + continue; + + /* found */ + if (flags) + *flags = extable[i].flags; + return base + extable[i].fixup; } - return false; + return 0; +} + +bool __fixup_vdso_exception(struct pt_regs *regs, int trapnr, + unsigned long error_code, unsigned long fault_addr) +{ + unsigned long new_ip; + + new_ip = get_vdso_exception_table_entry(regs, trapnr, NULL); + if (!new_ip) + return false; + + instruction_pointer_set(regs, new_ip); + regs->di = trapnr; + regs->si = error_code; + regs->dx = fault_addr; + return true; +} + +__attribute_const__ bool __is_async_vdso_exception(struct pt_regs *regs, + int trapnr) +{ + unsigned long new_ip; + unsigned int flags; + + new_ip = get_vdso_exception_table_entry(regs, trapnr, &flags); + + return new_ip && (flags & ASM_VDSO_ASYNC_FLAGS); } diff --git a/arch/x86/entry/vdso/vdso.lds.S b/arch/x86/entry/vdso/vdso.lds.S index 4bf48462fca7..fd4ba24571c8 100644 --- a/arch/x86/entry/vdso/vdso.lds.S +++ b/arch/x86/entry/vdso/vdso.lds.S @@ -28,6 +28,7 @@ VERSION { clock_getres; __vdso_clock_getres; __vdso_sgx_enter_enclave; + __vdso_prefetch_page; local: *; }; } diff --git a/arch/x86/entry/vdso/vprefetch.S b/arch/x86/entry/vdso/vprefetch.S new file mode 100644 index 000000000000..a0fcafb7d546 --- /dev/null +++ b/arch/x86/entry/vdso/vprefetch.S @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include +#include +#include +#include + +#include "extable.h" + +.code64 +.section .text, "ax" + +SYM_FUNC_START(__vdso_prefetch_page) + /* Prolog */ + .cfi_startproc + push %rbp + .cfi_adjust_cfa_offset 8 + .cfi_rel_offset %rbp, 0 + mov %rsp, %rbp + .cfi_def_cfa_register %rbp + + xor %rax, %rax +.Laccess_page: + movb (%rdi), %dil +.Lout: + + /* Epilog */ + pop %rbp + .cfi_def_cfa %rsp, 8 + ret + +.Lhandle_exception: + mov $-1ll, %rax + jmp .Lout + .cfi_endproc +ASM_VDSO_EXTABLE_HANDLE .Laccess_page, .Lhandle_exception, \ + (1< +#include struct vdso_image { void *data; @@ -49,9 +50,40 @@ extern void __init init_vdso_image(const struct vdso_image *image); extern int map_vdso_once(const struct vdso_image *image, unsigned long addr); -extern bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, - unsigned long error_code, - unsigned long fault_addr); +extern bool __fixup_vdso_exception(struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr); + +extern __attribute_const__ bool __is_async_vdso_exception(struct pt_regs *regs, + int trapnr); + +static inline bool is_exception_in_vdso(struct pt_regs *regs) +{ + const struct vdso_image *image = current->mm->context.vdso_image; + unsigned long vdso_base = (unsigned long)current->mm->context.vdso; + + return regs->ip >= vdso_base && regs->ip < vdso_base + image->size && + vdso_base != 0; +} + +static inline bool is_async_vdso_exception(struct pt_regs *regs, int trapnr) +{ + if (!is_exception_in_vdso(regs)) + return false; + + return __is_async_vdso_exception(regs, trapnr); +} + +static inline bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, + unsigned long error_code, + unsigned long fault_addr) +{ + if (is_exception_in_vdso(regs)) + return __fixup_vdso_exception(regs, trapnr, error_code, + fault_addr); + return false; +} + #endif /* __ASSEMBLER__ */ #endif /* _ASM_X86_VDSO_H */ diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index f1f1b5a0956a..87d8ae46510c 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1289,6 +1289,10 @@ void do_user_addr_fault(struct pt_regs *regs, if (user_mode(regs)) { local_irq_enable(); flags |= FAULT_FLAG_USER; + if (IS_ENABLED(CONFIG_GENERIC_VDSO_PREFETCH) && + is_async_vdso_exception(regs, X86_TRAP_PF)) + flags |= FAULT_FLAG_ALLOW_RETRY | + FAULT_FLAG_RETRY_NOWAIT; } else { if (regs->flags & X86_EFLAGS_IF) local_irq_enable(); @@ -1407,8 +1411,11 @@ void do_user_addr_fault(struct pt_regs *regs, */ if (unlikely((fault & VM_FAULT_RETRY) && (flags & FAULT_FLAG_ALLOW_RETRY))) { - flags |= FAULT_FLAG_TRIED; - goto retry; + if (!(flags & FAULT_FLAG_RETRY_NOWAIT)) { + flags |= FAULT_FLAG_TRIED; + goto retry; + } + fixup_vdso_exception(regs, X86_TRAP_PF, hw_error_code, address); } mmap_read_unlock(mm); diff --git a/lib/vdso/Kconfig b/lib/vdso/Kconfig index d883ac299508..a64d2b08b6f4 100644 --- a/lib/vdso/Kconfig +++ b/lib/vdso/Kconfig @@ -30,4 +30,9 @@ config GENERIC_VDSO_TIME_NS Selected by architectures which support time namespaces in the VDSO +config GENERIC_VDSO_PREFETCH + bool + help + Selected by architectures which support page prefetch VDSO + endif From patchwork Thu Feb 25 07:29:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12103563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.7 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29172C433DB for ; Thu, 25 Feb 2021 07:34:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B0DB164E85 for ; Thu, 25 Feb 2021 07:34:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B0DB164E85 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 378618D0003; Thu, 25 Feb 2021 02:33:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 357286B0074; Thu, 25 Feb 2021 02:33:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 216638D0003; Thu, 25 Feb 2021 02:33:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 0AA9D6B0073 for ; Thu, 25 Feb 2021 02:33:56 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CF969181AEF3F for ; Thu, 25 Feb 2021 07:33:55 +0000 (UTC) X-FDA: 77855976030.17.7600B04 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 0387FC0001EE for ; Thu, 25 Feb 2021 07:33:56 +0000 (UTC) Received: by mail-pf1-f172.google.com with SMTP id z5so3056679pfe.3 for ; Wed, 24 Feb 2021 23:33:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/5/6+cfz5gtBsby2ty8cXFbwclvAT3dDAZAxr63sxVg=; b=ibGos30OWkGhK94St7H4ygsjcHlkjiHBe5OfiXsHHCMi8KokCyS8CUUj6+jho55QZL vockORGC/YpJq4oaHSlTs5c6p1B9UfpQVf+26G/WLd1nZ974LY3hgX2NJmUkMYjVn4yQ Ujdew4CI+s0gBBq1esyGow3zdyEO1HNPG4Nu+VblemfLozhl6TfV9heeBW1Y6yda+KCG ZwNpD0fxcntbR75Z5IkbtixArbBQzph7R7+XZvt2GD9uvzBGIY8TRKfX/tfCEbRIHSV3 CC0zzYWVRQNpMJMgbcTSGTSeXhV2+TIptv1V3XkTbY/+YOAgUaVZRP9s17XEPl+cZ5HS dvEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/5/6+cfz5gtBsby2ty8cXFbwclvAT3dDAZAxr63sxVg=; b=SUZPCprWIk/0DdElSByKgi6DwziOhVDhxM1bM+/3eU01rMwk/wG8YNn4u0ezbkUOXX vQv+SmxN60+WA/yWACKiDz9iGwX9nGab3RTcy8QeYN9sHk0TRYfWAb766LNtwL3sF/v8 U51HbkPq82DWNy7OJj32CQzZrVcrahtAjb45NL9el8WcPLxoZdXT3S42wuGT8w+8Gzhh X5MblSMXGJd+823TZjLtVb3sX4Gr0cp5w470SZZnhkwNOEJLRlMSnsuD51c9mvGcP1Ed NXE3E6lViGZf5IyK1FNdqfzCAuRyuLIHrjDufZxrGRamCIqGljCrS4cyRpWvyG7Eswnv C0aw== X-Gm-Message-State: AOAM530CjQmryMXaqkFenhDiKb56HQurYsFXUNgsXqX1+Vct0+lCuDOK B9NqUBugv9nHHFOrg/GOu1PqbvMBrUygcQ== X-Google-Smtp-Source: ABdhPJzlg70x3wr3Y1hic7wjGCwzmQfHI7U2X3bfeBjzjSYddIgrS14TA6tUsoKVfGhF8hS6lxHCYw== X-Received: by 2002:a62:1412:0:b029:1ec:bc11:31fd with SMTP id 18-20020a6214120000b02901ecbc1131fdmr1997292pfu.76.1614238434086; Wed, 24 Feb 2021 23:33:54 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id w3sm4917561pjt.24.2021.02.24.23.33.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 23:33:53 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Andy Lutomirski , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Nadav Amit , Sean Christopherson , Andrew Morton , x86@kernel.org Subject: [RFC 4/6] mm/swap_state: respect FAULT_FLAG_RETRY_NOWAIT Date: Wed, 24 Feb 2021 23:29:08 -0800 Message-Id: <20210225072910.2811795-5-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210225072910.2811795-1-namit@vmware.com> References: <20210225072910.2811795-1-namit@vmware.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0387FC0001EE X-Stat-Signature: 58t6w946et3yr65ztnqpxb9xacje5dgi Received-SPF: none (<>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from="<>"; helo=mail-pf1-f172.google.com; client-ip=209.85.210.172 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614238436-574326 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Certain use-cases (e.g., prefetch_page()) may want to avoid polling while a page is brought from the swap. Yet, swap_cluster_readahead() and swap_vma_readahead() do not respect FAULT_FLAG_RETRY_NOWAIT. Add support to respect FAULT_FLAG_RETRY_NOWAIT by not polling in these cases. Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Sean Christopherson Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton Cc: x86@kernel.org Signed-off-by: Nadav Amit --- mm/memory.c | 15 +++++++++++++-- mm/shmem.c | 1 + mm/swap_state.c | 12 +++++++++--- 3 files changed, 23 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index feff48e1465a..13b9cf36268f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3326,12 +3326,23 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) } if (!page) { + /* + * Back out if we failed to bring the page while we + * tried to avoid I/O. + */ + if (fault_flag_allow_retry_first(vmf->flags) && + (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) { + ret = VM_FAULT_RETRY; + delayacct_clear_flag(DELAYACCT_PF_SWAPIN); + goto out; + } + /* * Back out if somebody else faulted in this pte * while we released the pte lock. */ - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + vmf->pte = pte_offset_map_lock(vma->vm_mm, + vmf->pmd, vmf->address, &vmf->ptl); if (likely(pte_same(*vmf->pte, vmf->orig_pte))) ret = VM_FAULT_OOM; delayacct_clear_flag(DELAYACCT_PF_SWAPIN); diff --git a/mm/shmem.c b/mm/shmem.c index 7c6b6d8f6c39..b108e9ba9e89 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1525,6 +1525,7 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp, shmem_pseudo_vma_init(&pvma, info, index); vmf.vma = &pvma; vmf.address = 0; + vmf.flags = 0; page = swap_cluster_readahead(swap, gfp, &vmf); shmem_pseudo_vma_destroy(&pvma); diff --git a/mm/swap_state.c b/mm/swap_state.c index 751c1ef2fe0e..1e930f7ff8b3 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -656,10 +656,13 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, unsigned long mask; struct swap_info_struct *si = swp_swap_info(entry); struct blk_plug plug; - bool do_poll = true, page_allocated; + bool page_allocated, do_poll; struct vm_area_struct *vma = vmf->vma; unsigned long addr = vmf->address; + do_poll = !fault_flag_allow_retry_first(vmf->flags) || + !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); + mask = swapin_nr_pages(offset) - 1; if (!mask) goto skip; @@ -838,7 +841,7 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask, pte_t *pte, pentry; swp_entry_t entry; unsigned int i; - bool page_allocated; + bool page_allocated, do_poll; struct vma_swap_readahead ra_info = { .win = 1, }; @@ -873,9 +876,12 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask, } blk_finish_plug(&plug); lru_add_drain(); + skip: + do_poll = (!fault_flag_allow_retry_first(vmf->flags) || + !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) && ra_info.win == 1; return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address, - ra_info.win == 1); + do_poll); } /** From patchwork Thu Feb 25 07:29:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12103565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94C30C433E0 for ; Thu, 25 Feb 2021 07:34:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F20F64E85 for ; Thu, 25 Feb 2021 07:34:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F20F64E85 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8D80A6B0073; Thu, 25 Feb 2021 02:33:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 832FF8D0005; Thu, 25 Feb 2021 02:33:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F8C06B0075; Thu, 25 Feb 2021 02:33:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id 5B1AD6B0073 for ; Thu, 25 Feb 2021 02:33:57 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2C374180AC498 for ; Thu, 25 Feb 2021 07:33:57 +0000 (UTC) X-FDA: 77855976114.07.B13C944 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by imf11.hostedemail.com (Postfix) with ESMTP id E09D12000381 for ; Thu, 25 Feb 2021 07:33:46 +0000 (UTC) Received: by mail-pj1-f48.google.com with SMTP id o6so3012287pjf.5 for ; Wed, 24 Feb 2021 23:33:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JZI/Cbjn5yPQpxTbGPsLZp1NA7npdB4OJJowknZTVUs=; b=KjOOgxtrhZlH8VdO3dMyYpui8dLLX2TbOkHXP4hItYpluXhWbReLIAH0zLujUrAQYz S++VpRkVYfNGYMBON/Dw1siXsLXFn4MI1tQemzVZk+AyG0gxQUPCF4runK17YMZSDOst eMyTv7Se9QIfpWVUZSC2e4TW9RwMQswLSUob3trfEHFrCu/pujybAS+bt8jx8D5JSRko 7j4ksmuDorNaeM77FX36QzIyIh536zFzd0euF5EenlcEt3J92kCkxmxBz03xfM+jDFJ8 H682Owwk4peOC4ffUSJHOD5832Lwf7Jun5us4rT8vDmv2reNWUUbF5R+IzmPq3HTNCz+ tB4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JZI/Cbjn5yPQpxTbGPsLZp1NA7npdB4OJJowknZTVUs=; b=A80zd5093aQ7OSZUt7YvhVEWJsDxfxbHlhTxXOrsZ80fQZWX1xTX4T6V1SfL7xjaPW vYNsvNbQxV3r+coWT+ItxXZdNthIShmzAwWFoeleJMrNNHKZBx/8L18diAAGuRzAqGKW GE7dCsj5OktEG4CQGSQtTspx2t2aNVCw7wwR34V3dO9Axs16qwS6uXs12EMjbzlRkRT9 cIMige+26W3BZu5O1Nd5LK/OGIyoa23ZvCXbBhJkmYnSqpE8+5IvbxrdMkrNAumlL8c/ ur5/HHM5KGbu4tH1oi6pNvAQYJvkJYQTiVKfYKxlNd1K3YwgTRkQtC3gSR5d4/VFEdUw b1LQ== X-Gm-Message-State: AOAM533ltjRZYkAcwyLs+/WAajox2oMmAFvKKYLtfOEJSr5nfB0cj1zu Q67hJGC/8oQWIctyc4fPyhNOW5u2SCFhLg== X-Google-Smtp-Source: ABdhPJyr/UlzcZf2ldqJ+gitmpfnxmmzEHvMiLhJ+XDQ8mDA7ozvSlKj+GAJbrvNdJPIN+HxPqRCGg== X-Received: by 2002:a17:90a:9c7:: with SMTP id 65mr2033257pjo.24.1614238435483; Wed, 24 Feb 2021 23:33:55 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id w3sm4917561pjt.24.2021.02.24.23.33.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 23:33:55 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Andy Lutomirski , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Nadav Amit , Sean Christopherson , Andrew Morton , x86@kernel.org Subject: [RFC 5/6] mm: use lightweight reclaim on FAULT_FLAG_RETRY_NOWAIT Date: Wed, 24 Feb 2021 23:29:09 -0800 Message-Id: <20210225072910.2811795-6-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210225072910.2811795-1-namit@vmware.com> References: <20210225072910.2811795-1-namit@vmware.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E09D12000381 X-Stat-Signature: ad48j51puih7uqap4i94s6oqzc1eioqr Received-SPF: none (<>: No applicable sender policy available) receiver=imf11; identity=mailfrom; envelope-from="<>"; helo=mail-pj1-f48.google.com; client-ip=209.85.216.48 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614238426-785298 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit When FAULT_FLAG_RETRY_NOWAIT is set, the caller arguably wants only a lightweight reclaim to avoid a long reclamation, which would not respect the "NOWAIT" semantic. Regard the request in swap and file-backed page-faults accordingly during the first try. Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Sean Christopherson Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton Cc: x86@kernel.org Signed-off-by: Nadav Amit --- mm/memory.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 13b9cf36268f..70899c92a9e6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2679,18 +2679,31 @@ static inline bool cow_user_page(struct page *dst, struct page *src, return ret; } -static gfp_t __get_fault_gfp_mask(struct vm_area_struct *vma) +static gfp_t massage_page_gfp_mask(gfp_t gfp_mask, unsigned long vmf_flags) { - struct file *vm_file = vma->vm_file; + if (fault_flag_allow_retry_first(vmf_flags) && + (vmf_flags & FAULT_FLAG_RETRY_NOWAIT)) + gfp_mask |= __GFP_NORETRY | __GFP_NOWARN; - if (vm_file) - return mapping_gfp_mask(vm_file->f_mapping) | __GFP_FS | __GFP_IO; + return gfp_mask; +} + +static gfp_t __get_fault_gfp_mask(struct vm_area_struct *vma, + unsigned long flags) +{ + struct file *vm_file = vma->vm_file; + gfp_t gfp_mask; /* * Special mappings (e.g. VDSO) do not have any file so fake * a default GFP_KERNEL for them. */ - return GFP_KERNEL; + if (!vm_file) + return GFP_KERNEL; + + gfp_mask = mapping_gfp_mask(vm_file->f_mapping) | __GFP_FS | __GFP_IO; + + return massage_page_gfp_mask(gfp_mask, flags); } /* @@ -3253,6 +3266,7 @@ EXPORT_SYMBOL(unmap_mapping_range); */ vm_fault_t do_swap_page(struct vm_fault *vmf) { + gfp_t gfp_mask = massage_page_gfp_mask(GFP_HIGHUSER_MOVABLE, vmf->flags); struct vm_area_struct *vma = vmf->vma; struct page *page = NULL, *swapcache; swp_entry_t entry; @@ -3293,8 +3307,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && __swap_count(entry) == 1) { /* skip swapcache */ - page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, - vmf->address); + page = alloc_page_vma(gfp_mask, vma, vmf->address); if (page) { int err; @@ -3320,8 +3333,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) swap_readpage(page, true); } } else { - page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, - vmf); + page = swapin_readahead(entry, gfp_mask, vmf); swapcache = page; } @@ -4452,7 +4464,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, .address = address & PAGE_MASK, .flags = flags, .pgoff = linear_page_index(vma, address), - .gfp_mask = __get_fault_gfp_mask(vma), + .gfp_mask = __get_fault_gfp_mask(vma, flags), }; unsigned int dirty = flags & FAULT_FLAG_WRITE; struct mm_struct *mm = vma->vm_mm; From patchwork Thu Feb 25 07:29:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 12103567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1820C433DB for ; Thu, 25 Feb 2021 07:34:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 39E9164E90 for ; Thu, 25 Feb 2021 07:34:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 39E9164E90 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 291A08D0006; Thu, 25 Feb 2021 02:33:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F04F8D0005; Thu, 25 Feb 2021 02:33:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01C998D0006; Thu, 25 Feb 2021 02:33:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id E1B618D0005 for ; Thu, 25 Feb 2021 02:33:58 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A921F180477A2 for ; Thu, 25 Feb 2021 07:33:58 +0000 (UTC) X-FDA: 77855976156.15.FA64ACE Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) by imf15.hostedemail.com (Postfix) with ESMTP id 20A57A0009D5 for ; Thu, 25 Feb 2021 07:33:56 +0000 (UTC) Received: by mail-pf1-f171.google.com with SMTP id q204so1992169pfq.10 for ; Wed, 24 Feb 2021 23:33:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ew5ULXh2s3debDfJGELT9nH6Qr39U3zvkA1enkJWaeA=; b=onvirPQoLiEDzNesEG2CCzrN3ULqiE3DB2mLthAl4jMogSYLK3XkbcJG1p5ojLSl63 KeMDMnPQNKt/fYg4GTGw3oc8DJgJJ40yDzAikB0a8WNjRgJz3nMfjUj4vdCqUG6gvPVA iGoLvshjoZUe4ePEJC49XUGLUapSIDowlKjRDSwWViktrLxLORKgkoO76zR99isB+5bT NAzCZb/jrKW6A7Yf8Psm8jHNxFNU1rkdtGMVb16r20KydB4kILjusiTWNcY2SrZb85Jm a/rSBPNHHWK4j7eWAB5zMgmNGTcyUgTXOwlfKpWJikWQVlb060EnO43EQjNBlVhaghg/ zxDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ew5ULXh2s3debDfJGELT9nH6Qr39U3zvkA1enkJWaeA=; b=mY0ObtDJLtKthpeAFDz80BBea3DSS8+HkCURQlo7CijlXyb7WvbOJBCWPIwkm+SOOA C+pQsbDX3jLcxo0q1Pda8jSGcrTKa14cBzNISnLkTPM+YU3WIAW3Fugf5JaKz83ytms5 qS+GJDYxKQtt6KJfFAy9h/qxWJ3CqJK7jRcoxcjRjF7R1Gg6ZFcSzgMQYGN1jUgmxZ2n CBH2F49o/gyKk4vlVz4mSNgnqITn2LFbS0WA8LJ1kCIb7TNK1wZbZiCgAK/yt+lebVY+ N5E+XKoT9HJ+MjtDq6SZG/EgYaph66lk6oaTmgB5qv0TCbS8S6Q0PMdJZ6ofH0FfOD/a WyKA== X-Gm-Message-State: AOAM531LyegxCItI2MQ0hVugyrJObGCwrxZQQEvjtRjS4/0TX4E0GXHV yW7spP+R0o6XFTJEXPZEhj6DWnoJqNtcLA== X-Google-Smtp-Source: ABdhPJyP/D1mqxF7RjwvOWEtaASpae014BCao/9yEM/ejxJFLMHEY4DszZ17KieyhQx2UezDqsNJrg== X-Received: by 2002:a63:546:: with SMTP id 67mr1803027pgf.173.1614238436911; Wed, 24 Feb 2021 23:33:56 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id w3sm4917561pjt.24.2021.02.24.23.33.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 23:33:56 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , Andy Lutomirski , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Nadav Amit , Sean Christopherson , Andrew Morton , x86@kernel.org Subject: [PATCH 6/6] testing/selftest: test vDSO prefetch_page() Date: Wed, 24 Feb 2021 23:29:10 -0800 Message-Id: <20210225072910.2811795-7-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210225072910.2811795-1-namit@vmware.com> References: <20210225072910.2811795-1-namit@vmware.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 20A57A0009D5 X-Stat-Signature: zmgmipokk8on13nah3b17tenawr4mjhz Received-SPF: none (<>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from="<>"; helo=mail-pf1-f171.google.com; client-ip=209.85.210.171 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614238436-36764 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Test prefetch_page() in cases of invalid pointer, file-mmap and anonymous memory. Partial checks are also done with mincore syscall to ensure the output of prefetch_page() is consistent with mincore (taking into account the different semantics of the two). The tests are not fool-proof as they rely on the behavior of the page-cache and page reclamation mechanism to get a major page-fault. They should be robust in the sense of test being skipped if it failed. There is a question though on how to know how much memory to access in the test of anonymous memory to force the eviction of a page and trigger a refault. Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Sean Christopherson Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton Cc: x86@kernel.org Signed-off-by: Nadav Amit --- tools/testing/selftests/vDSO/Makefile | 2 + .../selftests/vDSO/vdso_test_prefetch_page.c | 265 ++++++++++++++++++ 2 files changed, 267 insertions(+) create mode 100644 tools/testing/selftests/vDSO/vdso_test_prefetch_page.c diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile index d53a4d8008f9..dcd1ede8c0f7 100644 --- a/tools/testing/selftests/vDSO/Makefile +++ b/tools/testing/selftests/vDSO/Makefile @@ -11,6 +11,7 @@ ifeq ($(ARCH),$(filter $(ARCH),x86 x86_64)) TEST_GEN_PROGS += $(OUTPUT)/vdso_standalone_test_x86 endif TEST_GEN_PROGS += $(OUTPUT)/vdso_test_correctness +TEST_GEN_PROGS += $(OUTPUT)/vdso_test_prefetch_page CFLAGS := -std=gnu99 CFLAGS_vdso_standalone_test_x86 := -nostdlib -fno-asynchronous-unwind-tables -fno-stack-protector @@ -33,3 +34,4 @@ $(OUTPUT)/vdso_test_correctness: vdso_test_correctness.c vdso_test_correctness.c \ -o $@ \ $(LDFLAGS_vdso_test_correctness) +$(OUTPUT)/vdso_test_prefetch_page: vdso_test_prefetch_page.c parse_vdso.c diff --git a/tools/testing/selftests/vDSO/vdso_test_prefetch_page.c b/tools/testing/selftests/vDSO/vdso_test_prefetch_page.c new file mode 100644 index 000000000000..35928c3f36ca --- /dev/null +++ b/tools/testing/selftests/vDSO/vdso_test_prefetch_page.c @@ -0,0 +1,265 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * vdso_test_prefetch_page.c: Test vDSO's prefetch_page()) + */ + +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../kselftest.h" +#include "parse_vdso.h" + +const char *version = "LINUX_2.6"; +const char *name = "__vdso_prefetch_page"; + +struct getcpu_cache; +typedef long (*prefetch_page_t)(const void *p); + +#define MEM_SIZE_K (9500000ull) +#define PAGE_SIZE (4096ull) + +#define SKIP_MINCORE_BEFORE (1 << 0) +#define SKIP_MINCORE_AFTER (1 << 1) + +static prefetch_page_t prefetch_page; + +static const void *ptr_align(const void *p) +{ + return (const void *)((unsigned long)p & ~(PAGE_SIZE - 1)); +} + + +static int __test_prefetch(const void *p, bool expected_no_io, + const char *test_name, unsigned int skip_mincore) +{ + bool no_io; + char vec; + long r; + uint64_t start; + + p = ptr_align(p); + + /* + * First, run a sanity check to use mincore() to see if the page is in + * memory when we expect it not to be. We can only trust mincore to + * tell us when a page is already in memory when it should not be. + */ + if (!(skip_mincore & SKIP_MINCORE_BEFORE)) { + if (mincore((void *)p, PAGE_SIZE, &vec)) { + printf("[SKIP]\t%s: mincore failed: %s\n", test_name, + strerror(errno)); + return 0; + } + + no_io = vec & 1; + if (!skip_mincore && no_io && !expected_no_io) { + printf("[SKIP]\t%s: unexpected page state: %s\n", + test_name, + no_io ? "in memory" : "not in memory"); + return 0; + } + } + + /* + * Check we got the expected result from prefetch page. + */ + r = prefetch_page(p); + + no_io = r == 0; + if (no_io != expected_no_io) { + printf("[FAIL]\t%s: prefetch_page() returned %ld\n", + test_name, r); + return KSFT_FAIL; + } + + if (skip_mincore & SKIP_MINCORE_AFTER) + return 0; + + /* + * Check again using mincore that the page state is as expected. + * A bit racy. Skip the test if mincore fails. + */ + if (mincore((void *)p, PAGE_SIZE, &vec)) { + printf("[SKIP]\t%s: mincore failed: %s\n", test_name, + strerror(errno)); + return 0; + } + + no_io = vec & 1; + if (0 && no_io != expected_no_io) { + printf("[FAIL]\t%s: mincore reported page is %s\n", + test_name, no_io ? "in memory" : "not in memory"); + return KSFT_FAIL; + + } + return 0; +} + +#define test_prefetch(p, expected_no_io, test_name, skip_mincore) \ + do { \ + long _r = __test_prefetch(p, expected_no_io, \ + test_name, skip_mincore); \ + \ + if (_r) \ + return _r; \ + } while (0) + +static void wait_for_io_completion(const void *p) +{ + char vec; + int i; + + /* Wait to allow the I/O to complete */ + p = ptr_align(p); + + vec = 0; + + /* Wait for 5 seconds and keep probing the page to get it */ + for (i = 0; i < 5000; i++) { + if (mincore((void *)p, PAGE_SIZE, &vec) == 0 && (vec & 1)) + break; + prefetch_page(p); + usleep(1000); + } +} + +int main(int argc, char **argv) +{ + unsigned long sysinfo_ehdr; + long ret, i, test_ret = 0; + int fd, drop_fd; + char *p, vec; + + printf("[RUN]\tTesting vdso_prefetch_page\n"); + + sysinfo_ehdr = getauxval(AT_SYSINFO_EHDR); + if (!sysinfo_ehdr) { + printf("[SKIP]\tAT_SYSINFO_EHDR is not present!\n"); + return KSFT_SKIP; + } + + vdso_init_from_sysinfo_ehdr(getauxval(AT_SYSINFO_EHDR)); + + prefetch_page = (prefetch_page_t)vdso_sym(version, name); + if (!prefetch_page) { + printf("[SKIP]\tCould not find %s in vdso\n", name); + return KSFT_SKIP; + } + + test_prefetch(NULL, false, "NULL access", + SKIP_MINCORE_BEFORE|SKIP_MINCORE_AFTER); + + test_prefetch(name, true, "present", 0); + + p = mmap(0, PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0); + if (p == MAP_FAILED) { + perror("mmap anon"); + return KSFT_FAIL; + } + + /* + * Mincore would not tell us that no I/O is needed to retrieve the page, + * so tell test_prefetch() to skip it. + */ + test_prefetch(p, true, "anon prefetch", SKIP_MINCORE_BEFORE); + + /* Drop the caches before testing file mmap */ + drop_fd = open("/proc/sys/vm/drop_caches", O_WRONLY); + if (drop_fd < 0) { + perror("open /proc/sys/vm/drop_caches"); + return KSFT_FAIL; + } + + sync(); + ret = write(drop_fd, "3", 1); + if (ret != 1) { + perror("write to /proc/sys/vm/drop_caches"); + return KSFT_FAIL; + } + + /* close, which would also flush */ + ret = close(drop_fd); + if (ret) { + perror("close /proc/sys/vm/drop_caches"); + return KSFT_FAIL; + } + + /* Using /etc/passwd as a file that should alway exist */ + fd = open("/etc/hosts", O_RDONLY); + if (fd < 0) { + perror("open /etc/passwd"); + return KSFT_FAIL; + } + + p = mmap(0, PAGE_SIZE, PROT_READ, MAP_SHARED, fd, 0); + if (p == MAP_FAILED) { + perror("mmap file"); + return KSFT_FAIL; + } + + test_prefetch(p, false, "Minor-fault (io) file prefetch", 0); + + wait_for_io_completion(p); + + test_prefetch(p, true, "Minor-fault (cached) file prefetch", 0); + + munmap(p, PAGE_SIZE); + + /* + * Try to lock all to avoid unrelated page-faults before we create + * memory pressure to prevent unrelated page-faults. + */ + mlockall(MCL_CURRENT); + + p = mmap(0, 1024 * MEM_SIZE_K, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0); + if (p == MAP_FAILED) { + perror("mmap file"); + return KSFT_FAIL; + } + + /* + * Write random value to avoid try to prevent KSM from deduplicating + * this page. + */ + *(volatile unsigned long *)p = 0x43454659; + ret = madvise(p, PAGE_SIZE, MADV_PAGEOUT); + if (ret != 0) { + perror("madvise(MADV_PAGEOUT)"); + return KSFT_FAIL; + } + + /* Wait to allow the page-out to complete */ + usleep(2000000); + + /* Cause some memory pressure */ + for (i = PAGE_SIZE; i < MEM_SIZE_K * 1024; i += PAGE_SIZE) + *(volatile unsigned long *)((unsigned long)p + i) = i + 1; + + /* Check if we managed to evict the page */ + ret = mincore(p, PAGE_SIZE, &vec); + if (ret != 0) { + perror("mincore"); + return KSFT_FAIL; + } + + test_prefetch(p, false, "Minor-fault (io) anon prefetch", 0); + wait_for_io_completion(p); + + test_prefetch(p, true, "Minor-fault (cached) anon prefetch", false); + + printf("[PASS]\tvdso_prefetch_page\n"); + return 0; +}