From patchwork Fri Mar 11 00:25:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777168 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CA19C43217 for ; Fri, 11 Mar 2022 00:25:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345072AbiCKA0h (ORCPT ); Thu, 10 Mar 2022 19:26:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344897AbiCKA0f (ORCPT ); Thu, 10 Mar 2022 19:26:35 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62EAB1A1C70 for ; Thu, 10 Mar 2022 16:25:33 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id x10-20020a170902a38a00b00151e09a4e15so3542726pla.15 for ; Thu, 10 Mar 2022 16:25:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=j95pgE4Sx8peHA829tzkUc37bNityDTC3xUJdMO2UzA=; b=eY6emTpNoacvuiaoysnSFSYEGI+0B+krHa4P/Kh4D2MAjDRyvsIDsQzdiBWa0mReKe wLLIOmJC46PoM8Ve2y9iabrtTA3QVfRHxFm60wkR7QuQ3z3KjIr+FBV3ypD6+hqr/YlG oW2es0oK5N9Ctdp+T7UEZqjXrpgspiy/S7kJWnoT89pQXVGng1c0NPgbAZi4G2rX77Df GywIyvqUz+kS6x19Cro3q/w81nweCY468m7CiSYcFHlOvSXrGNZgMoAREj7DhcA1MQv+ Usb8QAEJ//5umwpZGGeozCTn8uddNLyxyhaHsmzSCQQKz02S7oHN67gRirf+EMGfG7em pUfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=j95pgE4Sx8peHA829tzkUc37bNityDTC3xUJdMO2UzA=; b=NxyCseMlT3GWdW/fx1eHOA7hoGWVli6yMxw2leEaIL4XPqxt1/b9MPx2RR/mc59t3i 2T3LhRIiUepFqRQkbqbUgzYL1BlNR6w68vPgO9khH2v2bdChud7Qxp1R1ejj2WyliPlw LkM0bT57m8kQA4TliSxT2Js4XXMxfgBnPwFHhNAd8h7kGDf5z2LPPM6wBoCW2qTdedVf ++EeQzJSeIlcCrf2gvVPCki1yQ/ey4cf6aYkT5tjtcSlYboe1IROa2XEktQDDlrk9ODr RJydjbpiKg8w37nnFhcYvv8abZzXDq48zSV8I/Qx3DYBhdIw0ypAMUy9q3CDedPMjTl7 t32A== X-Gm-Message-State: AOAM5302v6KYt1Gwntj+0X8g2ggFYUXxbEz9uYiqAlGyEJQa7XWdK+ca 5TftIv1awQxMAI5CkPi0IRahzW3v27Mfbg== X-Google-Smtp-Source: ABdhPJyr0Yxi/VkLVPG0JOeVLGZN1qgnMSwCFGFVa2lsMkBsD9BSYG73ywf0RhTzqzlrQfQDXUPJWYoLTcZWYQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a05:6a00:1a0b:b0:4cf:9a9:5c5f with SMTP id g11-20020a056a001a0b00b004cf09a95c5fmr7693681pfv.45.1646958332777; Thu, 10 Mar 2022 16:25:32 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:03 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-2-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 01/26] KVM: x86/mmu: Optimize MMU page cache lookup for all direct SPs From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Commit fb58a9c345f6 ("KVM: x86/mmu: Optimize MMU page cache lookup for fully direct MMUs") skipped the unsync checks and write flood clearing for full direct MMUs. We can extend this further and skip the checks for all direct shadow pages. Direct shadow pages are never marked unsynced or have a non-zero write-flooding count. Checking sp->role.direct alos generates better code than checking direct_map because, due to register pressure, direct_map has to get shoved onto the stack and then pulled back off. No functional change intended. Reviewed-by: Sean Christopherson Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) base-commit: ce41d078aaa9cf15cbbb4a42878cc6160d76525e diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3b8da8b0745e..3ad67f70e51c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2034,7 +2034,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, int direct, unsigned int access) { - bool direct_mmu = vcpu->arch.mmu->direct_map; union kvm_mmu_page_role role; struct hlist_head *sp_list; unsigned quadrant; @@ -2075,7 +2074,8 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, continue; } - if (direct_mmu) + /* unsync and write-flooding only apply to indirect SPs. */ + if (sp->role.direct) goto trace_get_page; if (sp->unsync) { From patchwork Fri Mar 11 00:25:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33645C4321E for ; Fri, 11 Mar 2022 00:25:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345146AbiCKA0i (ORCPT ); Thu, 10 Mar 2022 19:26:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345136AbiCKA0g (ORCPT ); Thu, 10 Mar 2022 19:26:36 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CED681A1C6A for ; Thu, 10 Mar 2022 16:25:34 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id z194-20020a627ecb000000b004f6db380a59so4155417pfc.19 for ; Thu, 10 Mar 2022 16:25:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=0eAfKiOxcQy9JlkrjXNpoHqYHtmPuiAwbCjCtsBF1gU=; b=Ds65Duv71Sj++gu858kzQzfx4qj+n74AY3f3En/PZmXyzUO8L6iQ55bITs/kJzLy7E 1jAfwm8/sK5JqfrHLJDw8N2+qOoDaqdIgsikIF90Gdja6qFOkEyH/1V+THlFIGsLKAUI NrzcYy53Cwxzug9SUZA/22KspdbY3akV2eXvpVmmP8/088xMO6De1aJppaQIBk6f513S D/KDc7JzburUmP0xkKQTMIcZS0hk68J7NNa+iw2WZ6/Q3TfdAEBsREKoblTyIGSfh6K2 k54muP2tXq0Pjriu0Esl1G3PuZQVW3OvY79X+cSH57wTuUftGGbDJIGQSSTasJuF1s2x vb6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=0eAfKiOxcQy9JlkrjXNpoHqYHtmPuiAwbCjCtsBF1gU=; b=Sh6r+g7FPJ7WKE90YuiY3vIDhLQTBT3+P2/MC49iWpq1qLzEnLO+hzTatoBnvVHAOG mnljQ/hW0mnqLtiH7svtTMMipv2Cmo4eCsc7llsf+VceE2/08y1wEvQ98qxrYuTE/Dks COCR1/eDwGGgtCL9HFo8Hfo8Hdfnuf2FsKPm6jKDAGH9n9yIA6Tix/TcPBPiECYD0IfB jkEW2rD5MnTvYuNmGwZbnChGIgXBnz4TqtkYirNLl437qzLXfZXb2rDT/i+GYP+vuQuU qML6KYW6TsDq8gjCJce6/JS0sJEKJCcEgQBiMWtnpN+pNS6jSchu10VSZJZxlJQoztj4 yyjA== X-Gm-Message-State: AOAM533gY/sLqcjM16ly0hTZuGaDCS37xnY4uk2LRPUc86LTGTTEFhvZ uUGdCjabg8ginnItz3wHxxoO0S3V0vSOUA== X-Google-Smtp-Source: ABdhPJx2ihth+Lyp0jx20NkW4UoRmNnE2WoUi5zOLMEZXktLr7faLnrpgu/8QE08DLH0Spz2A92OMKfVSAfiBg== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:e5cd:b0:151:da9a:2b5d with SMTP id u13-20020a170902e5cd00b00151da9a2b5dmr7419566plf.67.1646958334306; Thu, 10 Mar 2022 16:25:34 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:04 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-3-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 02/26] KVM: x86/mmu: Use a bool for direct From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The parameter "direct" can either be true or false, and all of the callers pass in a bool variable or true/false literal, so just use the type bool. No functional change intended. Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3ad67f70e51c..146df73a982e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1706,7 +1706,7 @@ static void drop_parent_pte(struct kvm_mmu_page *sp, mmu_spte_clear_no_track(parent_pte); } -static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct) +static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, bool direct) { struct kvm_mmu_page *sp; @@ -2031,7 +2031,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gaddr, unsigned level, - int direct, + bool direct, unsigned int access) { union kvm_mmu_page_role role; From patchwork Fri Mar 11 00:25:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EE67C433FE for ; Fri, 11 Mar 2022 00:25:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245520AbiCKA0k (ORCPT ); Thu, 10 Mar 2022 19:26:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345136AbiCKA0j (ORCPT ); Thu, 10 Mar 2022 19:26:39 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03C111A1C71 for ; Thu, 10 Mar 2022 16:25:37 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id p15-20020a17090a748f00b001bf3ba2ae95so4229701pjk.9 for ; Thu, 10 Mar 2022 16:25:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=hQ1Yj2jpLZU8m0KHZ3b2ZXi72STV++YRQqlTTShJ2JQ=; b=dUwKeuepnbyw5I0D0zMuN7US6l4wxLHljA5AApQQ7F8dXrKm4tCACM2Ugps7u8EbNv svxWhyuNtv4ZPIQtNMet7fhETJw6LR+X7042fh/hfghQ3hS09G7ORdIa66X94PRknWNe WK9BK6Rvd5d3JqbdXSUu3+i8v4WCdWIECzAvF96kUgv4gf++6omtsET5AXA6x5yqpYgI Tx8LI9oPouldhO9zpRH8GBq7bLkogNVqyx+Ou4bVr1nF61MdAK4Ix1sg8d9fdsmyAtt0 EltlXSsxXMf72xinEOHYBHxQQQ7yXKNtvWah99RgqxrSPGbmumNksNzAiNgfuIXaV7qV yJfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=hQ1Yj2jpLZU8m0KHZ3b2ZXi72STV++YRQqlTTShJ2JQ=; b=zUbhLEYQjMZ4GH0Bk7DvansETYM3t9KCcT+/5q9OjLHoqP6G7fVoKbQHbl/+Oub1JD qnrdpqER1B6HiuM6gS54LqQkL2HfnANEYIj/pBhIj5nS9HIAqL5Z7PaINn+AwIr+evkp bljwwqWzIwxLqzf2cUs1jr9Z6sh3BZIL5i1Sq0TDxNt5Q+j6yh8pthgTFfOGFDS4N3eA 2QhG7o2yUt2jXj5HosQbl3Gdg+LC/9emTM8GcUf7Bxmdk1JLokpq6muXQJgTzwa2wZDM hxhSYwaY36Y3+Vqna+4sTk8M0E3IiU/bTGV1sk0bD+rDqhHyFOLVaC7EMwPh3DbcFbm0 2Krw== X-Gm-Message-State: AOAM5338cMRHtm7+hNNMoprGimO+nhlaz0uFiSYP8iYvjolC3/0DJARh TWjCMU2GLgr8DhJQb0mMs1l2IuXiahdylQ== X-Google-Smtp-Source: ABdhPJxLj1e/WCXwjYQhN24wNkEVjpXHDZLELDbVrBV5t0YYirawE+RJcKOjpyzusMz/lduTyqsGweKfuUl38w== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:4d81:b0:1bf:8ce4:4f51 with SMTP id oj1-20020a17090b4d8100b001bf8ce44f51mr322491pjb.0.1646958335898; Thu, 10 Mar 2022 16:25:35 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:05 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-4-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 03/26] KVM: x86/mmu: Derive shadow MMU page role from parent From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Instead of computing the shadow page role from scratch for every new page, we can derive most of the information from the parent shadow page. This avoids redundant calculations and reduces the number of parameters to kvm_mmu_get_page(). Preemptively split out the role calculation to a separate function for use in a following commit. No functional change intended. Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 91 ++++++++++++++++++++++++---------- arch/x86/kvm/mmu/paging_tmpl.h | 9 ++-- 2 files changed, 71 insertions(+), 29 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 146df73a982e..23c2004c6435 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2027,30 +2027,14 @@ static void clear_sp_write_flooding_count(u64 *spte) __clear_sp_write_flooding_count(sptep_to_sp(spte)); } -static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, - gfn_t gfn, - gva_t gaddr, - unsigned level, - bool direct, - unsigned int access) +static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, + union kvm_mmu_page_role role) { - union kvm_mmu_page_role role; struct hlist_head *sp_list; - unsigned quadrant; struct kvm_mmu_page *sp; int collisions = 0; LIST_HEAD(invalid_list); - role = vcpu->arch.mmu->mmu_role.base; - role.level = level; - role.direct = direct; - role.access = access; - if (role.has_4_byte_gpte) { - quadrant = gaddr >> (PAGE_SHIFT + (PT64_PT_BITS * level)); - quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1; - role.quadrant = quadrant; - } - sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; for_each_valid_sp(vcpu->kvm, sp, sp_list) { if (sp->gfn != gfn) { @@ -2068,7 +2052,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, * Unsync pages must not be left as is, because the new * upper-level page will be write-protected. */ - if (level > PG_LEVEL_4K && sp->unsync) + if (role.level > PG_LEVEL_4K && sp->unsync) kvm_mmu_prepare_zap_page(vcpu->kvm, sp, &invalid_list); continue; @@ -2107,14 +2091,14 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, ++vcpu->kvm->stat.mmu_cache_miss; - sp = kvm_mmu_alloc_page(vcpu, direct); + sp = kvm_mmu_alloc_page(vcpu, role.direct); sp->gfn = gfn; sp->role = role; hlist_add_head(&sp->hash_link, sp_list); - if (!direct) { + if (!role.direct) { account_shadowed(vcpu->kvm, sp); - if (level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn)) + if (role.level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn)) kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1); } trace_kvm_mmu_get_page(sp, true); @@ -2126,6 +2110,51 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, return sp; } +static union kvm_mmu_page_role kvm_mmu_child_role(u64 *sptep, bool direct, u32 access) +{ + struct kvm_mmu_page *parent_sp = sptep_to_sp(sptep); + union kvm_mmu_page_role role; + + role = parent_sp->role; + role.level--; + role.access = access; + role.direct = direct; + + /* + * If the guest has 4-byte PTEs then that means it's using 32-bit, + * 2-level, non-PAE paging. KVM shadows such guests using 4 PAE page + * directories, each mapping 1/4 of the guest's linear address space + * (1GiB). The shadow pages for those 4 page directories are + * pre-allocated and assigned a separate quadrant in their role. + * + * Since we are allocating a child shadow page and there are only 2 + * levels, this must be a PG_LEVEL_4K shadow page. Here the quadrant + * will either be 0 or 1 because it maps 1/2 of the address space mapped + * by the guest's PG_LEVEL_4K page table (or 4MiB huge page) that it + * is shadowing. In this case, the quadrant can be derived by the index + * of the SPTE that points to the new child shadow page in the page + * directory (parent_sp). Specifically, every 2 SPTEs in parent_sp + * shadow one half of a guest's page table (or 4MiB huge page) so the + * quadrant is just the parity of the index of the SPTE. + */ + if (role.has_4_byte_gpte) { + BUG_ON(role.level != PG_LEVEL_4K); + role.quadrant = (sptep - parent_sp->spt) % 2; + } + + return role; +} + +static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, + u64 *sptep, gfn_t gfn, + bool direct, u32 access) +{ + union kvm_mmu_page_role role; + + role = kvm_mmu_child_role(sptep, direct, access); + return kvm_mmu_get_page(vcpu, gfn, role); +} + static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator, struct kvm_vcpu *vcpu, hpa_t root, u64 addr) @@ -2930,8 +2959,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) if (is_shadow_present_pte(*it.sptep)) continue; - sp = kvm_mmu_get_page(vcpu, base_gfn, it.addr, - it.level - 1, true, ACC_ALL); + sp = kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, true, ACC_ALL); link_shadow_page(vcpu, it.sptep, sp); if (fault->is_tdp && fault->huge_page_disallowed && @@ -3316,9 +3344,22 @@ static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn) static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva, u8 level, bool direct) { + union kvm_mmu_page_role role; struct kvm_mmu_page *sp; + unsigned int quadrant; + + role = vcpu->arch.mmu->mmu_role.base; + role.level = level; + role.direct = direct; + role.access = ACC_ALL; + + if (role.has_4_byte_gpte) { + quadrant = gva >> (PAGE_SHIFT + (PT64_PT_BITS * level)); + quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1; + role.quadrant = quadrant; + } - sp = kvm_mmu_get_page(vcpu, gfn, gva, level, direct, ACC_ALL); + sp = kvm_mmu_get_page(vcpu, gfn, role); ++sp->root_count; return __pa(sp->spt); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 252c77805eb9..c3909a07e938 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -683,8 +683,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, if (!is_shadow_present_pte(*it.sptep)) { table_gfn = gw->table_gfn[it.level - 2]; access = gw->pt_access[it.level - 2]; - sp = kvm_mmu_get_page(vcpu, table_gfn, fault->addr, - it.level-1, false, access); + sp = kvm_mmu_get_child_sp(vcpu, it.sptep, table_gfn, + false, access); + /* * We must synchronize the pagetable before linking it * because the guest doesn't need to flush tlb when @@ -740,8 +741,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, drop_large_spte(vcpu, it.sptep); if (!is_shadow_present_pte(*it.sptep)) { - sp = kvm_mmu_get_page(vcpu, base_gfn, fault->addr, - it.level - 1, true, direct_access); + sp = kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, + true, direct_access); link_shadow_page(vcpu, it.sptep, sp); if (fault->huge_page_disallowed && fault->req_level >= it.level) From patchwork Fri Mar 11 00:25:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777172 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70FDBC433EF for ; Fri, 11 Mar 2022 00:25:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345136AbiCKA0m (ORCPT ); Thu, 10 Mar 2022 19:26:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233337AbiCKA0k (ORCPT ); Thu, 10 Mar 2022 19:26:40 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D43C91A1C71 for ; Thu, 10 Mar 2022 16:25:38 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id r11-20020a63440b000000b0038068f34b0cso3813094pga.0 for ; Thu, 10 Mar 2022 16:25:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=preyZmoxeWV7bpSqcaT1D2ICPD7/5D40rW8RXyddS98=; b=LgWaICfxIBaJDsFBkUTdBOob1Y2k3/FFvQsolbqLhuk8ZrTeTkJdlWwjlTc00oGLEe LBi95aRVL4ltQZa1ZoBqImyXhIE6KCt5BKv4apSd5/TOgPmKoi9AKXn5JzAOW8qfkaFV lZPfstcYqk94YJDubOqRH4/FF9knojxzys4+ZWckaWnU9yCfGXEnP/dJMk0PlsKerJEz RFtEO4H2uxJnJcfrKsAd8SgRK/Z+vZHqCgqBMoQjTSjUQPefL7dqPZMkO0BphwLCxzPg WXAwwCmQdYAe2L1ybmPKCSs84V9vIKdpb9I5nboekQUX1+Mq9bO8ySuBjnvDZm/f7Gzo TQrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=preyZmoxeWV7bpSqcaT1D2ICPD7/5D40rW8RXyddS98=; b=PhDOXwUgRE0cFmJXcRQew+i7o+PtYg9JMG3n/YeMbYePTy42kykxHUyQSOwPUVNsVV kl1AQAlfmFey2SUt+PjPh8iWTMT0B7JpbgNo3HHEPEGQe6hPIiWsPhd6OGBBcBkdmSGP JfIHHFR5PDfMtLG/11+St7bRNQRCpSaTLD1C4zQYFLbuzTTnDtIkSxu0d/ifkC03Y3G6 3lhVor+WJjt7meW+QZ5wyW09Bm/2VTSvDtpG/GdwdWa+oNl/cZ9/e+V5kexYHZjLHCRc EOQdumWKsi9oz4kGJdsmskdpQvvthsx5jtgzK/nZOI9EquS4UdQqMyuIpWb3Vq/cVKTY 20lw== X-Gm-Message-State: AOAM532jrQuwLgYJI6xDEWaplBc5CDkvwheX99jChp+cc/ritEMjPlDU oyrUCbtQxZTH2krp4+bd/jDKvV5ELHVR/Q== X-Google-Smtp-Source: ABdhPJw4zPv1Jtev5tqhHtyUpqsXBloi5wAf9YuGxFv3zUHJ0lLUE9syCfw79D+xC+vv/19YPx/1nSCKvLkKDQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:1e10:b0:1bf:6c78:54a9 with SMTP id pg16-20020a17090b1e1000b001bf6c7854a9mr321576pjb.1.1646958337790; Thu, 10 Mar 2022 16:25:37 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:06 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-5-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 04/26] KVM: x86/mmu: Decompose kvm_mmu_get_page() into separate functions From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Decompose kvm_mmu_get_page() into separate helper functions to increase readability and prepare for allocating shadow pages without a vcpu pointer. Specifically, pull the guts of kvm_mmu_get_page() into 3 helper functions: __kvm_mmu_find_shadow_page() - Walks the page hash checking for any existing mmu pages that match the given gfn and role. Does not attempt to synchronize the page if it is unsync. kvm_mmu_find_shadow_page() - Wraps __kvm_mmu_find_shadow_page() and handles syncing if necessary. kvm_mmu_new_shadow_page() Allocates and initializes an entirely new kvm_mmu_page. This currently requries a vcpu pointer for allocation and looking up the memslot but that will be removed in a future commit. Note, kvm_mmu_new_shadow_page() is temporary and will be removed in a subsequent commit. The name uses "new" rather than the more typical "alloc" to avoid clashing with the existing kvm_mmu_alloc_page(). No functional change intended. Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 132 ++++++++++++++++++++++++--------- arch/x86/kvm/mmu/paging_tmpl.h | 5 +- arch/x86/kvm/mmu/spte.c | 5 +- 3 files changed, 101 insertions(+), 41 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 23c2004c6435..80dbfe07c87b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2027,16 +2027,25 @@ static void clear_sp_write_flooding_count(u64 *spte) __clear_sp_write_flooding_count(sptep_to_sp(spte)); } -static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, - union kvm_mmu_page_role role) +/* + * Searches for an existing SP for the given gfn and role. Makes no attempt to + * sync the SP if it is marked unsync. + * + * If creating an upper-level page table, zaps unsynced pages for the same + * gfn and adds them to the invalid_list. It's the callers responsibility + * to call kvm_mmu_commit_zap_page() on invalid_list. + */ +static struct kvm_mmu_page *__kvm_mmu_find_shadow_page(struct kvm *kvm, + gfn_t gfn, + union kvm_mmu_page_role role, + struct list_head *invalid_list) { struct hlist_head *sp_list; struct kvm_mmu_page *sp; int collisions = 0; - LIST_HEAD(invalid_list); - sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; - for_each_valid_sp(vcpu->kvm, sp, sp_list) { + sp_list = &kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; + for_each_valid_sp(kvm, sp, sp_list) { if (sp->gfn != gfn) { collisions++; continue; @@ -2053,60 +2062,109 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, * upper-level page will be write-protected. */ if (role.level > PG_LEVEL_4K && sp->unsync) - kvm_mmu_prepare_zap_page(vcpu->kvm, sp, - &invalid_list); + kvm_mmu_prepare_zap_page(kvm, sp, invalid_list); + continue; } - /* unsync and write-flooding only apply to indirect SPs. */ - if (sp->role.direct) - goto trace_get_page; + /* Write-flooding is only tracked for indirect SPs. */ + if (!sp->role.direct) + __clear_sp_write_flooding_count(sp); - if (sp->unsync) { - /* - * The page is good, but is stale. kvm_sync_page does - * get the latest guest state, but (unlike mmu_unsync_children) - * it doesn't write-protect the page or mark it synchronized! - * This way the validity of the mapping is ensured, but the - * overhead of write protection is not incurred until the - * guest invalidates the TLB mapping. This allows multiple - * SPs for a single gfn to be unsync. - * - * If the sync fails, the page is zapped. If so, break - * in order to rebuild it. - */ - if (!kvm_sync_page(vcpu, sp, &invalid_list)) - break; + goto out; + } - WARN_ON(!list_empty(&invalid_list)); - kvm_flush_remote_tlbs(vcpu->kvm); - } + sp = NULL; - __clear_sp_write_flooding_count(sp); +out: + if (collisions > kvm->stat.max_mmu_page_hash_collisions) + kvm->stat.max_mmu_page_hash_collisions = collisions; + + return sp; +} -trace_get_page: - trace_kvm_mmu_get_page(sp, false); +/* + * Looks up an existing SP for the given gfn and role if one exists. The + * return SP is guaranteed to be synced. + */ +static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm_vcpu *vcpu, + gfn_t gfn, + union kvm_mmu_page_role role) +{ + struct kvm_mmu_page *sp; + LIST_HEAD(invalid_list); + + sp = __kvm_mmu_find_shadow_page(vcpu->kvm, gfn, role, &invalid_list); + if (!sp) goto out; + + if (sp->unsync) { + /* + * The page is good, but is stale. kvm_sync_page does + * get the latest guest state, but (unlike mmu_unsync_children) + * it doesn't write-protect the page or mark it synchronized! + * This way the validity of the mapping is ensured, but the + * overhead of write protection is not incurred until the + * guest invalidates the TLB mapping. This allows multiple + * SPs for a single gfn to be unsync. + * + * If the sync fails, the page is zapped and added to the + * invalid_list. + */ + if (!kvm_sync_page(vcpu, sp, &invalid_list)) { + sp = NULL; + goto out; + } + + WARN_ON(!list_empty(&invalid_list)); + kvm_flush_remote_tlbs(vcpu->kvm); } +out: + kvm_mmu_commit_zap_page(vcpu->kvm, &invalid_list); + return sp; +} + +static struct kvm_mmu_page *kvm_mmu_new_shadow_page(struct kvm_vcpu *vcpu, + gfn_t gfn, + union kvm_mmu_page_role role) +{ + struct kvm_mmu_page *sp; + struct hlist_head *sp_list; + ++vcpu->kvm->stat.mmu_cache_miss; sp = kvm_mmu_alloc_page(vcpu, role.direct); - sp->gfn = gfn; sp->role = role; + + sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; hlist_add_head(&sp->hash_link, sp_list); + if (!role.direct) { account_shadowed(vcpu->kvm, sp); if (role.level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn)) kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1); } - trace_kvm_mmu_get_page(sp, true); -out: - kvm_mmu_commit_zap_page(vcpu->kvm, &invalid_list); - if (collisions > vcpu->kvm->stat.max_mmu_page_hash_collisions) - vcpu->kvm->stat.max_mmu_page_hash_collisions = collisions; + return sp; +} + +static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, + union kvm_mmu_page_role role) +{ + struct kvm_mmu_page *sp; + bool created = false; + + sp = kvm_mmu_find_shadow_page(vcpu, gfn, role); + if (sp) + goto out; + + created = true; + sp = kvm_mmu_new_shadow_page(vcpu, gfn, role); + +out: + trace_kvm_mmu_get_page(sp, created); return sp; } diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index c3909a07e938..55cac59b9c9b 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -692,8 +692,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, * the gpte is changed from non-present to present. * Otherwise, the guest may use the wrong mapping. * - * For PG_LEVEL_4K, kvm_mmu_get_page() has already - * synchronized it transiently via kvm_sync_page(). + * For PG_LEVEL_4K, kvm_mmu_get_existing_sp() has + * already synchronized it transiently via + * kvm_sync_page(). * * For higher level pagetable, we synchronize it via * the slower mmu_sync_children(). If it needs to diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 4739b53c9734..d10189d9c877 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -150,8 +150,9 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, /* * Optimization: for pte sync, if spte was writable the hash * lookup is unnecessary (and expensive). Write protection - * is responsibility of kvm_mmu_get_page / kvm_mmu_sync_roots. - * Same reasoning can be applied to dirty page accounting. + * is responsibility of kvm_mmu_create_sp() and + * kvm_mmu_sync_roots(). Same reasoning can be applied to dirty + * page accounting. */ if (is_writable_pte(old_spte)) goto out; From patchwork Fri Mar 11 00:25:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB736C433EF for ; Fri, 11 Mar 2022 00:25:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345166AbiCKA0o (ORCPT ); Thu, 10 Mar 2022 19:26:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48390 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345162AbiCKA0m (ORCPT ); Thu, 10 Mar 2022 19:26:42 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C853D1A1C7C for ; Thu, 10 Mar 2022 16:25:40 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id t18-20020a63dd12000000b00342725203b5so3766139pgg.16 for ; Thu, 10 Mar 2022 16:25:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=HT5iizWaNyX/scodM/UgaJfvTolyFgDlKXTIQK5ovbE=; b=ci1RV/cDxZwD5zNwmfmqMYmXEiXEpZeRiCx1bWeGOaV4x9h0XElhebmz7axQ94hcJK bZap8UCpYa0n+D0V1WGsj49i9GocZPpnOW/R6liGZwWWd+MLFvGP4SmRm18uGvNf7VJA hM4w3wETJZ3BZ98Lv4ivSwkHviUvB82J6N2Xmv9hmmGXSXClEANNQCTSmZl/thTSHvF3 qMaREJKuH529np34JrNojketBObofC9CldBCVw8lMyUMfILV1nVUnlrgaj9SjXRE3cnb ez8mqQgD4jcpfyLRKrLcfMsV8AyaZdBvsT3nFJZl/AvexHb6QnSL9UFeXZ9O9z2drwbU ZGeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=HT5iizWaNyX/scodM/UgaJfvTolyFgDlKXTIQK5ovbE=; b=aT+NaunHM7CDwP3SWQ2NNLmqj8lxAFK8mMp56iNIcGc7Q1+PYuU9IZdY6097L9VbQY GTk7gnEF67AkCSwImzJJFfEMOyGJJmbXCm1SMdh0Rxl8Mm02RDoAfGcwLlfRxJVaNJ1A eJGXJI7xOXo3AC8qMHc7tcguj5GOO9UiT6nDL+y424FcCdId4ktGmREbXxSw1Nm+ZLhj di55f/ZUs/PB9kF08NS90H+yskvY1vdeorcLa2wBDn+IoUEBBK4dJIX5wBIwCuLfnYfs 30N50gafwEhYAdDx7LRJjJI1WNroTvU0DWlWXjZeTqGsBlk5nSfV6M7SKtFHENVJqw+B IYsA== X-Gm-Message-State: AOAM533H+FgNvcw0ObU/5BLivoK8HHd+S8oQ9YbXyqSuUhXQ/pvBjfH9 EcTd5NwFgfD2alSVq8qVZUAN4fbz5KP4mA== X-Google-Smtp-Source: ABdhPJzSyoUJ2mS8Kxw0tWZmurwVG8N3wCZURTli9JtC3Otg5er4JZvYmqgznJT5OSafq4YssNwt2bbQk0+S3g== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:1e10:b0:1bf:6c78:54a9 with SMTP id pg16-20020a17090b1e1000b001bf6c7854a9mr321588pjb.1.1646958339516; Thu, 10 Mar 2022 16:25:39 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:07 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-6-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 05/26] KVM: x86/mmu: Rename shadow MMU functions that deal with shadow pages From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Rename 3 functions: kvm_mmu_get_page() -> kvm_mmu_get_shadow_page() kvm_mmu_alloc_page() -> kvm_mmu_alloc_shadow_page() kvm_mmu_free_page() -> kvm_mmu_free_shadow_page() This change makes it clear that these functions deal with shadow pages rather than struct pages. Prefer "shadow_page" over the shorter "sp" since these are core routines. Signed-off-by: David Matlack Acked-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 80dbfe07c87b..b6fb50e32291 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1668,7 +1668,7 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr) percpu_counter_add(&kvm_total_used_mmu_pages, nr); } -static void kvm_mmu_free_page(struct kvm_mmu_page *sp) +static void kvm_mmu_free_shadow_page(struct kvm_mmu_page *sp) { MMU_WARN_ON(!is_empty_shadow_page(sp->spt)); hlist_del(&sp->hash_link); @@ -1706,7 +1706,8 @@ static void drop_parent_pte(struct kvm_mmu_page *sp, mmu_spte_clear_no_track(parent_pte); } -static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, bool direct) +static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, + bool direct) { struct kvm_mmu_page *sp; @@ -2134,7 +2135,7 @@ static struct kvm_mmu_page *kvm_mmu_new_shadow_page(struct kvm_vcpu *vcpu, ++vcpu->kvm->stat.mmu_cache_miss; - sp = kvm_mmu_alloc_page(vcpu, role.direct); + sp = kvm_mmu_alloc_shadow_page(vcpu, role.direct); sp->gfn = gfn; sp->role = role; @@ -2150,8 +2151,9 @@ static struct kvm_mmu_page *kvm_mmu_new_shadow_page(struct kvm_vcpu *vcpu, return sp; } -static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn, - union kvm_mmu_page_role role) +static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu, + gfn_t gfn, + union kvm_mmu_page_role role) { struct kvm_mmu_page *sp; bool created = false; @@ -2210,7 +2212,7 @@ static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, union kvm_mmu_page_role role; role = kvm_mmu_child_role(sptep, direct, access); - return kvm_mmu_get_page(vcpu, gfn, role); + return kvm_mmu_get_shadow_page(vcpu, gfn, role); } static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator, @@ -2486,7 +2488,7 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm, list_for_each_entry_safe(sp, nsp, invalid_list, link) { WARN_ON(!sp->role.invalid || sp->root_count); - kvm_mmu_free_page(sp); + kvm_mmu_free_shadow_page(sp); } } @@ -3417,7 +3419,7 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva, role.quadrant = quadrant; } - sp = kvm_mmu_get_page(vcpu, gfn, role); + sp = kvm_mmu_get_shadow_page(vcpu, gfn, role); ++sp->root_count; return __pa(sp->spt); From patchwork Fri Mar 11 00:25:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6949C433F5 for ; Fri, 11 Mar 2022 00:25:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345177AbiCKA0r (ORCPT ); Thu, 10 Mar 2022 19:26:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345173AbiCKA0o (ORCPT ); Thu, 10 Mar 2022 19:26:44 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB86F1A1C75 for ; Thu, 10 Mar 2022 16:25:41 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id lp2-20020a17090b4a8200b001bc449ecbceso6757404pjb.8 for ; Thu, 10 Mar 2022 16:25:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=E1f7dwVMCTIQxzMxcNwcOjBa8usGSsNkBUA4TV7/K5Q=; b=dyFlaYzlo6UILgMHQAlPkaaMWow2Ul4B7LMcFG47WnUc9EGN8o4wI1lMnJ3utbSG7p q9GeTDSR1RhxNQy56RSpOC5NArGbG4gGdb7XRaIJJfeqUYsywCbqkmnVbTgkMS9Wau2M qeWdDfXmPCEfp84dJU25XOuzordVzgtSPCRymjS7qB1/GTeYHqtXALIOMRi2PQLz5BvD AdyrtucZlIf+ZhYFru+CvqSrBIOWiaZHMvEkUi3BaaeFk2pLY6D+IXBbJE9WAoXsMSuR A54OuPtVrIT/1g7o3yy9wrm+WFzZFhuIaFAK0CDO/gJBFzmHnhMWGPNd5MLO3oeSADwf qhOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=E1f7dwVMCTIQxzMxcNwcOjBa8usGSsNkBUA4TV7/K5Q=; b=5uPykYlztefhv+bAV2csT1LS8wJ7wVz+txUZcW6oehvnP0dVXFkkpzq+UsSpkgiLso iVrNrfT5+GTeeo69yUshzz5u8zVDdpl98N91BQRfYucmpq/YSnj/3VeqvvBI6OCP5T9J 5GdyvN0r50Wcr+/3pZWxtmiTYLuP+fzr+tGabapu6zFeckcuAFJVlFbz0PcINqFiCAe0 vzKVqNB8uj1xrH1NjVs9dBX25EEYCjhrE1ND41Ar8PJUsq8t8t2MnrjyFOyGG8IzgWVu K8eWMvIYxLt9FBrcvwn+RLe58FvacQYTS9tKDtmdf1CWOnFyAu7trvW8xbRGGrdp32jg J9aQ== X-Gm-Message-State: AOAM530HBK/xA+8pgDwTxj0LqkD7FDiiWhS1llVb2y2HW7bhkW7bX+I8 idCvubWEhUYFtEOUCgV1SDrtNbw76uKh/g== X-Google-Smtp-Source: ABdhPJwCJvv1gmlTzqVHHEXadtDnxMhsFf7CFfV0r1s95pboZSQVSEZ+RKGDk2zGMU2B9w5mZno63N0XkT+ONQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:a3c3:b0:151:ec83:d88 with SMTP id q3-20020a170902a3c300b00151ec830d88mr8062530plb.9.1646958341309; Thu, 10 Mar 2022 16:25:41 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:08 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-7-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 06/26] KVM: x86/mmu: Pass memslot to kvm_mmu_new_shadow_page() From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Passing the memslot to kvm_mmu_new_shadow_page() avoids the need for the vCPU pointer when write-protecting indirect 4k shadow pages. This moves us closer to being able to create new shadow pages during VM ioctls for eager page splitting, where there is not vCPU pointer. This change does not negatively impact "Populate memory time" for ept=Y or ept=N configurations since kvm_vcpu_gfn_to_memslot() caches the last use slot. So even though we now look up the slot more often, it is a very cheap check. Opportunistically move the code to write-protect GFNs shadowed by PG_LEVEL_4K shadow pages into account_shadowed() to reduce indentation and consolidate the code. This also eliminates a memslot lookup. No functional change intended. Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b6fb50e32291..519910938478 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -793,16 +793,14 @@ void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn) update_gfn_disallow_lpage_count(slot, gfn, -1); } -static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) +static void account_shadowed(struct kvm *kvm, + struct kvm_memory_slot *slot, + struct kvm_mmu_page *sp) { - struct kvm_memslots *slots; - struct kvm_memory_slot *slot; gfn_t gfn; kvm->arch.indirect_shadow_pages++; gfn = sp->gfn; - slots = kvm_memslots_for_spte_role(kvm, sp->role); - slot = __gfn_to_memslot(slots, gfn); /* the non-leaf shadow pages are keeping readonly. */ if (sp->role.level > PG_LEVEL_4K) @@ -810,6 +808,9 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) KVM_PAGE_TRACK_WRITE); kvm_mmu_gfn_disallow_lpage(slot, gfn); + + if (kvm_mmu_slot_gfn_write_protect(kvm, slot, gfn, PG_LEVEL_4K)) + kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) @@ -2127,6 +2128,7 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm_vcpu *vcpu, } static struct kvm_mmu_page *kvm_mmu_new_shadow_page(struct kvm_vcpu *vcpu, + struct kvm_memory_slot *slot, gfn_t gfn, union kvm_mmu_page_role role) { @@ -2142,11 +2144,8 @@ static struct kvm_mmu_page *kvm_mmu_new_shadow_page(struct kvm_vcpu *vcpu, sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; hlist_add_head(&sp->hash_link, sp_list); - if (!role.direct) { - account_shadowed(vcpu->kvm, sp); - if (role.level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn)) - kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1); - } + if (!role.direct) + account_shadowed(vcpu->kvm, slot, sp); return sp; } @@ -2155,6 +2154,7 @@ static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu, gfn_t gfn, union kvm_mmu_page_role role) { + struct kvm_memory_slot *slot; struct kvm_mmu_page *sp; bool created = false; @@ -2163,7 +2163,8 @@ static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu, goto out; created = true; - sp = kvm_mmu_new_shadow_page(vcpu, gfn, role); + slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); + sp = kvm_mmu_new_shadow_page(vcpu, slot, gfn, role); out: trace_kvm_mmu_get_page(sp, created); From patchwork Fri Mar 11 00:25:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777174 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41F82C43217 for ; Fri, 11 Mar 2022 00:25:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233337AbiCKA0q (ORCPT ); Thu, 10 Mar 2022 19:26:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241985AbiCKA0p (ORCPT ); Thu, 10 Mar 2022 19:26:45 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F1331A1C7D for ; Thu, 10 Mar 2022 16:25:43 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id y7-20020a626407000000b004f6d39f1b0fso4187997pfb.5 for ; Thu, 10 Mar 2022 16:25:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=PR1rQAV6XpL1OAtut/tZohbNM2n5NWW4hR39uHFMmHs=; b=JREOgPoLzDDtiHf1IB99ZCsJV+z9ufBWFVMkpURvetyp3jEyGAs2o6YEPskH74BNqH 7WmF26FHEoNxVCox407leSficGgH5ww6LB5K5LSaGEZcVddfv47Z0KQDZ2OSEncfuw9m 92v7Hbt/vGHys2QQB1nnUcKUyOrNd0rhX+hL0MUHBikUFb4TL3LThU4aUj+pbscZTx8A rpGIRCeOg8KXOwzgrp+EQFNMUn9ArnU47X2XXyWsaFCppowp5wxUCU5HIdt94UO5U9vW OZslnV1bvhlKMGwicyF96n8kXJdz/1GSq5xk81YxfldwUaudK35H+Jy/D7nJlNcxqF1P gXYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=PR1rQAV6XpL1OAtut/tZohbNM2n5NWW4hR39uHFMmHs=; b=UP34V4w5/rxr1RaNeAWMC1moswU7oJty1a9GrZ1LhpijsWFhf+QdwkCkR7xo2nSG9i n52VoajcSCid0e153grAtmrmm/hzHRsmsUsJpBT21ESkGCLK6Tr/dos4cVCTHMiHgVH3 AFJ39i61mUImYbybh9L58acwJxT+tJEd6vuNE2QXrdtA6nZD3LJlpZ/KgwebghQz3GB1 bxfGW4ayb+3+Woapiq7RaupENSJKw0XefV/gYehmcDfhZgUs+OxpmCueURgVw+PUiIOt Nw+pJF+NE9GeYgTTGsGkKA3/PKlh/WVwoGilZK6A5KEhT6AotfA4ZeB4cJygcXvds9rw GyFg== X-Gm-Message-State: AOAM533gOpbs0RV9rgF6suz2RMKdcJuLtVlisgS4UZOdvPFCzjwTqnnx Kp97iIkTx56Ro5bim018gzdJtXzrk65G2g== X-Google-Smtp-Source: ABdhPJy72cfT+ItACidnNCLuDyWXf6ABkTliWSk/elCD++cwKTACMZ9AqHdEBI5pX6/vwsZztFdxcxA1tFgWPw== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:d486:b0:151:8e66:621c with SMTP id c6-20020a170902d48600b001518e66621cmr7683225plg.141.1646958343059; Thu, 10 Mar 2022 16:25:43 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:09 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-8-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 07/26] KVM: x86/mmu: Separate shadow MMU sp allocation from initialization From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Separate the code that allocates a new shadow page from the vCPU caches from the code that initializes it. This is in preparation for creating new shadow pages from VM ioctls for eager page splitting, where we do not have access to the vCPU caches. No functional change intended. Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 38 ++++++++++++++++++-------------------- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 519910938478..e866e05c4ba5 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1716,16 +1716,9 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); if (!direct) sp->gfns = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_gfn_array_cache); + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); - /* - * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages() - * depends on valid pages being added to the head of the list. See - * comments in kvm_zap_obsolete_pages(). - */ - sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen; - list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages); - kvm_mod_used_mmu_pages(vcpu->kvm, +1); return sp; } @@ -2127,27 +2120,31 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm_vcpu *vcpu, return sp; } -static struct kvm_mmu_page *kvm_mmu_new_shadow_page(struct kvm_vcpu *vcpu, - struct kvm_memory_slot *slot, - gfn_t gfn, - union kvm_mmu_page_role role) +static void init_shadow_page(struct kvm *kvm, struct kvm_mmu_page *sp, + struct kvm_memory_slot *slot, gfn_t gfn, + union kvm_mmu_page_role role) { - struct kvm_mmu_page *sp; struct hlist_head *sp_list; - ++vcpu->kvm->stat.mmu_cache_miss; + ++kvm->stat.mmu_cache_miss; - sp = kvm_mmu_alloc_shadow_page(vcpu, role.direct); sp->gfn = gfn; sp->role = role; + sp->mmu_valid_gen = kvm->arch.mmu_valid_gen; - sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; + /* + * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages() + * depends on valid pages being added to the head of the list. See + * comments in kvm_zap_obsolete_pages(). + */ + list_add(&sp->link, &kvm->arch.active_mmu_pages); + kvm_mod_used_mmu_pages(kvm, 1); + + sp_list = &kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; hlist_add_head(&sp->hash_link, sp_list); if (!role.direct) - account_shadowed(vcpu->kvm, slot, sp); - - return sp; + account_shadowed(kvm, slot, sp); } static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu, @@ -2164,7 +2161,8 @@ static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu, created = true; slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); - sp = kvm_mmu_new_shadow_page(vcpu, slot, gfn, role); + sp = kvm_mmu_alloc_shadow_page(vcpu, role.direct); + init_shadow_page(vcpu->kvm, sp, slot, gfn, role); out: trace_kvm_mmu_get_page(sp, created); From patchwork Fri Mar 11 00:25:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777176 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91CE7C433EF for ; Fri, 11 Mar 2022 00:25:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240061AbiCKA0t (ORCPT ); Thu, 10 Mar 2022 19:26:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345170AbiCKA0r (ORCPT ); Thu, 10 Mar 2022 19:26:47 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F4911A2701 for ; Thu, 10 Mar 2022 16:25:45 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id y3-20020a1709029b8300b0014c8bcb70a1so3567657plp.3 for ; Thu, 10 Mar 2022 16:25:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=hUhTNfsRlcAhlCAEwuaf6qTSRETsPLu85xICgb6Pw2I=; b=Ux2f+mr/mK9ysuQSi/sBCmxVVqyTzZtZcuO+SYe9GBnc8TQpBaRNkuO26UcTq9+K7m 9XCEug+JYAx7QgnxtFzMKpDzCZ2Kl3NLGHYaMWkmd4Ic4q6ryxK1fh3eirtILH29GWRl y4ndBbNMnsnmerSi8z/TQIBMaNCb13Zxm6I81Hm3D5/Iw1XB83q9r0puc/5thPme9yUe 5CQHUvdH3HOyCN2dpgX6tbRevneUEVUCc5bYxwqg6D0Njsn05gMV/tq0tNKd8PUtrCZL TKX0vUgoYTO6j6iEsFtBVDflNzJ0qcKCJCSdpf3D5PzBxpdR1jblF6G0g4UsVDlY5RTe t19g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=hUhTNfsRlcAhlCAEwuaf6qTSRETsPLu85xICgb6Pw2I=; b=D9j/d4qZ8UAY5eQSbrnx5yqFnl5yW7P7d2Wbws1IVidorGfH25FmjHpm9FfzLeY4sm A48wY8uIbvgF1TNy0EW+DsUqlLTN8yKCZepixc2gLx3saXYbCw8VzbgPqGaP3XV/N5PJ 4QJ2SMZefk9HpApNaiJQl+B4asu3QWUFLLjCvkWcxHCT4DJOfwxNw3fpNARugx8D14DB Rk0O5mIkVleu9WZU/Z50lMwkz6g8ypVuPCbvo/RKgyxlxTjF3tiUqVRrd9O5Th18kUif GqOLBzSWUe52EfWpfTFAECet6osPsH4h07SqhOwN+i44OG5HHhZ6g4XhHJ3dRyKKTYMb z23w== X-Gm-Message-State: AOAM5318nD0/Q1pl8WQK1LLpru66qv8UIXtd6IPqDyQ457cDqy3vJPi8 vpBmgckAoJbtfVC4aM+lPYYODqD2EJkZrQ== X-Google-Smtp-Source: ABdhPJyVkv6e6MFwOzwVPNnjhMEMTwjec6c9yLeYXj0Dnt6k4PAJzbfaJIjMW+/lDxdSlSq9DXN35GK1V5t4+g== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a63:9d8c:0:b0:378:4b73:5b3b with SMTP id i134-20020a639d8c000000b003784b735b3bmr6119017pgd.564.1646958344889; Thu, 10 Mar 2022 16:25:44 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:10 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-9-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 08/26] KVM: x86/mmu: Link spt to sp during allocation From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Link the shadow page table to the sp (via set_page_private()) during allocation rather than initialization. This is a more logical place to do it because allocation time is also where we do the reverse link (setting sp->spt). This creates one extra call to set_page_private(), but having multiple calls to set_page_private() is unavoidable anyway. We either do set_page_private() during allocation, which requires 1 per allocation function, or we do it during initialization, which requires 1 per initialization function. No functional change intended. Suggested-by: Ben Gardon Signed-off-by: David Matlack --- arch/x86/kvm/mmu/tdp_mmu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index af60922906ef..eecb0215e636 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -274,6 +274,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); return sp; } @@ -281,8 +282,6 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, gfn_t gfn, union kvm_mmu_page_role role) { - set_page_private(virt_to_page(sp->spt), (unsigned long)sp); - sp->role = role; sp->gfn = gfn; sp->ptep = sptep; @@ -1410,6 +1409,8 @@ static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp) return NULL; } + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); + return sp; } From patchwork Fri Mar 11 00:25:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73C74C433F5 for ; Fri, 11 Mar 2022 00:25:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345180AbiCKA0v (ORCPT ); Thu, 10 Mar 2022 19:26:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345181AbiCKA0u (ORCPT ); Thu, 10 Mar 2022 19:26:50 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE5511A2707 for ; Thu, 10 Mar 2022 16:25:47 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id b5-20020a17090a7ac500b001bfc1f48421so2640958pjl.4 for ; Thu, 10 Mar 2022 16:25:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=V/b+716hREpGluS5au7SFvhvrZUH1mP5ZRBJUo2aKek=; b=NrOteM9lQZRLTvGjh7Vk2YOgkSJONBJ9EpO+Rh0e4cd73vulmGH+hFimtPt0gE+8Ho k8JVHCrbIF0zjGII7zgfaIT8k7/FKxKSqC1qy+gJKY1Fn2gJfg1PT4ql7x80T5MMdSaX bC9lIGSQ2dpF23XdAOIxEeUHf4ph7pFTmtW6mlo8a7I2IbuHRD/FW6U8cEQZUD+p+OvW NX7TXaXYU4um5GdK+Pbvp3NhR93SLOD1Grc7+fg58Gq5DdoJsaPOmjlVu9RAJc+vFAXt UqlKZqaxGUd8fVlMc0DOlFb8fnPCoA2kMW1cjgW7bS5hxnqYubmgtW/N6whS2ZNh3U3y 2nmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=V/b+716hREpGluS5au7SFvhvrZUH1mP5ZRBJUo2aKek=; b=UmCV6zFvjQpdPfc9GG0k4rP34y04MjM2iQge/NoQnpyquN3AKZb/FelfWF9F0kD6lb TTPCzj7a1hk6zTSaoWD42zYcGOy9TVdck+K1Xlylg9EkBZY2VbZY9PuwiZGa/QU//ptf ChaGQb/IuvLt3BoUuiqii9NcdUrKdK4yMfYg41UTt8XtP2isgGqpQDV0tOBeQbB/GHyz u4Gi9VCkwLmzr71IUaCqjrHnZtPD8aCTqIKzY0t0Ro/IeeHEUZoA1zAKOJgV7LHl6VPQ rVsrmsGPy5pLydb3k4lJVK1RQqGeYtPj0kfJql6WBYg+7PiyfrGY6Dk/cKLXHh4v1Fql MFRg== X-Gm-Message-State: AOAM533maH/Oy6V4tTXMFq5SBZB8cy5mDwBhoAre4IZjHz+rJegGVqbR cxRRWWAopH0jq0HXcMixKMD4XEFys7Hz8A== X-Google-Smtp-Source: ABdhPJwJMvWtF2+lcHehUS9eYrjqEwNfEYfhcTLqNJ3E9xEg/1eZ72BHS9KkgILyrBX8MA8OHqfLuhNcfwn1aQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:4d81:b0:1bf:8ce4:4f51 with SMTP id oj1-20020a17090b4d8100b001bf8ce44f51mr322556pjb.0.1646958346675; Thu, 10 Mar 2022 16:25:46 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:11 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-10-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 09/26] KVM: x86/mmu: Move huge page split sp allocation code to mmu.c From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move the code that allocates a new shadow page for splitting huge pages into mmu.c. Currently this code is only used by the TDP MMU but it will be reused in subsequent commits to also split huge pages mapped by the shadow MMU. While here, also shove the GFP complexity down into the allocation function so that it does not have to be duplicated when the shadow MMU needs to start allocating SPs for splitting. No functional change intended. Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 34 +++++++++++++++++++++++++++++++++ arch/x86/kvm/mmu/mmu_internal.h | 2 ++ arch/x86/kvm/mmu/tdp_mmu.c | 34 ++------------------------------- 3 files changed, 38 insertions(+), 32 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e866e05c4ba5..c12d5016f6dc 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1722,6 +1722,40 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, return sp; } +/* + * Allocate a new shadow page, potentially while holding the MMU lock. + * + * Huge page splitting always uses direct shadow pages since the huge page is + * being mapped directly with a lower level page table. Thus there's no need to + * allocate the gfns array. + */ +struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked) +{ + struct kvm_mmu_page *sp; + gfp_t gfp; + + /* + * If under the MMU lock, use GFP_NOWAIT to avoid direct reclaim (which + * is slow) and to avoid making any filesystem callbacks (which can end + * up invoking KVM MMU notifiers, resulting in a deadlock). + */ + gfp = (locked ? GFP_NOWAIT : GFP_KERNEL) | __GFP_ACCOUNT | __GFP_ZERO; + + sp = kmem_cache_alloc(mmu_page_header_cache, gfp); + if (!sp) + return NULL; + + sp->spt = (void *)__get_free_page(gfp); + if (!sp->spt) { + kmem_cache_free(mmu_page_header_cache, sp); + return NULL; + } + + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); + + return sp; +} + static void mark_unsync(u64 *spte); static void kvm_mmu_mark_parents_unsync(struct kvm_mmu_page *sp) { diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 1bff453f7cbe..a0648e7ddd33 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -171,4 +171,6 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); +struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked); + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index eecb0215e636..1a43f908d508 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1393,43 +1393,13 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, return spte_set; } -static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp) -{ - struct kvm_mmu_page *sp; - - gfp |= __GFP_ZERO; - - sp = kmem_cache_alloc(mmu_page_header_cache, gfp); - if (!sp) - return NULL; - - sp->spt = (void *)__get_free_page(gfp); - if (!sp->spt) { - kmem_cache_free(mmu_page_header_cache, sp); - return NULL; - } - - set_page_private(virt_to_page(sp->spt), (unsigned long)sp); - - return sp; -} - static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, struct tdp_iter *iter, bool shared) { struct kvm_mmu_page *sp; - /* - * Since we are allocating while under the MMU lock we have to be - * careful about GFP flags. Use GFP_NOWAIT to avoid blocking on direct - * reclaim and to avoid making any filesystem callbacks (which can end - * up invoking KVM MMU notifiers, resulting in a deadlock). - * - * If this allocation fails we drop the lock and retry with reclaim - * allowed. - */ - sp = __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT); + sp = kvm_mmu_alloc_direct_sp_for_split(true); if (sp) return sp; @@ -1441,7 +1411,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, write_unlock(&kvm->mmu_lock); iter->yielded = true; - sp = __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT); + sp = kvm_mmu_alloc_direct_sp_for_split(false); if (shared) read_lock(&kvm->mmu_lock); From patchwork Fri Mar 11 00:25:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777178 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67BE1C433F5 for ; Fri, 11 Mar 2022 00:25:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237427AbiCKA0x (ORCPT ); Thu, 10 Mar 2022 19:26:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345181AbiCKA0v (ORCPT ); Thu, 10 Mar 2022 19:26:51 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38DE31A1C7E for ; Thu, 10 Mar 2022 16:25:49 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id j5-20020a63e745000000b00378c359fac3so3791473pgk.2 for ; Thu, 10 Mar 2022 16:25:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=JTpQf7sMt9NLwDv0fAUwU2EA32MsJpiLJgXJ8737nsg=; b=nI8x0dteBcJQ0AeW7ZZZnCVKrEe68X0+npmVLf8lMk0Ag9nRT1msQlKpIkRULGuUI8 06ikZLRTCzLiUgNamd32eHHL90zFaZpJJSGCSIacLRIE5rwK5rwhE8gGl2fH/Qt7GyLM /hZTGRxO3Wm0LkCdC3D4JXrM3t3pqzYXHTarQbY0zLTLvXKJmLBwqD/zni0OcJqToo/c RxVg35KKH4nykhIefrkxNUzy6WjfNJ9TJUpctMWnzuDPi+ofAgYSLz+nuyuKLxJRSet2 nsIPDOHgYBYLQkkCzl53A3Lm+IRStEgvTXKbpKSxs0K+5Fi2pGo9vjO4eXKe0nSvQ0cZ Wq6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=JTpQf7sMt9NLwDv0fAUwU2EA32MsJpiLJgXJ8737nsg=; b=NMVVF0sGFrUlooPQ/RtJ3lBjm/B9Arny9J0PHBQ/ZTyF9VUdScy6R20jvIDFIWVMn1 DGYo3hiyg8BE1BA5LQWs8s6afsLGM3dGj7ULzlSU6hFlMC+YjArqtihwaJ7LPHsrUkvF HL18RcLb2fVbHmjmzYJ1/y0zLeXRDXfkDXo4ek4K9TuzU9Wx/XNsqKuUEDFJT25jqHcR XTzmZet0U7p7xtqQtMSaD+37GoVyiMSlXTLt/c4T2phrPdgumUKF1Z6/3CajddsoVMcP letRk/yu0buX08bSynTtrjwQid+BsF2lMwNr/AZSy023Om/rJDqAvyr7uV4B4Xllapjh 7RAg== X-Gm-Message-State: AOAM530tev/HaAH7uA0KqT0m6tD/LgiMx6N9LX4Vshc3HvKfQKp9TTpA Gw5iCgLIIy53bccte0Qz00AC72uqE/M2uQ== X-Google-Smtp-Source: ABdhPJx0jVLYy5OgA1/trFkRvNs31FrB39hnDd/rYlVAUt8N2QRI9jD4iGvKLypSffqncxLl/Lr39n8gAymcHg== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:7296:b0:14f:2a67:b400 with SMTP id d22-20020a170902729600b0014f2a67b400mr7633127pll.172.1646958348502; Thu, 10 Mar 2022 16:25:48 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:12 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-11-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 10/26] KVM: x86/mmu: Use common code to free kvm_mmu_page structs From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use a common function to free kvm_mmu_page structs in the TDP MMU and the shadow MMU. This reduces the amount of duplicate code and is needed in subsequent commits that allocate and free kvm_mmu_pages for eager page splitting. No functional change intended. Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 8 ++++---- arch/x86/kvm/mmu/mmu_internal.h | 2 ++ arch/x86/kvm/mmu/tdp_mmu.c | 3 +-- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c12d5016f6dc..2dcafbef5ffc 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1669,11 +1669,8 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr) percpu_counter_add(&kvm_total_used_mmu_pages, nr); } -static void kvm_mmu_free_shadow_page(struct kvm_mmu_page *sp) +void kvm_mmu_free_shadow_page(struct kvm_mmu_page *sp) { - MMU_WARN_ON(!is_empty_shadow_page(sp->spt)); - hlist_del(&sp->hash_link); - list_del(&sp->link); free_page((unsigned long)sp->spt); if (!sp->role.direct) free_page((unsigned long)sp->gfns); @@ -2521,6 +2518,9 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm, list_for_each_entry_safe(sp, nsp, invalid_list, link) { WARN_ON(!sp->role.invalid || sp->root_count); + MMU_WARN_ON(!is_empty_shadow_page(sp->spt)); + hlist_del(&sp->hash_link); + list_del(&sp->link); kvm_mmu_free_shadow_page(sp); } } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index a0648e7ddd33..5f91e4d07a95 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -173,4 +173,6 @@ void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked); +void kvm_mmu_free_shadow_page(struct kvm_mmu_page *sp); + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1a43f908d508..184874a82a1b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -64,8 +64,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) static void tdp_mmu_free_sp(struct kvm_mmu_page *sp) { - free_page((unsigned long)sp->spt); - kmem_cache_free(mmu_page_header_cache, sp); + kvm_mmu_free_shadow_page(sp); } /* From patchwork Fri Mar 11 00:25:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777179 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBAE1C433EF for ; Fri, 11 Mar 2022 00:25:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345181AbiCKA0z (ORCPT ); Thu, 10 Mar 2022 19:26:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242902AbiCKA0w (ORCPT ); Thu, 10 Mar 2022 19:26:52 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB6AD1A1C7A for ; Thu, 10 Mar 2022 16:25:50 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id b4-20020a170902a9c400b001532ec9005aso769921plr.10 for ; Thu, 10 Mar 2022 16:25:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=zCJDCK8B0lL6JubhIyGfaHMZb5Qp7vvPx+3RInS6vEE=; b=bEo8CbzbHriKnkqNIGJfYKHguuXOz576eX2T4cHlipuA8XvhukRnyam2cGRz+vvTpw RKJ42EslOVMjlUZ5CqQ4Ro08rbQLot9XVom2AtVhJJsFRpPuXeGRD7+kE5bpRt8Qchq7 mMho4sxejFkUfAZbxPpKnDuYiyy3NfeZqL47umZAHwKfmYzRsaUKEQSMQF6uEpU395dt bTBuze3KrAItL51NXFJk8Yx6OLDDZZ+pkLmxqY+EN/KoiYLwEqHKxOtAT/Xc+klpzLNy L6eSziMkwAPaIASt9BXbcqjCoVM19WcMyU3KXGrS5/AslQMgCIEuuN29MySTWd/dS5VG L/aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=zCJDCK8B0lL6JubhIyGfaHMZb5Qp7vvPx+3RInS6vEE=; b=KU1TJtLMSLW/IztfxtJmu9BCdMTNg++CwwdL4T7oVxkvOkextgmzi3FoRlDdUAakUV KLwknDgkyePyfKjwTCJPFRMKuhYY5V6UlmVaWP8XYS7x9h8hTH9joEtjy9CoEoRIQITZ L0WxhzCivErYAxyT3/EWADx3SDmdD4Y13DPSlrh0/gpKctJgt6pqWK3MfiOtLcq79uQk 3FoD+vSn13bYtlLhW6ZLgL5tjzC7RNIk69BJJtytFRXhj8x7WNC/tPJuKoCwASx5WjAN shqTTcFHehTU2F6RGC3dlTM3SmhiKrsYucbE1MTMK4MaJN2I1SZHjUgWPZMP8GJV2mZd j6rw== X-Gm-Message-State: AOAM532qWDht+DVEU3g3jgqpicSzyWZZWu57Sz/l96uZj85KURrDpRyg i6KaHLiFtNWqgG8KE6ItpLv9NRYJ5TPZww== X-Google-Smtp-Source: ABdhPJz4xnTY2m6Isi4IBgbuJhluKtEZACOCW6bqOZj5r5ynVvnXZt7dHD/2bSBFHfhe17fyQtZKmIVyCTxRMw== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a05:6a00:238f:b0:4f7:78b1:2f6b with SMTP id f15-20020a056a00238f00b004f778b12f6bmr4873058pfc.17.1646958350133; Thu, 10 Mar 2022 16:25:50 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:13 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-12-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 11/26] KVM: x86/mmu: Use common code to allocate kvm_mmu_page structs from vCPU caches From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Now that allocating a kvm_mmu_page struct is isolated to a helper function, it can be re-used in the TDP MMU. No functional change intended. Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 3 +-- arch/x86/kvm/mmu/mmu_internal.h | 1 + arch/x86/kvm/mmu/tdp_mmu.c | 8 +------- 3 files changed, 3 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 2dcafbef5ffc..4c8feaeb063d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1704,8 +1704,7 @@ static void drop_parent_pte(struct kvm_mmu_page *sp, mmu_spte_clear_no_track(parent_pte); } -static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, - bool direct) +struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, bool direct) { struct kvm_mmu_page *sp; diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 5f91e4d07a95..d4e2de5f2a6d 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -173,6 +173,7 @@ void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked); +struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, bool direct); void kvm_mmu_free_shadow_page(struct kvm_mmu_page *sp); #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 184874a82a1b..f285fd76717b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -269,13 +269,7 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm, static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) { - struct kvm_mmu_page *sp; - - sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); - sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); - set_page_private(virt_to_page(sp->spt), (unsigned long)sp); - - return sp; + return kvm_mmu_alloc_shadow_page(vcpu, true); } static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, From patchwork Fri Mar 11 00:25:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777180 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9447DC43217 for ; Fri, 11 Mar 2022 00:25:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345192AbiCKA04 (ORCPT ); Thu, 10 Mar 2022 19:26:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345185AbiCKA0y (ORCPT ); Thu, 10 Mar 2022 19:26:54 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 067A41A1C7E for ; Thu, 10 Mar 2022 16:25:52 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id n11-20020a170902d2cb00b0015331a5d02fso543072plc.12 for ; Thu, 10 Mar 2022 16:25:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=FuGhM+ADTv/ZcWx+5nd4XLS/9fWvtmM0zVjBEofzrxU=; b=Ji7DjPBUNinn477C4vHC7v0CuY0YoR6Yspro2QMZ2A3u86aqpm1X0/aNnOB3Onp3w2 HgDtdMQ+gjJdoqlarq8fbzeMvb9spnuW9GS9uYaCSHgwIzu15JEmHu79mAnZ1tLJFwPT oFowkvO9jBF3oyujgESdSYeqoe0gYjAcidMXMJAHHt6OtFUYjbz9/XwZHovpDi/VB3a2 2w1CxktgeA3NrBx+qyKN/4Ey7VJIa/lVAfIOLkSA5mAPxkDNWBz2kAed064YBhLxUxUP f0JOgrZf/UTO6C022foFU+9duN7WQjx0LGEL0QoLxC6tLxyIOe8si9f3ojrRG/e9lbPJ RSag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=FuGhM+ADTv/ZcWx+5nd4XLS/9fWvtmM0zVjBEofzrxU=; b=x+hKc4tYrYgI73UB/d9yE4gDsSuNBvY6dROzcJlLVfQANzarL1m5KuwqrHdzqBMs4t aEro/OfyG+0wXXuGlRHKI3WTIXGlj5YMkAX7uEX2ykoxD4yu/058mDOqCtS6MTf5pgST /7vJ9g0nTmBnxjLOiDuhuu/RnbPrh6JtnBU8SP+dAH0HHPovX3cjsJ0ApfN93ByTuj0N qeiP4jmvt53b9OJtlg57ahmFmM9YhVAzkhXmCCyY0APhjaToGA2on0FOMFpVvcr3YJG3 2HG47BJEhTAWW51hCA9PiOcVH5QVnp2kwhFxTCnTo95Rj0pG1/YYSJLd6kEP5JILTxbp 6dlA== X-Gm-Message-State: AOAM530eq0QpqJWsCnoeIMr5Ay3KtGzzfhzV/0yqLaJRwhd3FU3OJJ1/ dE3pBdoKrkfCPBmvvJkhz5aGmTQznXarDQ== X-Google-Smtp-Source: ABdhPJwexzqWNekNQlu0Z8Q7kGjrMeWTrgWkJ92wr1Nrhoh86lXtdckWOSYIgTE9CJlEh1iQr24fdow+vmFtsQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:903:192:b0:151:8df9:6cdb with SMTP id z18-20020a170903019200b001518df96cdbmr7919327plg.20.1646958351495; Thu, 10 Mar 2022 16:25:51 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:14 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-13-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 12/26] KVM: x86/mmu: Pass const memslot to rmap_add() From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org rmap_add() only uses the slot to call gfn_to_rmap() which takes a const memslot. No functional change intended. Reviewed-by: Ben Gardon Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 4c8feaeb063d..23c0a36ac93f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1596,7 +1596,7 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, #define RMAP_RECYCLE_THRESHOLD 1000 -static void rmap_add(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, +static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, u64 *spte, gfn_t gfn) { struct kvm_mmu_page *sp; From patchwork Fri Mar 11 00:25:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C216C433EF for ; Fri, 11 Mar 2022 00:25:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345195AbiCKA06 (ORCPT ); Thu, 10 Mar 2022 19:26:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345191AbiCKA05 (ORCPT ); Thu, 10 Mar 2022 19:26:57 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6E341A2718 for ; Thu, 10 Mar 2022 16:25:53 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id mt1-20020a17090b230100b001beef010919so6768612pjb.7 for ; Thu, 10 Mar 2022 16:25:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=v7qCsbgsr/gR22mozQcF+sQ2jUkkViok2U86H9+3SCg=; b=sxbiv6bxYZKji+KMkwgFhZgmKilK+x7JKaeklm0aaY/0tPHo6PqSDrSP8HWopQYwNb nYjE55/Ylu41BjN1Wr8tR6Vj6dzqCF34FD3W/C18qtOwHO8+ZBiBG1Y+12IK5ZU+3UoP CC+BQNMbs5hN4UK+uFlYFs4kOqkifWM+N5qlJK7DTJ7axe2fdbrrMPmRjJrIVbzuWq0r 3k81g/OAwBL5gKGcgy6IkuE3EHezxxR9kafqxsj1rgsGhwauVhdI2uyKyMeYKz8leCWX I4Bqze8qQcKReoPNBhoTBprcMm2NQA/bx4Z1RgJ2wXF5sWAbMswIfJlKoVg2OV4VMNvY kgtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=v7qCsbgsr/gR22mozQcF+sQ2jUkkViok2U86H9+3SCg=; b=Ab83tUkR5sJVeI8Lp2Qr++Rj1r1XRAj/qdC4DxyoHeTU3ZCh7lMhaX4jGFtlXRNcJU QcaGyzCwrSvyCDx4WEodfzOtzJZ8yv8ipwlQ8AnbNxRcPCVFjJMdglTd9Bz0NczV4UvE iE8T6wX9dHf+FicUKKBp47evZfAk1kzgnfIoTgos4/2YeDbwmPtxsJyy+aFR/KmyKzWP Nk5qejmyiHDKRql04XEJUQxlEzu5puwQ3b91/3wqDT/z46vYrJmZsUPstHLrCTEcqsDM ofHBUxH2TRpWjWsYHbG18SlPuzPX9IwkRzWpciCWYuod9JYeOmQeO67zOWxa6yaOTbrk 7yhQ== X-Gm-Message-State: AOAM5336tAMFhVm+ykvXA+JctAcoUtoZNJXy/nsLkjcprdL5WMXbUXzu GsmXvTD06IA8adpN6VunrPD+BS6dPXy17Q== X-Google-Smtp-Source: ABdhPJzvjuqM/MeknIQuH530q9QhRKE8ELO29h5EvZaK896mLaUdORuS4wDYPcFM+SuVielHfBcrazhJK1uDSA== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:be18:b0:153:2444:9c1a with SMTP id r24-20020a170902be1800b0015324449c1amr4691462pls.152.1646958353220; Thu, 10 Mar 2022 16:25:53 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:15 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-14-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 13/26] KVM: x86/mmu: Pass const memslot to init_shadow_page() and descendants From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use a const pointer so that init_shadow_page() can be called from contexts where we have a const pointer. No functional change intended. Reviewed-by: Ben Gardon Signed-off-by: David Matlack --- arch/x86/include/asm/kvm_page_track.h | 2 +- arch/x86/kvm/mmu/mmu.c | 6 +++--- arch/x86/kvm/mmu/mmu_internal.h | 2 +- arch/x86/kvm/mmu/page_track.c | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- arch/x86/kvm/mmu/tdp_mmu.h | 2 +- 6 files changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h index eb186bc57f6a..3a2dc183ae9a 100644 --- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -58,7 +58,7 @@ int kvm_page_track_create_memslot(struct kvm *kvm, unsigned long npages); void kvm_slot_page_track_add_page(struct kvm *kvm, - struct kvm_memory_slot *slot, gfn_t gfn, + const struct kvm_memory_slot *slot, gfn_t gfn, enum kvm_page_track_mode mode); void kvm_slot_page_track_remove_page(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 23c0a36ac93f..d7ad71be6c52 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -794,7 +794,7 @@ void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn) } static void account_shadowed(struct kvm *kvm, - struct kvm_memory_slot *slot, + const struct kvm_memory_slot *slot, struct kvm_mmu_page *sp) { gfn_t gfn; @@ -1373,7 +1373,7 @@ int kvm_cpu_dirty_log_size(void) } bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, - struct kvm_memory_slot *slot, u64 gfn, + const struct kvm_memory_slot *slot, u64 gfn, int min_level) { struct kvm_rmap_head *rmap_head; @@ -2151,7 +2151,7 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm_vcpu *vcpu, } static void init_shadow_page(struct kvm *kvm, struct kvm_mmu_page *sp, - struct kvm_memory_slot *slot, gfn_t gfn, + const struct kvm_memory_slot *slot, gfn_t gfn, union kvm_mmu_page_role role) { struct hlist_head *sp_list; diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index d4e2de5f2a6d..b6e22ba9c654 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -134,7 +134,7 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot, void kvm_mmu_gfn_disallow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn); void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn); bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm, - struct kvm_memory_slot *slot, u64 gfn, + const struct kvm_memory_slot *slot, u64 gfn, int min_level); void kvm_flush_remote_tlbs_with_address(struct kvm *kvm, u64 start_gfn, u64 pages); diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 2e09d1b6249f..3e7901294573 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -84,7 +84,7 @@ int kvm_page_track_write_tracking_alloc(struct kvm_memory_slot *slot) return 0; } -static void update_gfn_track(struct kvm_memory_slot *slot, gfn_t gfn, +static void update_gfn_track(const struct kvm_memory_slot *slot, gfn_t gfn, enum kvm_page_track_mode mode, short count) { int index, val; @@ -112,7 +112,7 @@ static void update_gfn_track(struct kvm_memory_slot *slot, gfn_t gfn, * @mode: tracking mode, currently only write track is supported. */ void kvm_slot_page_track_add_page(struct kvm *kvm, - struct kvm_memory_slot *slot, gfn_t gfn, + const struct kvm_memory_slot *slot, gfn_t gfn, enum kvm_page_track_mode mode) { diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index f285fd76717b..85b7bc333302 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1768,7 +1768,7 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root, * Returns true if an SPTE was set and a TLB flush is needed. */ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, - struct kvm_memory_slot *slot, gfn_t gfn, + const struct kvm_memory_slot *slot, gfn_t gfn, int min_level) { struct kvm_mmu_page *root; diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 54bc8118c40a..8308bfa4126b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -42,7 +42,7 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, const struct kvm_memory_slot *slot); bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, - struct kvm_memory_slot *slot, gfn_t gfn, + const struct kvm_memory_slot *slot, gfn_t gfn, int min_level); void kvm_tdp_mmu_try_split_huge_pages(struct kvm *kvm, From patchwork Fri Mar 11 00:25:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777182 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D713EC433EF for ; Fri, 11 Mar 2022 00:26:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345200AbiCKA1B (ORCPT ); Thu, 10 Mar 2022 19:27:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345185AbiCKA07 (ORCPT ); Thu, 10 Mar 2022 19:26:59 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65D4D1A270C for ; Thu, 10 Mar 2022 16:25:55 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id e13-20020a17090301cd00b00150145346f9so3533026plh.23 for ; Thu, 10 Mar 2022 16:25:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=trCyse6kc5VaWQEWZpkBTeyDB9juUOFht/HeyrHGEd0=; b=ITUQYvDkX2h3RQafxt+nm82/5781iHDiISy4DhTqmrro4/iNkPMsCBDlg8t8eB6+vW WoKsbJ7zeDrbOuwfTkcWOWIHcZkJGUw2d4r+98Smj6dBLngOlw908mgtDChj3UyeFl76 FyOJizpm4NSOQnGdgQxrZIN82+kY5NHA6X6TJmku0dL5yKt8RAtw/MBwBf3lvsO9ufHQ Dqd5AKP8PlHMHCKEXNTDH4MaONsDnd2Q9BRw9Kd7sIUZhnRm7oa9dS0kBigX9ayD9ejf EqqRiqfDXYoPw8fi+Fl3DZk+QJK6k5SwEwYznvLftMTTXWKnvbsNNkitPFK/djwTdGSR vCjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=trCyse6kc5VaWQEWZpkBTeyDB9juUOFht/HeyrHGEd0=; b=Tf0dTRuuHWTesVJ5BmTnMGPIXSl4H9Bhc2QVRV9MV5Hqdi7tTBmw+GMS2rTu3hU48U 1rwvmPZbumM4MQ4kiLEmQ6YnbIjsAWSaVdils+CXhvbe0dl0KEK/mjIRY/bG2EUfgYfC fpOZmdAWOy/cRb5d/w8R4yP2XL9PlbBlbYKS/nvd7FEvrLILOQk/P7uegCcnpeF9oF7T HO2x3Ixg08LLTbhKhzoXz4w4y4WsjVJA8S86YX+Eqwvvd9LlrrkrvUPanfYbmAUIB7xU N8UWzO/2FGLvQ824AZhKAfDb6MEshwdyE+KbDmg9pMtkyhA3w4HIXMG231FYcm6aFlm7 Qqfw== X-Gm-Message-State: AOAM530xMGt3ktjT3hXEqlEemAfTNli7nnJeqp1G0L7hwf8L2n5GzgvI YkO4GjWyzAVHzbLFx7OUcS4aCwqk7L5Ygg== X-Google-Smtp-Source: ABdhPJzvAwzKoOAunNI85/tiPMvRTJXBrV2pzYGx34nWHsviDIj876IlOqB4GDjd2lcrh2rYpRjUo90RjQTUhw== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:1941:b0:1bf:5440:d716 with SMTP id nk1-20020a17090b194100b001bf5440d716mr7787622pjb.147.1646958354821; Thu, 10 Mar 2022 16:25:54 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:16 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-15-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 14/26] KVM: x86/mmu: Decouple rmap_add() and link_shadow_page() from kvm_vcpu From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allow adding new entries to the rmap and linking shadow pages without a struct kvm_vcpu pointer by moving the implementation of rmap_add() and link_shadow_page() into inner helper functions. No functional change intended. Reviewed-by: Ben Gardon Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 43 +++++++++++++++++++++++++++--------------- 1 file changed, 28 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index d7ad71be6c52..c57070ed157d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -725,9 +725,9 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } -static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_vcpu *vcpu) +static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_mmu_memory_cache *cache) { - return kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_pte_list_desc_cache); + return kvm_mmu_memory_cache_alloc(cache); } static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc) @@ -874,7 +874,7 @@ gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn, /* * Returns the number of pointers in the rmap chain, not counting the new one. */ -static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte, +static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte, struct kvm_rmap_head *rmap_head) { struct pte_list_desc *desc; @@ -885,7 +885,7 @@ static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte, rmap_head->val = (unsigned long)spte; } else if (!(rmap_head->val & 1)) { rmap_printk("%p %llx 1->many\n", spte, *spte); - desc = mmu_alloc_pte_list_desc(vcpu); + desc = mmu_alloc_pte_list_desc(cache); desc->sptes[0] = (u64 *)rmap_head->val; desc->sptes[1] = spte; desc->spte_count = 2; @@ -897,7 +897,7 @@ static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte, while (desc->spte_count == PTE_LIST_EXT) { count += PTE_LIST_EXT; if (!desc->more) { - desc->more = mmu_alloc_pte_list_desc(vcpu); + desc->more = mmu_alloc_pte_list_desc(cache); desc = desc->more; desc->spte_count = 0; break; @@ -1596,8 +1596,10 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, #define RMAP_RECYCLE_THRESHOLD 1000 -static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, - u64 *spte, gfn_t gfn) +static void __rmap_add(struct kvm *kvm, + struct kvm_mmu_memory_cache *cache, + const struct kvm_memory_slot *slot, + u64 *spte, gfn_t gfn) { struct kvm_mmu_page *sp; struct kvm_rmap_head *rmap_head; @@ -1606,15 +1608,21 @@ static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, sp = sptep_to_sp(spte); kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn); rmap_head = gfn_to_rmap(gfn, sp->role.level, slot); - rmap_count = pte_list_add(vcpu, spte, rmap_head); + rmap_count = pte_list_add(cache, spte, rmap_head); if (rmap_count > RMAP_RECYCLE_THRESHOLD) { - kvm_unmap_rmapp(vcpu->kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0)); + kvm_unmap_rmapp(kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0)); kvm_flush_remote_tlbs_with_address( - vcpu->kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level)); + kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level)); } } +static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, + u64 *spte, gfn_t gfn) +{ + __rmap_add(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, slot, spte, gfn); +} + bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; @@ -1682,13 +1690,13 @@ static unsigned kvm_page_table_hashfn(gfn_t gfn) return hash_64(gfn, KVM_MMU_HASH_SHIFT); } -static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu, +static void mmu_page_add_parent_pte(struct kvm_mmu_memory_cache *cache, struct kvm_mmu_page *sp, u64 *parent_pte) { if (!parent_pte) return; - pte_list_add(vcpu, parent_pte, &sp->parent_ptes); + pte_list_add(cache, parent_pte, &sp->parent_ptes); } static void mmu_page_remove_parent_pte(struct kvm_mmu_page *sp, @@ -2307,8 +2315,8 @@ static void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator) __shadow_walk_next(iterator, *iterator->sptep); } -static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, - struct kvm_mmu_page *sp) +static void __link_shadow_page(struct kvm_mmu_memory_cache *cache, u64 *sptep, + struct kvm_mmu_page *sp) { u64 spte; @@ -2318,12 +2326,17 @@ static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, mmu_spte_set(sptep, spte); - mmu_page_add_parent_pte(vcpu, sp, sptep); + mmu_page_add_parent_pte(cache, sp, sptep); if (sp->unsync_children || sp->unsync) mark_unsync(sptep); } +static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, struct kvm_mmu_page *sp) +{ + __link_shadow_page(&vcpu->arch.mmu_pte_list_desc_cache, sptep, sp); +} + static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned direct_access) { From patchwork Fri Mar 11 00:25:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E24A4C43217 for ; Fri, 11 Mar 2022 00:26:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345202AbiCKA1C (ORCPT ); Thu, 10 Mar 2022 19:27:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345199AbiCKA1A (ORCPT ); Thu, 10 Mar 2022 19:27:00 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 026F81A1C75 for ; Thu, 10 Mar 2022 16:25:57 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id f18-20020a623812000000b004f6a259bbf4so4193894pfa.7 for ; Thu, 10 Mar 2022 16:25:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=O7PTjGgVVBCrsjwnVURq9XzsqcZsMncDOf7hGGBSbd0=; b=kmeAXlYSaDpRLGcUmN4Sv/xTcYDW6jclnp1q1rJJ4dL2RoKdVl5SeZYJBr8GYWH5m+ DKD4iQBnIfLFGWX9cwaqDbad8l6EyX7jK4BZPjX14nJp+Wg87hDMUKTEmjaE3Zz6UHm0 NFQQxbrHmJK9MHiqOLJ7843mXQn7NgLsSzLX/7CnhTcMy0PzlTku645wfWBqEuMBl3yU EwRBFkfr3fYm5vTZkWqKJ8rysORKE9UEYN6dw6cLPFscwbhIaxSWiLkf9LiPAIIADKaf vypriHQ917TpN7rZX2KwCHN3Tpuv7SMZQiwX93ecC+tARg/gwyRbtujWRsbhB0yYIilw sbQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=O7PTjGgVVBCrsjwnVURq9XzsqcZsMncDOf7hGGBSbd0=; b=kQkg2sJyGtRIWqHEApxCC9iSc5s9TzliWUrMl3nj0KqYcDDS666GBYrNzugNTo5nzi /o96yQm5GEolOlvLKl1BYNyktuVWNBODtNlxSW/4667t/0j4KgCM69DrzgPP/1r1y3aY E02JI8vc57KoeDXRjh1JR09MhA6boO8a62Ki8CeRcQ6n/zfTMaxFR1lMAENUb5W/wBtV sSau1LRrjypbiYoyNkE8V85Np8o/Hc5J4UY0wvJAdnHFPqIDxXEQ1SE1tRn+f2Tb1V6c JqwJ5PtmDKGC3K5I1bsuIa/fi5icNQiQSDDUrtWgQnxsB9lmrmS3bh+ntGckkLP6DY1E 202Q== X-Gm-Message-State: AOAM533772PtrSV6FHfuFW6A4n6Tg30/V4Fr/qzY/VwUStK58ZmrA0s7 UOwNUlBR/HqltBfHGCYkBT7x9TCEHEH2Xg== X-Google-Smtp-Source: ABdhPJyfGZbmtHRGGR2TxbnIQFRzmM6E4in1hGACCBqL+EJ7gFLqMSHgs8hy0AVD/5cl522W64O1wnhoHBbOLQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a05:6a00:b92:b0:4f6:dfe0:9abb with SMTP id g18-20020a056a000b9200b004f6dfe09abbmr7383915pfj.68.1646958356428; Thu, 10 Mar 2022 16:25:56 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:17 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-16-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 15/26] KVM: x86/mmu: Update page stats in __rmap_add() From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Update the page stats in __rmap_add() rather than at the call site. This will avoid having to manually update page stats when splitting huge pages in a subsequent commit. No functional change intended. Reviewed-by: Ben Gardon Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c57070ed157d..73a7077f9991 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1607,6 +1607,8 @@ static void __rmap_add(struct kvm *kvm, sp = sptep_to_sp(spte); kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn); + kvm_update_page_stats(kvm, sp->role.level, 1); + rmap_head = gfn_to_rmap(gfn, sp->role.level, slot); rmap_count = pte_list_add(cache, spte, rmap_head); @@ -2847,7 +2849,6 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, if (!was_rmapped) { WARN_ON_ONCE(ret == RET_PF_SPURIOUS); - kvm_update_page_stats(vcpu->kvm, level, 1); rmap_add(vcpu, slot, sptep, gfn); } From patchwork Fri Mar 11 00:25:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777184 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DA35C433F5 for ; Fri, 11 Mar 2022 00:26:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345210AbiCKA1E (ORCPT ); Thu, 10 Mar 2022 19:27:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345205AbiCKA1D (ORCPT ); Thu, 10 Mar 2022 19:27:03 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EA521A1C70 for ; Thu, 10 Mar 2022 16:25:58 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id m9-20020a17090ade0900b001bedf2d1d4cso6787199pjv.2 for ; Thu, 10 Mar 2022 16:25:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=ka577/Q99IMPRmJNwySEEB2HMNKyat5/nl7Ts9gRTDM=; b=MmYv+Rj5M/jSA2Ff3Se9bCgVKVRkMDvLGNqA7/gNeGpPE0ht9lEIJgS6sNOVq+Isze t/ymOK2sA9B+I6DLxG3QrvMdNOMB/bjitaErkmLNQdjFHfhqwPodK3Fhn1Iy2OHyZ19J 5pvzZRrcyKDFIzF+9U3G3rHK1TKLHSkn75JCz3/eLj1NRJMZjsJjl7asO8iIxU+bqDR1 AYieclXsjJ2XD32DoSZzMdRBwub12+YWqqm58qZ5yAEkYAG//8Tbv+dJgpHwu7EWTEsg NrL7n22PSbZf8iAQ9yMWfEsxDeZ/uzEubqfKjv2lpliaUMobGoZLYL4e5QsrRZFNOD0Q PuoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=ka577/Q99IMPRmJNwySEEB2HMNKyat5/nl7Ts9gRTDM=; b=7f4TOoMgNlA4V1c2NCI7VKR2bSoHnyQl+1Ygwlw9VJiThMB4Q5u5hWkfhsLdF6q0eQ 4K8Da2C78bqjt7mxf5GhwNrj9lD6C05QSTE81BGjOBbHtEXqxitEIypazmmHB7FPCmQb tZ3kafcwQT2PBXiykb1Lh0gQ4ZtuPs0epONH6dtbnWwE82I45pnFRarJvyUdzi15w3zD /0N9rVvpMmjP9xZVwo29uKtaq9d21XUR6bp/soWHjE6qCLJ2f65Rwq+9mMuF1t7Ua8Hr e6JKjyX6CWnfrRwcbTHkVJnGkaPZMKwVEyE1WwaxWis1xrTSfGyBv6y5EIyqfBjVCCaD JdUw== X-Gm-Message-State: AOAM5332+yiITwN36kUjZMYG/HzQFeqPGpupEjBmWCLMKQ6+xrMDP+yx nZEbgKv/83MxQ1rqDBZ91hQazI3tcAL6Aw== X-Google-Smtp-Source: ABdhPJxl1HbbReF9yQU4eyO9gbHYGhz96P3jHX6CHhtVHS6VGFBY3jktqYgtMbP+C8rBBM+D/bObxNupHpMttA== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:903:120e:b0:151:71e4:dadf with SMTP id l14-20020a170903120e00b0015171e4dadfmr7886863plh.48.1646958358116; Thu, 10 Mar 2022 16:25:58 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:18 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-17-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 16/26] KVM: x86/mmu: Cache the access bits of shadowed translations From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In order to split a huge page we need to know what access bits to assign to the role of the new child page table. This can't be easily derived from the huge page SPTE itself since KVM applies its own access policies on top, such as for HugePage NX. We could walk the guest page tables to determine the correct access bits, but that is difficult to plumb outside of a vCPU fault context. Instead, we can store the original access bits for each leaf SPTE alongside the GFN in the gfns array. The access bits only take up 3 bits, which leaves 61 bits left over for gfns, which is more than enough. So this change does not require any additional memory. In order to keep the access bit cache in sync with the guest, we have to extend FNAME(sync_page) to also update the access bits. Now that the gfns array caches more information than just GFNs, rename it to shadowed_translation. Signed-off-by: David Matlack --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/mmu.c | 32 +++++++++++++++++++------------- arch/x86/kvm/mmu/mmu_internal.h | 15 +++++++++++++-- arch/x86/kvm/mmu/paging_tmpl.h | 7 +++++-- 4 files changed, 38 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f72e80178ffc..0f5a36772bdc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -694,7 +694,7 @@ struct kvm_vcpu_arch { struct kvm_mmu_memory_cache mmu_pte_list_desc_cache; struct kvm_mmu_memory_cache mmu_shadow_page_cache; - struct kvm_mmu_memory_cache mmu_gfn_array_cache; + struct kvm_mmu_memory_cache mmu_shadowed_translation_cache; struct kvm_mmu_memory_cache mmu_page_header_cache; /* diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 73a7077f9991..89a7a8d7a632 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -708,7 +708,7 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect) if (r) return r; if (maybe_indirect) { - r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_gfn_array_cache, + r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadowed_translation_cache, PT64_ROOT_MAX_LEVEL); if (r) return r; @@ -721,7 +721,7 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) { kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache); - kvm_mmu_free_memory_cache(&vcpu->arch.mmu_gfn_array_cache); + kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_translation_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } @@ -738,15 +738,17 @@ static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc) static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index) { if (!sp->role.direct) - return sp->gfns[index]; + return sp->shadowed_translation[index].gfn; return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS)); } -static void kvm_mmu_page_set_gfn(struct kvm_mmu_page *sp, int index, gfn_t gfn) +static void kvm_mmu_page_set_gfn_access(struct kvm_mmu_page *sp, int index, + gfn_t gfn, u32 access) { if (!sp->role.direct) { - sp->gfns[index] = gfn; + sp->shadowed_translation[index].gfn = gfn; + sp->shadowed_translation[index].access = access; return; } @@ -1599,14 +1601,14 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, static void __rmap_add(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, const struct kvm_memory_slot *slot, - u64 *spte, gfn_t gfn) + u64 *spte, gfn_t gfn, u32 access) { struct kvm_mmu_page *sp; struct kvm_rmap_head *rmap_head; int rmap_count; sp = sptep_to_sp(spte); - kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn); + kvm_mmu_page_set_gfn_access(sp, spte - sp->spt, gfn, access); kvm_update_page_stats(kvm, sp->role.level, 1); rmap_head = gfn_to_rmap(gfn, sp->role.level, slot); @@ -1620,9 +1622,9 @@ static void __rmap_add(struct kvm *kvm, } static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, - u64 *spte, gfn_t gfn) + u64 *spte, gfn_t gfn, u32 access) { - __rmap_add(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, slot, spte, gfn); + __rmap_add(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, slot, spte, gfn, access); } bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) @@ -1683,7 +1685,7 @@ void kvm_mmu_free_shadow_page(struct kvm_mmu_page *sp) { free_page((unsigned long)sp->spt); if (!sp->role.direct) - free_page((unsigned long)sp->gfns); + free_page((unsigned long)sp->shadowed_translation); kmem_cache_free(mmu_page_header_cache, sp); } @@ -1720,8 +1722,12 @@ struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, bool direc sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); + + BUILD_BUG_ON(sizeof(sp->shadowed_translation[0]) != sizeof(u64)); + if (!direct) - sp->gfns = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_gfn_array_cache); + sp->shadowed_translation = + kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadowed_translation_cache); set_page_private(virt_to_page(sp->spt), (unsigned long)sp); @@ -1733,7 +1739,7 @@ struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, bool direc * * Huge page splitting always uses direct shadow pages since the huge page is * being mapped directly with a lower level page table. Thus there's no need to - * allocate the gfns array. + * allocate the shadowed_translation array. */ struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked) { @@ -2849,7 +2855,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, if (!was_rmapped) { WARN_ON_ONCE(ret == RET_PF_SPURIOUS); - rmap_add(vcpu, slot, sptep, gfn); + rmap_add(vcpu, slot, sptep, gfn, pte_access); } return ret; diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index b6e22ba9c654..c5b8ee625df7 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -32,6 +32,11 @@ extern bool dbg; typedef u64 __rcu *tdp_ptep_t; +struct shadowed_translation_entry { + u64 access:3; + u64 gfn:56; +}; + struct kvm_mmu_page { /* * Note, "link" through "spt" fit in a single 64 byte cache line on @@ -53,8 +58,14 @@ struct kvm_mmu_page { gfn_t gfn; u64 *spt; - /* hold the gfn of each spte inside spt */ - gfn_t *gfns; + /* + * For indirect shadow pages, caches the result of the intermediate + * guest translation being shadowed by each SPTE. + * + * NULL for direct shadow pages. + */ + struct shadowed_translation_entry *shadowed_translation; + /* Currently serving as active root */ union { int root_count; diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 55cac59b9c9b..128eccadf1de 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -1014,7 +1014,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, } /* - * Using the cached information from sp->gfns is safe because: + * Using the information in sp->shadowed_translation is safe because: * - The spte has a reference to the struct page, so the pfn for a given gfn * can't change unless all sptes pointing to it are nuked first. * @@ -1088,12 +1088,15 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access)) continue; - if (gfn != sp->gfns[i]) { + if (gfn != sp->shadowed_translation[i].gfn) { drop_spte(vcpu->kvm, &sp->spt[i]); flush = true; continue; } + if (pte_access != sp->shadowed_translation[i].access) + sp->shadowed_translation[i].access = pte_access; + sptep = &sp->spt[i]; spte = *sptep; host_writable = spte & shadow_host_writable_mask; From patchwork Fri Mar 11 00:25:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B251AC433EF for ; Fri, 11 Mar 2022 00:26:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345214AbiCKA1K (ORCPT ); Thu, 10 Mar 2022 19:27:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345196AbiCKA1F (ORCPT ); Thu, 10 Mar 2022 19:27:05 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE5181A2722 for ; Thu, 10 Mar 2022 16:26:00 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id t10-20020a17090a5d8a00b001bed9556134so6765481pji.5 for ; Thu, 10 Mar 2022 16:26:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=cgeEs8BEwlCwQg4uwdq2WGFHwIysPh6phqJCR0F6k5E=; b=rYx7Jqj6Gy3597TptoRhJdrqs/Hp8iKZ54PUtCCP1XWegMJL6/XfhmdhJT9uBJQiWH ZE5h7nYGSmoHWR3jekHywiWqVqzYnIn3iMjZLs1yTffpVYvhuqxw7VtoRdHcmknboDZV JaGspNzGf5/qUKHlZtXWJwiljrvD4FR+upssqyDkEZueUr53Bv3WvwOj/zzr8mYfrJiq k7f2lf8q256HKL3Wuv3+j022awWEg8sSZdUIR/trB9Yyd6Z8krtbJm0h2i+oQ8aXL1QH 3OLc2hD77uQGQ4LMYuHeQex2C+6qz2UP8W2rChJhSKxk3mMJLVkbpnGVBmgnfN6VoZen sB1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=cgeEs8BEwlCwQg4uwdq2WGFHwIysPh6phqJCR0F6k5E=; b=FJaubvKAPrXJt6cZ7jOo6B/FpTy/q3I5WZH/GFj3iOMMuMmrodkCjmoai3lZD1JcEk BCK13UPXoZxvCoAtL8o4Yfem093LsHAThxANls95Gj3o0RHRz+AnxphH21ewGXip5XtZ PCxcvvgL5aloc8nZ1qMYVUEX0AjdPLtp+l9LZQG1NndeBstmBb07wjDy/NYDjdmkV4su 2a8Z/gb+3/s2RjZu2g/tj5r8i8tHSGQgPTwHEuhdH1cjo0WGWyhVrJmw3Lphosb2yEDp D6VlYwqwhIWR880C8BFh5GjgcXySrvIzfB6uCQZxrp1PdlcGzMzerAuqk63akq95xwJE 65QA== X-Gm-Message-State: AOAM531gsihPZ242PIxUc/lcLkaeOJxOfK08XPD1w2/ennK85jPad5YW ymKRWhRTCd7oevTsAy1aqGcedK9a73lpaA== X-Google-Smtp-Source: ABdhPJy4jPXZbommb1140O0m7nX/f+s6PPe4tO5XvxEJ5QF7X5XtVfhvTF/UqH4vCyBIyTH3kSFZezPw37Zl0Q== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:1e10:b0:1bf:6c78:54a9 with SMTP id pg16-20020a17090b1e1000b001bf6c7854a9mr321715pjb.1.1646958359778; Thu, 10 Mar 2022 16:25:59 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:19 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-18-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 17/26] KVM: x86/mmu: Pass access information to make_huge_page_split_spte() From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently make_huge_page_split_spte() assumes execute permissions can be granted to any 4K SPTE when splitting huge pages. This is true for the TDP MMU but is not necessarily true for the shadow MMU. Huge pages mapped by the shadow MMU may be shadowing huge pages that the guest has disallowed execute permissions. No functional change intended. Reviewed-by: Ben Gardon Signed-off-by: David Matlack --- arch/x86/kvm/mmu/spte.c | 5 +++-- arch/x86/kvm/mmu/spte.h | 3 ++- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index d10189d9c877..7294f95464a7 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -216,7 +216,8 @@ static u64 make_spte_executable(u64 spte) * This is used during huge page splitting to build the SPTEs that make up the * new page table. */ -u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index) +u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index, + unsigned int access) { u64 child_spte; int child_level; @@ -244,7 +245,7 @@ u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index) * When splitting to a 4K page, mark the page executable as the * NX hugepage mitigation no longer applies. */ - if (is_nx_huge_page_enabled()) + if (is_nx_huge_page_enabled() && (access & ACC_EXEC_MASK)) child_spte = make_spte_executable(child_spte); } diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 73f12615416f..c7ccdd5c440d 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -415,7 +415,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, unsigned int pte_access, gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool prefetch, bool can_unsync, bool host_writable, u64 *new_spte); -u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index); +u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index, + unsigned int access); u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled); u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access); u64 mark_spte_for_access_track(u64 spte); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 85b7bc333302..541b145b2df2 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1430,7 +1430,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, * not been linked in yet and thus is not reachable from any other CPU. */ for (i = 0; i < PT64_ENT_PER_PAGE; i++) - sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i); + sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i, ACC_ALL); /* * Replace the huge spte with a pointer to the populated lower level From patchwork Fri Mar 11 00:25:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777186 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3D56C433FE for ; Fri, 11 Mar 2022 00:26:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238498AbiCKA1O (ORCPT ); Thu, 10 Mar 2022 19:27:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345222AbiCKA1M (ORCPT ); Thu, 10 Mar 2022 19:27:12 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9333D1A1C7A for ; Thu, 10 Mar 2022 16:26:02 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id q7-20020a63e207000000b003801b9bb18dso3777929pgh.15 for ; Thu, 10 Mar 2022 16:26:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=sbKWGus3ebobJm3+FVUoYXRe2tQF3TQMqNwj5vm+oQs=; b=rVXfocbRBiQbddq7Mymh8A5+rpjWdaiL9E9VqTuaVfs85neIgNVqmcY2867zAHwnJa 9TkxrBsa1FXUF6fizdhytevf03ZL9ZGE36n9+/W481j9IRCF6RPrAqCOuCHs8jZNwKIx WZ8wZGpBYjcCDJjxR4zZSW6m8K6/KkXuyxJuluinWK57EZ80Yg2P+y55gHFkhA6w9Oa1 ZjUrekfxltWEIcYp6tKHwVmxG/FoOfIojaUB8GXZiciN0zPQtoYypB9jwqwoqblMNs58 x+sWxZzzU0AXvB2UhlDpsSCnQfHpWc1MNB/G1fw4SMqXvl1aRl6tolbMQZm49L18DnHb iqyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=sbKWGus3ebobJm3+FVUoYXRe2tQF3TQMqNwj5vm+oQs=; b=HnjovCkYfyHrMlzxNRvXUe9Ha8uecYqY04F2C9oSecVjaD8j8FP3nJmfSH19kcw1AE h55Nn1u42ADjwpvV6nF7eV1tK5LvzP8aBgT2apfmjx5sVNva0CyvED0qkkF93B1DuE57 RQgsG/4b8Kfe0+EKR4uNnmv/h6ct/kXCSKQc00AJgHCqO3rmJMRlwsv1RQ6WUApHLq3C vhUI7/fdA3HsMdzFwfinO/LIkC/GN75XCc9bswAAZLzRSYWeF6UFsuBsukIA4auK4kPy wS8LeBNeoWdGWIkHhCMPLGoU1TJM6RmGOivSc1SOPLrVQ7agk8fElwGD5wNdjfuCItpW GCyg== X-Gm-Message-State: AOAM532F3lHCrwqL3tzVwpYo08E0IptSJbhcuJZ0O63Jauf8ERNDdFzO yCQblkLJLF4mbkH3KCVIG4EjOZmyW6TFng== X-Google-Smtp-Source: ABdhPJzlF0dOeLQBpCrN6EQ/1uv182Ojcd8GXVsCcPw3S8P1mroh+hcuj+7wbKMkJW298Mx+vaz9w72iGDnQxg== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a05:6a00:890:b0:4f6:686e:a8a9 with SMTP id q16-20020a056a00089000b004f6686ea8a9mr7226414pfj.83.1646958361665; Thu, 10 Mar 2022 16:26:01 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:20 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-19-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 18/26] KVM: x86/mmu: Zap collapsible SPTEs at all levels in the shadow MMU From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently KVM only zaps collapsible 4KiB SPTEs in the shadow MMU (i.e. in the rmap). This is fine for now KVM never creates intermediate huge pages during dirty logging, i.e. a 1GiB page is never partially split to a 2MiB page. However, this will stop being true once the shadow MMU participates in eager page splitting, which can in fact leave behind partially split huge pages. In preparation for that change, change the shadow MMU to iterate over all necessary levels when zapping collapsible SPTEs. No functional change intended. Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 89a7a8d7a632..2032be3edd71 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6142,18 +6142,30 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, return need_tlb_flush; } +static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm, + const struct kvm_memory_slot *slot) +{ + bool flush; + + /* + * Note, use KVM_MAX_HUGEPAGE_LEVEL - 1 since there's no need to zap + * pages that are already mapped at the maximum possible level. + */ + flush = slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte, + PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL - 1, + true); + + if (flush) + kvm_arch_flush_remote_tlbs_memslot(kvm, slot); + +} + void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, const struct kvm_memory_slot *slot) { if (kvm_memslots_have_rmaps(kvm)) { write_lock(&kvm->mmu_lock); - /* - * Zap only 4k SPTEs since the legacy MMU only supports dirty - * logging at a 4k granularity and never creates collapsible - * 2m SPTEs during dirty logging. - */ - if (slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true)) - kvm_arch_flush_remote_tlbs_memslot(kvm, slot); + kvm_rmap_zap_collapsible_sptes(kvm, slot); write_unlock(&kvm->mmu_lock); } From patchwork Fri Mar 11 00:25:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E253FC433F5 for ; Fri, 11 Mar 2022 00:26:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345196AbiCKA1P (ORCPT ); Thu, 10 Mar 2022 19:27:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235520AbiCKA1N (ORCPT ); Thu, 10 Mar 2022 19:27:13 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 079B31A2712 for ; Thu, 10 Mar 2022 16:26:03 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id t62-20020a635f41000000b0037c9ae5fb8bso3766725pgb.19 for ; Thu, 10 Mar 2022 16:26:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=25+KDsWb88cITvAkDtK/k2hukUBdQk6h1+nWravTP4A=; b=K0TPB3uKAzhv2V8uMMeBatFQY8hZWLm9Du6sCL0+SuvLt4cuZsnxgLsdX/S1nOvVtD fOJgb/6kxC900RYeJvh122bm8r8COP0dUdAfk7iS0A5hw7OsHXDqi9bwu6Fcka2oVXTA wP2mdyJD6Bh+sA7+tTKBHIc8ZtBem1Ya8VZxq5wGKapUL1k9nJoMFuAfBjE8WKCxiWN1 OlgrdMpnZH9W0FOCWtVCHjw1n/zefXLlVMhEeaAI5THutPCrG55DAjR14Ub0YTKdRBav 4LhtJZs/D/q8/J1cBMRrkXueu0GUwMdyz890TM3Fupl3hdRaVYavWFTBmZ/OW44pfIAC zJdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=25+KDsWb88cITvAkDtK/k2hukUBdQk6h1+nWravTP4A=; b=f70+PgrRu3uVGSrrhTF839iiBXir7GfU0BoD9d/6YVHiXsaqaHvn3gcPnyik5bltj2 TMeth72K7rP67y1bPDVtWJWzF3f8qwfGSs3CQyW11KBBjuKEywn7tizUlzyazshh46qZ 3PPdb8vD8nSqBvgtQegYo6fEZPsndJzkVhjzSkJdw9Bxk7G1MJzOrTBUyehgU4IGkfje xgmBS2/W4c9ayzI4eld7gWhPhs4ovQGjJ2+dtxtfjvk4Tug1AosKkJaWdPvjo1CNjZi/ wVcd3FfdrGy2mn1+IlCPE99GvpBBb+Mdtct6vAJu1SxoVcM/8JpO5KmxwA/pkOUfA2zX Q/ug== X-Gm-Message-State: AOAM532vjI7/EVcRqW+mdb/8DCuKQn7PfiWNH7OaDEWmzNdwS4sbJLrX kTgJdHPIfY2TwQvjC6owNK6imh0rFTo3Wg== X-Google-Smtp-Source: ABdhPJzQQNzCUZydl9KF0V/z0eslYsBHitU/RkIzZ5f2qOcoErUZJMAOQ9EJr1dof3fsKDR0fIjt+SaohHYUBw== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:6a86:b0:151:f1c5:2fa3 with SMTP id n6-20020a1709026a8600b00151f1c52fa3mr7493244plk.77.1646958363421; Thu, 10 Mar 2022 16:26:03 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:21 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-20-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 19/26] KVM: x86/mmu: Refactor drop_large_spte() From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org drop_large_spte() drops a large SPTE if it exists and then flushes TLBs. Its helper function, __drop_large_spte(), does the drop without the flush. In preparation for eager page splitting, which will need to sometimes flush when dropping large SPTEs (and sometimes not), push the flushing logic down into __drop_large_spte() and add a bool parameter to control it. No functional change intended. Signed-off-by: David Matlack Reviewed-by: Peter Xu --- arch/x86/kvm/mmu/mmu.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 2032be3edd71..926ddfaa9e1a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1150,28 +1150,29 @@ static void drop_spte(struct kvm *kvm, u64 *sptep) rmap_remove(kvm, sptep); } - -static bool __drop_large_spte(struct kvm *kvm, u64 *sptep) +static void __drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush) { - if (is_large_pte(*sptep)) { - WARN_ON(sptep_to_sp(sptep)->role.level == PG_LEVEL_4K); - drop_spte(kvm, sptep); - return true; - } + struct kvm_mmu_page *sp; - return false; -} + if (!is_large_pte(*sptep)) + return; -static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep) -{ - if (__drop_large_spte(vcpu->kvm, sptep)) { - struct kvm_mmu_page *sp = sptep_to_sp(sptep); + sp = sptep_to_sp(sptep); + WARN_ON(sp->role.level == PG_LEVEL_4K); - kvm_flush_remote_tlbs_with_address(vcpu->kvm, sp->gfn, + drop_spte(kvm, sptep); + + if (flush) { + kvm_flush_remote_tlbs_with_address(kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level)); } } +static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep) +{ + return __drop_large_spte(vcpu->kvm, sptep, true); +} + /* * Write-protect on the specified @sptep, @pt_protect indicates whether * spte write-protection is caused by protecting shadow page table. From patchwork Fri Mar 11 00:25:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777188 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BECAC433EF for ; Fri, 11 Mar 2022 00:26:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345217AbiCKA1Q (ORCPT ); Thu, 10 Mar 2022 19:27:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345216AbiCKA1P (ORCPT ); Thu, 10 Mar 2022 19:27:15 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A7061A274E for ; Thu, 10 Mar 2022 16:26:05 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 196-20020a6307cd000000b0038027886594so3790052pgh.4 for ; Thu, 10 Mar 2022 16:26:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=WA6HST15t1Zfp54ZQZHtPJ91hXGCHATf6ynvL/uqJAw=; b=a47vI4e5DCLaCtH52PqEaW29IZ49EU6bu9Rtjt5nca1GCTKOBl18RsRsz75mldZGaG XnFfgd02wvDk2wnozXhQbnICqjNQs17WnM/32z1+uCSsnejAyzh8b0lDcVaydkV54BuQ V3vkbcLloBzuDdED20UrMdE3Il0cIOBfanM0k5idb7yf57pB7WbDwd0NOg5s/9YoDsXc KHFuZ//SEJSl+hkOJCD4FX9bUeeL1FUDpeJTzgxJ5j3IzEcRqlj5a846ofSfsEvxbauK 2oj/JLoCtYAnLAYlhHcJSawLqdh5mHiPv0oIRiaTtKtya1rdgCovUPMxIz7AwSjgxD0M 1qbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=WA6HST15t1Zfp54ZQZHtPJ91hXGCHATf6ynvL/uqJAw=; b=xx6/zj5MDfMvedMwyp1G0ZWfkT/OZRXf1PRMnY0/RBqwOYpq7rGFI/rwWNi1lSOd+f Ei1OsHEpqtiQxUIyJMwJxzE7rG/Ye8ANSHdjv89bWGSADZU6XJQqfxvcDPwwZPhVcvsO OZj6MfLg2O7EVdibMg6JIg0cACqiZK42o9rhYiu4DEHlyFgQQ2vUbwJ2U7NB3Rfi8THp 72oS9ExS4CrosrL9belQ9aSaiZ9bZVFI+8L1ZFUZmEndntpqFTWuOcU5bJWb57XPVICz DHP1hCPUebErELG9fvKZl+MkkdbhvCfbrUoEIWPrafbGEjYSsncFGQUsfCx+NhLIe5A1 xP8g== X-Gm-Message-State: AOAM533bJrMp404ek0idQF9sVNg36qld4FCWZRB63MH4/zlWorK1uHb6 7w8g13vhCcGodbeQZz2KdZ6MWfzXySUbzw== X-Google-Smtp-Source: ABdhPJwYWz8wUmi2nMsvX7GLlR+W5kd6r41rJtex5GTM+ljEkO3lCJi7MRe0hudTcE7ycgRLMGPLjQQzzRCeJg== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90a:c984:b0:1bf:aee2:3503 with SMTP id w4-20020a17090ac98400b001bfaee23503mr7712377pjt.28.1646958365095; Thu, 10 Mar 2022 16:26:05 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:22 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-21-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 20/26] KVM: x86/mmu: Extend Eager Page Splitting to the shadow MMU From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Extend KVM's eager page splitting to also split huge pages that are mapped by the shadow MMU. Specifically, walk through the rmap splitting all 1GiB pages to 2MiB pages, and splitting all 2MiB pages to 4KiB pages. Splitting huge pages mapped by the shadow MMU requries dealing with some extra complexity beyond that of the TDP MMU: (1) The shadow MMU has a limit on the number of shadow pages that are allowed to be allocated. So, as a policy, Eager Page Splitting refuses to split if there are KVM_MIN_FREE_MMU_PAGES or fewer pages available. (2) Huge pages may be mapped by indirect shadow pages which have the possibility of being unsync. As a policy we opt not to split such pages as their translation may no longer be valid. (3) Splitting a huge page may end up re-using an existing lower level shadow page tables. This is unlike the TDP MMU which always allocates new shadow page tables when splitting. This commit does *not* handle such aliasing and opts not to split such huge pages. (4) When installing the lower level SPTEs, they must be added to the rmap which may require allocating additional pte_list_desc structs. This commit does *not* handle such cases and instead opts to leave such lower-level SPTEs non-present. In this situation TLBs must be flushed before dropping the MMU lock as a portion of the huge page region is being unmapped. Suggested-by: Peter Feiner [ This commit is based off of the original implementation of Eager Page Splitting from Peter in Google's kernel from 2016. ] Signed-off-by: David Matlack --- .../admin-guide/kernel-parameters.txt | 3 - arch/x86/kvm/mmu/mmu.c | 307 ++++++++++++++++++ 2 files changed, 307 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 05161afd7642..495f6ac53801 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2360,9 +2360,6 @@ the KVM_CLEAR_DIRTY ioctl, and only for the pages being cleared. - Eager page splitting currently only supports splitting - huge pages mapped by the TDP MMU. - Default is Y (on). kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 926ddfaa9e1a..dd56b5b9624f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -727,6 +727,11 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_mmu_memory_cache *cache) { + static const gfp_t gfp_nocache = GFP_ATOMIC | __GFP_ACCOUNT | __GFP_ZERO; + + if (WARN_ON_ONCE(!cache)) + return kmem_cache_alloc(pte_list_desc_cache, gfp_nocache); + return kvm_mmu_memory_cache_alloc(cache); } @@ -743,6 +748,28 @@ static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index) return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS)); } +static gfn_t sptep_to_gfn(u64 *sptep) +{ + struct kvm_mmu_page *sp = sptep_to_sp(sptep); + + return kvm_mmu_page_get_gfn(sp, sptep - sp->spt); +} + +static unsigned int kvm_mmu_page_get_access(struct kvm_mmu_page *sp, int index) +{ + if (!sp->role.direct) + return sp->shadowed_translation[index].access; + + return sp->role.access; +} + +static unsigned int sptep_to_access(u64 *sptep) +{ + struct kvm_mmu_page *sp = sptep_to_sp(sptep); + + return kvm_mmu_page_get_access(sp, sptep - sp->spt); +} + static void kvm_mmu_page_set_gfn_access(struct kvm_mmu_page *sp, int index, gfn_t gfn, u32 access) { @@ -912,6 +939,9 @@ static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte, return count; } +static struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level, + const struct kvm_memory_slot *slot); + static void pte_list_desc_remove_entry(struct kvm_rmap_head *rmap_head, struct pte_list_desc *desc, int i, @@ -2125,6 +2155,23 @@ static struct kvm_mmu_page *__kvm_mmu_find_shadow_page(struct kvm *kvm, return sp; } +static struct kvm_mmu_page *kvm_mmu_find_direct_sp(struct kvm *kvm, gfn_t gfn, + union kvm_mmu_page_role role) +{ + struct kvm_mmu_page *sp; + LIST_HEAD(invalid_list); + + BUG_ON(!role.direct); + + sp = __kvm_mmu_find_shadow_page(kvm, gfn, role, &invalid_list); + + /* Direct SPs are never unsync. */ + WARN_ON_ONCE(sp && sp->unsync); + + kvm_mmu_commit_zap_page(kvm, &invalid_list); + return sp; +} + /* * Looks up an existing SP for the given gfn and role if one exists. The * return SP is guaranteed to be synced. @@ -6063,12 +6110,266 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); } +static int prepare_to_split_huge_page(struct kvm *kvm, + const struct kvm_memory_slot *slot, + u64 *huge_sptep, + struct kvm_mmu_page **spp, + bool *flush, + bool *dropped_lock) +{ + int r = 0; + + *dropped_lock = false; + + if (kvm_mmu_available_pages(kvm) <= KVM_MIN_FREE_MMU_PAGES) + return -ENOSPC; + + if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) + goto drop_lock; + + *spp = kvm_mmu_alloc_direct_sp_for_split(true); + if (r) + goto drop_lock; + + return 0; + +drop_lock: + if (*flush) + kvm_arch_flush_remote_tlbs_memslot(kvm, slot); + + *flush = false; + *dropped_lock = true; + + write_unlock(&kvm->mmu_lock); + cond_resched(); + *spp = kvm_mmu_alloc_direct_sp_for_split(false); + if (!*spp) + r = -ENOMEM; + write_lock(&kvm->mmu_lock); + + return r; +} + +static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm, + const struct kvm_memory_slot *slot, + u64 *huge_sptep, + struct kvm_mmu_page **spp) +{ + struct kvm_mmu_page *split_sp; + union kvm_mmu_page_role role; + unsigned int access; + gfn_t gfn; + + gfn = sptep_to_gfn(huge_sptep); + access = sptep_to_access(huge_sptep); + + /* + * Huge page splitting always uses direct shadow pages since we are + * directly mapping the huge page GFN region with smaller pages. + */ + role = kvm_mmu_child_role(huge_sptep, true, access); + split_sp = kvm_mmu_find_direct_sp(kvm, gfn, role); + + /* + * Opt not to split if the lower-level SP already exists. This requires + * more complex handling as the SP may be already partially filled in + * and may need extra pte_list_desc structs to update parent_ptes. + */ + if (split_sp) + return NULL; + + swap(split_sp, *spp); + init_shadow_page(kvm, split_sp, slot, gfn, role); + trace_kvm_mmu_get_page(split_sp, true); + + return split_sp; +} + +static int kvm_mmu_split_huge_page(struct kvm *kvm, + const struct kvm_memory_slot *slot, + u64 *huge_sptep, struct kvm_mmu_page **spp, + bool *flush) + +{ + struct kvm_mmu_page *split_sp; + u64 huge_spte, split_spte; + int split_level, index; + unsigned int access; + u64 *split_sptep; + gfn_t split_gfn; + + split_sp = kvm_mmu_get_sp_for_split(kvm, slot, huge_sptep, spp); + if (!split_sp) + return -EOPNOTSUPP; + + /* + * Since we did not allocate pte_list_desc_structs for the split, we + * cannot add a new parent SPTE to parent_ptes. This should never happen + * in practice though since this is a fresh SP. + * + * Note, this makes it safe to pass NULL to __link_shadow_page() below. + */ + if (WARN_ON_ONCE(split_sp->parent_ptes.val)) + return -EINVAL; + + huge_spte = READ_ONCE(*huge_sptep); + + split_level = split_sp->role.level; + access = split_sp->role.access; + + for (index = 0; index < PT64_ENT_PER_PAGE; index++) { + split_sptep = &split_sp->spt[index]; + split_gfn = kvm_mmu_page_get_gfn(split_sp, index); + + BUG_ON(is_shadow_present_pte(*split_sptep)); + + /* + * Since we did not allocate pte_list_desc structs for the + * split, we can't add a new SPTE that maps this GFN. + * Skipping this SPTE means we're only partially mapping the + * huge page, which means we'll need to flush TLBs before + * dropping the MMU lock. + * + * Note, this make it safe to pass NULL to __rmap_add() below. + */ + if (gfn_to_rmap(split_gfn, split_level, slot)->val) { + *flush = true; + continue; + } + + split_spte = make_huge_page_split_spte( + huge_spte, split_level + 1, index, access); + + mmu_spte_set(split_sptep, split_spte); + __rmap_add(kvm, NULL, slot, split_sptep, split_gfn, access); + } + + /* + * Replace the huge spte with a pointer to the populated lower level + * page table. Since we are making this change without a TLB flush vCPUs + * will see a mix of the split mappings and the original huge mapping, + * depending on what's currently in their TLB. This is fine from a + * correctness standpoint since the translation will either be identical + * or non-present. To account for non-present mappings, the TLB will be + * flushed prior to dropping the MMU lock. + */ + __drop_large_spte(kvm, huge_sptep, false); + __link_shadow_page(NULL, huge_sptep, split_sp); + + return 0; +} + +static bool should_split_huge_page(u64 *huge_sptep) +{ + struct kvm_mmu_page *huge_sp = sptep_to_sp(huge_sptep); + + if (WARN_ON_ONCE(!is_large_pte(*huge_sptep))) + return false; + + if (huge_sp->role.invalid) + return false; + + /* + * As a policy, do not split huge pages if SP on which they reside + * is unsync. Unsync means the guest is modifying the page table being + * shadowed by huge_sp, so splitting may be a waste of cycles and + * memory. + */ + if (huge_sp->unsync) + return false; + + return true; +} + +static bool rmap_try_split_huge_pages(struct kvm *kvm, + struct kvm_rmap_head *rmap_head, + const struct kvm_memory_slot *slot) +{ + struct kvm_mmu_page *sp = NULL; + struct rmap_iterator iter; + u64 *huge_sptep, spte; + bool flush = false; + bool dropped_lock; + int level; + gfn_t gfn; + int r; + +restart: + for_each_rmap_spte(rmap_head, &iter, huge_sptep) { + if (!should_split_huge_page(huge_sptep)) + continue; + + spte = *huge_sptep; + level = sptep_to_sp(huge_sptep)->role.level; + gfn = sptep_to_gfn(huge_sptep); + + r = prepare_to_split_huge_page(kvm, slot, huge_sptep, &sp, &flush, &dropped_lock); + if (r) { + trace_kvm_mmu_split_huge_page(gfn, spte, level, r); + break; + } + + if (dropped_lock) + goto restart; + + r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp, &flush); + + trace_kvm_mmu_split_huge_page(gfn, spte, level, r); + + /* + * If splitting is successful we must restart the iterator + * because huge_sptep has just been removed from it. + */ + if (!r) + goto restart; + } + + if (sp) + kvm_mmu_free_shadow_page(sp); + + return flush; +} + +static void kvm_rmap_try_split_huge_pages(struct kvm *kvm, + const struct kvm_memory_slot *slot, + gfn_t start, gfn_t end, + int target_level) +{ + bool flush; + int level; + + /* + * Split huge pages starting with KVM_MAX_HUGEPAGE_LEVEL and working + * down to the target level. This ensures pages are recursively split + * all the way to the target level. There's no need to split pages + * already at the target level. + * + * Note that TLB flushes must be done before dropping the MMU lock since + * rmap_try_split_huge_pages() may partially split any given huge page, + * i.e. it may effectively unmap (make non-present) a portion of the + * huge page. + */ + for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--) { + flush = slot_handle_level_range(kvm, slot, + rmap_try_split_huge_pages, + level, level, start, end - 1, + true, flush); + } + + if (flush) + kvm_arch_flush_remote_tlbs_memslot(kvm, slot); +} + /* Must be called with the mmu_lock held in write-mode. */ void kvm_mmu_try_split_huge_pages(struct kvm *kvm, const struct kvm_memory_slot *memslot, u64 start, u64 end, int target_level) { + if (kvm_memslots_have_rmaps(kvm)) + kvm_rmap_try_split_huge_pages(kvm, memslot, start, end, + target_level); + if (is_tdp_mmu_enabled(kvm)) kvm_tdp_mmu_try_split_huge_pages(kvm, memslot, start, end, target_level, false); @@ -6086,6 +6387,12 @@ void kvm_mmu_slot_try_split_huge_pages(struct kvm *kvm, u64 start = memslot->base_gfn; u64 end = start + memslot->npages; + if (kvm_memslots_have_rmaps(kvm)) { + write_lock(&kvm->mmu_lock); + kvm_rmap_try_split_huge_pages(kvm, memslot, start, end, target_level); + write_unlock(&kvm->mmu_lock); + } + if (is_tdp_mmu_enabled(kvm)) { read_lock(&kvm->mmu_lock); kvm_tdp_mmu_try_split_huge_pages(kvm, memslot, start, end, target_level, true); From patchwork Fri Mar 11 00:25:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777189 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 601DEC4332F for ; Fri, 11 Mar 2022 00:26:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345224AbiCKA1S (ORCPT ); Thu, 10 Mar 2022 19:27:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345219AbiCKA1R (ORCPT ); Thu, 10 Mar 2022 19:27:17 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6482A1A275B for ; Thu, 10 Mar 2022 16:26:07 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id e14-20020a17090a684e00b001bf09ac2385so4254064pjm.1 for ; Thu, 10 Mar 2022 16:26:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=GR+ED1Eu59W4o3nJAyG55WZVxFJ9/aqXQ0I8zOq5KWw=; b=OaktuYZQ/pvIKWnmmZaGYUKQVLGED+tckPX6EQ/Hge7NnOHi+0dW6e3BHGTVrJ5LSe zYj8P0q2JukbFWsyULC/88FupX6jIBpbFfharxjl39QDAs9lanUyXICTSDK8cj3pDZOV WEWEMonrACc5oHh0JE70KDRZRDGPSgYrCPwzJsWJm+sQ1qVmyolq8NXGcq0IrnRYyig7 nNTneUiLNKNAjzW7VEGLrEYY1dnIvKM+6ysBXeasZcBA2ng++NQvJ9sjXPuw854de0H0 XQPTUr5op2/4j+/a0wymfSm9VmfA19cou++hpRyPqqizmjhN5tVB2h0HSqgqbNMKiYrm VL0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=GR+ED1Eu59W4o3nJAyG55WZVxFJ9/aqXQ0I8zOq5KWw=; b=yJHxZMqL1iWeuAONko/Pd33cdZwohPMNhvYBdowDo2OVUOcqahoGxYSpBfbO7Eag0h skswaZi/k9kqXQnwNwgI/UawrwfZgyeBrXVeZehBiq+uQqHs27oO4/oQatypW35YMyCr Ka+mBBmR5LaKQUHfl2bHclYjqNO/9OEBIBwtckv989gS9Ue37j2Ye0Y6X09aZo9PJgnB ZbephaDDHBpTu/yUuXLblpry3U3NQL3RR9A3E8fyiqXbvkJXkLMbFbVlND8KeOLYrPVA FeTRenu4aKXRcFeiVNH8jpztnv6V87sAfOgioFj43X4+LwxTR3avnn8JPpe0ZedZqDP8 x0ow== X-Gm-Message-State: AOAM5337WT7nsfLaSDxxe+T6wLEDH2EQKv46o/W3cnjbqIh5bv3r+oHI g2glAVqmMxW9g9FGj67ruAC8AnIQD3JX9w== X-Google-Smtp-Source: ABdhPJzHaoRrrUKazsHH7xWX2bJBJdhTwzNXL7WwckHPvdlf9FodGi4JOJFbVyjzmnYSAL0OQxSui6or0+UizQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:7e4d:b0:14f:e295:5a41 with SMTP id a13-20020a1709027e4d00b0014fe2955a41mr7628012pln.27.1646958366783; Thu, 10 Mar 2022 16:26:06 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:23 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-22-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 21/26] KVM: Allow for different capacities in kvm_mmu_memory_cache structs From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at declaration time rather than being fixed for all declarations. This will be used in a follow-up commit to declare an cache in x86 with a capacity of 512+ objects without having to increase the capacity of all caches in KVM. This change requires each cache now specify its capacity at runtime, since the cache struct itself no longer has a fixed capacity known at compile time. To protect against someone accidentally defining a kvm_mmu_memory_cache struct directly (without the extra storage), this commit includes a WARN_ON() in kvm_mmu_topup_memory_cache(). This change, unfortunately, adds some grottiness to kvm_phys_addr_ioremap() in arm64, which uses a function-local (i.e. stack-allocated) kvm_mmu_memory_cache struct. Since C does not allow anonymous structs in functions, the new wrapper struct that contains kvm_mmu_memory_cache and the objects pointer array, must be named, which means dealing with an outer and inner struct. The outer struct can't be dropped since then there would be no guarantee the kvm_mmu_memory_cache struct and objects array would be laid out consecutively on the stack. No functional change intended. Signed-off-by: David Matlack --- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/mmu.c | 13 +++++++++---- arch/mips/include/asm/kvm_host.h | 2 +- arch/mips/kvm/mips.c | 2 ++ arch/riscv/include/asm/kvm_host.h | 2 +- arch/riscv/kvm/vcpu.c | 1 + arch/x86/include/asm/kvm_host.h | 8 ++++---- arch/x86/kvm/mmu/mmu.c | 9 +++++++++ include/linux/kvm_types.h | 19 +++++++++++++++++-- virt/kvm/kvm_main.c | 10 +++++++++- 11 files changed, 55 insertions(+), 14 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 5bc01e62c08a..1369415290dd 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -357,7 +357,7 @@ struct kvm_vcpu_arch { bool pause; /* Cache some mmu pages needed inside spinlock regions */ - struct kvm_mmu_memory_cache mmu_page_cache; + DEFINE_KVM_MMU_MEMORY_CACHE(mmu_page_cache); /* Target CPU and feature flags */ int target; diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index ecc5958e27fe..5e38385be0ef 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -319,6 +319,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.target = -1; bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE; vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO; /* Set up the timer */ diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index bc2aba953299..940089ba65ad 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -765,7 +765,12 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, { phys_addr_t addr; int ret = 0; - struct kvm_mmu_memory_cache cache = { 0, __GFP_ZERO, NULL, }; + DEFINE_KVM_MMU_MEMORY_CACHE(cache) page_cache = { + .cache = { + .gfp_zero = __GFP_ZERO, + .capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE, + }, + }; struct kvm_pgtable *pgt = kvm->arch.mmu.pgt; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE | KVM_PGTABLE_PROT_R | @@ -778,14 +783,14 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, guest_ipa &= PAGE_MASK; for (addr = guest_ipa; addr < guest_ipa + size; addr += PAGE_SIZE) { - ret = kvm_mmu_topup_memory_cache(&cache, + ret = kvm_mmu_topup_memory_cache(&page_cache.cache, kvm_mmu_cache_min_pages(kvm)); if (ret) break; spin_lock(&kvm->mmu_lock); ret = kvm_pgtable_stage2_map(pgt, addr, PAGE_SIZE, pa, prot, - &cache); + &page_cache.cache); spin_unlock(&kvm->mmu_lock); if (ret) break; @@ -793,7 +798,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, pa += PAGE_SIZE; } - kvm_mmu_free_memory_cache(&cache); + kvm_mmu_free_memory_cache(&page_cache.cache); return ret; } diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h index 717716cc51c5..935511d7fc3a 100644 --- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -347,7 +347,7 @@ struct kvm_vcpu_arch { unsigned long pending_exceptions_clr; /* Cache some mmu pages needed inside spinlock regions */ - struct kvm_mmu_memory_cache mmu_page_cache; + DEFINE_KVM_MMU_MEMORY_CACHE(mmu_page_cache); /* vcpu's vzguestid is different on each host cpu in an smp system */ u32 vzguestid[NR_CPUS]; diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index a25e0b73ee70..45c7179144dc 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -387,6 +387,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) if (err) goto out_free_gebase; + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE; + return 0; out_free_gebase: diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index 99ef6a120617..5bd4902ebda3 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -186,7 +186,7 @@ struct kvm_vcpu_arch { struct kvm_sbi_context sbi_context; /* Cache pages needed to program page tables with spinlock held */ - struct kvm_mmu_memory_cache mmu_page_cache; + DEFINE_KVM_MMU_MEMORY_CACHE(mmu_page_cache); /* VCPU power-off state */ bool power_off; diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 624166004e36..6a5f5aa45bac 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -94,6 +94,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) /* Mark this VCPU never ran */ vcpu->arch.ran_atleast_once = false; + vcpu->arch.mmu_page_cache.capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE; vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO; /* Setup ISA features available to VCPU */ diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0f5a36772bdc..544dde11963b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -692,10 +692,10 @@ struct kvm_vcpu_arch { */ struct kvm_mmu *walk_mmu; - struct kvm_mmu_memory_cache mmu_pte_list_desc_cache; - struct kvm_mmu_memory_cache mmu_shadow_page_cache; - struct kvm_mmu_memory_cache mmu_shadowed_translation_cache; - struct kvm_mmu_memory_cache mmu_page_header_cache; + DEFINE_KVM_MMU_MEMORY_CACHE(mmu_pte_list_desc_cache); + DEFINE_KVM_MMU_MEMORY_CACHE(mmu_shadow_page_cache); + DEFINE_KVM_MMU_MEMORY_CACHE(mmu_shadowed_translation_cache); + DEFINE_KVM_MMU_MEMORY_CACHE(mmu_page_header_cache); /* * QEMU userspace and the guest each have their own FPU state. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index dd56b5b9624f..24e7e053e05b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5817,12 +5817,21 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) { int ret; + vcpu->arch.mmu_pte_list_desc_cache.capacity = + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE; vcpu->arch.mmu_pte_list_desc_cache.kmem_cache = pte_list_desc_cache; vcpu->arch.mmu_pte_list_desc_cache.gfp_zero = __GFP_ZERO; + vcpu->arch.mmu_page_header_cache.capacity = + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE; vcpu->arch.mmu_page_header_cache.kmem_cache = mmu_page_header_cache; vcpu->arch.mmu_page_header_cache.gfp_zero = __GFP_ZERO; + vcpu->arch.mmu_shadowed_translation_cache.capacity = + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE; + + vcpu->arch.mmu_shadow_page_cache.capacity = + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE; vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO; vcpu->arch.mmu = &vcpu->arch.root_mmu; diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index ac1ebb37a0ff..579cf39986ec 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -83,14 +83,29 @@ struct gfn_to_pfn_cache { * MMU flows is problematic, as is triggering reclaim, I/O, etc... while * holding MMU locks. Note, these caches act more like prefetch buffers than * classical caches, i.e. objects are not returned to the cache on being freed. + * + * The storage for the cache object pointers is laid out after the struct, to + * allow different declarations to choose different capacities. The capacity + * field defines the number of object pointers available after the struct. */ struct kvm_mmu_memory_cache { int nobjs; + int capacity; gfp_t gfp_zero; struct kmem_cache *kmem_cache; - void *objects[KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE]; + void *objects[]; }; -#endif + +#define __DEFINE_KVM_MMU_MEMORY_CACHE(_name, _capacity) \ + struct { \ + struct kvm_mmu_memory_cache _name; \ + void *_name##_objects[_capacity]; \ + } + +#define DEFINE_KVM_MMU_MEMORY_CACHE(_name) \ + __DEFINE_KVM_MMU_MEMORY_CACHE(_name, KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE) + +#endif /* KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE */ #define HALT_POLL_HIST_COUNT 32 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9581a24c3d17..1d849ba9529f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -371,9 +371,17 @@ int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min) { void *obj; + /* + * The capacity fieldmust be initialized since the storage for the + * objects pointer array is laid out after the kvm_mmu_memory_cache + * struct and not known at compile time. + */ + if (WARN_ON(mc->capacity == 0)) + return -EINVAL; + if (mc->nobjs >= min) return 0; - while (mc->nobjs < ARRAY_SIZE(mc->objects)) { + while (mc->nobjs < mc->capacity) { obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT); if (!obj) return mc->nobjs >= min ? 0 : -ENOMEM; From patchwork Fri Mar 11 00:25:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777190 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A29C7C43217 for ; Fri, 11 Mar 2022 00:26:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345226AbiCKA1T (ORCPT ); Thu, 10 Mar 2022 19:27:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345220AbiCKA1S (ORCPT ); Thu, 10 Mar 2022 19:27:18 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B8CC1A1C79 for ; Thu, 10 Mar 2022 16:26:09 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id t12-20020a17090a448c00b001b9cbac9c43so4177817pjg.2 for ; Thu, 10 Mar 2022 16:26:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=Xxlsl5gMjKmF9SCaiAavzpq6LPoScWsJHzRFSDHyFso=; b=cMEUZuguiW6pmIHu9LU3eoo1FDRnKfP07X6qrVJz+feSqP00edJurvJ6fZVCoGCvaJ 5svA21y657lW0vnnvcB8DrJj0Wif0psmeyY8i7OHoLX//aZidsK2Kwctp17DBZkm6zks bNqQTe153ijee74TbEx9ka/WwLu8NnHNmhJ3rzu0hVNv/X65IxxIaEZsKTVXiz3k0pq/ PXKVNpYhcW10Mo0GvokzU2pQFbCQGCYnnW5ERLDk3Bglg0cMtbS8FePSpbkv3zkxP4/+ HypkhlHZXg2YdqZQ+p9wkqmsfUp8Sg4/Y0BzNMC+ciXTAMJksdrkfNpcGpiny8WuXLFB rq7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Xxlsl5gMjKmF9SCaiAavzpq6LPoScWsJHzRFSDHyFso=; b=OAvVXujisD+f01oC/3GL9sq3inICEHsZ+PUtjayj+XPxQDF42BFO5ieji9EeVzWahZ zIsVLhQd4YQ73tFOhHbvjmwUmy8K/hDlwR/sbxV88WZLgk1m5pNT5qmCBaxN0BsSTYxy V6zaCWN2gZG3Euhqmmo5PPYNst21hoI2hplBhaSLB6jMkX88OpUKNBGz4hZnM8U5uVGE c85frCd2M5/rY1BCVIDdYcHmWHjwjwHZ4PacDZdoAJArJj/B1uN0EjbnIcwPwBvPq4Th 6osb6UQhpOC497L/ulkF/RUVD1rjj7rBqTMxRLTMhLj1TbQ6O87qCjdMUFI49mYUPgwo Ox6A== X-Gm-Message-State: AOAM533q1JKD46geDXiyeQpB5gE94pHJiFMDgJTLopNlG2CHcYET5L2j Fga66MNw6ixMnVFH60NvjqGPsZEfR2vMmg== X-Google-Smtp-Source: ABdhPJywweZjzvrfhg7epKsXyYUluoBQw2WEtYQ9qKQxDvH8uHrVK9h/VGrnh31BcpoNycSVrImYh5be0PAtHQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:17ca:b0:1bf:6188:cc00 with SMTP id me10-20020a17090b17ca00b001bf6188cc00mr19161970pjb.2.1646958368553; Thu, 10 Mar 2022 16:26:08 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:24 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-23-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 22/26] KVM: Allow GFP flags to be passed when topping up MMU caches From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This will be used in a subsequent commit to top-up MMU caches under the MMU lock with GFP_NOWAIT as part of eager page splitting. No functional change intended. Reviewed-by: Ben Gardon Signed-off-by: David Matlack --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 9 +++++++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 252ee4a61b58..7d3a1f28beb2 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1335,6 +1335,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm); #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min); +int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min, gfp_t gfp); int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc); void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc); void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1d849ba9529f..7861874af1c8 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -367,7 +367,7 @@ static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache *mc, return (void *)__get_free_page(gfp_flags); } -int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min) +int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min, gfp_t gfp) { void *obj; @@ -382,7 +382,7 @@ int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min) if (mc->nobjs >= min) return 0; while (mc->nobjs < mc->capacity) { - obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT); + obj = mmu_memory_cache_alloc_obj(mc, gfp); if (!obj) return mc->nobjs >= min ? 0 : -ENOMEM; mc->objects[mc->nobjs++] = obj; @@ -390,6 +390,11 @@ int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min) return 0; } +int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min) +{ + return __kvm_mmu_topup_memory_cache(mc, min, GFP_KERNEL_ACCOUNT); +} + int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc) { return mc->nobjs; From patchwork Fri Mar 11 00:25:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777191 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39074C433FE for ; Fri, 11 Mar 2022 00:26:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345233AbiCKA1X (ORCPT ); Thu, 10 Mar 2022 19:27:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239797AbiCKA1U (ORCPT ); Thu, 10 Mar 2022 19:27:20 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E6431A3604 for ; Thu, 10 Mar 2022 16:26:10 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 1-20020a630c41000000b00378d9d6bd91so3777037pgm.17 for ; Thu, 10 Mar 2022 16:26:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=8sZhoWELpqEc2BsW/puOFOcuzB1aijCrHDgV+XGrT0A=; b=ZtJErbNbgiA2+RbqXKqBnxBr2Nq6m57xmo9PXp2yC127qBftR6dEDE/OtgMbXvzCe5 R3oHIyZtmvDsk5xP3N7/iIB8cGyt7UM8f0gNZlLM8zCj7BaKqgCdxPmM22RnEROZc6Jo GCFdagqCDehiE8hlKuTZVkv9tHCzqD3B+rrlOZXtI7E8Hc5in7dOI/qmFbGOIo09ztU1 V0po6MjiTapLs0qLGGWWVa9CvqbOzj55JzFU3GourgAtoEkN4GItGo5azol+kzBfzXqX sGDmznitJy/e3LnDvHMW04qf1JamRU4KvweC4AR38auxJrW7KOXkXENKvW4kGdA5Fyt2 1HHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=8sZhoWELpqEc2BsW/puOFOcuzB1aijCrHDgV+XGrT0A=; b=GwipfaI6X38BuqfJIBOVLBUKZEzh8XBRjAwZE1pI9CrUNf1z3y3hBPKSF3S8a/T9HQ xXulp4VlQP3pzzWUPjB1dERA3RdLLlybtNFxo1cLon9akjaQSyqtfttBBPZcGeyEaJN4 qaxEe6uLIxLjXXD5EbA5E3RSe4P72i3F1vCrzblqerqrVI/4U8PUUeAcm2Dg3kam3Wo+ 2ULTuWWNpiWdPtLV85WZpqvQGmgz5yENaeOfFZiyhFWorFGiRgz9xbdhV7ZyREeV+YAa 6LvoF/1PIX8zNH72TexyFIXRR1JXfZB8weIacH2u5ww3caOp0WgxhN6oyizHcSkuUtWr 3HuA== X-Gm-Message-State: AOAM530qewW5EP2l3x0q6XdSBBCrReA8Y6IQY1XIuuh8+vGbH/Fe4RNx 8RRZ7g23uSoXxwe9Hcxy0rtEY2CtqFRtaQ== X-Google-Smtp-Source: ABdhPJxp6KEAF9MQBj/R4AILw7oMFKjGojv2nE0u8xYk0LZhOg4P/GeokJiuxdu0z4Y/37+7y05xtg0eLGJuXg== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:9007:b0:14f:3680:66d1 with SMTP id a7-20020a170902900700b0014f368066d1mr7904332plp.91.1646958370128; Thu, 10 Mar 2022 16:26:10 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:25 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-24-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 23/26] KVM: x86/mmu: Fully split huge pages that require extra pte_list_desc structs From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When splitting a huge page we need to add all of the lower level SPTEs to the memslot rmap. The current implementation of eager page splitting bails if adding an SPTE would require allocating an extra pte_list_desc struct. Fix this limitation by allocating enough pte_list_desc structs before splitting the huge page. This eliminates the need for TLB flushing under the MMU lock because the huge page is always entirely split (no subregion of the huge page is unmapped). Signed-off-by: David Matlack --- arch/x86/include/asm/kvm_host.h | 10 +++ arch/x86/kvm/mmu/mmu.c | 131 ++++++++++++++++++-------------- 2 files changed, 85 insertions(+), 56 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 544dde11963b..00a5c0bcc2eb 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1240,6 +1240,16 @@ struct kvm_arch { hpa_t hv_root_tdp; spinlock_t hv_root_tdp_lock; #endif + + /* + * Memory cache used to allocate pte_list_desc structs while splitting + * huge pages. In the worst case, to split one huge page we need 512 + * pte_list_desc structs to add each new lower level leaf sptep to the + * memslot rmap. + */ +#define HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY 512 + __DEFINE_KVM_MMU_MEMORY_CACHE(huge_page_split_desc_cache, + HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY); }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 24e7e053e05b..95b8e2ef562f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1765,6 +1765,16 @@ struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, bool direc return sp; } +static inline gfp_t gfp_flags_for_split(bool locked) +{ + /* + * If under the MMU lock, use GFP_NOWAIT to avoid direct reclaim (which + * is slow) and to avoid making any filesystem callbacks (which can end + * up invoking KVM MMU notifiers, resulting in a deadlock). + */ + return (locked ? GFP_NOWAIT : GFP_KERNEL) | __GFP_ACCOUNT; +} + /* * Allocate a new shadow page, potentially while holding the MMU lock. * @@ -1772,17 +1782,11 @@ struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm_vcpu *vcpu, bool direc * being mapped directly with a lower level page table. Thus there's no need to * allocate the shadowed_translation array. */ -struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked) +static struct kvm_mmu_page *__kvm_mmu_alloc_direct_sp_for_split(gfp_t gfp) { struct kvm_mmu_page *sp; - gfp_t gfp; - /* - * If under the MMU lock, use GFP_NOWAIT to avoid direct reclaim (which - * is slow) and to avoid making any filesystem callbacks (which can end - * up invoking KVM MMU notifiers, resulting in a deadlock). - */ - gfp = (locked ? GFP_NOWAIT : GFP_KERNEL) | __GFP_ACCOUNT | __GFP_ZERO; + gfp |= __GFP_ZERO; sp = kmem_cache_alloc(mmu_page_header_cache, gfp); if (!sp) @@ -1799,6 +1803,13 @@ struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked) return sp; } +struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(bool locked) +{ + gfp_t gfp = gfp_flags_for_split(locked); + + return __kvm_mmu_alloc_direct_sp_for_split(gfp); +} + static void mark_unsync(u64 *spte); static void kvm_mmu_mark_parents_unsync(struct kvm_mmu_page *sp) { @@ -5989,6 +6000,11 @@ void kvm_mmu_init_vm(struct kvm *kvm) node->track_write = kvm_mmu_pte_write; node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot; kvm_page_track_register_notifier(kvm, node); + + kvm->arch.huge_page_split_desc_cache.capacity = + HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY; + kvm->arch.huge_page_split_desc_cache.kmem_cache = pte_list_desc_cache; + kvm->arch.huge_page_split_desc_cache.gfp_zero = __GFP_ZERO; } void kvm_mmu_uninit_vm(struct kvm *kvm) @@ -6119,11 +6135,43 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); } +static int topup_huge_page_split_desc_cache(struct kvm *kvm, gfp_t gfp) +{ + /* + * We may need up to HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY descriptors + * to split any given huge page. We could more accurately calculate how + * many we actually need by inspecting all the rmaps and check which + * will need new descriptors, but that's not worth the extra cost or + * code complexity. + */ + return __kvm_mmu_topup_memory_cache( + &kvm->arch.huge_page_split_desc_cache, + HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY, + gfp); +} + +static int alloc_memory_for_split(struct kvm *kvm, struct kvm_mmu_page **spp, + bool locked) +{ + gfp_t gfp = gfp_flags_for_split(locked); + int r; + + r = topup_huge_page_split_desc_cache(kvm, gfp); + if (r) + return r; + + if (!*spp) { + *spp = __kvm_mmu_alloc_direct_sp_for_split(gfp); + r = *spp ? 0 : -ENOMEM; + } + + return r; +} + static int prepare_to_split_huge_page(struct kvm *kvm, const struct kvm_memory_slot *slot, u64 *huge_sptep, struct kvm_mmu_page **spp, - bool *flush, bool *dropped_lock) { int r = 0; @@ -6136,24 +6184,18 @@ static int prepare_to_split_huge_page(struct kvm *kvm, if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) goto drop_lock; - *spp = kvm_mmu_alloc_direct_sp_for_split(true); + r = alloc_memory_for_split(kvm, spp, true); if (r) goto drop_lock; return 0; drop_lock: - if (*flush) - kvm_arch_flush_remote_tlbs_memslot(kvm, slot); - - *flush = false; *dropped_lock = true; write_unlock(&kvm->mmu_lock); cond_resched(); - *spp = kvm_mmu_alloc_direct_sp_for_split(false); - if (!*spp) - r = -ENOMEM; + r = alloc_memory_for_split(kvm, spp, false); write_lock(&kvm->mmu_lock); return r; @@ -6196,10 +6238,10 @@ static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm, static int kvm_mmu_split_huge_page(struct kvm *kvm, const struct kvm_memory_slot *slot, - u64 *huge_sptep, struct kvm_mmu_page **spp, - bool *flush) + u64 *huge_sptep, struct kvm_mmu_page **spp) { + struct kvm_mmu_memory_cache *cache; struct kvm_mmu_page *split_sp; u64 huge_spte, split_spte; int split_level, index; @@ -6212,9 +6254,9 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm, return -EOPNOTSUPP; /* - * Since we did not allocate pte_list_desc_structs for the split, we - * cannot add a new parent SPTE to parent_ptes. This should never happen - * in practice though since this is a fresh SP. + * We did not allocate an extra pte_list_desc struct to add huge_sptep + * to split_sp->parent_ptes. An extra pte_list_desc struct should never + * be necessary in practice though since split_sp is brand new. * * Note, this makes it safe to pass NULL to __link_shadow_page() below. */ @@ -6225,6 +6267,7 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm, split_level = split_sp->role.level; access = split_sp->role.access; + cache = &kvm->arch.huge_page_split_desc_cache; for (index = 0; index < PT64_ENT_PER_PAGE; index++) { split_sptep = &split_sp->spt[index]; @@ -6232,25 +6275,11 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm, BUG_ON(is_shadow_present_pte(*split_sptep)); - /* - * Since we did not allocate pte_list_desc structs for the - * split, we can't add a new SPTE that maps this GFN. - * Skipping this SPTE means we're only partially mapping the - * huge page, which means we'll need to flush TLBs before - * dropping the MMU lock. - * - * Note, this make it safe to pass NULL to __rmap_add() below. - */ - if (gfn_to_rmap(split_gfn, split_level, slot)->val) { - *flush = true; - continue; - } - split_spte = make_huge_page_split_spte( huge_spte, split_level + 1, index, access); mmu_spte_set(split_sptep, split_spte); - __rmap_add(kvm, NULL, slot, split_sptep, split_gfn, access); + __rmap_add(kvm, cache, slot, split_sptep, split_gfn, access); } /* @@ -6258,9 +6287,7 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm, * page table. Since we are making this change without a TLB flush vCPUs * will see a mix of the split mappings and the original huge mapping, * depending on what's currently in their TLB. This is fine from a - * correctness standpoint since the translation will either be identical - * or non-present. To account for non-present mappings, the TLB will be - * flushed prior to dropping the MMU lock. + * correctness standpoint since the translation will be identical. */ __drop_large_spte(kvm, huge_sptep, false); __link_shadow_page(NULL, huge_sptep, split_sp); @@ -6297,7 +6324,6 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm, struct kvm_mmu_page *sp = NULL; struct rmap_iterator iter; u64 *huge_sptep, spte; - bool flush = false; bool dropped_lock; int level; gfn_t gfn; @@ -6312,7 +6338,7 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm, level = sptep_to_sp(huge_sptep)->role.level; gfn = sptep_to_gfn(huge_sptep); - r = prepare_to_split_huge_page(kvm, slot, huge_sptep, &sp, &flush, &dropped_lock); + r = prepare_to_split_huge_page(kvm, slot, huge_sptep, &sp, &dropped_lock); if (r) { trace_kvm_mmu_split_huge_page(gfn, spte, level, r); break; @@ -6321,7 +6347,7 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm, if (dropped_lock) goto restart; - r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp, &flush); + r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp); trace_kvm_mmu_split_huge_page(gfn, spte, level, r); @@ -6336,7 +6362,7 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm, if (sp) kvm_mmu_free_shadow_page(sp); - return flush; + return false; } static void kvm_rmap_try_split_huge_pages(struct kvm *kvm, @@ -6344,7 +6370,6 @@ static void kvm_rmap_try_split_huge_pages(struct kvm *kvm, gfn_t start, gfn_t end, int target_level) { - bool flush; int level; /* @@ -6352,21 +6377,15 @@ static void kvm_rmap_try_split_huge_pages(struct kvm *kvm, * down to the target level. This ensures pages are recursively split * all the way to the target level. There's no need to split pages * already at the target level. - * - * Note that TLB flushes must be done before dropping the MMU lock since - * rmap_try_split_huge_pages() may partially split any given huge page, - * i.e. it may effectively unmap (make non-present) a portion of the - * huge page. */ for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--) { - flush = slot_handle_level_range(kvm, slot, - rmap_try_split_huge_pages, - level, level, start, end - 1, - true, flush); + slot_handle_level_range(kvm, slot, + rmap_try_split_huge_pages, + level, level, start, end - 1, + true, false); } - if (flush) - kvm_arch_flush_remote_tlbs_memslot(kvm, slot); + kvm_mmu_free_memory_cache(&kvm->arch.huge_page_split_desc_cache); } /* Must be called with the mmu_lock held in write-mode. */ From patchwork Fri Mar 11 00:25:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777192 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE288C43217 for ; Fri, 11 Mar 2022 00:26:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345236AbiCKA1Y (ORCPT ); Thu, 10 Mar 2022 19:27:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244821AbiCKA1X (ORCPT ); Thu, 10 Mar 2022 19:27:23 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E65951A1C6F for ; Thu, 10 Mar 2022 16:26:12 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id l6-20020a170903120600b0014f43ba55f3so3556696plh.11 for ; Thu, 10 Mar 2022 16:26:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=+N8CyLDUI8dOs6IazZEgayEpuN+Pk+7Zaf+44sPVu9M=; b=fkEfUCit5tqWvdDlI8v5ZSGJVY9seUe6cbf2whXXIZ5bwxCgzP0IqvRSWDUoHnUm0C W3XxUPolVkh0JOGVkZrBPLTgYMhmCmL3KGSLazm6lEXEdrUp9eA1e+Zxh3obazelnzBh dZq3DeFT1PL7lfJmBODoMSeUte5Fn4XfcSn9I3jPcoaKyZaVho/0yEOenJX9x6bY1C/o FG6pjk4smgKYWPjXUN1uX4ZD02cguLFEYY/MSYP3fEl3xF5comBq5sUT/YInXbYrQJMv JiMc8dEQa7jZfyLvcEOZDz+Kp9eA+3xn4embgt7o+mxQ1YtN7sOkwpiLsRTj0LtlLVPv dRCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=+N8CyLDUI8dOs6IazZEgayEpuN+Pk+7Zaf+44sPVu9M=; b=xIJu+59rFXFVurkllwy4FcyfOBuvUcZ3aY8S02a2VOFw4Po89idniNkcFkZylW9GZT iSv7H0l/Zxbynph3Rf1PBE73O5fPUX8voyiBqCsFSSzsnM/X5arjN7e6rYc6VuTL8yE5 PMgRFpf4NHOvJ5wCZmQwrba0o9L0VtFzAbtdP/gQP88i+cQYpernZShYjqzY0i0fsD+D SDy6OqTYeSDGLplxVrffG+aNqp0LBDN1Sd3Gz40evxwl054Tw+vS2jfrA33D6I7eUcF4 LBa/U2uC2Z8sILHS3l1gBnYkhzBEXvgemf4/4GiK9ntQqk7bG4U25plw+zrl3OAijKPz fKnA== X-Gm-Message-State: AOAM530oVpW11RLCG1rbCbKetgGYcwXh80FCAV1ps+YRsKziP95DfWI7 G+1pwAoRB6PdFyzReoOkn1JXi0ZqpgJOPA== X-Google-Smtp-Source: ABdhPJzEurWCjkJGrlFQG7dN5CM2sI5brKP2s1Ygv17frKolnNrY7kNpKXDlaqlxOBpcKkeByFPsKnJeUISXWA== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90b:4d81:b0:1bf:8ce4:4f51 with SMTP id oj1-20020a17090b4d8100b001bf8ce44f51mr322717pjb.0.1646958371853; Thu, 10 Mar 2022 16:26:11 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:26 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-25-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 24/26] KVM: x86/mmu: Split huge pages aliased by multiple SPTEs From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The existing huge page splitting code bails if it encounters a huge page that is aliased by another SPTE that has already been split (either due to NX huge pages or eager page splitting). Extend the huge page splitting code to also handle such aliases. The thing we have to be careful about is dealing with what's already in the lower level page table. If eager page splitting was the only operation that split huge pages, this would be fine. However huge pages can also be split by NX huge pages. This means the lower level page table may only be partially filled in and may point to even lower level page tables that are partially filled in. We can fill in the rest of the page table but dealing with the lower level page tables would be too complex. To handle this we flush TLBs after dropping the huge SPTE whenever we are about to install a lower level page table that was partially filled in (*). We can skip the TLB flush if the lower level page table was empty (no aliasing) or identical to what we were already going to populate it with (aliased huge page that was just eagerly split). (*) This TLB flush could probably be delayed until we're about to drop the MMU lock, which would also let us batch flushes for multiple splits. However such scenarios should be rare in practice (a huge page must be aliased in multiple SPTEs and have been split for NX Huge Pages in only some of them). Flushing immediately is simpler to plumb and also reduces the chances of tripping over a CPU bug (e.g. see iTLB multi-hit). Signed-off-by: David Matlack --- arch/x86/include/asm/kvm_host.h | 5 ++- arch/x86/kvm/mmu/mmu.c | 73 +++++++++++++++------------------ 2 files changed, 36 insertions(+), 42 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 00a5c0bcc2eb..275d00528805 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1245,9 +1245,10 @@ struct kvm_arch { * Memory cache used to allocate pte_list_desc structs while splitting * huge pages. In the worst case, to split one huge page we need 512 * pte_list_desc structs to add each new lower level leaf sptep to the - * memslot rmap. + * memslot rmap plus 1 to extend the parent_ptes rmap of the new lower + * level page table. */ -#define HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY 512 +#define HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY 513 __DEFINE_KVM_MMU_MEMORY_CACHE(huge_page_split_desc_cache, HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY); }; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 95b8e2ef562f..68785b422a08 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6208,6 +6208,7 @@ static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm, { struct kvm_mmu_page *split_sp; union kvm_mmu_page_role role; + bool created = false; unsigned int access; gfn_t gfn; @@ -6220,25 +6221,21 @@ static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm, */ role = kvm_mmu_child_role(huge_sptep, true, access); split_sp = kvm_mmu_find_direct_sp(kvm, gfn, role); - - /* - * Opt not to split if the lower-level SP already exists. This requires - * more complex handling as the SP may be already partially filled in - * and may need extra pte_list_desc structs to update parent_ptes. - */ if (split_sp) - return NULL; + goto out; + created = true; swap(split_sp, *spp); init_shadow_page(kvm, split_sp, slot, gfn, role); - trace_kvm_mmu_get_page(split_sp, true); +out: + trace_kvm_mmu_get_page(split_sp, created); return split_sp; } -static int kvm_mmu_split_huge_page(struct kvm *kvm, - const struct kvm_memory_slot *slot, - u64 *huge_sptep, struct kvm_mmu_page **spp) +static void kvm_mmu_split_huge_page(struct kvm *kvm, + const struct kvm_memory_slot *slot, + u64 *huge_sptep, struct kvm_mmu_page **spp) { struct kvm_mmu_memory_cache *cache; @@ -6246,22 +6243,11 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm, u64 huge_spte, split_spte; int split_level, index; unsigned int access; + bool flush = false; u64 *split_sptep; gfn_t split_gfn; split_sp = kvm_mmu_get_sp_for_split(kvm, slot, huge_sptep, spp); - if (!split_sp) - return -EOPNOTSUPP; - - /* - * We did not allocate an extra pte_list_desc struct to add huge_sptep - * to split_sp->parent_ptes. An extra pte_list_desc struct should never - * be necessary in practice though since split_sp is brand new. - * - * Note, this makes it safe to pass NULL to __link_shadow_page() below. - */ - if (WARN_ON_ONCE(split_sp->parent_ptes.val)) - return -EINVAL; huge_spte = READ_ONCE(*huge_sptep); @@ -6273,7 +6259,20 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm, split_sptep = &split_sp->spt[index]; split_gfn = kvm_mmu_page_get_gfn(split_sp, index); - BUG_ON(is_shadow_present_pte(*split_sptep)); + /* + * split_sp may have populated page table entries if this huge + * page is aliased in multiple shadow page table entries. We + * know the existing SP will be mapping the same GFN->PFN + * translation since this is a direct SP. However, the SPTE may + * point to an even lower level page table that may only be + * partially filled in (e.g. for NX huge pages). In other words, + * we may be unmapping a portion of the huge page, which + * requires a TLB flush. + */ + if (is_shadow_present_pte(*split_sptep)) { + flush |= !is_last_spte(*split_sptep, split_level); + continue; + } split_spte = make_huge_page_split_spte( huge_spte, split_level + 1, index, access); @@ -6284,15 +6283,12 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm, /* * Replace the huge spte with a pointer to the populated lower level - * page table. Since we are making this change without a TLB flush vCPUs - * will see a mix of the split mappings and the original huge mapping, - * depending on what's currently in their TLB. This is fine from a - * correctness standpoint since the translation will be identical. + * page table. If the lower-level page table indentically maps the huge + * page, there's no need for a TLB flush. Otherwise, flush TLBs after + * dropping the huge page and before installing the shadow page table. */ - __drop_large_spte(kvm, huge_sptep, false); - __link_shadow_page(NULL, huge_sptep, split_sp); - - return 0; + __drop_large_spte(kvm, huge_sptep, flush); + __link_shadow_page(cache, huge_sptep, split_sp); } static bool should_split_huge_page(u64 *huge_sptep) @@ -6347,16 +6343,13 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm, if (dropped_lock) goto restart; - r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp); - - trace_kvm_mmu_split_huge_page(gfn, spte, level, r); - /* - * If splitting is successful we must restart the iterator - * because huge_sptep has just been removed from it. + * After splitting we must restart the iterator because + * huge_sptep has just been removed from it. */ - if (!r) - goto restart; + kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp); + trace_kvm_mmu_split_huge_page(gfn, spte, level, 0); + goto restart; } if (sp) From patchwork Fri Mar 11 00:25:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AEBCC4332F for ; Fri, 11 Mar 2022 00:26:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238759AbiCKA10 (ORCPT ); Thu, 10 Mar 2022 19:27:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345238AbiCKA1Z (ORCPT ); Thu, 10 Mar 2022 19:27:25 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47BAE1A271D for ; Thu, 10 Mar 2022 16:26:14 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id lp2-20020a17090b4a8200b001bc449ecbceso6758130pjb.8 for ; Thu, 10 Mar 2022 16:26:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=//m+/4Erm4fn+rEULp2+N5ovNu4tAor1JU0kGbOkSFg=; b=DMOErCjEnqpcFOKygIlsSFTfCmw0C9Pj2tK0NYpYCgAcTI1iGWeaL6FRV1b2ZlfKX8 fHYQBmv/QbDdZc+efkMwruYG+KF4wDiQDaVEfR1WJ8mcUSF76wTcN32p8eRd3os+gbwe HmiPcVbSsqLkKI3pTJ0GlBqlqq9PA36fDqQPPRemYGpGS2VlQfbNt1iRDlvaB0K2kTnc XnxAxv59lS8YUiLJmKwZc6thsP7Jo1gj39W7PaGOaRSlJdD4vh92yg8lUDbEdT2KiQZ8 tIhYCc+gAjCABk9ZJUASFdr26iqSp1JW7do4XHUHG5vHE4zhj5bN+IYHo6EeO/tLY3TV 7glw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=//m+/4Erm4fn+rEULp2+N5ovNu4tAor1JU0kGbOkSFg=; b=ZkRnsLOT+aXaYtJkdLC71u2YulaQagGEqrlaFftlJtEinxuOjRf51D/0CQmCdKU99t 2xvjWH8U2vBGInL2HlJP2279Aws7jG10AeMKpDc1I9n36cNcsjiQak+d/NF2LsXHEAK4 jJ1KMmNCHnWpluq6vWTx41q+YIi/e0aEv/H0xG0r3XNKivM/nY4UO7RIS5AxSa+qG12Y UA0RuSy2OsG0TdKflkAXDub+4p8N+/v4jPv8k4k28X5sZ9XcNbj5KGa6XLpJQqWHqSQI qUPNLY3mgOfmVIw45ElxO14BQMbKvz2Jujc81t5FE5AW/zrJwSIDIF9U/Eky0Oeew1jM fCUw== X-Gm-Message-State: AOAM533BhSJ350iyy0R2Xbzdq6PsDpbs7/ZR+YunMapJGOgajgNenCoY SeC1eyzng2BxUEnpX06e7/+oUDkYGb/58g== X-Google-Smtp-Source: ABdhPJzhIkawyFt0f4OEQ5xaTKMM8Gn4o1vOGnaEYzvNSaPlKUnsrfCuFW5EPlURrJUZ7m52xqfE2I313J0YiA== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90a:a510:b0:1bc:5887:d957 with SMTP id a16-20020a17090aa51000b001bc5887d957mr18553716pjq.38.1646958373723; Thu, 10 Mar 2022 16:26:13 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:27 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-26-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 25/26] KVM: x86/mmu: Drop NULL pte_list_desc_cache fallback From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Now that the eager page splitting code no longer passes in NULL cache pointers we can get rid of the debug WARN_ON() and allocation fallback. While here, also drop the helper function mmu_alloc_pte_list_desc() as it no longer serves any purpose. Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 68785b422a08..d2ffebb659e0 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -725,16 +725,6 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } -static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_mmu_memory_cache *cache) -{ - static const gfp_t gfp_nocache = GFP_ATOMIC | __GFP_ACCOUNT | __GFP_ZERO; - - if (WARN_ON_ONCE(!cache)) - return kmem_cache_alloc(pte_list_desc_cache, gfp_nocache); - - return kvm_mmu_memory_cache_alloc(cache); -} - static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc) { kmem_cache_free(pte_list_desc_cache, pte_list_desc); @@ -914,7 +904,7 @@ static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte, rmap_head->val = (unsigned long)spte; } else if (!(rmap_head->val & 1)) { rmap_printk("%p %llx 1->many\n", spte, *spte); - desc = mmu_alloc_pte_list_desc(cache); + desc = kvm_mmu_memory_cache_alloc(cache); desc->sptes[0] = (u64 *)rmap_head->val; desc->sptes[1] = spte; desc->spte_count = 2; @@ -926,7 +916,7 @@ static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte, while (desc->spte_count == PTE_LIST_EXT) { count += PTE_LIST_EXT; if (!desc->more) { - desc->more = mmu_alloc_pte_list_desc(cache); + desc->more = kvm_mmu_memory_cache_alloc(cache); desc = desc->more; desc->spte_count = 0; break; From patchwork Fri Mar 11 00:25:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12777194 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A85DC433FE for ; Fri, 11 Mar 2022 00:26:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345246AbiCKA12 (ORCPT ); Thu, 10 Mar 2022 19:27:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244821AbiCKA10 (ORCPT ); Thu, 10 Mar 2022 19:27:26 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA8261A2724 for ; Thu, 10 Mar 2022 16:26:15 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id 67-20020a621446000000b004f739ef52f1so4207932pfu.0 for ; Thu, 10 Mar 2022 16:26:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=C/6kqtgpfTZHxI5VWOjwoNRlEkRWSpZD6XXLim3vJOQ=; b=Bs0Nt6VPOSGpoyrAZTXHtLQrqfTnnHH0UC9KK9A0vLWpFqtNCmQ76QkKCaoiU7vlME Q0bTazFlFcbJfqcpCjLO9tF13nAtANqrfJ4/weyLzjhfeuO+6qrSrtOmMS7TIYUEvmoj 3jWCRL/2/6pXWfByCKv+9rhnbh7rbU0XaID/OlkGYUR1Z4dQD7f/Cw2G6GMi4iWL1Uec P+O9K7voxCptiW2ggUfO7UL3R6VkHHKXqZbW0O4DhLGBotN31nqItGDXOibW5YjGwp66 nMmg3cEoPd8Q4NTBLdQxpNMCsJJ+tFv4B4qQ12ZQNrnMrYMMdODRkSRN3LkqTjAlbCWy wjpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=C/6kqtgpfTZHxI5VWOjwoNRlEkRWSpZD6XXLim3vJOQ=; b=bzYcTFUaGF/jxB2JKm9fsrrMRaRcvXpk+F/qG/kk9T4jRY1f3jtBFsIgagny7pRfzg Nh60c/SU4pgVCxHuVPvz10YS2WN3JL6pgXPL0xy/xyBxab6AiqvS9bNtcuPBglKFne1V G3ouLZC0alFMiY0N97q1S4NBOIyPWGtke+glKlhC6e6PtLw7ot+VfN29M56jF9oz7T3j qwJ0dLtac0xIOsFB/39jIrxaWDOpwvCEzXRDYCCMhr8ggIYEe2Rd33Z+VbT4mdb1meIE jYgIWvuJkplQQRyowyDpLuPIm/ya3hEYGfPJS1wmTsg7DC7lMq7f08ygpiLZZrsxvW1o QlnA== X-Gm-Message-State: AOAM532CyXjKEvzL8sxcybFCEzbIKUoo7iKXfBD1OMzKG0kSjARxfB2q svggseYuEJhrBiwm9nv45WR6GTP+8wJhag== X-Google-Smtp-Source: ABdhPJxWLEVXby9lk2SDFDSYjtNrAmaCeZHIJ0XIStrXC375fHHUc9G7571Y4uEmDYxZrmSHCBqVZt0xqwpthw== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90a:12c8:b0:1bf:6484:3e27 with SMTP id b8-20020a17090a12c800b001bf64843e27mr7807022pjg.209.1646958375426; Thu, 10 Mar 2022 16:26:15 -0800 (PST) Date: Fri, 11 Mar 2022 00:25:28 +0000 In-Reply-To: <20220311002528.2230172-1-dmatlack@google.com> Message-Id: <20220311002528.2230172-27-dmatlack@google.com> Mime-Version: 1.0 References: <20220311002528.2230172-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.1.723.g4982287a31-goog Subject: [PATCH v2 26/26] KVM: selftests: Map x86_64 guest virtual memory with huge pages From: David Matlack To: Paolo Bonzini Cc: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Andrew Jones , Ben Gardon , Peter Xu , maciej.szmigiero@oracle.com, "moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)" , "open list:KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)" , Peter Feiner , David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Override virt_map() in x86_64 selftests to use the largest page size possible when mapping guest virtual memory. This enables testing eager page splitting with shadow paging (e.g. kvm_intel.ept=N), as it allows KVM to shadow guest memory with huge pages. Signed-off-by: David Matlack --- .../selftests/kvm/include/x86_64/processor.h | 6 ++++ tools/testing/selftests/kvm/lib/kvm_util.c | 4 +-- .../selftests/kvm/lib/x86_64/processor.c | 31 +++++++++++++++++++ 3 files changed, 39 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index 37db341d4cc5..efb228d2fbf7 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -470,6 +470,12 @@ enum x86_page_size { X86_PAGE_SIZE_2M, X86_PAGE_SIZE_1G, }; + +static inline size_t page_size_bytes(enum x86_page_size page_size) +{ + return 1UL << (page_size * 9 + 12); +} + void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, enum x86_page_size page_size); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 1665a220abcb..60198587236d 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1432,8 +1432,8 @@ vm_vaddr_t vm_vaddr_alloc_page(struct kvm_vm *vm) * Within the VM given by @vm, creates a virtual translation for * @npages starting at @vaddr to the page range starting at @paddr. */ -void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, - unsigned int npages) +void __weak virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, + unsigned int npages) { size_t page_size = vm->page_size; size_t size = npages * page_size; diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 9f000dfb5594..7df84292d5de 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -282,6 +282,37 @@ void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr) __virt_pg_map(vm, vaddr, paddr, X86_PAGE_SIZE_4K); } +void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, unsigned int npages) +{ + size_t size = (size_t) npages * vm->page_size; + size_t vend = vaddr + size; + enum x86_page_size page_size; + size_t stride; + + TEST_ASSERT(vaddr + size > vaddr, "Vaddr overflow"); + TEST_ASSERT(paddr + size > paddr, "Paddr overflow"); + + /* + * Map the region with all 1G pages if possible, falling back to all + * 2M pages, and finally all 4K pages. This could be improved to use + * a mix of page sizes so that more of the region is mapped with large + * pages. + */ + for (page_size = X86_PAGE_SIZE_1G; page_size >= X86_PAGE_SIZE_4K; page_size--) { + stride = page_size_bytes(page_size); + + if (!(vaddr % stride) && !(paddr % stride) && !(size % stride)) + break; + } + + TEST_ASSERT(page_size >= X86_PAGE_SIZE_4K, + "Cannot map unaligned region: vaddr 0x%lx paddr 0x%lx npages 0x%x\n", + vaddr, paddr, npages); + + for (; vaddr < vend; vaddr += stride, paddr += stride) + __virt_pg_map(vm, vaddr, paddr, page_size); +} + static struct pageTableEntry *_vm_get_page_table_entry(struct kvm_vm *vm, int vcpuid, uint64_t vaddr) {