From patchwork Tue Feb 2 18:57:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BE93C433DB for ; Tue, 2 Feb 2021 18:59:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5219964EC1 for ; Tue, 2 Feb 2021 18:59:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239249AbhBBS7O (ORCPT ); Tue, 2 Feb 2021 13:59:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239212AbhBBS6V (ORCPT ); Tue, 2 Feb 2021 13:58:21 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AF5CC061788 for ; Tue, 2 Feb 2021 10:57:41 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id l7so8695688pgj.4 for ; Tue, 02 Feb 2021 10:57:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=JmOMIXWttp6i62xS7cuIM/ozODaFpac3D3d4bnLpAU8=; b=euAu6Opvq1y1VpidLQ8Oq09e/TrLgb/rbhNuIy2MmJdEzB5ZqUlxB1H8xT8zfEB6Vj 3/BJsYUUVr/lDP04W+4nMyvJpPCOq38UjhDo51soy8LwSenYDFUqqpXnCT7SpBTbwtjY cbCcJh8qRfoDvwUktyeGxopGlmq167RyyhBBmzRG6lINudxDvwgHgItf/N1LJFQFjct/ eibxVR8YJ95nhps7dxooxnJCCQlGca8uynQ1L9GlG9zvottVYQRO8M3aWUqgl6K84I6A OkPyluEoHxW6iVYkDeYqPK34PiUT4fA07x9XVkUuWgbmQu7E/m8Cj7PTnbNdgLKB8L7j tkpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=JmOMIXWttp6i62xS7cuIM/ozODaFpac3D3d4bnLpAU8=; b=K/j/DC4QQp0RecMiAirr/WzJug4IfVCEU0nQ1/VJ4QwuHU+7L8YJHAZOEUb6kkj0cE M6TCdbzkFrsBUnzN/bP5HlaMJmMQ3fENRkuJluCxcFTsV+iFadh5NT0O48+ztP5CXS5M hI427UlIjiPL14m9XqTTx2iQwAqufUgdblFSsyBgof+yWtmluUGyuuDYVtIdwgQfV09u NlF1KkpeP6aEm7xG/nRMerXh/S78F7r6YVXucR7H3o6JenAaRyh4g3LM5rWk/GuN1o2D fh7XB1dXZOVusoyAoVPvlfGWTc26PcxzwzelaFOGGSpRL505WemmqRgH992lCdepKQlx Blbg== X-Gm-Message-State: AOAM531wgoqM0UX4YModAPKFIF+CoNkKn3XpAGbW1KkMVA8fySIyJx3o ZiAspSi0IMbaSagQbCqWIHOFeI63hlwt X-Google-Smtp-Source: ABdhPJyALatEZ01ixKvjEw/9drTlyjjz5b4Lp2A/W9XIayTqqPdwvUTu2WeQErsecDzxzztrUEokK56l18Hf Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a05:6a00:2b8:b029:1b8:a114:f81f with SMTP id q24-20020a056a0002b8b02901b8a114f81fmr22719715pfs.19.1612292260621; Tue, 02 Feb 2021 10:57:40 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:07 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-2-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 01/28] KVM: x86/mmu: change TDP MMU yield function returns to match cond_resched From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently the TDP MMU yield / cond_resched functions either return nothing or return true if the TLBs were not flushed. These are confusing semantics, especially when making control flow decisions in calling functions. To clean things up, change both functions to have the same return value semantics as cond_resched: true if the thread yielded, false if it did not. If the function yielded in the _flush_ version, then the TLBs will have been flushed. Reviewed-by: Peter Feiner Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 39 ++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 2ef8615f9dba..e9f9ff81a38e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -413,8 +413,15 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, _mmu->shadow_root_level, _start, _end) /* - * Flush the TLB if the process should drop kvm->mmu_lock. - * Return whether the caller still needs to flush the tlb. + * Flush the TLB and yield if the MMU lock is contended or this thread needs to + * return control to the scheduler. + * + * If this function yields, it will also reset the tdp_iter's walk over the + * paging structure and the calling function should allow the iterator to + * continue its traversal from the paging structure root. + * + * Return true if this function yielded, the TLBs were flushed, and the + * iterator's traversal was reset. Return false if a yield was not needed. */ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { @@ -422,18 +429,32 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it kvm_flush_remote_tlbs(kvm); cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); - return false; - } else { return true; } + + return false; } -static void tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +/* + * Yield if the MMU lock is contended or this thread needs to return control + * to the scheduler. + * + * If this function yields, it will also reset the tdp_iter's walk over the + * paging structure and the calling function should allow the iterator to + * continue its traversal from the paging structure root. + * + * Return true if this function yielded and the iterator's traversal was reset. + * Return false if a yield was not needed. + */ +static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); + return true; } + + return false; } /* @@ -469,10 +490,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); - if (can_yield) - flush_needed = tdp_mmu_iter_flush_cond_resched(kvm, &iter); - else - flush_needed = true; + flush_needed = !can_yield || + !tdp_mmu_iter_flush_cond_resched(kvm, &iter); } return flush_needed; } @@ -1072,7 +1091,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, tdp_mmu_set_spte(kvm, &iter, 0); - spte_set = tdp_mmu_iter_flush_cond_resched(kvm, &iter); + spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter); } if (spte_set) From patchwork Tue Feb 2 18:57:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 947EAC433E6 for ; Tue, 2 Feb 2021 18:59:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 52A1764EC1 for ; Tue, 2 Feb 2021 18:59:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239271AbhBBS7m (ORCPT ); Tue, 2 Feb 2021 13:59:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239215AbhBBS6X (ORCPT ); Tue, 2 Feb 2021 13:58:23 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B60ACC06178B for ; Tue, 2 Feb 2021 10:57:42 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id t13so14119837pfg.13 for ; Tue, 02 Feb 2021 10:57:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=gBakp41ktk4jgkeZgHcExRdbU8Ig7SZgvQMSmr/VLBU=; b=UP9UzJfzEe9/Ly/yxN5Gjfj7iCf04Nj5PvrnWKJvXZSvijmzu3M/fLZnQDrbW/Wtx6 R7DZZ5/gcOxTtdPj3Y1x8Mk/x4GBKvI1qdaNlihuUHMQtkgpJhlJmGHPXSYETV0yUcgX 1Uoem/JJthQoPjSd6iWa/PgbA1tKRBNAwtmUyRpZlyg38kE7m6Vdvk6G60EQMZkU8eVF ln5XTvOQHOFrKytwn2RQoDNDqTGo4sXMrj4sGLL8T7/V/uJ8VyQ1LAAB/zR+A13ZAMVr Ash90jAsH7cqQAIrK5v7MQvB7aKZmUAqfcpHhbiXyk8dNo+CS8uGxEIrpIs24/JkuVv6 MehQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=gBakp41ktk4jgkeZgHcExRdbU8Ig7SZgvQMSmr/VLBU=; b=lAAPahUoNd2ubkrTsnY7m9Qt1z7wg3ZNgDAnjRgU635/PV0BUgWls153E6V4ZlVrJ3 Tpz1obnG/8e8aARUcSTjhXnEOtE1Y1FTcU/6K7ek1mpN/AlnMkazjwc+6RrTRDFN/n8+ WkPlYtPdHD2KP4qfPur0/2Zr0maXgEHnNVy3jEmGvZNvNvKFxNmVA+unpcOP16nzRLtn f9Iwhj9LT/KCjLvjN5/7Nxwhq07nSN08nu4/p5BprLorMSBCXyQKjJHCwKTLDKGBfxT5 CinTU0eOk6qQ6Omh2rvcJq9QoO6qp9q+MEcB1+U0ecyWrPB9IOt60N/vw5s4g00ErGvK ELyQ== X-Gm-Message-State: AOAM531dBcabvPTNZfobayHh7Jx8S7fWhV2NW0VOWUjSGtGoNkwuC9dI X1p1iyRQHyFoBlkmQKzNm4QwXmo0f6GI X-Google-Smtp-Source: ABdhPJzeRZOFIX9a/bG5Hmq6Upkz91nse9rF4JC93D5n4GvGY/qHC0TkdleD1Uz3WAjRhhnaXuHiUszMatv4 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:aa7:85cc:0:b029:1be:181e:32af with SMTP id z12-20020aa785cc0000b02901be181e32afmr23215062pfn.50.1612292262260; Tue, 02 Feb 2021 10:57:42 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:08 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-3-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 02/28] KVM: x86/mmu: Add comment on __tdp_mmu_set_spte From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org __tdp_mmu_set_spte is a very important function in the TDP MMU which already accepts several arguments and will take more in future commits. To offset this complexity, add a comment to the function describing each of the arguemnts. No functional change intended. Reviewed-by: Peter Feiner Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e9f9ff81a38e..3d8cca238eba 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -357,6 +357,22 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, new_spte, level); } +/* + * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping + * @kvm: kvm instance + * @iter: a tdp_iter instance currently on the SPTE that should be set + * @new_spte: The value the SPTE should be set to + * @record_acc_track: Notify the MM subsystem of changes to the accessed state + * of the page. Should be set unless handling an MMU + * notifier for access tracking. Leaving record_acc_track + * unset in that case prevents page accesses from being + * double counted. + * @record_dirty_log: Record the page as dirty in the dirty bitmap if + * appropriate for the change being made. Should be set + * unless performing certain dirty logging operations. + * Leaving record_dirty_log unset in that case prevents page + * writes from being double counted. + */ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, u64 new_spte, bool record_acc_track, bool record_dirty_log) From patchwork Tue Feb 2 18:57:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFB00C433E0 for ; Tue, 2 Feb 2021 19:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 93CE164E4D for ; Tue, 2 Feb 2021 19:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239723AbhBBTUT (ORCPT ); Tue, 2 Feb 2021 14:20:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57298 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239217AbhBBS6Z (ORCPT ); Tue, 2 Feb 2021 13:58:25 -0500 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CECC9C061793 for ; Tue, 2 Feb 2021 10:57:44 -0800 (PST) Received: by mail-qk1-x74a.google.com with SMTP id u66so18226570qkd.13 for ; Tue, 02 Feb 2021 10:57:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=Cra3FPzbCtRzIcvq0e9H9YxXh60eEn/9H2dqE2N9TfU=; b=p823Z4ilFYZxxy9UpoD/iwe6gt9jgEy5kxVKN9i9/tHseeVNazzD9Lo54JAfCVpxoK vFAls/EJDEFnWDrNGgu9B8XoE6aKqgfQtkBDXfphCZfmlgDS0pq142+e3O3o6SfEjh1w LkNH3ejAwSVl9T+Qdo3jwJMTJVz48aOLAumqDUa8XYWeStDE5kAdGYxBvNdhpspdCH0E OTZXIdKVfdA5ny6I6rZJAQsglB4StEN6zW5olpmaglwNLLgMNBKi7GP1lUaGw5WLkLbs QS2t56MjAkXwkt/id6yGPy1l9Fq1Qli34MlZDv/p8gKUKnkphe5WUqEN1146rBtOg7Hm GTNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Cra3FPzbCtRzIcvq0e9H9YxXh60eEn/9H2dqE2N9TfU=; b=qt/949FAUEHw+ppVzqQ+79yi9X55P7cSykJK6N2gzi3l+FEVEWDeVp8Ghgi7si4wzX 9wCjlzlHS9eqrI3TRJpUuH7RYiL/RTc6+G9Kl/pdNmzQ1fJgGm9i6o+RbibwNb3ew99W AlC80yk+Uq8f6mPZ9VdDDQqfZux4y9jsY0vuhc2EPF6u65rcD0aiQaAknHcM5cDs/Gmp zltrK84jlpcHmodeQAPlydNnl3FSKauEcyY9pIWDAFJ6fXuHt+KoJLgo0HWzTiyy8TyT 5+/i3POn+oyrAeMSyo0Wm6i2Z2oAkKdZNeUd+SFAdTgXJGvpom1MU+9Z0ZACfcmlED72 nQRA== X-Gm-Message-State: AOAM531CNa7RjZUdO4gQfq57uR4JeTBx3YK2FUvswfqo1zmXnEN6GuGu tXnA2G1yqWmVtiMF/+Vu2lUASedxWjs9 X-Google-Smtp-Source: ABdhPJynRt6E/nk50AlfkYVFuCNBExPQoQBYs/j6qqmjMLVk2sU/q5KZSY/y9Kp24h0PIcJub0x9og/NUiwG Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a0c:b509:: with SMTP id d9mr2910426qve.29.1612292264016; Tue, 02 Feb 2021 10:57:44 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:09 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-4-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 03/28] KVM: x86/mmu: Add lockdep when setting a TDP MMU SPTE From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add lockdep to __tdp_mmu_set_spte to ensure that SPTEs are only modified under the MMU lock. No functional change intended. Reviewed-by: Peter Feiner Reviewed-by: Sean Christopherson Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 3d8cca238eba..b83a6a3ad29c 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -381,6 +381,8 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, struct kvm_mmu_page *root = sptep_to_sp(root_pt); int as_id = kvm_mmu_page_as_id(root); + lockdep_assert_held(&kvm->mmu_lock); + WRITE_ONCE(*iter->sptep, new_spte); __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, From patchwork Tue Feb 2 18:57:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EA1FC433DB for ; Tue, 2 Feb 2021 19:01:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 59AF264F66 for ; Tue, 2 Feb 2021 19:01:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239133AbhBBTBJ (ORCPT ); Tue, 2 Feb 2021 14:01:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239221AbhBBS60 (ORCPT ); Tue, 2 Feb 2021 13:58:26 -0500 Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B178C061573 for ; Tue, 2 Feb 2021 10:57:46 -0800 (PST) Received: by mail-qt1-x84a.google.com with SMTP id m21so15016007qtp.6 for ; Tue, 02 Feb 2021 10:57:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=KTYEyXIgb4wI42K1yBcuqU69rlausMxtU1GwO2JJSfc=; b=A3SMO1A6jAqBUExSNWmRam6ysZiwsIlC52mBxT2bopExdqHIBugBrv/ileXZbdnRGF i8cVgOJh+zPcodoOUyleOTeF3fMbp4DaOIUDSujm6e6/7JmPxeQlrgtKDMigIhPBFzjW Xqiqa9WEoZ2UW+mP6GwHpblG/H2H8weZzseRVsKfn/YV7BCWl2E+gvJ1+3P7KBd7jGGp LnzaAVttOqPI9eBilD+UWW9acrjfAvbtPMLs/AaAe8Zff4CCewn7UMSyDnlipvlgHexG MDSPtzb/nEzSvA+2X8UUqlK54rdhJ9boWVXOg2MqNj8zbD0QK9BteYRh3RwsBw0UWm5Z 60Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=KTYEyXIgb4wI42K1yBcuqU69rlausMxtU1GwO2JJSfc=; b=bFROvdEAR+TnmHL1ZEmkWtfFqysucnsd0tm+rtikJchEbLfZfNTXsYjIAKCAcdOFVE roIBTAoxI7DiwWAov3hP39VB/2TMmVPknCsnvTz/jJQQlLQmSU9jKHw0gt+XCqpxHsPC F59zLijo2hUwwzKupmmEKvXiPWtxJDV35r3ZAStbsfLWHDJqw4/zeLYTDPDLWroaYivB NXNLX3Yr0fK6SbUzqOvv9hEkSTn35pka0vcOOAyd+YIyMW78xBR8n6iLnSeNszZv3S/7 PhHsY6peYjnqFf4xSlZ1wzLTmtk38FG5pCxrTXYUYZOMADBQBjCn8XIJACKq33ShPcfg C38A== X-Gm-Message-State: AOAM5326nzIEfatGeOkM5XyZq3fSIr5WbuUubWqm0qOMVJWFKUe5oo5y O+blmiJl3MlIylgdIjRc7IRgANzwSQV0 X-Google-Smtp-Source: ABdhPJw35iQ1myHf6nwzdvwrNMRryl3/GMsxE12DKFd8FH4ic90uHjpGrPkdi51usxmn3uQAKzCJ0aULv8k0 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:ad4:4f41:: with SMTP id eu1mr21827353qvb.34.1612292265673; Tue, 02 Feb 2021 10:57:45 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:10 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-5-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 04/28] KVM: x86/mmu: Don't redundantly clear TDP MMU pt memory From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The KVM MMU caches already guarantee that shadow page table memory will be zeroed, so there is no reason to re-zero the page in the TDP MMU page fault handler. No functional change intended. Reviewed-by: Peter Feiner Reviewed-by: Sean Christopherson Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index b83a6a3ad29c..3828c0e83466 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -655,7 +655,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level); list_add(&sp->link, &vcpu->kvm->arch.tdp_mmu_pages); child_pt = sp->spt; - clear_page(child_pt); new_spte = make_nonleaf_spte(child_pt, !shadow_accessed_mask); From patchwork Tue Feb 2 18:57:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F086AC433DB for ; Tue, 2 Feb 2021 19:00:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B73A564EC1 for ; Tue, 2 Feb 2021 19:00:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239302AbhBBTAW (ORCPT ); Tue, 2 Feb 2021 14:00:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239224AbhBBS6d (ORCPT ); Tue, 2 Feb 2021 13:58:33 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 582B3C0617A7 for ; Tue, 2 Feb 2021 10:57:48 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id ob3so2800504pjb.0 for ; Tue, 02 Feb 2021 10:57:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=S0cDnClyNXVmiNLA5hLSJx8SJQB7LRWr/4TdaU7WpUo=; b=k6lsxzxMOHiG3jn878FlFd/+cJBmOZ7515ZJaBd70/fv3l+PBs9d4XeoR86VagEQnJ vqgxy2Acft/z4PMvGQMb2DbcEeTXas8mhJcPGTREotijfldn30nOupg6v0N5sic85aJ4 UHBvEPTTPIgPoooWsdTyr1trW2LHBId/ebXoy6eE2CuYcMSRq5Um+vurBy58i1PjtuKz K9M2yA9yW6ux+HtHRrNMFXttVYVGNxid0HD6AREamfWnxS34lmjVxz6uSthTSpbkM5Ws LqwY+ZVkpVeBgriwbtpYWBPvBi03AUEyGAj+NISyRZgRourkWwh7mQmj1A6GIQB76p0d 0dJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=S0cDnClyNXVmiNLA5hLSJx8SJQB7LRWr/4TdaU7WpUo=; b=CRl5587SLilCEP+H7xlAFk7yyUPEhAvh0vIao+CsHS1pHJ+4+Yns0KNHDVLM892NhV qRvPEreRpBK72uSR501tsuNsv8SBJ3IjRGYg3cMTrQS+YGewNnO/hbG12wiCRGccVTjG SJC6GWDPQGvG7+3Ia89PkWWvA5uhV0eZJF4EUhdvQQ+0ZQX7Lv4VnGyO9VXiMXqt7GwE B6s6X3gE4ZcUUevZMT6si1w5OWKg5PxXsPrzBTUnPuqo3n71/IlQu2WxGmyYBL/Od3jn E5E/RGsZMQisxjw63b2V3OYxRA6+XO0zVvyJwREiSvj89cOYl9iA1BRhjk5WsnWidLRE SFAA== X-Gm-Message-State: AOAM532NvOS1GAG290n0Dme8xMIkdZpLu45O1jXQ7//noH5Laj2THWBU j3Vqi/4J0x6tI3mQmSnYs/lpmmKHSzd2 X-Google-Smtp-Source: ABdhPJyiZhfioFp9ZR8ukB4iQqIctGnZxjQDsamWD+u2WGwSEaC0G2JbsA68uNvl7vEnnIfb0kiqIcRapPSP Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90a:8d83:: with SMTP id d3mr614547pjo.0.1612292267530; Tue, 02 Feb 2021 10:57:47 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:11 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-6-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 05/28] KVM: x86/mmu: Factor out handling of removed page tables From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Factor out the code to handle a disconnected subtree of the TDP paging structure from the code to handle the change to an individual SPTE. Future commits will build on this to allow asynchronous page freeing. No functional change intended. Reviewed-by: Peter Feiner Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- v1 -> v2 - Replaced "disconnected" with "removed" updated derivative comments and code arch/x86/kvm/mmu/tdp_mmu.c | 71 ++++++++++++++++++++++---------------- 1 file changed, 42 insertions(+), 29 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 3828c0e83466..c3075fb568eb 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -234,6 +234,45 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, } } +/** + * handle_removed_tdp_mmu_page - handle a pt removed from the TDP structure + * + * @kvm: kvm instance + * @pt: the page removed from the paging structure + * + * Given a page table that has been removed from the TDP paging structure, + * iterates through the page table to clear SPTEs and free child page tables. + */ +static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt) +{ + struct kvm_mmu_page *sp = sptep_to_sp(pt); + int level = sp->role.level; + gfn_t gfn = sp->gfn; + u64 old_child_spte; + int i; + + trace_kvm_mmu_prepare_zap_page(sp); + + list_del(&sp->link); + + if (sp->lpage_disallowed) + unaccount_huge_nx_page(kvm, sp); + + for (i = 0; i < PT64_ENT_PER_PAGE; i++) { + old_child_spte = READ_ONCE(*(pt + i)); + WRITE_ONCE(*(pt + i), 0); + handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), + gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), + old_child_spte, 0, level - 1); + } + + kvm_flush_remote_tlbs_with_address(kvm, gfn, + KVM_PAGES_PER_HPAGE(level)); + + free_page((unsigned long)pt); + kmem_cache_free(mmu_page_header_cache, sp); +} + /** * handle_changed_spte - handle bookkeeping associated with an SPTE change * @kvm: kvm instance @@ -254,10 +293,6 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, bool was_leaf = was_present && is_last_spte(old_spte, level); bool is_leaf = is_present && is_last_spte(new_spte, level); bool pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte); - u64 *pt; - struct kvm_mmu_page *sp; - u64 old_child_spte; - int i; WARN_ON(level > PT64_ROOT_MAX_LEVEL); WARN_ON(level < PG_LEVEL_4K); @@ -321,31 +356,9 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, * Recursively handle child PTs if the change removed a subtree from * the paging structure. */ - if (was_present && !was_leaf && (pfn_changed || !is_present)) { - pt = spte_to_child_pt(old_spte, level); - sp = sptep_to_sp(pt); - - trace_kvm_mmu_prepare_zap_page(sp); - - list_del(&sp->link); - - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); - - for (i = 0; i < PT64_ENT_PER_PAGE; i++) { - old_child_spte = READ_ONCE(*(pt + i)); - WRITE_ONCE(*(pt + i), 0); - handle_changed_spte(kvm, as_id, - gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), - old_child_spte, 0, level - 1); - } - - kvm_flush_remote_tlbs_with_address(kvm, gfn, - KVM_PAGES_PER_HPAGE(level)); - - free_page((unsigned long)pt); - kmem_cache_free(mmu_page_header_cache, sp); - } + if (was_present && !was_leaf && (pfn_changed || !is_present)) + handle_removed_tdp_mmu_page(kvm, + spte_to_child_pt(old_spte, level)); } static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, From patchwork Tue Feb 2 18:57:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCF2BC433E9 for ; Tue, 2 Feb 2021 19:02:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 82E5164F5F for ; Tue, 2 Feb 2021 19:02:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239208AbhBBTBz (ORCPT ); Tue, 2 Feb 2021 14:01:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239262AbhBBS7d (ORCPT ); Tue, 2 Feb 2021 13:59:33 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F09F4C0617AA for ; Tue, 2 Feb 2021 10:57:49 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id ds13so2869025pjb.4 for ; Tue, 02 Feb 2021 10:57:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=2jALOnR02DelYQsUJSUDQ9Zj6pptUDPmbQP0oTGAN5k=; b=Iev1zX72MfK7mD5itgi9KXbRFo17KVmLsFSv3pkVFuJA56XKPonwgx4fyJLFQFFuPQ 1qqgfd5EZyJ8faqztnNf/X34uP3WvAVpLYuKv9yoShz8iwOBTAwHzQDi5WzTnVveprJs RrpoX6LFPZfYUIZ7tqxPpOYwi3i2hMR3gtYumreFEWQLAGsIlKQK7wULg3yjsIVQ/t1v a/kR6kdfBC7uLLm3D7koCOuPg+I5Tbsy00wlJ50eCxZtvliRdbiYFuT22Xro5nrXnJpt newSqu6vAiwQB6Zj3k4hWYYfNYE0vRdzBp5rYeG2ENS1o+/mXCiWd660JKRKVumOrNpK sbLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=2jALOnR02DelYQsUJSUDQ9Zj6pptUDPmbQP0oTGAN5k=; b=KXQ4oMjAz3SWLDJHMLCUJwXIoieMD+/v9cwOMXVticyNSBlLo5hhHPgtxn9TLWo8M4 AV+3mAbFNmR2m7Wk65QOGao6p98c7AvnP4u6q5hCf17OncbzbOc81KkwOV6yGjyYiWMh TokM0vRm8ZtYLfET96sRC1elyieF0/LshjtYvG5PiRtyXrjoU4LMHGV1GRNxHLgqnajE SMLvbxPV0D+tHkwZhLfYH6fXvukfa3YNodbEoZ+wSE+sbrPgMTRYR/Yt/s0GhYZEzzId jF1VkIf6js6Q7YSLG6J1kMzHBbHJ2Rb4n9l81OGzxI+uctuvh3k03NA1lowxGjslIJya /Sug== X-Gm-Message-State: AOAM532mpmq4KEzRh8+r4sznDeb4QFvtn7t41GeQ7LnEgUwiRv8w61b1 FNrrf0dcl+YzXOqrjbiBv5WGswlSiflj X-Google-Smtp-Source: ABdhPJxT7zsP4FMf12L3RkWtR7DMzVl1GMgVhCc5fj2RYAmrngkxUabKWYjAkS0v/TGl4HDYLG4nU9xuFPXy Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:902:7c88:b029:e1:c7d:c271 with SMTP id y8-20020a1709027c88b02900e10c7dc271mr24095510pll.74.1612292269427; Tue, 02 Feb 2021 10:57:49 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:12 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-7-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 06/28] locking/rwlocks: Add contention detection for rwlocks From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Ingo Molnar , Will Deacon , Peter Zijlstra , Davidlohr Bueso , Waiman Long Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org rwlocks do not currently have any facility to detect contention like spinlocks do. In order to allow users of rwlocks to better manage latency, add contention detection for queued rwlocks. CC: Ingo Molnar CC: Will Deacon Acked-by: Peter Zijlstra Acked-by: Davidlohr Bueso Acked-by: Waiman Long Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- include/asm-generic/qrwlock.h | 24 ++++++++++++++++++------ include/linux/rwlock.h | 7 +++++++ 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 84ce841ce735..0020d3b820a7 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -14,6 +14,7 @@ #include #include +#include /* * Writer states & reader shift and bias. @@ -116,15 +117,26 @@ static inline void queued_write_unlock(struct qrwlock *lock) smp_store_release(&lock->wlocked, 0); } +/** + * queued_rwlock_is_contended - check if the lock is contended + * @lock : Pointer to queue rwlock structure + * Return: 1 if lock contended, 0 otherwise + */ +static inline int queued_rwlock_is_contended(struct qrwlock *lock) +{ + return arch_spin_is_locked(&lock->wait_lock); +} + /* * Remapping rwlock architecture specific functions to the corresponding * queue rwlock functions. */ -#define arch_read_lock(l) queued_read_lock(l) -#define arch_write_lock(l) queued_write_lock(l) -#define arch_read_trylock(l) queued_read_trylock(l) -#define arch_write_trylock(l) queued_write_trylock(l) -#define arch_read_unlock(l) queued_read_unlock(l) -#define arch_write_unlock(l) queued_write_unlock(l) +#define arch_read_lock(l) queued_read_lock(l) +#define arch_write_lock(l) queued_write_lock(l) +#define arch_read_trylock(l) queued_read_trylock(l) +#define arch_write_trylock(l) queued_write_trylock(l) +#define arch_read_unlock(l) queued_read_unlock(l) +#define arch_write_unlock(l) queued_write_unlock(l) +#define arch_rwlock_is_contended(l) queued_rwlock_is_contended(l) #endif /* __ASM_GENERIC_QRWLOCK_H */ diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h index 3dcd617e65ae..7ce9a51ae5c0 100644 --- a/include/linux/rwlock.h +++ b/include/linux/rwlock.h @@ -128,4 +128,11 @@ do { \ 1 : ({ local_irq_restore(flags); 0; }); \ }) +#ifdef arch_rwlock_is_contended +#define rwlock_is_contended(lock) \ + arch_rwlock_is_contended(&(lock)->raw_lock) +#else +#define rwlock_is_contended(lock) ((void)(lock), 0) +#endif /* arch_rwlock_is_contended */ + #endif /* __LINUX_RWLOCK_H */ From patchwork Tue Feb 2 18:57:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B26EFC433DB for ; Tue, 2 Feb 2021 19:01:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7406764EC1 for ; Tue, 2 Feb 2021 19:01:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239180AbhBBTBd (ORCPT ); Tue, 2 Feb 2021 14:01:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239266AbhBBS7k (ORCPT ); Tue, 2 Feb 2021 13:59:40 -0500 Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00FC9C061351 for ; Tue, 2 Feb 2021 10:57:52 -0800 (PST) Received: by mail-qk1-x749.google.com with SMTP id o16so5772569qkj.15 for ; Tue, 02 Feb 2021 10:57:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=vKyAzxjQBGxCc4s66Xnux4uLB1j6dF1t9W4yNEtDreQ=; b=HVr+0foHl89omFgYiz6hKgzbk/r1SQJ88Zs8DPDXsjmZAaq31A2cgqBUK1kliB2dLq sG7UO19OmzTl/YBi3LSFmr7pBXcvljQgn0wOS66SpuMBB7pmSijONEULiDLwfKRw++U8 UnvmKLWm/nh3u843jmu/z6PA6b9kBAR+2n9J/0CEs9rFiWetr2BijHsoDbVquego91zW 2zvCpwmtaPy/1UQyrXnbWOBwdho2TC8abniH8zMXYDH/L7+g0wOfGAm5YO+Ji83QReZL o3Bkd4l88MRsgyUSIXXqyfthEZuwthav/QQb0nWVw6aA+8wv8eJdiCeZeRjfKHYGoNMi 5vtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=vKyAzxjQBGxCc4s66Xnux4uLB1j6dF1t9W4yNEtDreQ=; b=REYxGfOZbdasseddwSCgt+uss6NdAIiap6+26aSykH2/isLM/qnwsDxzXKK1xHISlq YoQzVrRnK2J9jCYtPS5vWlhiHOKfXji0CoiHZgnES0gHxak5QErCm6Fl6qf8eqtWUCil bwpeknjC2r1B0+0JuHcZhWFpxHIE6cz/v34jFlVf6D4FjHAsMOOfY/wp3npoRJ+QOnuw Ep+bg9lYms2BUC/qoPfsoub7C5O19O5Vx3PSX6JEeZgqiH5C3TwWNfuBG2Hgt76wauaF IEqu1VHQlAHs+TpTpHZmXcYSV3yT1eFrjCTg+CK+wGbRT6iuP2Y2/06WdgZgKh8wu4Qt XUBQ== X-Gm-Message-State: AOAM530t8ek/jHBzjp0Nw+ZlHVf0jEYn8nascOptjguyYf+pHi6PD/qB levgVsXdO9GEmdOsSckAybTN9d1fuQnm X-Google-Smtp-Source: ABdhPJyrU5qslteX5GpFz1sPXhfk9MRXnYKDBd+2HCN7lm6NRE5mDx0gL9uFCFJ8m+IP2rUuQOxyS3rObJ25 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a05:6214:446:: with SMTP id cc6mr11232321qvb.31.1612292271200; Tue, 02 Feb 2021 10:57:51 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:13 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-8-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 07/28] sched: Add needbreak for rwlocks From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Ingo Molnar , Will Deacon , Peter Zijlstra , Davidlohr Bueso , Waiman Long Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Contention awareness while holding a spin lock is essential for reducing latency when long running kernel operations can hold that lock. Add the same contention detection interface for read/write spin locks. CC: Ingo Molnar CC: Will Deacon Acked-by: Peter Zijlstra Acked-by: Davidlohr Bueso Acked-by: Waiman Long Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- include/linux/sched.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 6e3a5eeec509..5d1378e5a040 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1912,6 +1912,23 @@ static inline int spin_needbreak(spinlock_t *lock) #endif } +/* + * Check if a rwlock is contended. + * Returns non-zero if there is another task waiting on the rwlock. + * Returns zero if the lock is not contended or the system / underlying + * rwlock implementation does not support contention detection. + * Technically does not depend on CONFIG_PREEMPTION, but a general need + * for low latency. + */ +static inline int rwlock_needbreak(rwlock_t *lock) +{ +#ifdef CONFIG_PREEMPTION + return rwlock_is_contended(lock); +#else + return 0; +#endif +} + static __always_inline bool need_resched(void) { return unlikely(tif_need_resched()); From patchwork Tue Feb 2 18:57:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062673 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12ED3C433E0 for ; Tue, 2 Feb 2021 19:02:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B23DC64F5D for ; Tue, 2 Feb 2021 19:02:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239259AbhBBTBs (ORCPT ); Tue, 2 Feb 2021 14:01:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239270AbhBBS7l (ORCPT ); Tue, 2 Feb 2021 13:59:41 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43F62C061353 for ; Tue, 2 Feb 2021 10:57:53 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id e2so14797498pgg.10 for ; Tue, 02 Feb 2021 10:57:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=YXSGE321Hb1spvDtzkP5gbN2dNsISQr8m2Sxt2hfEQw=; b=Uce/HAumEGErTei5sAOxj/z26CYF6m1y5IaCuTDbahOpmbAfKeJwIsZ5/C9Nce4Y32 HqvUsSlvypprHj8Y0NYn7Ja5rjDiPQc9yBViVpS6NuVm3tM3DobmSbuSKMc1pk11vKDL j0huOX5yXeEkRbrZ/2doRnG8aB9CgoIiSaAHV4+n4sylBtmzYrD2ikUF08UgicRMNXHu pVWFBcLzueg/GlszziuntcVgpptkJhEnhpFB2wqeBVa9zDnTa5VSqDB0ClwfvxIUlZht fU3+gyQGHX1EcXTK0KGQRCAIV58t0eMe6M84n7/jkdv0+t1lkz8m3y45zTaXZbgkh5qZ +LEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=YXSGE321Hb1spvDtzkP5gbN2dNsISQr8m2Sxt2hfEQw=; b=ShD3JoEUEs3uNC3STfE3EZ+vTtnODLVz1vzYivl6G8aQUgoP3UAax1kiI97uN6WFV8 Cm/C1X7xjjya3zc3Cj7cYDekmu476IMMPi+PAVqoJEJDsv3dI0+VAXzwWwiSeqfW9bFq QR4TkxoIJa6uWw/sXjcwpUcxDypy5A8kjWeHDwB8HEmmPLGedcGsU3y3iLCcJ05r+Qfb KZOcjTkz2V+5kufIPG0upBwaOC8QY+67dGZz4IaEgfQ/RS5Ntg6mn/FAZ6lDyAqUobsB aS/hmlpqgoLb4oWmFCgQqBxcaOWYeP6VwkG/fA8Xqm4Q270GWYWtmvpILZR+WvjeSKv6 +lbQ== X-Gm-Message-State: AOAM530iX1N0OKvSk0c4/6xiE28d2M6pi5V2TdaqeO1FwCqlk1o5g/Mk RjjxNJ/KC4w5dHNedQ3xw8VjB8XDX2s5 X-Google-Smtp-Source: ABdhPJwGrxjV6sQ+79aRBFvtmuGCi22q1OfdwPYkSNZ1JHZ4zNkLjV9kBH8bIzdmXik79XvPQNVs1fxLRmzv Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:aa7:9a48:0:b029:1b7:bb17:38c9 with SMTP id x8-20020aa79a480000b02901b7bb1738c9mr23164029pfj.51.1612292272799; Tue, 02 Feb 2021 10:57:52 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:14 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-9-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 08/28] sched: Add cond_resched_rwlock From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Ingo Molnar , Will Deacon , Peter Zijlstra , Davidlohr Bueso , Waiman Long Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Safely rescheduling while holding a spin lock is essential for keeping long running kernel operations running smoothly. Add the facility to cond_resched rwlocks. CC: Ingo Molnar CC: Will Deacon Acked-by: Peter Zijlstra Acked-by: Davidlohr Bueso Acked-by: Waiman Long Acked-by: Paolo Bonzini Signed-off-by: Ben Gardon --- include/linux/sched.h | 12 ++++++++++++ kernel/sched/core.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 5d1378e5a040..3052d16da3cf 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1883,12 +1883,24 @@ static inline int _cond_resched(void) { return 0; } }) extern int __cond_resched_lock(spinlock_t *lock); +extern int __cond_resched_rwlock_read(rwlock_t *lock); +extern int __cond_resched_rwlock_write(rwlock_t *lock); #define cond_resched_lock(lock) ({ \ ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\ __cond_resched_lock(lock); \ }) +#define cond_resched_rwlock_read(lock) ({ \ + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ + __cond_resched_rwlock_read(lock); \ +}) + +#define cond_resched_rwlock_write(lock) ({ \ + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ + __cond_resched_rwlock_write(lock); \ +}) + static inline void cond_resched_rcu(void) { #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ff74fca39ed2..efed1bf202d1 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6709,6 +6709,46 @@ int __cond_resched_lock(spinlock_t *lock) } EXPORT_SYMBOL(__cond_resched_lock); +int __cond_resched_rwlock_read(rwlock_t *lock) +{ + int resched = should_resched(PREEMPT_LOCK_OFFSET); + int ret = 0; + + lockdep_assert_held_read(lock); + + if (rwlock_needbreak(lock) || resched) { + read_unlock(lock); + if (resched) + preempt_schedule_common(); + else + cpu_relax(); + ret = 1; + read_lock(lock); + } + return ret; +} +EXPORT_SYMBOL(__cond_resched_rwlock_read); + +int __cond_resched_rwlock_write(rwlock_t *lock) +{ + int resched = should_resched(PREEMPT_LOCK_OFFSET); + int ret = 0; + + lockdep_assert_held_write(lock); + + if (rwlock_needbreak(lock) || resched) { + write_unlock(lock); + if (resched) + preempt_schedule_common(); + else + cpu_relax(); + ret = 1; + write_lock(lock); + } + return ret; +} +EXPORT_SYMBOL(__cond_resched_rwlock_write); + /** * yield - yield the current processor to other threads. * From patchwork Tue Feb 2 18:57:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062737 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4344C433DB for ; Tue, 2 Feb 2021 19:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 777B964E45 for ; Tue, 2 Feb 2021 19:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239686AbhBBTUG (ORCPT ); Tue, 2 Feb 2021 14:20:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239282AbhBBS7t (ORCPT ); Tue, 2 Feb 2021 13:59:49 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58AD3C0613D6 for ; Tue, 2 Feb 2021 10:57:55 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id y19so1670737plr.20 for ; Tue, 02 Feb 2021 10:57:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=B0VigK2P5ak64VCq6hWJQFhuqtks7N3lzzr7PnL5znA=; b=f10BMaH1rM64eodNhWOBnAnthqsNAHjjKYrvC9ZophPlUZRe2fDKTrcdwVRetQigOD 8BKXc5ZpDg8iWDD/te/SmQKPVJPIOlbvZypVTq3M7TaSk387aVTmcZENyujIqiSq9tjK NdTRVdp2eXPSFOeo8C4RCyriGl/ze9F/o9s6L0cjEntCb2PSbFYcJMryUHz9evMXzv40 hbu9zFHK7IPejA27L+onORHo8do7QuTyU+ymwx0Y+fgBMqV4IBkPbA8RhZTbfa/ohpd/ GbjVsUYW1aXlOT///YnQFIUDV/VDdOm9wu2Xi04D/qMmcy1fvxjueT2AKSW+dd1AdPsp Imlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=B0VigK2P5ak64VCq6hWJQFhuqtks7N3lzzr7PnL5znA=; b=APLi7Pa8x+f/IPnBN9QVG3CWLxmR3l7cHpdJYa1dsbo76reeNPEm2kPmN525lM9ypb zO6BMBT9QmmZoyf7n4rBbxkuIGx4CHAFS1RD8Nj0w7KxzLREQJ1OdVE0QpAo5s/IU+rv pizwfPY/k2hoAdryCcAWPBQDoiMDgyxSgteuEk2PP9ouG66zpaTza9MqSLBEYTRWSAfU R30INnomfFRpr4u0vDP9x3iFwHgviLAlqj5HxyCDepqMkiYkigXizJMxIrYK1gdsG6nI 1oOf30WnJeLjKBRBcXL3ytpTUHPAw+oYxz+gDwxq1lTg+hsmav4LvJ4bO9S4MQmaCI/a ihSA== X-Gm-Message-State: AOAM532h9IBuxtbRmwaK0A74W5k2JpBlizdrJfzZ4+/e4hkSyPDPGTfR Xu22MdTk9AtF73BIIxKAA5An04CNZaJK X-Google-Smtp-Source: ABdhPJykMdXaaVCk0ONAntUZG48nUEsuDAg3ibhbZYtyiHnZr+MJZXWrl2Xj/U3/Z2Wq1xNJK2nW5Nb2e5lN Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90a:8d83:: with SMTP id d3mr614578pjo.0.1612292274520; Tue, 02 Feb 2021 10:57:54 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:15 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-10-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 09/28] KVM: x86/mmu: Fix braces in kvm_recover_nx_lpages From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org No functional change intended. Fixes: 29cf0f5007a2 ("kvm: x86/mmu: NX largepage recovery for TDP MMU") Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6d16481aa29d..60ff6837655a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6005,10 +6005,10 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) struct kvm_mmu_page, lpage_disallowed_link); WARN_ON_ONCE(!sp->lpage_disallowed); - if (sp->tdp_mmu_page) + if (sp->tdp_mmu_page) { kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level)); - else { + } else { kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); WARN_ON_ONCE(sp->lpage_disallowed); } From patchwork Tue Feb 2 18:57:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42B7FC433E0 for ; Tue, 2 Feb 2021 19:18:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07A4564E45 for ; Tue, 2 Feb 2021 19:18:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239585AbhBBTSe (ORCPT ); Tue, 2 Feb 2021 14:18:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239285AbhBBS7v (ORCPT ); Tue, 2 Feb 2021 13:59:51 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24F0BC061356 for ; Tue, 2 Feb 2021 10:57:57 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id v7so7104895ybl.15 for ; Tue, 02 Feb 2021 10:57:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=KYVd6fUQw3zIOqAGEpGtYtjC1FQT+zrejnhSTvgL1Tc=; b=AbpanSfQAKLYwwX/pQuSNKeFHr4axrcCDvd62suB3GmDBdzIgiYdCdzSU4mns/bb7s a9Ee+nd9i7nsCQoujjH+slrX1c943W0QHX6Jv7Cgx9L/gZrGM9jyW+8MzaHhJktg36/3 XoGH5dQdlgilaYiE30BZN2sSC+tNUsK9ZlqUe2pLgAucdYkMKrOtbHFvy9idE1YeCVQJ cmBW1MhPJigCh2nv0G8CngVoj0RdP3aDG45cHxi5Wjk5bsmFhJ76Hhy+ENvn2JpnjKv+ 2mP4Yw5dtmybP1Uqo9W0d3PbOXMw0lIT8vSv+nzas46LrzHEON0j8m9GGNyBJZE2Q63Q BfIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=KYVd6fUQw3zIOqAGEpGtYtjC1FQT+zrejnhSTvgL1Tc=; b=NF9AjgLjIMsWPG/7ShEUeFJ+KsPWgfB6ugWTgg1BdD5Bt+ebzc0ZDTedjCoJ2M93GZ FYHVfbvF09Id9pWyua13etUvXdv5gD3IQNlFQbcR9WTX03UvxSO9138OEuPckDW1jWt8 ZEvRkqHNWq6OIbxPpC2ycXnNvs5D7bABeTDJ2dDzSJ/CTm+BlH24PbmhSld9mNBC/gKJ PhOVZZtegQfbXdepCRpenh3tGfEyebcAIi8B/fc5Rony8WRd8hCCpw+0dKX44USlx8K7 /+lsXklBJAzvkQ+0D4E7HUpGToHHyBSlUXZlNKn3XYzhHruRKJFwzqtA3GN7jX6i8Xek ZKvQ== X-Gm-Message-State: AOAM53320iwcDoGevviW01ZDiBlY+TFh0dYouGqQjP9Dyk2Q9v0tGLD4 JAoM3GS8t0GNI6IjprO0KYBxibW8U5/j X-Google-Smtp-Source: ABdhPJwbOZw3GpJNQQ/n0bg2L6ItJcBXZkWD0jzrdbnmEfZlWs/Wz2K6OMts+qvcqYo7IMhvQJ5WsYdQd26f Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a25:b745:: with SMTP id e5mr34704252ybm.518.1612292276425; Tue, 02 Feb 2021 10:57:56 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:16 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-11-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 10/28] KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEs From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There is a bug in the TDP MMU function to zap SPTEs which could be replaced with a larger mapping which prevents the function from doing anything. Fix this by correctly zapping the last level SPTEs. Fixes: 14881998566d ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU") Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index c3075fb568eb..e3066d08c1dc 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1098,8 +1098,8 @@ bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) } /* - * Clear non-leaf entries (and free associated page tables) which could - * be replaced by large mappings, for GFNs within the slot. + * Clear leaf entries which could be replaced by large mappings, for + * GFNs within the slot. */ static void zap_collapsible_spte_range(struct kvm *kvm, struct kvm_mmu_page *root, @@ -1111,7 +1111,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, tdp_root_for_each_pte(iter, root, start, end) { if (!is_shadow_present_pte(iter.old_spte) || - is_last_spte(iter.old_spte, iter.level)) + !is_last_spte(iter.old_spte, iter.level)) continue; pfn = spte_to_pfn(iter.old_spte); From patchwork Tue Feb 2 18:57:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5EE5C433E9 for ; Tue, 2 Feb 2021 19:18:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6820064DD6 for ; Tue, 2 Feb 2021 19:18:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239554AbhBBTRy (ORCPT ); Tue, 2 Feb 2021 14:17:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239288AbhBBS74 (ORCPT ); Tue, 2 Feb 2021 13:59:56 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD2B4C06121C for ; Tue, 2 Feb 2021 10:57:58 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id f127so24629397ybf.12 for ; Tue, 02 Feb 2021 10:57:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=dElFoNeHiYfE+Clh2h/Ld01Wkmf8wMNyWJL/CRUJm2o=; b=nr6Agelw/nIeikKIswusjI6QHczTI7BP2L8ss6Ng/iwcx1l7Gvyqs/+0iJNeDaHI6b pppHxuVpNNpdmW0oAfDoruE5z9LaAoaN/5G0N7tYkeUVnzuCVdxNvGv9RuxfIZGpijNF Vg//mNSI6BSbh6kwBhFy8c13xNfd9F0NQqQhpPvTDOHKPbXV8qzxBNrf02EjStjIpfNb bjuVOHEolnnlgmAxJY7x9PXhcPt1Lk2fWucV+x2Q9UpYOo9rAV9uVnFMDXQ0N+Sti2S1 0YxfQYxp8tUbNXbXqFEXzY1to/bJ3ugE4CCWUP3HGB1Ny9XhDKeMJEBmrmzRXJzGtbMY w9xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=dElFoNeHiYfE+Clh2h/Ld01Wkmf8wMNyWJL/CRUJm2o=; b=j6gEb2dSaigCeHeUlrJftj4r3+/xYk3hWFgc0lldJQk0Dc9sXz0EmlAQ8e3tXtqgPQ N89hbdtfm5OJr1kinB56WhUbX8IFvQCsCvwYgFMlF/8yqHsH6SopceCFU/88dcfAvAIl varqpSEb6uIMBfJFRXhJPoZAGwrdprL0VzHNXTa8mlSmvImzwksfiHtRbosFNb8ITtpP kC4Z07LiHl+eik+Rwj8DfyEvgzo+t45De7ny+pmgexTK9aFoL2dIK07ZWW4SrhcNjtvy n85MxBgh8RODgr4irKY+hmiYtMahj2KJRooKK75gO+nvaVbAKzCxuecj4D/gkHqdSvgE R/gA== X-Gm-Message-State: AOAM531s+Bxa2+1v4hb3ZdgeQyv3lPj+jtfsEbirR90GxFAXZjuqmYnE VXDwuMiBBXOxF85gnh18/sMHX2FPM7sH X-Google-Smtp-Source: ABdhPJy3lGkJJpTPbhJWSyiIyOghQNskwzb0YjuAuPVyrowqWyUiNgZ/nWZEXqmNn+hZuu+XFZUHt0fBpw1w Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a25:fc2:: with SMTP id 185mr36753132ybp.385.1612292278050; Tue, 02 Feb 2021 10:57:58 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:17 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-12-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 11/28] KVM: x86/mmu: Merge flush and non-flush tdp_mmu_iter_cond_resched From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The flushing and non-flushing variants of tdp_mmu_iter_cond_resched have almost identical implementations. Merge the two functions and add a flush parameter. Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 42 ++++++++++++-------------------------- 1 file changed, 13 insertions(+), 29 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e3066d08c1dc..8f7b120597f3 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -443,33 +443,13 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, for_each_tdp_pte(_iter, __va(_mmu->root_hpa), \ _mmu->shadow_root_level, _start, _end) -/* - * Flush the TLB and yield if the MMU lock is contended or this thread needs to - * return control to the scheduler. - * - * If this function yields, it will also reset the tdp_iter's walk over the - * paging structure and the calling function should allow the iterator to - * continue its traversal from the paging structure root. - * - * Return true if this function yielded, the TLBs were flushed, and the - * iterator's traversal was reset. Return false if a yield was not needed. - */ -static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter) -{ - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { - kvm_flush_remote_tlbs(kvm); - cond_resched_lock(&kvm->mmu_lock); - tdp_iter_refresh_walk(iter); - return true; - } - - return false; -} - /* * Yield if the MMU lock is contended or this thread needs to return control * to the scheduler. * + * If this function should yield and flush is set, it will perform a remote + * TLB flush before yielding. + * * If this function yields, it will also reset the tdp_iter's walk over the * paging structure and the calling function should allow the iterator to * continue its traversal from the paging structure root. @@ -477,9 +457,13 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it * Return true if this function yielded and the iterator's traversal was reset. * Return false if a yield was not needed. */ -static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter) +static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, + struct tdp_iter *iter, bool flush) { if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (flush) + kvm_flush_remote_tlbs(kvm); + cond_resched_lock(&kvm->mmu_lock); tdp_iter_refresh_walk(iter); return true; @@ -522,7 +506,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); flush_needed = !can_yield || - !tdp_mmu_iter_flush_cond_resched(kvm, &iter); + !tdp_mmu_iter_cond_resched(kvm, &iter, true); } return flush_needed; } @@ -894,7 +878,7 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - tdp_mmu_iter_cond_resched(kvm, &iter); + tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -953,7 +937,7 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - tdp_mmu_iter_cond_resched(kvm, &iter); + tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -1069,7 +1053,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, new_spte); spte_set = true; - tdp_mmu_iter_cond_resched(kvm, &iter); + tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; @@ -1121,7 +1105,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, tdp_mmu_set_spte(kvm, &iter, 0); - spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter); + spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter, true); } if (spte_set) From patchwork Tue Feb 2 18:57:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 378C7C433E0 for ; Tue, 2 Feb 2021 19:18:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F21EB64E45 for ; Tue, 2 Feb 2021 19:18:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239528AbhBBTRw (ORCPT ); Tue, 2 Feb 2021 14:17:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239291AbhBBTAE (ORCPT ); Tue, 2 Feb 2021 14:00:04 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6ECB5C06121E for ; Tue, 2 Feb 2021 10:58:00 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id b125so17222619ybg.10 for ; Tue, 02 Feb 2021 10:58:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=r9zf4JapqsUbKkhg3MdDCwfnvpgINJBQ+t3m5xFJi3A=; b=BthiZkDKRBnfpRaHuUlcedzUCwp0gA+v6ZySt9zENYUU17VUhTMuRiinMwXBUIHTjd y3QZQ1WBGtOmWvL6Aj0QltfBQTI76nX0dBd2Ix+RKTL+xEvstYKw9QMZl40BFs/kZk+5 UE33gnArZmtCj9VVelChvI5iAbLvWWsrlBrXBXaaVFv6aF9HE1KvnJ2TIr/fscZO+sz7 ttyPCQv1jd2vZ3wNHcrjRvYb2R6nYRXAfR40/dw6asH+JqVU1+hQTciR1Tismk0qSMEX qUtOjcVm3Y6B2eLG8uXGBRHU7WhJdvvmfGfpUXEhgpI0PZVTjl6deS3zXgV1EsVaVBv5 0xzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=r9zf4JapqsUbKkhg3MdDCwfnvpgINJBQ+t3m5xFJi3A=; b=ULAJmUPBIUfqzSFbFH/ZCeqMA2tH1vKniLTQadSM21XpxjcNGGEnczkdwCBXjJ2DV3 kK/UwVxu0Tkie0jpOv73NuCv6tnb49Drkpkj28WMxMu45kVNMwe2JM8Z7GLCFoREX0Ua DkEkBJtu/jkX2xQfzhh4qLUn/OTteqLBJLJ61Ir0uO9ey0GcX/gNcSzovEDm0+/qOYsV lgCClJomKiIwVvuEwLaXpen8hHgUvv+AyS0Si1Hp/UhiMeHXSUXMNb6aZb8YSh1gXEcq gOfsYXxhDI5agvoJSgjj3o2zipT/RKOyy8pLWadLCpmr/xIbwXjFGeMHlDdbKt1Z3vEO AdaQ== X-Gm-Message-State: AOAM531lSg76ZtVzqDddPST+vjB/yL36Ve7kVstSGXSi0s0F3GVmSdtE dlji+5mhX009BbkgvXJaOVivWRWVKkXV X-Google-Smtp-Source: ABdhPJwpXiHO1n0Vvt+COE83RSFbSRP15c1huQEKtMuIA/E3dznJXAwBdC2hdQ7te4BbpK/5Bw26Ax7AsXT4 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a25:be13:: with SMTP id h19mr33995134ybk.53.1612292279737; Tue, 02 Feb 2021 10:57:59 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:18 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-13-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 12/28] KVM: x86/mmu: Rename goal_gfn to next_last_level_gfn From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The goal_gfn field in tdp_iter can be misleading as it implies that it is the iterator's final goal. It is really a taget for the lowest gfn mapped by the leaf level SPTE the iterator will traverse towards. Change the field's name to be more precise. Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_iter.c | 20 ++++++++++---------- arch/x86/kvm/mmu/tdp_iter.h | 4 ++-- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index 87b7e16911db..9917c55b7d24 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -22,21 +22,21 @@ static gfn_t round_gfn_for_level(gfn_t gfn, int level) /* * Sets a TDP iterator to walk a pre-order traversal of the paging structure - * rooted at root_pt, starting with the walk to translate goal_gfn. + * rooted at root_pt, starting with the walk to translate next_last_level_gfn. */ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, - int min_level, gfn_t goal_gfn) + int min_level, gfn_t next_last_level_gfn) { WARN_ON(root_level < 1); WARN_ON(root_level > PT64_ROOT_MAX_LEVEL); - iter->goal_gfn = goal_gfn; + iter->next_last_level_gfn = next_last_level_gfn; iter->root_level = root_level; iter->min_level = min_level; iter->level = root_level; iter->pt_path[iter->level - 1] = root_pt; - iter->gfn = round_gfn_for_level(iter->goal_gfn, iter->level); + iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level); tdp_iter_refresh_sptep(iter); iter->valid = true; @@ -82,7 +82,7 @@ static bool try_step_down(struct tdp_iter *iter) iter->level--; iter->pt_path[iter->level - 1] = child_pt; - iter->gfn = round_gfn_for_level(iter->goal_gfn, iter->level); + iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level); tdp_iter_refresh_sptep(iter); return true; @@ -106,7 +106,7 @@ static bool try_step_side(struct tdp_iter *iter) return false; iter->gfn += KVM_PAGES_PER_HPAGE(iter->level); - iter->goal_gfn = iter->gfn; + iter->next_last_level_gfn = iter->gfn; iter->sptep++; iter->old_spte = READ_ONCE(*iter->sptep); @@ -166,13 +166,13 @@ void tdp_iter_next(struct tdp_iter *iter) */ void tdp_iter_refresh_walk(struct tdp_iter *iter) { - gfn_t goal_gfn = iter->goal_gfn; + gfn_t next_last_level_gfn = iter->next_last_level_gfn; - if (iter->gfn > goal_gfn) - goal_gfn = iter->gfn; + if (iter->gfn > next_last_level_gfn) + next_last_level_gfn = iter->gfn; tdp_iter_start(iter, iter->pt_path[iter->root_level - 1], - iter->root_level, iter->min_level, goal_gfn); + iter->root_level, iter->min_level, next_last_level_gfn); } u64 *tdp_iter_root_pt(struct tdp_iter *iter) diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index 47170d0dc98e..b2dd269c631f 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -15,7 +15,7 @@ struct tdp_iter { * The iterator will traverse the paging structure towards the mapping * for this GFN. */ - gfn_t goal_gfn; + gfn_t next_last_level_gfn; /* Pointers to the page tables traversed to reach the current SPTE */ u64 *pt_path[PT64_ROOT_MAX_LEVEL]; /* A pointer to the current SPTE */ @@ -52,7 +52,7 @@ struct tdp_iter { u64 *spte_to_child_pt(u64 pte, int level); void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, - int min_level, gfn_t goal_gfn); + int min_level, gfn_t next_last_level_gfn); void tdp_iter_next(struct tdp_iter *iter); void tdp_iter_refresh_walk(struct tdp_iter *iter); u64 *tdp_iter_root_pt(struct tdp_iter *iter); From patchwork Tue Feb 2 18:57:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE58FC433E0 for ; Tue, 2 Feb 2021 19:03:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 98AD164EC1 for ; Tue, 2 Feb 2021 19:03:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239298AbhBBTCw (ORCPT ); Tue, 2 Feb 2021 14:02:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239313AbhBBTAb (ORCPT ); Tue, 2 Feb 2021 14:00:31 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F32DBC061220 for ; Tue, 2 Feb 2021 10:58:01 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id k27so14143933pfg.8 for ; Tue, 02 Feb 2021 10:58:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=n8fhJr5V3N84UjdeiH+rWqHQKMtsaVAMI036LQjLnUY=; b=ijbEgJRV/Zt77PMExttUhXYB/VcaH+EEAn54a2as91X1nzovPhXC4+4pmqFCAWflWb pI2/lYkeJkQSIGNMdCk4dGY2z6gDWCJEMwz/Fh8ZJjWtNorwfkiEyCZzBMzS7LwBaW1p KjY7REOyWcdk26GftMco8oNfW1HLvNQzBzQJWsQpiXZ8EOdx4D/ArP5Sc3A0QGtL0xlu i9bnJJ+RkraYJmA+MvP2bBJHVQ1J/YvyWRoTYGoqManS3Vqbl4k1gtQ5yA7iUQjtBNBq 1ajlNRBNFmnIrhAQeuURXGtrBPKdKjwYOyRJIsyowJV9u93s2rC5CEvQuCxdOQgEuigs 71Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=n8fhJr5V3N84UjdeiH+rWqHQKMtsaVAMI036LQjLnUY=; b=NQJHJ4F3tCF74sOodxAEUIHALcl0S4VZgFyYMoBwKBN5TfAH8m0mQ7QMRyKB8VaTz3 YRNVk+FHDl54SEkUepkynOLiOxxkeY9F8yi2uCANg3LaaPMolZqoAEKiADjGyTEULGjx 1VpkLxCcxuZu09DXgakpijaHOMkqTxzUUIs/eAM8nVa7JtUi8D7IhbYNQpGsCuZfgzNg em9wuXkpelC7U+CWt72T5ryMsAnkh4O3B9KtsQ1JNQi2/2iGpOVY3cRh7YS0UP6/58zu H/TK1fC7Pa53h32k3xOs7FeepcfaaoByQ0uKRV7DWN0vgEclGEh1tp8I10AylBI6izJW JKFw== X-Gm-Message-State: AOAM532MDXVXHbLuytda6F49WwcyLe3fBTJHl/QgEq7vcvPXdW+UMngX ZdItH1CXd8sUpK5sjJyzepCyWhKMO6cy X-Google-Smtp-Source: ABdhPJwBU9rxvm3bj6kk46P+sxuaQA2kcB389w5ls0j2KxNqxATQ24vyhoBKehbvq+dVwSBGmSGgGM2Iozgp Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a65:4983:: with SMTP id r3mr23767928pgs.288.1612292281437; Tue, 02 Feb 2021 10:58:01 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:19 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-14-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 13/28] KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In some functions the TDP iter risks not making forward progress if two threads livelock yielding to one another. This is possible if two threads are trying to execute wrprot_gfn_range. Each could write protect an entry and then yield. This would reset the tdp_iter's walk over the paging structure and the loop would end up repeating the same entry over and over, preventing either thread from making forward progress. Fix this issue by only yielding if the loop has made forward progress since the last yield. Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- v1 -> v2 - Moved forward progress check into tdp_mmu_iter_cond_resched - Folded tdp_iter_refresh_walk into tdp_mmu_iter_cond_resched - Split patch into three and renamed all arch/x86/kvm/mmu/tdp_iter.c | 18 +----------------- arch/x86/kvm/mmu/tdp_iter.h | 7 ++++++- arch/x86/kvm/mmu/tdp_mmu.c | 21 ++++++++++++++++----- 3 files changed, 23 insertions(+), 23 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index 9917c55b7d24..1a09d212186b 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -31,6 +31,7 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, WARN_ON(root_level > PT64_ROOT_MAX_LEVEL); iter->next_last_level_gfn = next_last_level_gfn; + iter->yielded_gfn = iter->next_last_level_gfn; iter->root_level = root_level; iter->min_level = min_level; iter->level = root_level; @@ -158,23 +159,6 @@ void tdp_iter_next(struct tdp_iter *iter) iter->valid = false; } -/* - * Restart the walk over the paging structure from the root, starting from the - * highest gfn the iterator had previously reached. Assumes that the entire - * paging structure, except the root page, may have been completely torn down - * and rebuilt. - */ -void tdp_iter_refresh_walk(struct tdp_iter *iter) -{ - gfn_t next_last_level_gfn = iter->next_last_level_gfn; - - if (iter->gfn > next_last_level_gfn) - next_last_level_gfn = iter->gfn; - - tdp_iter_start(iter, iter->pt_path[iter->root_level - 1], - iter->root_level, iter->min_level, next_last_level_gfn); -} - u64 *tdp_iter_root_pt(struct tdp_iter *iter) { return iter->pt_path[iter->root_level - 1]; diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index b2dd269c631f..d480c540ee27 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -16,6 +16,12 @@ struct tdp_iter { * for this GFN. */ gfn_t next_last_level_gfn; + /* + * The next_last_level_gfn at the time when the thread last + * yielded. Only yielding when the next_last_level_gfn != + * yielded_gfn helps ensure forward progress. + */ + gfn_t yielded_gfn; /* Pointers to the page tables traversed to reach the current SPTE */ u64 *pt_path[PT64_ROOT_MAX_LEVEL]; /* A pointer to the current SPTE */ @@ -54,7 +60,6 @@ u64 *spte_to_child_pt(u64 pte, int level); void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, int min_level, gfn_t next_last_level_gfn); void tdp_iter_next(struct tdp_iter *iter); -void tdp_iter_refresh_walk(struct tdp_iter *iter); u64 *tdp_iter_root_pt(struct tdp_iter *iter); #endif /* __KVM_X86_MMU_TDP_ITER_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 8f7b120597f3..7cfc0639b1ef 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -451,8 +451,9 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, * TLB flush before yielding. * * If this function yields, it will also reset the tdp_iter's walk over the - * paging structure and the calling function should allow the iterator to - * continue its traversal from the paging structure root. + * paging structure and the calling function should skip to the next + * iteration to allow the iterator to continue its traversal from the + * paging structure root. * * Return true if this function yielded and the iterator's traversal was reset. * Return false if a yield was not needed. @@ -460,12 +461,22 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter, bool flush) { + /* Ensure forward progress has been made before yielding. */ + if (iter->next_last_level_gfn == iter->yielded_gfn) + return false; + if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { if (flush) kvm_flush_remote_tlbs(kvm); cond_resched_lock(&kvm->mmu_lock); - tdp_iter_refresh_walk(iter); + + WARN_ON(iter->gfn > iter->next_last_level_gfn); + + tdp_iter_start(iter, iter->pt_path[iter->root_level - 1], + iter->root_level, iter->min_level, + iter->next_last_level_gfn); + return true; } @@ -505,8 +516,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); - flush_needed = !can_yield || - !tdp_mmu_iter_cond_resched(kvm, &iter, true); + flush_needed = !(can_yield && + tdp_mmu_iter_cond_resched(kvm, &iter, true)); } return flush_needed; } From patchwork Tue Feb 2 18:57:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C4C1C433E6 for ; Tue, 2 Feb 2021 19:17:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 39D4C64E46 for ; Tue, 2 Feb 2021 19:17:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239243AbhBBTR0 (ORCPT ); Tue, 2 Feb 2021 14:17:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239294AbhBBTAQ (ORCPT ); Tue, 2 Feb 2021 14:00:16 -0500 Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06B4EC061222 for ; Tue, 2 Feb 2021 10:58:04 -0800 (PST) Received: by mail-qk1-x749.google.com with SMTP id e5so18185595qkn.2 for ; Tue, 02 Feb 2021 10:58:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=D/IHf2NWbkBu2/dnLT5S1455F/NSQH7Bez/0/zsY4ns=; b=kqy9t70KqYBTQoFXYLtydJxlgpe/KOQmV+GFvXqKC4syTSP2zegsnljJty/ZSZjbKF hC6HDrYaOBKMYYn5y9j7TW8QcOcs3Y+RX1i0b57aS1i6P8ZDGKKdWfrAQzyZq4KGb0Uj cX3hxHQPifN7RwXkVCfdnUSVGZH4fu+r8ukip92w6KgZfwHNdzxyEIZzxU47TptN7+If JKYieZhqF3ZRRxDuZjTMIoAoGiMKMl2WqhpjNiNctOGdMmovVR3LTV7TL+/uivZn6KcK klfj0MzH91YgBL3XCneEfdP0UjAch7XgqlrHeJFx7XcLuxj/fw5iJITl/Tg1v7JnYTMj lEAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=D/IHf2NWbkBu2/dnLT5S1455F/NSQH7Bez/0/zsY4ns=; b=qkLXpS77h1KkMvEn9ymVynVrNzbheciXo+DcXnc0kv+EC+i+AiGfSATNqG4m2WYf7C RLGY8dUcx7fblR+82nslM5vxFuAw1ZA30+dm/KwR6AFSaF6ko0NcCkhl4/cJTbd833K7 X3dH5QRCDq7KpoH0XGGkcAbJc1B47w/Q0qvwemhzuPaV3VpqKBLnHF3tlDRxUT7VY+QS HWcp2o37rYt5IeU/nzHJcwCPsd/CZA5o0vmke+zyHVEqZgP7BN6LCx3z7rEQ0oPLqBba FtlPYoDLxfOm0rMVT3S4FRJl7m1fnqaUJTnxVcTBqjoksbMazFzC3afbKVoMAU2OJ2wF ZHcA== X-Gm-Message-State: AOAM530uKIEs58Bx5KGRYUsH9uzpKnqdyqbCAy14L3UNEkRaHTPwPOMv rNPl5ZIHfXJIGs2h4Zzt5p3S+PzToVjb X-Google-Smtp-Source: ABdhPJxR58WTTtS8AKXqF8trL+Tq7n5Af2kg5RhvQyrSPnU4lJrad5XcrhDLdqTGryVNTb31qr53mNC6nH4t Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a05:6214:108d:: with SMTP id o13mr21210649qvr.16.1612292283198; Tue, 02 Feb 2021 10:58:03 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:20 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-15-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 14/28] KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Given certain conditions, some TDP MMU functions may not yield reliably / frequently enough. For example, if a paging structure was very large but had few, if any writable entries, wrprot_gfn_range could traverse many entries before finding a writable entry and yielding because the check for yielding only happens after an SPTE is modified. Fix this issue by moving the yield to the beginning of the loop. Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- v1 -> v2 - Split patch into three arch/x86/kvm/mmu/tdp_mmu.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 7cfc0639b1ef..c8a1149cb229 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -501,6 +501,12 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, bool flush_needed = false; tdp_root_for_each_pte(iter, root, start, end) { + if (can_yield && + tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) { + flush_needed = false; + continue; + } + if (!is_shadow_present_pte(iter.old_spte)) continue; @@ -515,9 +521,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, continue; tdp_mmu_set_spte(kvm, &iter, 0); - - flush_needed = !(can_yield && - tdp_mmu_iter_cond_resched(kvm, &iter, true)); + flush_needed = true; } return flush_needed; } @@ -880,6 +884,9 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, for_each_tdp_pte_min_level(iter, root->spt, root->role.level, min_level, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + continue; + if (!is_shadow_present_pte(iter.old_spte) || !is_last_spte(iter.old_spte, iter.level)) continue; @@ -888,8 +895,6 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -933,6 +938,9 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, bool spte_set = false; tdp_root_for_each_leaf_pte(iter, root, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + continue; + if (spte_ad_need_write_protect(iter.old_spte)) { if (is_writable_pte(iter.old_spte)) new_spte = iter.old_spte & ~PT_WRITABLE_MASK; @@ -947,8 +955,6 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; } @@ -1056,6 +1062,9 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, bool spte_set = false; tdp_root_for_each_pte(iter, root, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + continue; + if (!is_shadow_present_pte(iter.old_spte)) continue; @@ -1063,8 +1072,6 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, new_spte); spte_set = true; - - tdp_mmu_iter_cond_resched(kvm, &iter, false); } return spte_set; @@ -1105,6 +1112,11 @@ static void zap_collapsible_spte_range(struct kvm *kvm, bool spte_set = false; tdp_root_for_each_pte(iter, root, start, end) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, spte_set)) { + spte_set = false; + continue; + } + if (!is_shadow_present_pte(iter.old_spte) || !is_last_spte(iter.old_spte, iter.level)) continue; @@ -1116,7 +1128,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, tdp_mmu_set_spte(kvm, &iter, 0); - spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter, true); + spte_set = true; } if (spte_set) From patchwork Tue Feb 2 18:57:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 957A4C433E0 for ; Tue, 2 Feb 2021 19:02:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5750564F6A for ; Tue, 2 Feb 2021 19:02:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239311AbhBBTCn (ORCPT ); Tue, 2 Feb 2021 14:02:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239309AbhBBTA3 (ORCPT ); Tue, 2 Feb 2021 14:00:29 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FDA0C061224 for ; Tue, 2 Feb 2021 10:58:05 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id o11so11986461pgn.1 for ; Tue, 02 Feb 2021 10:58:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=TvzNxZEmeauTq2Q8+NC26kwWaevzAKsfhvaBGBmdGZw=; b=T+DimJrwM2NuQebN6GQIVbIHXPhvE6MCkly0TwLwZfbQAzn865KoEtBuZzk4JO3R1v pkgXztB7dyVKmSGuyYfBNMzHE6JbhBjNKjHfFnkxuO/LBRg8BQ4pmpxd0RjE6ZR+jOUw OTxW/3s9L2DcmM6t60b5OZZm26CfiQ97p0RLtZIOLom5oz7QW+MKJh3UDiV5P66CfoBH end2v+qmW3oic7P1yvm0YYujO6S29t9glJYKRRYHX11iNNEHXG8m8Ij6l5k9EPVd1bj+ rUA9Yaq3tfSpD2e0a6z1SWatxGFl53QN1fL/D3uATKsvN62XdSxngWdyGsvO2cAeERZs UOVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=TvzNxZEmeauTq2Q8+NC26kwWaevzAKsfhvaBGBmdGZw=; b=XcSrQ0QD5YPa+Krje4kdiqWaNVkRq43twS5EJHC3L9d7GRBJN1DpNWdxtEst1yCTRs 84I1xnm5nzzR+CLtmhL0m+CuPUUOJrusJYRnW/5qs/LjGoEScFmQa3tDw6AmbLNrJYUd ZiCs2VAUZwRwsbd2RwcpyTy2MBIRgCUrZ8cUl5PYgmZJTwqwAuh73lx9jolZ56E7m4IN wNvfIpadOpwKCHAzWnPfb6HB2ammLbzf94Dobv5vSwU64AWiyocbSrgy3YieyI0Qwo0S sI/B1NoxiGOuKfP/OfY91xwkoncQJgcNuupzgLrjjbYTWPiTfwz5d82sjCva0D4pev2y J5jg== X-Gm-Message-State: AOAM532erHSeSXScVxiO0YZ8p4Q06hOY9PsydhZIrf7oqUuPTEMS77mC jTlZFGyNI685pvqtuGRW6VP75NSop25l X-Google-Smtp-Source: ABdhPJxAoifktld/mM+14GeUwIjNAf6XLVqMicqaxaKI0BFtXVEKCZdf91ZkAToJ7z7NNV1jwuaH7UXFTYdR Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a63:703:: with SMTP id 3mr23018613pgh.272.1612292285011; Tue, 02 Feb 2021 10:58:05 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:21 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-16-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 15/28] KVM: x86/mmu: Skip no-op changes in TDP MMU functions From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Skip setting SPTEs if no change is expected. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- v1 -> v2 - Merged no-op checks into exiting old_spte check arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index c8a1149cb229..aeb05f626b55 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -888,7 +888,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, continue; if (!is_shadow_present_pte(iter.old_spte) || - !is_last_spte(iter.old_spte, iter.level)) + !is_last_spte(iter.old_spte, iter.level) || + !(iter.old_spte & PT_WRITABLE_MASK)) continue; new_spte = iter.old_spte & ~PT_WRITABLE_MASK; @@ -1065,7 +1066,8 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) continue; - if (!is_shadow_present_pte(iter.old_spte)) + if (!is_shadow_present_pte(iter.old_spte) || + iter.old_spte & shadow_dirty_mask) continue; new_spte = iter.old_spte | shadow_dirty_mask; From patchwork Tue Feb 2 18:57:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062681 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6B4AC433E0 for ; Tue, 2 Feb 2021 19:03:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6DD9764EC1 for ; Tue, 2 Feb 2021 19:03:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239237AbhBBTDh (ORCPT ); Tue, 2 Feb 2021 14:03:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239176AbhBBTB1 (ORCPT ); Tue, 2 Feb 2021 14:01:27 -0500 Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F7EFC061226 for ; Tue, 2 Feb 2021 10:58:07 -0800 (PST) Received: by mail-qt1-x84a.google.com with SMTP id c19so6794640qtp.2 for ; Tue, 02 Feb 2021 10:58:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=863KxC1/6wBl/LmIsFwRhyyYIQNzbKOVVmGrH1lkP6c=; b=gfvaiNF1NesUPe/yD6AqhBvDIl5DAFchc6PFaK0jSnOdybyrSDKCylQvIGkLEXblHG rK8cpdJLhOHMv1bDVrraUfpnmJCuzVDKrJV5iYNsAKOYKCHDjQ+dzA9QDZinOrMc09qP es4LquQhqfvgoj2/xhLvLYjNZ0O89UQyeFctLDUL5y79Mk6khvhBsSQsnOakpRfs1hhu elkBUzH4B58FDZIBsg+U3M5VhqQ+1mSTnhVecz6qCUySWr0C22+DrjTZiA7+DYjyMeZh bhZO00jg/Cah5u1uA4fHUVWJSLDzFsbfftXbInJb1FKvtgWKmAx5nQYth8znMBdc98aH SYlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=863KxC1/6wBl/LmIsFwRhyyYIQNzbKOVVmGrH1lkP6c=; b=qY3IYJlt6ZAR04JOvJmle5b1tstNt3sls6VtoRdpofMbZQSGDIW+hrwpTWboNhCsxy b/vf47lencB6YMwPCSV43MsLRjuB0W8A79AGgmuya+nSkS4/CGZjiAGrFQTvrdkI6Ka/ 12jd+Fbncwtaf/aCOROZldv7541qnH69a58sxzS+iPCm/ajICL4c4bADA/hJm6OLXNTQ a2kZpbGlzqIvLsCzE4JDBFkKJE9m/JJp7pJ1sJA95WUG/QJ+x7shPB0T0ecyOu0DCUWD Xv6fYwKgTxSKi+pY1q74nAasc4ppNpcjfbcDJla2znp1lllI7XmCMgl1ArdtplKiO5+O f2KQ== X-Gm-Message-State: AOAM532I+MbK8Baw5RN4dFU5BOlxnxqu4drWIHhKGoMfApu/U+VRo4ib l4MXQE7NAhDf11uhjV6eKy7as/94Plvd X-Google-Smtp-Source: ABdhPJwrmTTtJgDxk79E2bRzdxXBq4ABPCviRTbWISZT3iVQAzKTtwCOhfo+RFy+5vn9G+bIlyfZ80k96DmZ Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a0c:fcca:: with SMTP id i10mr22127299qvq.38.1612292286580; Tue, 02 Feb 2021 10:58:06 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:22 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-17-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 16/28] KVM: x86/mmu: Clear dirtied pages mask bit before early break From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In clear_dirty_pt_masked, the loop is intended to exit early after processing each of the GFNs with corresponding bits set in mask. This does not work as intended if another thread has already cleared the dirty bit or writable bit on the SPTE. In that case, the loop would proceed to the next iteration early and the bit in mask would not be cleared. As a result the loop could not exit early and would proceed uselessly. Move the unsetting of the mask bit before the check for a no-op SPTE change. Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Suggested-by: Sean Christopherson Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index aeb05f626b55..a75e92164a8b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1007,6 +1007,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, !(mask & (1UL << (iter.gfn - gfn)))) continue; + mask &= ~(1UL << (iter.gfn - gfn)); + if (wrprot || spte_ad_need_write_protect(iter.old_spte)) { if (is_writable_pte(iter.old_spte)) new_spte = iter.old_spte & ~PT_WRITABLE_MASK; @@ -1020,8 +1022,6 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, } tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); - - mask &= ~(1UL << (iter.gfn - gfn)); } } From patchwork Tue Feb 2 18:57:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7009C433E9 for ; Tue, 2 Feb 2021 19:17:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7F5C864E4D for ; Tue, 2 Feb 2021 19:17:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231806AbhBBTRX (ORCPT ); Tue, 2 Feb 2021 14:17:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239243AbhBBTBl (ORCPT ); Tue, 2 Feb 2021 14:01:41 -0500 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98DEBC0611C1 for ; Tue, 2 Feb 2021 10:58:09 -0800 (PST) Received: by mail-pg1-x54a.google.com with SMTP id 145so5495500pgh.2 for ; Tue, 02 Feb 2021 10:58:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=PUjq1PmHRqanFbRtZvCFLqPwTIM7m5ji/Z5EVfp8UJ8=; b=S8YoCVyZ5DUGM/oyyP1/qwQro5DsIPqTY6Y1+CcTD0/eSiHuZ+wC6F/vv+lJYMn1GF zmMfNsOpL3FDjIjQpXp0zVjwj1sOjPCnUszIBDTbVHpranL+1sOktF3yz/Db2q+R7t3a bzV+jsXbk0NjTUoRNiwgzzVzXI1tdRAw9J7S4okJsAm3vElMp1GXXGnhtLmLR0LfZ9+1 8rFNExDgFKrOPOWUxM3JZKL3cMwhp7cTrCxOURH5jU5TPa1mPJxsdI/eS+2J9Z5nG9Na PMFG44qctZTLbo4NgfGJDLY6KgvVpM0bes0o/BphFflY4DaGoaXZNgTuwfbe6elg84Sp O5pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=PUjq1PmHRqanFbRtZvCFLqPwTIM7m5ji/Z5EVfp8UJ8=; b=p0lrSjPQ1BYeIg0vWSklq4zwqP5/d8wFTN265yXVNbbV0Chd3h+ZT2MMIUYALfZAUH hhQ/VOUybPeHP1o0nRWAjFPdzTJVbpUWElKVDCdEOMndYeyU3U3CJbSSKHN8MxZYdIQ4 B+k2U7o6WaJYNUDAtRTcxBflVpeLUoHF6rzO74sQKcPz8Cq5partX946iZEFWmiSfyal 9VL1aaZYMfchc8fUiiCYOesZRYDBBNrTsjg9up8Qhauv7n175Nqel/XLSW04H5doTrfX lYC94oPYXY2W+9+e1ZFFWSJyLzftwnUAL9nYqbzHXzvJhyz3w1KzIe9pOpCQcM0HG6ON hB1A== X-Gm-Message-State: AOAM532v3FzfeUgxJbukNew2PYlRmu6gAs53dd5Dc4akOF7MDu/VTClL iA11gPBJpN47PrqkYivPFI0BD/v3ZjlR X-Google-Smtp-Source: ABdhPJxGKEq11F1hW7rwkQqAfEKRiHg/+xDfc56d7O/I/IKDaG6iyUQlp3+GiAtQ891MZ5Zg9FgRglVpH6BT Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90a:3f08:: with SMTP id l8mr614277pjc.1.1612292288437; Tue, 02 Feb 2021 10:58:08 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:23 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-18-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 17/28] KVM: x86/mmu: Protect TDP MMU page table memory with RCU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In order to enable concurrent modifications to the paging structures in the TDP MMU, threads must be able to safely remove pages of page table memory while other threads are traversing the same memory. To ensure threads do not access PT memory after it is freed, protect PT memory with RCU. Protecting concurrent accesses to page table memory from use-after-free bugs could also have been acomplished using walk_shadow_page_lockless_begin/end() and READING_SHADOW_PAGE_TABLES, coupling with the barriers in a TLB flush. The use of RCU for this case has several distinct advantages over that approach. 1. Disabling interrupts for long running operations is not desirable. Future commits will allow operations besides page faults to operate without the exclusive protection of the MMU lock and those operations are too long to disable iterrupts for their duration. 2. The use of RCU here avoids long blocking / spinning operations in perfromance critical paths. By freeing memory with an asynchronous RCU API we avoid the longer wait times TLB flushes experience when overlapping with a thread in walk_shadow_page_lockless_begin/end(). 3. RCU provides a separation of concerns when removing memory from the paging structure. Because the RCU callback to free memory can be scheduled immediately after a TLB flush, there's no need for the thread to manually free a queue of pages later, as commit_zap_pages does. Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU") Reviewed-by: Peter Feiner Suggested-by: Sean Christopherson Signed-off-by: Ben Gardon --- v1 -> v2 - Moved RCU read unlock before the TLB flush - Merged the RCU commits from v1 into a single commit - Changed the way accesses to page table memory are annotated with RCU in the TDP iterator arch/x86/kvm/mmu/mmu_internal.h | 3 ++ arch/x86/kvm/mmu/tdp_iter.c | 16 +++--- arch/x86/kvm/mmu/tdp_iter.h | 10 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 95 +++++++++++++++++++++++++++++---- 4 files changed, 103 insertions(+), 21 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index bfc6389edc28..7f599cc64178 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -57,6 +57,9 @@ struct kvm_mmu_page { atomic_t write_flooding_count; bool tdp_mmu_page; + + /* Used for freeing the page asyncronously if it is a TDP MMU page. */ + struct rcu_head rcu_head; }; extern struct kmem_cache *mmu_page_header_cache; diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c index 1a09d212186b..e5f148106e20 100644 --- a/arch/x86/kvm/mmu/tdp_iter.c +++ b/arch/x86/kvm/mmu/tdp_iter.c @@ -12,7 +12,7 @@ static void tdp_iter_refresh_sptep(struct tdp_iter *iter) { iter->sptep = iter->pt_path[iter->level - 1] + SHADOW_PT_INDEX(iter->gfn << PAGE_SHIFT, iter->level); - iter->old_spte = READ_ONCE(*iter->sptep); + iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep)); } static gfn_t round_gfn_for_level(gfn_t gfn, int level) @@ -35,7 +35,7 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, iter->root_level = root_level; iter->min_level = min_level; iter->level = root_level; - iter->pt_path[iter->level - 1] = root_pt; + iter->pt_path[iter->level - 1] = (tdp_ptep_t)root_pt; iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level); tdp_iter_refresh_sptep(iter); @@ -48,7 +48,7 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, * address of the child page table referenced by the SPTE. Returns null if * there is no such entry. */ -u64 *spte_to_child_pt(u64 spte, int level) +tdp_ptep_t spte_to_child_pt(u64 spte, int level) { /* * There's no child entry if this entry isn't present or is a @@ -57,7 +57,7 @@ u64 *spte_to_child_pt(u64 spte, int level) if (!is_shadow_present_pte(spte) || is_last_spte(spte, level)) return NULL; - return __va(spte_to_pfn(spte) << PAGE_SHIFT); + return (tdp_ptep_t)__va(spte_to_pfn(spte) << PAGE_SHIFT); } /* @@ -66,7 +66,7 @@ u64 *spte_to_child_pt(u64 spte, int level) */ static bool try_step_down(struct tdp_iter *iter) { - u64 *child_pt; + tdp_ptep_t child_pt; if (iter->level == iter->min_level) return false; @@ -75,7 +75,7 @@ static bool try_step_down(struct tdp_iter *iter) * Reread the SPTE before stepping down to avoid traversing into page * tables that are no longer linked from this entry. */ - iter->old_spte = READ_ONCE(*iter->sptep); + iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep)); child_pt = spte_to_child_pt(iter->old_spte, iter->level); if (!child_pt) @@ -109,7 +109,7 @@ static bool try_step_side(struct tdp_iter *iter) iter->gfn += KVM_PAGES_PER_HPAGE(iter->level); iter->next_last_level_gfn = iter->gfn; iter->sptep++; - iter->old_spte = READ_ONCE(*iter->sptep); + iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep)); return true; } @@ -159,7 +159,7 @@ void tdp_iter_next(struct tdp_iter *iter) iter->valid = false; } -u64 *tdp_iter_root_pt(struct tdp_iter *iter) +tdp_ptep_t tdp_iter_root_pt(struct tdp_iter *iter) { return iter->pt_path[iter->root_level - 1]; } diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index d480c540ee27..4cc177d75c4a 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -7,6 +7,8 @@ #include "mmu.h" +typedef u64 __rcu *tdp_ptep_t; + /* * A TDP iterator performs a pre-order walk over a TDP paging structure. */ @@ -23,9 +25,9 @@ struct tdp_iter { */ gfn_t yielded_gfn; /* Pointers to the page tables traversed to reach the current SPTE */ - u64 *pt_path[PT64_ROOT_MAX_LEVEL]; + tdp_ptep_t pt_path[PT64_ROOT_MAX_LEVEL]; /* A pointer to the current SPTE */ - u64 *sptep; + tdp_ptep_t sptep; /* The lowest GFN mapped by the current SPTE */ gfn_t gfn; /* The level of the root page given to the iterator */ @@ -55,11 +57,11 @@ struct tdp_iter { #define for_each_tdp_pte(iter, root, root_level, start, end) \ for_each_tdp_pte_min_level(iter, root, root_level, PG_LEVEL_4K, start, end) -u64 *spte_to_child_pt(u64 pte, int level); +tdp_ptep_t spte_to_child_pt(u64 pte, int level); void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level, int min_level, gfn_t next_last_level_gfn); void tdp_iter_next(struct tdp_iter *iter); -u64 *tdp_iter_root_pt(struct tdp_iter *iter); +tdp_ptep_t tdp_iter_root_pt(struct tdp_iter *iter); #endif /* __KVM_X86_MMU_TDP_ITER_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index a75e92164a8b..9e4009068920 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -42,6 +42,12 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) return; WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots)); + + /* + * Ensure that all the outstanding RCU callbacks to free shadow pages + * can run before the VM is torn down. + */ + rcu_barrier(); } static void tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root) @@ -196,6 +202,28 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) return __pa(root->spt); } +static void tdp_mmu_free_sp(struct kvm_mmu_page *sp) +{ + free_page((unsigned long)sp->spt); + kmem_cache_free(mmu_page_header_cache, sp); +} + +/* + * This is called through call_rcu in order to free TDP page table memory + * safely with respect to other kernel threads that may be operating on + * the memory. + * By only accessing TDP MMU page table memory in an RCU read critical + * section, and freeing it after a grace period, lockless access to that + * memory won't use it after it is freed. + */ +static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head) +{ + struct kvm_mmu_page *sp = container_of(head, struct kvm_mmu_page, + rcu_head); + + tdp_mmu_free_sp(sp); +} + static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, u64 old_spte, u64 new_spte, int level); @@ -269,8 +297,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt) kvm_flush_remote_tlbs_with_address(kvm, gfn, KVM_PAGES_PER_HPAGE(level)); - free_page((unsigned long)pt); - kmem_cache_free(mmu_page_header_cache, sp); + call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback); } /** @@ -390,13 +417,13 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, u64 new_spte, bool record_acc_track, bool record_dirty_log) { - u64 *root_pt = tdp_iter_root_pt(iter); + tdp_ptep_t root_pt = tdp_iter_root_pt(iter); struct kvm_mmu_page *root = sptep_to_sp(root_pt); int as_id = kvm_mmu_page_as_id(root); lockdep_assert_held(&kvm->mmu_lock); - WRITE_ONCE(*iter->sptep, new_spte); + WRITE_ONCE(*rcu_dereference(iter->sptep), new_spte); __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, iter->level); @@ -466,10 +493,13 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, return false; if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + rcu_read_unlock(); + if (flush) kvm_flush_remote_tlbs(kvm); cond_resched_lock(&kvm->mmu_lock); + rcu_read_lock(); WARN_ON(iter->gfn > iter->next_last_level_gfn); @@ -500,6 +530,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, struct tdp_iter iter; bool flush_needed = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { if (can_yield && tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) { @@ -523,6 +555,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte(kvm, &iter, 0); flush_needed = true; } + + rcu_read_unlock(); return flush_needed; } @@ -568,13 +602,15 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write, if (unlikely(is_noslot_pfn(pfn))) { new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL); - trace_mark_mmio_spte(iter->sptep, iter->gfn, new_spte); + trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn, + new_spte); } else { make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn, pfn, iter->old_spte, prefault, true, map_writable, !shadow_accessed_mask, &new_spte); - trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep); + trace_kvm_mmu_set_spte(iter->level, iter->gfn, + rcu_dereference(iter->sptep)); } if (new_spte == iter->old_spte) @@ -597,7 +633,8 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write, if (unlikely(is_mmio_spte(new_spte))) ret = RET_PF_EMULATE; - trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep); + trace_kvm_mmu_set_spte(iter->level, iter->gfn, + rcu_dereference(iter->sptep)); if (!prefault) vcpu->stat.pf_fixed++; @@ -635,6 +672,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, huge_page_disallowed, &req_level); trace_kvm_mmu_spte_requested(gpa, level, pfn); + + rcu_read_lock(); + tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { if (nx_huge_page_workaround_enabled) disallowed_hugepage_adjust(iter.old_spte, gfn, @@ -660,7 +700,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, * because the new value informs the !present * path below. */ - iter.old_spte = READ_ONCE(*iter.sptep); + iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); } if (!is_shadow_present_pte(iter.old_spte)) { @@ -678,11 +718,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, } } - if (WARN_ON(iter.level != level)) + if (WARN_ON(iter.level != level)) { + rcu_read_unlock(); return RET_PF_RETRY; + } ret = tdp_mmu_map_handle_target_level(vcpu, write, map_writable, &iter, pfn, prefault); + rcu_read_unlock(); return ret; } @@ -753,6 +796,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, int young = 0; u64 new_spte = 0; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, start, end) { /* * If we have a non-accessed entry we don't need to change the @@ -784,6 +829,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, trace_kvm_age_page(iter.gfn, iter.level, slot, young); } + rcu_read_unlock(); + return young; } @@ -829,6 +876,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot, u64 new_spte; int need_flush = 0; + rcu_read_lock(); + WARN_ON(pte_huge(*ptep)); new_pfn = pte_pfn(*ptep); @@ -857,6 +906,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot, if (need_flush) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); + rcu_read_unlock(); + return 0; } @@ -880,6 +931,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, u64 new_spte; bool spte_set = false; + rcu_read_lock(); + BUG_ON(min_level > KVM_MAX_HUGEPAGE_LEVEL); for_each_tdp_pte_min_level(iter, root->spt, root->role.level, @@ -897,6 +950,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; } + + rcu_read_unlock(); return spte_set; } @@ -938,6 +993,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, u64 new_spte; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, start, end) { if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) continue; @@ -957,6 +1014,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); spte_set = true; } + + rcu_read_unlock(); return spte_set; } @@ -998,6 +1057,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, struct tdp_iter iter; u64 new_spte; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, gfn + __ffs(mask), gfn + BITS_PER_LONG) { if (!mask) @@ -1023,6 +1084,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root, tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); } + + rcu_read_unlock(); } /* @@ -1062,6 +1125,8 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, u64 new_spte; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) continue; @@ -1076,6 +1141,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, spte_set = true; } + rcu_read_unlock(); return spte_set; } @@ -1113,6 +1179,8 @@ static void zap_collapsible_spte_range(struct kvm *kvm, kvm_pfn_t pfn; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_pte(iter, root, start, end) { if (tdp_mmu_iter_cond_resched(kvm, &iter, spte_set)) { spte_set = false; @@ -1133,6 +1201,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, spte_set = true; } + rcu_read_unlock(); if (spte_set) kvm_flush_remote_tlbs(kvm); } @@ -1169,6 +1238,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root, u64 new_spte; bool spte_set = false; + rcu_read_lock(); + tdp_root_for_each_leaf_pte(iter, root, gfn, gfn + 1) { if (!is_writable_pte(iter.old_spte)) break; @@ -1180,6 +1251,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root, spte_set = true; } + rcu_read_unlock(); + return spte_set; } @@ -1220,10 +1293,14 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, *root_level = vcpu->arch.mmu->shadow_root_level; + rcu_read_lock(); + tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { leaf = iter.level; sptes[leaf] = iter.old_spte; } + rcu_read_unlock(); + return leaf; } From patchwork Tue Feb 2 18:57:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 363E2C433DB for ; Tue, 2 Feb 2021 19:16:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0570864E45 for ; Tue, 2 Feb 2021 19:16:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239610AbhBBTPm (ORCPT ); Tue, 2 Feb 2021 14:15:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239315AbhBBTBu (ORCPT ); Tue, 2 Feb 2021 14:01:50 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1FE5C0611C3 for ; Tue, 2 Feb 2021 10:58:11 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id z3so14131905pfj.3 for ; Tue, 02 Feb 2021 10:58:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=SGEmEAOkonOeY//vvlD5pwUR/L9cbqKIAPZrPaHHSAA=; b=rLFpqPW2++uiwbs8DScIia5KmQ/LWwgSG65kx7WtbZxEAPCp7j6G8Lw7nF2doj+3Fh HUsGMqmjczwB9uG7dfY8EJ4vAHA6TI1beGGw84mNdMIH18pHNrlQZc0ac7QfCcC1z/U+ FqCfJmrwRPufganyO1g4J3QLHZdWXiq0b4P6kOmz1xcvmvbjKQFQgikoO9dplD9pgQm/ YrpyQfXlvBZ4DzsjPd8QrdZoUJvgkX22G2M6Xk0KLNr9FoT4R5PWGdG/bGr30e+nHi0r +PIVHY4q09hpKXq2F+RrCtQOOx/pQucPr0js9ZsjQHKvc8b4j0BQ54U5mqmUO4igbgeG 3DgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=SGEmEAOkonOeY//vvlD5pwUR/L9cbqKIAPZrPaHHSAA=; b=sl6ZvWOcUqhSGzVTVShZlhnNtqKETpAc3tcTyYKacxVphRO/VsJp5GmjW8JWILVhp5 TDilNNinDwQQgloCC+ET4FOJog8BD05B7WhwRCom0adCmSExUuHd+wy73dZaxF+N9Bdr 0gd9WIwcBa0U/NUFJ03I/znyQE4sSgUgnYYnz1JGNNNcRrMTvEPtPTjAWJuF2j007LJX xHFZzOgJQrJBz3tcGw+mY37FzSpkgurT9bFdcrk7XA6NLg9xKN/3CVEEL2BHMIdgI+YF /aJd8l0HvgIRn3N4cmHWPACxglY0HeU0HmUPATGEKxVP7+NMQspLvcWj6h6kfv3D/mbA 5o1Q== X-Gm-Message-State: AOAM530ZN0nHBwXzogltMaa7zIOCkhO3tyJwe86O3knlBgH39hPswe3l 4DSfZKZtyl5NHyODBIeN98QY7IZC1lBS X-Google-Smtp-Source: ABdhPJwxylqU6vPiDAJ030SBBCDz8KKpTxXZcrTnOUmcLhIvHsaSkyR+MZzwuF4uzlI+cKB5vZ+PTEl3UZDK Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90a:8d83:: with SMTP id d3mr614627pjo.0.1612292290910; Tue, 02 Feb 2021 10:58:10 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:24 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-19-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 18/28] KVM: x86/mmu: Use an rwlock for the x86 MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a read / write lock to be used in place of the MMU spinlock on x86. The rwlock will enable the TDP MMU to handle page faults, and other operations in parallel in future commits. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- v1 -> v2 - Removed MMU lock wrappers - Completely replaced the MMU spinlock with an rwlock for x86 arch/x86/include/asm/kvm_host.h | 2 + arch/x86/kvm/mmu/mmu.c | 90 ++++++++++++++++----------------- arch/x86/kvm/mmu/page_track.c | 8 +-- arch/x86/kvm/mmu/paging_tmpl.h | 8 +-- arch/x86/kvm/mmu/tdp_mmu.c | 20 ++++---- arch/x86/kvm/x86.c | 4 +- include/linux/kvm_host.h | 5 ++ virt/kvm/dirty_ring.c | 10 ++++ virt/kvm/kvm_main.c | 46 +++++++++++------ 9 files changed, 112 insertions(+), 81 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3d6616f6f6ef..b6ebf2558386 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -337,6 +337,8 @@ struct kvm_mmu_root_info { #define KVM_MMU_NUM_PREV_ROOTS 3 +#define KVM_HAVE_MMU_RWLOCK + struct kvm_mmu_page; /* diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 60ff6837655a..b4d6709c240e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2016,9 +2016,9 @@ static void mmu_sync_children(struct kvm_vcpu *vcpu, flush |= kvm_sync_page(vcpu, sp, &invalid_list); mmu_pages_clear_parents(&parents); } - if (need_resched() || spin_needbreak(&vcpu->kvm->mmu_lock)) { + if (need_resched() || rwlock_needbreak(&vcpu->kvm->mmu_lock)) { kvm_mmu_flush_or_zap(vcpu, &invalid_list, false, flush); - cond_resched_lock(&vcpu->kvm->mmu_lock); + cond_resched_rwlock_write(&vcpu->kvm->mmu_lock); flush = false; } } @@ -2470,7 +2470,7 @@ static int make_mmu_pages_available(struct kvm_vcpu *vcpu) */ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long goal_nr_mmu_pages) { - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); if (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages) { kvm_mmu_zap_oldest_mmu_pages(kvm, kvm->arch.n_used_mmu_pages - @@ -2481,7 +2481,7 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long goal_nr_mmu_pages) kvm->arch.n_max_mmu_pages = goal_nr_mmu_pages; - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn) @@ -2492,7 +2492,7 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn) pgprintk("%s: looking for gfn %llx\n", __func__, gfn); r = 0; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); for_each_gfn_indirect_valid_sp(kvm, sp, gfn) { pgprintk("%s: gfn %llx role %x\n", __func__, gfn, sp->role.word); @@ -2500,7 +2500,7 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn) kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); } kvm_mmu_commit_zap_page(kvm, &invalid_list); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); return r; } @@ -3192,7 +3192,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, return; } - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) if (roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) @@ -3215,7 +3215,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, } kvm_mmu_commit_zap_page(kvm, &invalid_list); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } EXPORT_SYMBOL_GPL(kvm_mmu_free_roots); @@ -3236,16 +3236,16 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva, { struct kvm_mmu_page *sp; - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); if (make_mmu_pages_available(vcpu)) { - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); return INVALID_PAGE; } sp = kvm_mmu_get_page(vcpu, gfn, gva, level, direct, ACC_ALL); ++sp->root_count; - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); return __pa(sp->spt); } @@ -3416,17 +3416,17 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) !smp_load_acquire(&sp->unsync_children)) return; - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC); mmu_sync_children(vcpu, sp); kvm_mmu_audit(vcpu, AUDIT_POST_SYNC); - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); return; } - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC); for (i = 0; i < 4; ++i) { @@ -3440,7 +3440,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) } kvm_mmu_audit(vcpu, AUDIT_POST_SYNC); - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); } EXPORT_SYMBOL_GPL(kvm_mmu_sync_roots); @@ -3724,7 +3724,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, return r; r = RET_PF_RETRY; - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; r = make_mmu_pages_available(vcpu); @@ -3739,7 +3739,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, prefault, is_tdp); out_unlock: - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); kvm_release_pfn_clean(pfn); return r; } @@ -4999,7 +4999,7 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, */ mmu_topup_memory_caches(vcpu, true); - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); gentry = mmu_pte_write_fetch_gpte(vcpu, &gpa, &bytes); @@ -5035,7 +5035,7 @@ static void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, } kvm_mmu_flush_or_zap(vcpu, &invalid_list, remote_flush, local_flush); kvm_mmu_audit(vcpu, AUDIT_POST_PTE_WRITE); - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); } int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva) @@ -5233,14 +5233,14 @@ slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot, if (iterator.rmap) flush |= fn(kvm, iterator.rmap); - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { if (flush && lock_flush_tlb) { kvm_flush_remote_tlbs_with_address(kvm, start_gfn, iterator.gfn - start_gfn + 1); flush = false; } - cond_resched_lock(&kvm->mmu_lock); + cond_resched_rwlock_write(&kvm->mmu_lock); } } @@ -5390,7 +5390,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) * be in active use by the guest. */ if (batch >= BATCH_ZAP_PAGES && - cond_resched_lock(&kvm->mmu_lock)) { + cond_resched_rwlock_write(&kvm->mmu_lock)) { batch = 0; goto restart; } @@ -5423,7 +5423,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) { lockdep_assert_held(&kvm->slots_lock); - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); trace_kvm_mmu_zap_all_fast(kvm); /* @@ -5450,7 +5450,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) if (kvm->arch.tdp_mmu_enabled) kvm_tdp_mmu_zap_all(kvm); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm) @@ -5492,7 +5492,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) int i; bool flush; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); kvm_for_each_memslot(memslot, slots) { @@ -5516,7 +5516,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) kvm_flush_remote_tlbs(kvm); } - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } static bool slot_rmap_write_protect(struct kvm *kvm, @@ -5531,12 +5531,12 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); flush = slot_handle_level(kvm, memslot, slot_rmap_write_protect, start_level, KVM_MAX_HUGEPAGE_LEVEL, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_wrprot_slot(kvm, memslot, PG_LEVEL_4K); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); /* * We can flush all the TLBs out of the mmu lock without TLB @@ -5596,13 +5596,13 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, const struct kvm_memory_slot *memslot) { /* FIXME: const-ify all uses of struct kvm_memory_slot. */ - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); slot_handle_leaf(kvm, (struct kvm_memory_slot *)memslot, kvm_mmu_zap_collapsible_spte, true); if (kvm->arch.tdp_mmu_enabled) kvm_tdp_mmu_zap_collapsible_sptes(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm, @@ -5625,11 +5625,11 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_clear_dirty_slot(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); /* * It's also safe to flush TLBs out of mmu lock here as currently this @@ -5647,12 +5647,12 @@ void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); flush = slot_handle_large_level(kvm, memslot, slot_rmap_write_protect, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_wrprot_slot(kvm, memslot, PG_LEVEL_2M); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); @@ -5664,11 +5664,11 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm, { bool flush; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); flush = slot_handle_all_level(kvm, memslot, __rmap_set_dirty, false); if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_slot_set_dirty(kvm, memslot); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); @@ -5681,14 +5681,14 @@ void kvm_mmu_zap_all(struct kvm *kvm) LIST_HEAD(invalid_list); int ign; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); restart: list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) { if (WARN_ON(sp->role.invalid)) continue; if (__kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, &ign)) goto restart; - if (cond_resched_lock(&kvm->mmu_lock)) + if (cond_resched_rwlock_write(&kvm->mmu_lock)) goto restart; } @@ -5697,7 +5697,7 @@ void kvm_mmu_zap_all(struct kvm *kvm) if (kvm->arch.tdp_mmu_enabled) kvm_tdp_mmu_zap_all(kvm); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) @@ -5757,7 +5757,7 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) continue; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); if (kvm_has_zapped_obsolete_pages(kvm)) { kvm_mmu_commit_zap_page(kvm, @@ -5768,7 +5768,7 @@ mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) freed = kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan); unlock: - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, idx); /* @@ -5988,7 +5988,7 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) ulong to_zap; rcu_idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); ratio = READ_ONCE(nx_huge_pages_recovery_ratio); to_zap = ratio ? DIV_ROUND_UP(kvm->stat.nx_lpage_splits, ratio) : 0; @@ -6013,14 +6013,14 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) WARN_ON_ONCE(sp->lpage_disallowed); } - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { kvm_mmu_commit_zap_page(kvm, &invalid_list); - cond_resched_lock(&kvm->mmu_lock); + cond_resched_rwlock_write(&kvm->mmu_lock); } } kvm_mmu_commit_zap_page(kvm, &invalid_list); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, rcu_idx); } diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 8443a675715b..34bb0ec69bd8 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -184,9 +184,9 @@ kvm_page_track_register_notifier(struct kvm *kvm, head = &kvm->arch.track_notifier_head; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); hlist_add_head_rcu(&n->node, &head->track_notifier_list); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); } EXPORT_SYMBOL_GPL(kvm_page_track_register_notifier); @@ -202,9 +202,9 @@ kvm_page_track_unregister_notifier(struct kvm *kvm, head = &kvm->arch.track_notifier_head; - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); hlist_del_rcu(&n->node); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); synchronize_srcu(&head->track_srcu); } EXPORT_SYMBOL_GPL(kvm_page_track_unregister_notifier); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 50e268eb8e1a..d9f66cc459e8 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -868,7 +868,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, } r = RET_PF_RETRY; - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; @@ -881,7 +881,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gpa_t addr, u32 error_code, kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT); out_unlock: - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); kvm_release_pfn_clean(pfn); return r; } @@ -919,7 +919,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa) return; } - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); for_each_shadow_entry_using_root(vcpu, root_hpa, gva, iterator) { level = iterator.level; sptep = iterator.sptep; @@ -954,7 +954,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa) if (!is_shadow_present_pte(*sptep) || !sp->unsync_children) break; } - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); } /* Note, @addr is a GPA when gva_to_gpa() translates an L2 GPA to an L1 GPA. */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 9e4009068920..f1fbed72e149 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -59,7 +59,7 @@ static void tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root) static inline bool tdp_mmu_next_root_valid(struct kvm *kvm, struct kvm_mmu_page *root) { - lockdep_assert_held(&kvm->mmu_lock); + lockdep_assert_held_write(&kvm->mmu_lock); if (list_entry_is_head(root, &kvm->arch.tdp_mmu_roots, link)) return false; @@ -117,7 +117,7 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) { gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT); - lockdep_assert_held(&kvm->mmu_lock); + lockdep_assert_held_write(&kvm->mmu_lock); WARN_ON(root->root_count); WARN_ON(!root->tdp_mmu_page); @@ -170,13 +170,13 @@ static struct kvm_mmu_page *get_tdp_mmu_vcpu_root(struct kvm_vcpu *vcpu) role = page_role_for_level(vcpu, vcpu->arch.mmu->shadow_root_level); - spin_lock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); /* Check for an existing root before allocating a new one. */ for_each_tdp_mmu_root(kvm, root) { if (root->role.word == role.word) { kvm_mmu_get_root(kvm, root); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); return root; } } @@ -186,7 +186,7 @@ static struct kvm_mmu_page *get_tdp_mmu_vcpu_root(struct kvm_vcpu *vcpu) list_add(&root->link, &kvm->arch.tdp_mmu_roots); - spin_unlock(&kvm->mmu_lock); + write_unlock(&kvm->mmu_lock); return root; } @@ -421,7 +421,7 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, struct kvm_mmu_page *root = sptep_to_sp(root_pt); int as_id = kvm_mmu_page_as_id(root); - lockdep_assert_held(&kvm->mmu_lock); + lockdep_assert_held_write(&kvm->mmu_lock); WRITE_ONCE(*rcu_dereference(iter->sptep), new_spte); @@ -492,13 +492,13 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, if (iter->next_last_level_gfn == iter->yielded_gfn) return false; - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) { + if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { rcu_read_unlock(); if (flush) kvm_flush_remote_tlbs(kvm); - cond_resched_lock(&kvm->mmu_lock); + cond_resched_rwlock_write(&kvm->mmu_lock); rcu_read_lock(); WARN_ON(iter->gfn > iter->next_last_level_gfn); @@ -1103,7 +1103,7 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root; int root_as_id; - lockdep_assert_held(&kvm->mmu_lock); + lockdep_assert_held_write(&kvm->mmu_lock); for_each_tdp_mmu_root(kvm, root) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) @@ -1268,7 +1268,7 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, int root_as_id; bool spte_set = false; - lockdep_assert_held(&kvm->mmu_lock); + lockdep_assert_held_write(&kvm->mmu_lock); for_each_tdp_mmu_root(kvm, root) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 76bce832cade..b544f59b6952 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7092,9 +7092,9 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, if (vcpu->arch.mmu->direct_map) { unsigned int indirect_shadow_pages; - spin_lock(&vcpu->kvm->mmu_lock); + write_lock(&vcpu->kvm->mmu_lock); indirect_shadow_pages = vcpu->kvm->arch.indirect_shadow_pages; - spin_unlock(&vcpu->kvm->mmu_lock); + write_unlock(&vcpu->kvm->mmu_lock); if (indirect_shadow_pages) kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(gpa)); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f3b1013fb22c..f417447129b9 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -451,7 +451,12 @@ struct kvm_memslots { }; struct kvm { +#ifdef KVM_HAVE_MMU_RWLOCK + rwlock_t mmu_lock; +#else spinlock_t mmu_lock; +#endif /* KVM_HAVE_MMU_RWLOCK */ + struct mutex slots_lock; struct mm_struct *mm; /* userspace tied to this vm */ struct kvm_memslots __rcu *memslots[KVM_ADDRESS_SPACE_NUM]; diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 9d01299563ee..dc7052a6e033 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -60,9 +60,19 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask) if (!memslot || (offset + __fls(mask)) >= memslot->npages) return; +#ifdef KVM_HAVE_MMU_RWLOCK + write_lock(&kvm->mmu_lock); +#else spin_lock(&kvm->mmu_lock); +#endif /* KVM_HAVE_MMU_RWLOCK */ + kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask); + +#ifdef KVM_HAVE_MMU_RWLOCK + write_unlock(&kvm->mmu_lock); +#else spin_unlock(&kvm->mmu_lock); +#endif /* KVM_HAVE_MMU_RWLOCK */ } int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8367d88ce39b..44b55f9387c4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -450,6 +450,14 @@ static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, srcu_read_unlock(&kvm->srcu, idx); } +#ifdef KVM_HAVE_MMU_RWLOCK +#define KVM_MMU_LOCK(kvm) write_lock(&kvm->mmu_lock) +#define KVM_MMU_UNLOCK(kvm) write_unlock(&kvm->mmu_lock) +#else +#define KVM_MMU_LOCK(kvm) spin_lock(&kvm->mmu_lock) +#define KVM_MMU_UNLOCK(kvm) spin_unlock(&kvm->mmu_lock) +#endif /* KVM_HAVE_MMU_RWLOCK */ + static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long address, @@ -459,13 +467,15 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, int idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + + KVM_MMU_LOCK(kvm); + kvm->mmu_notifier_seq++; if (kvm_set_spte_hva(kvm, address, pte)) kvm_flush_remote_tlbs(kvm); - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); } @@ -476,7 +486,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, int need_tlb_flush = 0, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + KVM_MMU_LOCK(kvm); /* * The count increase must become visible at unlock time as no * spte can be established without taking the mmu_lock and @@ -489,7 +499,7 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, if (need_tlb_flush || kvm->tlbs_dirty) kvm_flush_remote_tlbs(kvm); - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); return 0; @@ -500,7 +510,7 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, { struct kvm *kvm = mmu_notifier_to_kvm(mn); - spin_lock(&kvm->mmu_lock); + KVM_MMU_LOCK(kvm); /* * This sequence increase will notify the kvm page fault that * the page that is going to be mapped in the spte could have @@ -514,7 +524,7 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, * in conjunction with the smp_rmb in mmu_notifier_retry(). */ kvm->mmu_notifier_count--; - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); BUG_ON(kvm->mmu_notifier_count < 0); } @@ -528,13 +538,13 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, int young, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + KVM_MMU_LOCK(kvm); young = kvm_age_hva(kvm, start, end); if (young) kvm_flush_remote_tlbs(kvm); - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); return young; @@ -549,7 +559,7 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, int young, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + KVM_MMU_LOCK(kvm); /* * Even though we do not flush TLB, this will still adversely * affect performance on pre-Haswell Intel EPT, where there is @@ -564,7 +574,7 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, * more sophisticated heuristic later. */ young = kvm_age_hva(kvm, start, end); - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); return young; @@ -578,9 +588,9 @@ static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, int young, idx; idx = srcu_read_lock(&kvm->srcu); - spin_lock(&kvm->mmu_lock); + KVM_MMU_LOCK(kvm); young = kvm_test_age_hva(kvm, address); - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); return young; @@ -745,7 +755,11 @@ static struct kvm *kvm_create_vm(unsigned long type) if (!kvm) return ERR_PTR(-ENOMEM); +#ifdef KVM_HAVE_MMU_RWLOCK + rwlock_init(&kvm->mmu_lock); +#else spin_lock_init(&kvm->mmu_lock); +#endif /* KVM_HAVE_MMU_RWLOCK */ mmgrab(current->mm); kvm->mm = current->mm; kvm_eventfd_init(kvm); @@ -1525,7 +1539,7 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log) dirty_bitmap_buffer = kvm_second_dirty_bitmap(memslot); memset(dirty_bitmap_buffer, 0, n); - spin_lock(&kvm->mmu_lock); + KVM_MMU_LOCK(kvm); for (i = 0; i < n / sizeof(long); i++) { unsigned long mask; gfn_t offset; @@ -1541,7 +1555,7 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log) kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask); } - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); } if (flush) @@ -1636,7 +1650,7 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm, if (copy_from_user(dirty_bitmap_buffer, log->dirty_bitmap, n)) return -EFAULT; - spin_lock(&kvm->mmu_lock); + KVM_MMU_LOCK(kvm); for (offset = log->first_page, i = offset / BITS_PER_LONG, n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--; i++, offset += BITS_PER_LONG) { @@ -1659,7 +1673,7 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm, offset, mask); } } - spin_unlock(&kvm->mmu_lock); + KVM_MMU_UNLOCK(kvm); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); From patchwork Tue Feb 2 18:57:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CABC2C433E0 for ; Tue, 2 Feb 2021 19:13:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 86DF764E0A for ; Tue, 2 Feb 2021 19:13:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239416AbhBBTNY (ORCPT ); Tue, 2 Feb 2021 14:13:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239350AbhBBTCG (ORCPT ); Tue, 2 Feb 2021 14:02:06 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F8C0C0611BD for ; Tue, 2 Feb 2021 10:58:13 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id z2so10915798pln.18 for ; Tue, 02 Feb 2021 10:58:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=7wwE6UVlfYIz1bD+MH7rvktLq1BWQ7Niw1vVFs8Dr9U=; b=hykkCo6ukRsr8EfpcpDNvsfhfSfG7G0UiuPEUQFCTVj45UvL+vT39H9nqqYK3GsmV0 VPEqKGmHRy1S5o6aAiO4ZnYQdjj3qcdYwgVutWx6dEuhmo6vZo517WBA+cm6oJjDR8EN dassTMz4E+Sfi5Yc34vwXDzlbAkpdkt87VvDIZiF1FTtPsFM1QkvJvS2xSZa415ItSeG Jh1PNDhkrzJHexl21g1NXJFHiukosH/2Gx3x/UOxPd33UWFkytVVDX5N4pmWQWGzocrQ ddAOf3RY3rRPt24IPkuR75uHm6cV4NjtcfX4pg6I8CrUXWaikvMFdow5zoWY65cBr0iO jIZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=7wwE6UVlfYIz1bD+MH7rvktLq1BWQ7Niw1vVFs8Dr9U=; b=mlbrB9CAVmtvLvVt6MXua40Na/r4+F1c1+nkFZwNdc/FVpzHKYGyzXXgmfQpTf1ts+ zlkuscRHCXRKUGJuyYs00DrE99odiPt5DJyykxrxAXkLk+0C4s4eIUNMO+DSQBsbDOxT QPUYSVZHifsD4MI4J/hNVEuapR15GAHPA3sY0yx/F26gK8f9jg9BTp79fxABLFhCeb1/ Gkefx47h/QWplAzjWjxPJyasrCOC71mdhH0vCCrawFW2UTRi+WwnwbraEaUL5uMclpmB VwfUM35GEByboW9iePnBSt7BjB1DD0k79wWrBvxdzLrk+Y+pyoWMcFJkGFzg2uHCBrVF OcNw== X-Gm-Message-State: AOAM533l1BEgNmK2y75dcjKll3e+R3+oBk40ii0SBYbJWJgQ/9QgFMX5 6XeGFGikdJnPivNxhU9CBUUe7BVTX8gp X-Google-Smtp-Source: ABdhPJzhz23yIx2witmBOtZ9LieRs9XiDxy0qRVz/6phPhCsMap8VaSEb4v8MRK9hQSuqWIl1hzEf9hcAc80 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a05:6a00:138f:b029:1b8:b9d5:3a2c with SMTP id t15-20020a056a00138fb02901b8b9d53a2cmr23099752pfg.10.1612292293034; Tue, 02 Feb 2021 10:58:13 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:25 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-20-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 19/28] KVM: x86/mmu: Factor out functions to add/remove TDP MMU pages From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move the work of adding and removing TDP MMU pages to/from "secondary" data structures to helper functions. These functions will be built on in future commits to enable MMU operations to proceed (mostly) in parallel. No functional change expected. Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 47 +++++++++++++++++++++++++++++++------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index f1fbed72e149..5a9e964e0178 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -262,6 +262,39 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, } } +/** + * tdp_mmu_link_page - Add a new page to the list of pages used by the TDP MMU + * + * @kvm: kvm instance + * @sp: the new page + * @account_nx: This page replaces a NX large page and should be marked for + * eventual reclaim. + */ +static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool account_nx) +{ + lockdep_assert_held_write(&kvm->mmu_lock); + + list_add(&sp->link, &kvm->arch.tdp_mmu_pages); + if (account_nx) + account_huge_nx_page(kvm, sp); +} + +/** + * tdp_mmu_unlink_page - Remove page from the list of pages used by the TDP MMU + * + * @kvm: kvm instance + * @sp: the page to be removed + */ +static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + lockdep_assert_held_write(&kvm->mmu_lock); + + list_del(&sp->link); + if (sp->lpage_disallowed) + unaccount_huge_nx_page(kvm, sp); +} + /** * handle_removed_tdp_mmu_page - handle a pt removed from the TDP structure * @@ -281,10 +314,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt) trace_kvm_mmu_prepare_zap_page(sp); - list_del(&sp->link); - - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); + tdp_mmu_unlink_page(kvm, sp); for (i = 0; i < PT64_ENT_PER_PAGE; i++) { old_child_spte = READ_ONCE(*(pt + i)); @@ -705,15 +735,16 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, if (!is_shadow_present_pte(iter.old_spte)) { sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level); - list_add(&sp->link, &vcpu->kvm->arch.tdp_mmu_pages); child_pt = sp->spt; + + tdp_mmu_link_page(vcpu->kvm, sp, + huge_page_disallowed && + req_level >= iter.level); + new_spte = make_nonleaf_spte(child_pt, !shadow_accessed_mask); trace_kvm_mmu_get_page(sp, true); - if (huge_page_disallowed && req_level >= iter.level) - account_huge_nx_page(vcpu->kvm, sp); - tdp_mmu_set_spte(vcpu->kvm, &iter, new_spte); } } From patchwork Tue Feb 2 18:57:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A242CC433DB for ; Tue, 2 Feb 2021 19:13:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5894964E45 for ; Tue, 2 Feb 2021 19:13:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239378AbhBBTNS (ORCPT ); Tue, 2 Feb 2021 14:13:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239352AbhBBTCH (ORCPT ); Tue, 2 Feb 2021 14:02:07 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2558C061A10 for ; Tue, 2 Feb 2021 10:58:15 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id b125so17223370ybg.10 for ; Tue, 02 Feb 2021 10:58:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=rQ/2uzOGEB0AGr9GTok3ymLIkQsPkRXMIAaXffzGwQU=; b=ItxIvvc6bbX8LPNf1NUdolXQpQ3f1ig9h1u4bngpEtSfgU4vUZOLWHOQtjHecUb7bl 4Jp3XK77EGDPHsSHZzI3At0CkHaoXHLE+dBntZwHFexRMYZepUlITjMlW4TWrn+3gCXH wJWrXvSE+wdwG9U5mX+LM2RVUuhFaGeqzZO2CBY4fKYdyO8uJLHZQCeetE4ykhSqXjRg YK8qmGZuyRnclGhQMBJLbWTk50IXcblG8JOLdVapQjWHlxOHjirj3IHvb9S0b9Z8Cbrt DtW/447LFuB6vbRcC5d8qUX+xQmyZJxaHSaUXr2R9QGN4/fMa3ZMZWVGsNHR5bV0A0f9 2CxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=rQ/2uzOGEB0AGr9GTok3ymLIkQsPkRXMIAaXffzGwQU=; b=AuV/buSUUIPZHJTLS2W0UXwXyBrrIlUxvvstmnuIfirl4RH0oC5Fyd4j/72VUzraD4 elY3ntofK4+r5t0jEaBsM9wnKXf9IZnJDvFuvwNc6P7rboW9KunVMk3sJfV/CwdgzEbr aEMdXJ5TWwNCRWjjWI0akLCU/S12y/nPw1g8CH1rRx7UzIpUFjumfAbWityQ1XAdsR1D AG/qwlPpRG7sNI/lLP3694QLc+Rql4xipcqx1fiJ8daUxRfuURaoas66Sf5BhvEgd6Am DVbTvgGxjV0fQs01lzizKnhLvwbTKlA/w1X99wBnYrvzEdYG9GgkfFgZuf2o9rRUI2TT UbyQ== X-Gm-Message-State: AOAM530xZTY3X0NGDmTgw8/n77pFONkn3SIxZ5GPjTTqVzAJDV5xeNnC ULrpG9GCYgv9lDA+Jh32OxnS58Y5R9WY X-Google-Smtp-Source: ABdhPJzvaZISwD/AYwji6K6c68C1mtJjmS3pJ3jB/VnOzKUamjTIQDWhwkGJWwx2ow1YaBLERWypNnJjNSe3 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a25:e08a:: with SMTP id x132mr19078901ybg.121.1612292295005; Tue, 02 Feb 2021 10:58:15 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:26 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-21-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 20/28] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To prepare for handling page faults in parallel, change the TDP MMU page fault handler to use atomic operations to set SPTEs so that changes are not lost if multiple threads attempt to modify the same SPTE. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon Reported-by: kernel test robot --- v1 -> v2 - Rename "atomic" arg to "shared" in multiple functions - Merged the commit that protects the lists of TDP MMU pages with a new lock - Merged the commits to add an atomic option for setting SPTEs and to use that option in the TDP MMU page fault handler arch/x86/include/asm/kvm_host.h | 13 +++ arch/x86/kvm/mmu/tdp_mmu.c | 142 ++++++++++++++++++++++++-------- 2 files changed, 122 insertions(+), 33 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b6ebf2558386..78ebf56f2b37 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1028,6 +1028,19 @@ struct kvm_arch { * tdp_mmu_page set and a root_count of 0. */ struct list_head tdp_mmu_pages; + + /* + * Protects accesses to the following fields when the MMU lock + * is held in read mode: + * - tdp_mmu_pages (above) + * - the link field of struct kvm_mmu_pages used by the TDP MMU + * - lpage_disallowed_mmu_pages + * - the lpage_disallowed_link field of struct kvm_mmu_pages used + * by the TDP MMU + * It is acceptable, but not necessary, to acquire this lock when + * the thread holds the MMU lock in write mode. + */ + spinlock_t tdp_mmu_pages_lock; }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 5a9e964e0178..0b5a9339ac55 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -7,6 +7,7 @@ #include "tdp_mmu.h" #include "spte.h" +#include #include #ifdef CONFIG_X86_64 @@ -33,6 +34,7 @@ void kvm_mmu_init_tdp_mmu(struct kvm *kvm) kvm->arch.tdp_mmu_enabled = true; INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots); + spin_lock_init(&kvm->arch.tdp_mmu_pages_lock); INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages); } @@ -225,7 +227,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head) } static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level); + u64 old_spte, u64 new_spte, int level, + bool shared); static int kvm_mmu_page_as_id(struct kvm_mmu_page *sp) { @@ -267,17 +270,26 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, * * @kvm: kvm instance * @sp: the new page + * @shared: This operation may not be running under the exclusive use of + * the MMU lock and the operation must synchronize with other + * threads that might be adding or removing pages. * @account_nx: This page replaces a NX large page and should be marked for * eventual reclaim. */ static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp, - bool account_nx) + bool shared, bool account_nx) { - lockdep_assert_held_write(&kvm->mmu_lock); + if (shared) + spin_lock(&kvm->arch.tdp_mmu_pages_lock); + else + lockdep_assert_held_write(&kvm->mmu_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); if (account_nx) account_huge_nx_page(kvm, sp); + + if (shared) + spin_unlock(&kvm->arch.tdp_mmu_pages_lock); } /** @@ -285,14 +297,24 @@ static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp, * * @kvm: kvm instance * @sp: the page to be removed + * @shared: This operation may not be running under the exclusive use of + * the MMU lock and the operation must synchronize with other + * threads that might be adding or removing pages. */ -static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp) +static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool shared) { - lockdep_assert_held_write(&kvm->mmu_lock); + if (shared) + spin_lock(&kvm->arch.tdp_mmu_pages_lock); + else + lockdep_assert_held_write(&kvm->mmu_lock); list_del(&sp->link); if (sp->lpage_disallowed) unaccount_huge_nx_page(kvm, sp); + + if (shared) + spin_unlock(&kvm->arch.tdp_mmu_pages_lock); } /** @@ -300,28 +322,39 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp) * * @kvm: kvm instance * @pt: the page removed from the paging structure + * @shared: This operation may not be running under the exclusive use + * of the MMU lock and the operation must synchronize with other + * threads that might be modifying SPTEs. * * Given a page table that has been removed from the TDP paging structure, * iterates through the page table to clear SPTEs and free child page tables. */ -static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt) +static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt, + bool shared) { struct kvm_mmu_page *sp = sptep_to_sp(pt); int level = sp->role.level; gfn_t gfn = sp->gfn; u64 old_child_spte; + u64 *sptep; int i; trace_kvm_mmu_prepare_zap_page(sp); - tdp_mmu_unlink_page(kvm, sp); + tdp_mmu_unlink_page(kvm, sp, shared); for (i = 0; i < PT64_ENT_PER_PAGE; i++) { - old_child_spte = READ_ONCE(*(pt + i)); - WRITE_ONCE(*(pt + i), 0); + sptep = pt + i; + + if (shared) { + old_child_spte = xchg(sptep, 0); + } else { + old_child_spte = READ_ONCE(*sptep); + WRITE_ONCE(*sptep, 0); + } handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), - old_child_spte, 0, level - 1); + old_child_spte, 0, level - 1, shared); } kvm_flush_remote_tlbs_with_address(kvm, gfn, @@ -338,12 +371,16 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt) * @old_spte: The value of the SPTE before the change * @new_spte: The value of the SPTE after the change * @level: the level of the PT the SPTE is part of in the paging structure + * @shared: This operation may not be running under the exclusive use of + * the MMU lock and the operation must synchronize with other + * threads that might be modifying SPTEs. * * Handle bookkeeping that might result from the modification of a SPTE. * This function must be called for all TDP SPTE modifications. */ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level) + u64 old_spte, u64 new_spte, int level, + bool shared) { bool was_present = is_shadow_present_pte(old_spte); bool is_present = is_shadow_present_pte(new_spte); @@ -415,18 +452,51 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, */ if (was_present && !was_leaf && (pfn_changed || !is_present)) handle_removed_tdp_mmu_page(kvm, - spte_to_child_pt(old_spte, level)); + spte_to_child_pt(old_spte, level), shared); } static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level) + u64 old_spte, u64 new_spte, int level, + bool shared) { - __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level); + __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, + shared); handle_changed_spte_acc_track(old_spte, new_spte, level); handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte, level); } +/* + * tdp_mmu_set_spte_atomic - Set a TDP MMU SPTE atomically and handle the + * associated bookkeeping + * + * @kvm: kvm instance + * @iter: a tdp_iter instance currently on the SPTE that should be set + * @new_spte: The value the SPTE should be set to + * Returns: true if the SPTE was set, false if it was not. If false is returned, + * this function will have no side-effects. + */ +static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter, + u64 new_spte) +{ + u64 *root_pt = tdp_iter_root_pt(iter); + struct kvm_mmu_page *root = sptep_to_sp(root_pt); + int as_id = kvm_mmu_page_as_id(root); + + lockdep_assert_held_read(&kvm->mmu_lock); + + if (cmpxchg64(rcu_dereference(iter->sptep), iter->old_spte, + new_spte) != iter->old_spte) + return false; + + handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, + iter->level, true); + + return true; +} + + /* * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping * @kvm: kvm instance @@ -456,7 +526,7 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, WRITE_ONCE(*rcu_dereference(iter->sptep), new_spte); __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, - iter->level); + iter->level, false); if (record_acc_track) handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level); @@ -630,23 +700,18 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write, int ret = 0; int make_spte_ret = 0; - if (unlikely(is_noslot_pfn(pfn))) { + if (unlikely(is_noslot_pfn(pfn))) new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL); - trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn, - new_spte); - } else { + else make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn, pfn, iter->old_spte, prefault, true, map_writable, !shadow_accessed_mask, &new_spte); - trace_kvm_mmu_set_spte(iter->level, iter->gfn, - rcu_dereference(iter->sptep)); - } if (new_spte == iter->old_spte) ret = RET_PF_SPURIOUS; - else - tdp_mmu_set_spte(vcpu->kvm, iter, new_spte); + else if (!tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte)) + return RET_PF_RETRY; /* * If the page fault was caused by a write but the page is write @@ -660,8 +725,13 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write, } /* If a MMIO SPTE is installed, the MMIO will need to be emulated. */ - if (unlikely(is_mmio_spte(new_spte))) + if (unlikely(is_mmio_spte(new_spte))) { + trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn, + new_spte); ret = RET_PF_EMULATE; + } else + trace_kvm_mmu_set_spte(iter->level, iter->gfn, + rcu_dereference(iter->sptep)); trace_kvm_mmu_set_spte(iter->level, iter->gfn, rcu_dereference(iter->sptep)); @@ -720,7 +790,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, */ if (is_shadow_present_pte(iter.old_spte) && is_large_pte(iter.old_spte)) { - tdp_mmu_set_spte(vcpu->kvm, &iter, 0); + if (!tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, 0)) + break; kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter.gfn, KVM_PAGES_PER_HPAGE(iter.level)); @@ -737,19 +808,24 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level); child_pt = sp->spt; - tdp_mmu_link_page(vcpu->kvm, sp, - huge_page_disallowed && - req_level >= iter.level); - new_spte = make_nonleaf_spte(child_pt, !shadow_accessed_mask); - trace_kvm_mmu_get_page(sp, true); - tdp_mmu_set_spte(vcpu->kvm, &iter, new_spte); + if (tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, + new_spte)) { + tdp_mmu_link_page(vcpu->kvm, sp, true, + huge_page_disallowed && + req_level >= iter.level); + + trace_kvm_mmu_get_page(sp, true); + } else { + tdp_mmu_free_sp(sp); + break; + } } } - if (WARN_ON(iter.level != level)) { + if (iter.level != level) { rcu_read_unlock(); return RET_PF_RETRY; } From patchwork Tue Feb 2 18:57:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BDBFC433E9 for ; Tue, 2 Feb 2021 19:13:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C2E3C64E45 for ; Tue, 2 Feb 2021 19:13:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239375AbhBBTNN (ORCPT ); Tue, 2 Feb 2021 14:13:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239372AbhBBTCL (ORCPT ); Tue, 2 Feb 2021 14:02:11 -0500 Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EDD1C061A28 for ; Tue, 2 Feb 2021 10:58:17 -0800 (PST) Received: by mail-qk1-x749.google.com with SMTP id e187so18190785qkf.0 for ; Tue, 02 Feb 2021 10:58:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=1ycHnD8yBYRgNZEKk0ejNdseFr7uCl6yWeGqf8KX/VU=; b=mG0aFaq5nsmo6fjZ/HD5Y2NZ0DvhSk++/kMPBovaVNTtOxQH7d/KjMq21euaAx9EFs roqwbN959lta0upadznPxBEK31CgnAsiHZExHxlmrg6GnzSghCM9T1XNwEbm0NNPHlOp hW0NZJyLvVKaVbcJT7SsPue+3cRwiO+kV2dEaSePzRtr1ZhcdvY4Raa9DvyX/owDVMME Bv/hBZAb20Qi5um5jdMo67wrTZS18SQnAQkQnUzQIiRNwGESA2ByMI4ixiYwrVrk6eJM 7VUBaWOBQemHX5YMYkVJTPTTTW1eW/zYGDgZn0w99zVTuRsYqqJIhHY7OCadqhneNCrF zhbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=1ycHnD8yBYRgNZEKk0ejNdseFr7uCl6yWeGqf8KX/VU=; b=PKtXoe9EB7H6hb/vb9l5oHk/3JKe6cTxIeoL++cvVJmgR+oFERVCV+Q2EByYu8NQDC KoYB4G9xVcowK9Z0XmEnM4WJ9tzw9Us0rvTkyYCGsA/ra6oVa8JenusmVwjr9/T9sWZL DNxBeT0abfZFRYlLI2MwZwUg9dPUT+74QN/NU9e2UjfV04hNjYk4krGqF7EuHFCeaCBu CLJ9I4AZrOBzZqELXNgYL+B6Bx4QiZmfKChTqplg9Mp0/pmSrPpmpz87o6SK951IS3W1 1o+FaSNCxY4U5vtgyjj4iS39yGzPGyfCoqd3MZ4C4SkLTPI9nfKq6S4udliMunowdB1o cjdA== X-Gm-Message-State: AOAM532oFmNixjb8pRqbHBKBKnzNGNgCh6LzvAOs76CpECjBbfsgi7eE iFFi4SzsRoML/opNPdQPH9OHGrve6cO9 X-Google-Smtp-Source: ABdhPJzGa8t3f1SiTVspjcJISfRIFygOEpq5RCtshb8W8OkSAdVoTK9Vg/AgSxM6PsQQJt3fiLPnXddL2nue Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a0c:8485:: with SMTP id m5mr22102786qva.14.1612292296548; Tue, 02 Feb 2021 10:58:16 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:27 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-22-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 21/28] KVM: x86/mmu: Flush TLBs after zap in TDP MMU PF handler From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When the TDP MMU is allowed to handle page faults in parallel there is the possiblity of a race where an SPTE is cleared and then imediately replaced with a present SPTE pointing to a different PFN, before the TLBs can be flushed. This race would violate architectural specs. Ensure that the TLBs are flushed properly before other threads are allowed to install any present value for the SPTE. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- v1 -> v2 - Renamed "FROZEN_SPTE" to "REMOVED_SPTE" and updated derivative comments and code arch/x86/kvm/mmu/spte.h | 21 ++++++++++++- arch/x86/kvm/mmu/tdp_mmu.c | 63 ++++++++++++++++++++++++++++++++------ 2 files changed, 74 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 2b3a30bd38b0..3f974006cfb6 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -130,6 +130,25 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask; PT64_EPT_EXECUTABLE_MASK) #define SHADOW_ACC_TRACK_SAVED_BITS_SHIFT PT64_SECOND_AVAIL_BITS_SHIFT +/* + * If a thread running without exclusive control of the MMU lock must perform a + * multi-part operation on an SPTE, it can set the SPTE to REMOVED_SPTE as a + * non-present intermediate value. Other threads which encounter this value + * should not modify the SPTE. + * + * This constant works because it is considered non-present on both AMD and + * Intel CPUs and does not create a L1TF vulnerability because the pfn section + * is zeroed out. + * + * Only used by the TDP MMU. + */ +#define REMOVED_SPTE (1ull << 59) + +static inline bool is_removed_spte(u64 spte) +{ + return spte == REMOVED_SPTE; +} + /* * In some cases, we need to preserve the GFN of a non-present or reserved * SPTE when we usurp the upper five bits of the physical address space to @@ -187,7 +206,7 @@ static inline bool is_access_track_spte(u64 spte) static inline int is_shadow_present_pte(u64 pte) { - return (pte != 0) && !is_mmio_spte(pte); + return (pte != 0) && !is_mmio_spte(pte) && !is_removed_spte(pte); } static inline int is_large_pte(u64 pte) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 0b5a9339ac55..7a2cdfeac4d2 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -427,15 +427,19 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, */ if (!was_present && !is_present) { /* - * If this change does not involve a MMIO SPTE, it is - * unexpected. Log the change, though it should not impact the - * guest since both the former and current SPTEs are nonpresent. + * If this change does not involve a MMIO SPTE or removed SPTE, + * it is unexpected. Log the change, though it should not + * impact the guest since both the former and current SPTEs + * are nonpresent. */ - if (WARN_ON(!is_mmio_spte(old_spte) && !is_mmio_spte(new_spte))) + if (WARN_ON(!is_mmio_spte(old_spte) && + !is_mmio_spte(new_spte) && + !is_removed_spte(new_spte))) pr_err("Unexpected SPTE change! Nonpresent SPTEs\n" "should not be replaced with another,\n" "different nonpresent SPTE, unless one or both\n" - "are MMIO SPTEs.\n" + "are MMIO SPTEs, or the new SPTE is\n" + "a temporary removed SPTE.\n" "as_id: %d gfn: %llx old_spte: %llx new_spte: %llx level: %d", as_id, gfn, old_spte, new_spte, level); return; @@ -486,6 +490,13 @@ static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, lockdep_assert_held_read(&kvm->mmu_lock); + /* + * Do not change removed SPTEs. Only the thread that froze the SPTE + * may modify it. + */ + if (iter->old_spte == REMOVED_SPTE) + return false; + if (cmpxchg64(rcu_dereference(iter->sptep), iter->old_spte, new_spte) != iter->old_spte) return false; @@ -496,6 +507,34 @@ static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, return true; } +static inline bool tdp_mmu_zap_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter) +{ + /* + * Freeze the SPTE by setting it to a special, + * non-present value. This will stop other threads from + * immediately installing a present entry in its place + * before the TLBs are flushed. + */ + if (!tdp_mmu_set_spte_atomic(kvm, iter, REMOVED_SPTE)) + return false; + + kvm_flush_remote_tlbs_with_address(kvm, iter->gfn, + KVM_PAGES_PER_HPAGE(iter->level)); + + /* + * No other thread can overwrite the removed SPTE as they + * must either wait on the MMU lock or use + * tdp_mmu_set_spte_atomic which will not overrite the + * special removed SPTE value. No bookkeeping is needed + * here since the SPTE is going from non-present + * to non-present. + */ + WRITE_ONCE(*iter->sptep, 0); + + return true; +} + /* * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping @@ -523,6 +562,15 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter, lockdep_assert_held_write(&kvm->mmu_lock); + /* + * No thread should be using this function to set SPTEs to the + * temporary removed SPTE value. + * If operating under the MMU lock in read mode, tdp_mmu_set_spte_atomic + * should be used. If operating under the MMU lock in write mode, the + * use of the removed SPTE should not be necessary. + */ + WARN_ON(iter->old_spte == REMOVED_SPTE); + WRITE_ONCE(*rcu_dereference(iter->sptep), new_spte); __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, @@ -790,12 +838,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, */ if (is_shadow_present_pte(iter.old_spte) && is_large_pte(iter.old_spte)) { - if (!tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, 0)) + if (!tdp_mmu_zap_spte_atomic(vcpu->kvm, &iter)) break; - kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter.gfn, - KVM_PAGES_PER_HPAGE(iter.level)); - /* * The iter must explicitly re-read the spte here * because the new value informs the !present From patchwork Tue Feb 2 18:57:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062697 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 503B1C433E9 for ; Tue, 2 Feb 2021 19:08:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F132C64E92 for ; Tue, 2 Feb 2021 19:08:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239604AbhBBTI3 (ORCPT ); Tue, 2 Feb 2021 14:08:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239380AbhBBTCU (ORCPT ); Tue, 2 Feb 2021 14:02:20 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2675FC061A2A for ; Tue, 2 Feb 2021 10:58:19 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id l7so2883810pjy.0 for ; Tue, 02 Feb 2021 10:58:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=AT93/ogWHXZz20/nWmto+tVuhh8JYGtfkCQW8Y2OAc4=; b=r5R8y+ePnaAaeFSdFKCGU7w4lreH34R3dbk7cfBMhYkTaOHLEhd2TwMXIbp2vgFYi3 Gd/u1N8eXJchw09Pj82KEmSH5oCErngyK4OhDuq+mzuwOp3lpfgSp9+1HQ27i7YJXnmr k90Cuvq3phZ7aKAKk8rsgizzFnYhuzngDJeoGoxSPmCgG0nPlJsipQTbR/jfIztrVdmH PChpmj8LgDQ5h6LSSW2KJf/diK7nV4A5xnjbi3T91sat9hXOZuoS8B9rCG2oy0PPUbzE oeBetr1O4R4IPOqoZ61HiXZTMb2cSBR6jwSXXgE8SEYEVw4HV027RlgtmhDE63G10nUC 4ISw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=AT93/ogWHXZz20/nWmto+tVuhh8JYGtfkCQW8Y2OAc4=; b=ETiyQP9flKGPoiSTAQeCo0H7fhWASYFgBkhg/F0j/d4Hnlm84Qef7w8ARKgKcMlO9E /UJaNI4nTiEoTQQKmhe+Uc38RRsJ1dCCQUzOFWMz8FEUjIj1K8OShvFuqJWXXNHSlEje Tzr7w5jNNVbvmZ7/svJ0WhKWDxf3NxNg6UZLgBSolQroxHunrLiOR/nIrPgCNSVfXPw4 SBfisu2K0DWBV7QMi2ty66oMOfE9B4afhfYrInsiEe/2H3DElvlbktBIUtzcqIH+Opvj 0GvIjBRJDjp3PfIwK/fa5IopKGMFVjHUGm/boeNs84ksa0K1gd4e8eBGW9rDxQ8smAyx WL0Q== X-Gm-Message-State: AOAM531wnkp/xjgI9cNGcZoOZ3dnMZrfyk+hgK0Wn1jmhSHeu7Cu1wZV rcV8UrzZ/8OMv7cXt77f4iKkuMtdZ7KL X-Google-Smtp-Source: ABdhPJwCA8GyUrhXpxf2tdjZ130kf6w37Af0AcDtX/wCAZtKVUM1t3M3pKL4+fHk82e59W3QYalqp5VOzjzD Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90a:8d83:: with SMTP id d3mr614671pjo.0.1612292298314; Tue, 02 Feb 2021 10:58:18 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:28 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-23-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 22/28] KVM: x86/mmu: Mark SPTEs in disconnected pages as removed From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When clearing TDP MMU pages what have been disconnected from the paging structure root, set the SPTEs to a special non-present value which will not be overwritten by other threads. This is needed to prevent races in which a thread is clearing a disconnected page table, but another thread has already acquired a pointer to that memory and installs a mapping in an already cleared entry. This can lead to memory leaks and accounting errors. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/tdp_mmu.c | 36 ++++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 7a2cdfeac4d2..0dd27e000dd0 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -334,9 +334,10 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt, { struct kvm_mmu_page *sp = sptep_to_sp(pt); int level = sp->role.level; - gfn_t gfn = sp->gfn; + gfn_t base_gfn = sp->gfn; u64 old_child_spte; u64 *sptep; + gfn_t gfn; int i; trace_kvm_mmu_prepare_zap_page(sp); @@ -345,16 +346,39 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt, for (i = 0; i < PT64_ENT_PER_PAGE; i++) { sptep = pt + i; + gfn = base_gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)); if (shared) { - old_child_spte = xchg(sptep, 0); + /* + * Set the SPTE to a nonpresent value that other + * threads will not overwrite. If the SPTE was + * already marked as removed then another thread + * handling a page fault could overwrite it, so + * set the SPTE until it is set from some other + * value to the removed SPTE value. + */ + for (;;) { + old_child_spte = xchg(sptep, REMOVED_SPTE); + if (!is_removed_spte(old_child_spte)) + break; + cpu_relax(); + } } else { old_child_spte = READ_ONCE(*sptep); - WRITE_ONCE(*sptep, 0); + + /* + * Marking the SPTE as a removed SPTE is not + * strictly necessary here as the MMU lock should + * stop other threads from concurrentrly modifying + * this SPTE. Using the removed SPTE value keeps + * the shared and non-atomic cases consistent and + * simplifies the function. + */ + WRITE_ONCE(*sptep, REMOVED_SPTE); } - handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), - gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)), - old_child_spte, 0, level - 1, shared); + handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn, + old_child_spte, REMOVED_SPTE, level - 1, + shared); } kvm_flush_remote_tlbs_with_address(kvm, gfn, From patchwork Tue Feb 2 18:57:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062683 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF033C433E0 for ; Tue, 2 Feb 2021 19:04:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 923CF64EC1 for ; Tue, 2 Feb 2021 19:04:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239474AbhBBTEg (ORCPT ); Tue, 2 Feb 2021 14:04:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239134AbhBBTC2 (ORCPT ); Tue, 2 Feb 2021 14:02:28 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7210C061A2D for ; Tue, 2 Feb 2021 10:58:20 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id bg11so12606346plb.16 for ; Tue, 02 Feb 2021 10:58:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=7arfRLL2Z2CtnM0YUHBbJ+FLJFgLhV7kQmsV+ql5Wjw=; b=I4XvQXp2ht+2a41TenE/roJ1Ett778dW7gHugcfOPCOJRwFKzXH81fcTcsBv2atCd9 AYPcKfbLDHtPedau0a9TVeDLqDORxZH01r3RoKiBwxAyM8FrDzlNsvMJu8q7pqY8g+et 4xmVbVRZ8BLTu8bBGlfAfBmH1MLqhoZXHiQocMogRtc/GgebeTKNxUWsPZmHTNInLhd3 EwH5KJICDoGq5rc2HGa0tM6u3GgbFGSWXSzCFnCrz0dp3uI+unkeBajE55ILXgaiilCl FKbtJVgy9le9npYsArei5igdzuUaCSjknWA2xeapzvgej/A6m4+74B3Tn5+BAgLFV0kD 5euw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=7arfRLL2Z2CtnM0YUHBbJ+FLJFgLhV7kQmsV+ql5Wjw=; b=MFXqRvqN8x5yNDEUNqQAW0sqjZlRKQBFBV9UyUl6ZOR9H7E52nmfLl12MnMREDtnJW 9j8SDIg440oPi2Ml/b3d4+j/cw34c6h5wEIvwAeOdptG1Tr09Jqx4idmsG0YtNNwq9au BLCoeA/quKk+juW9h4xaFVCxVDxUChjBiLCqVw9YiswfCiOSuNlJE24bC9bF2Jbwg5tU TdGRGFq3aL6tD45vW+4RVXqevlUtbOmNzezZ5z+Rb+FwPS1oXZRHPcN+eOb9Miz/HNQ0 gij/RjOle1ABU2GsFIAj1zXPRU/62LA2v3GK8DzuynV2xYJ/nwkpb4xDngyJLLReS+XK yBCA== X-Gm-Message-State: AOAM532Kv0CgSGJI1ZvoifHzZTABcxVh1V/BgZJphfaZyimomTvcUdKm TDzXPxpwBYEGCEKw9N3oL+jlTLSLdr07 X-Google-Smtp-Source: ABdhPJzXNvUKye10Fi3gdAai/uJC9OZeI8HcXCz3jyHrLKGlj9WEWKBgFqLkRaBCXC1/W6Dl77imlc+CAAZX Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90b:1247:: with SMTP id gx7mr5877515pjb.22.1612292300509; Tue, 02 Feb 2021 10:58:20 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:29 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-24-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 23/28] KVM: x86/mmu: Allow parallel page faults for the TDP MMU From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Make the last few changes necessary to enable the TDP MMU to handle page faults in parallel while holding the mmu_lock in read mode. Reviewed-by: Peter Feiner Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b4d6709c240e..3d181a2b2485 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3724,7 +3724,12 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, return r; r = RET_PF_RETRY; - write_lock(&vcpu->kvm->mmu_lock); + + if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) + read_lock(&vcpu->kvm->mmu_lock); + else + write_lock(&vcpu->kvm->mmu_lock); + if (mmu_notifier_retry(vcpu->kvm, mmu_seq)) goto out_unlock; r = make_mmu_pages_available(vcpu); @@ -3739,7 +3744,10 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, prefault, is_tdp); out_unlock: - write_unlock(&vcpu->kvm->mmu_lock); + if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) + read_unlock(&vcpu->kvm->mmu_lock); + else + write_unlock(&vcpu->kvm->mmu_lock); kvm_release_pfn_clean(pfn); return r; } From patchwork Tue Feb 2 18:57:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062703 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46CB9C433DB for ; Tue, 2 Feb 2021 19:09:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8CBA64EC1 for ; Tue, 2 Feb 2021 19:09:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238928AbhBBTHb (ORCPT ); Tue, 2 Feb 2021 14:07:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239396AbhBBTCo (ORCPT ); Tue, 2 Feb 2021 14:02:44 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27390C061A30 for ; Tue, 2 Feb 2021 10:58:23 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id bc11so12606236plb.17 for ; Tue, 02 Feb 2021 10:58:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=/3ZQzUuaWzQcmBzEG3kq4CNXokRjzWqOYdXtOTnfHDo=; b=Le+bNWezEFxaDZOKrNLQ4fH7bWg5iMGI+cjUv60qgHbWxOe+u/2TbMwhpEk2aqoDsR ZPvqP95APopbB04XcSnsKw5/p34amZhFuWWblhbufZ8Up9HNqEI/qn59ohk/piAIuWVJ oabL0b94itqfddvIZ3DoDU3ArirLVyG2mUXJx+1j+SjKFqqOd8peXceH5s8l0eVkPCsG o28hj1CC6o0aNfE0jdtMtMKeZTMUtu3cXJ1tl4pp36NcQrQDpGx2it33t4AUpNVnlkQ0 nGsMcGYM1ftOYlM8l4kbbDr0rjHWbS+i+DiiyELdPhsILNeyRfmdNLaZCc2Z17+Hlq9b cH3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=/3ZQzUuaWzQcmBzEG3kq4CNXokRjzWqOYdXtOTnfHDo=; b=A2FcQw+X63/ZQH8Za5uDEk4tkV5T9xlWCw6OjUpTBr76aEw+zjGglj5pEHxK7Ql/eG OUJEGypipXV2suOuHcjqYhk+jbMpiQdrpewtUgQ8GlCLP5L7Zn738BH5UFaSLa4cHj32 /cHAMKnlJbeKHFpRwwUMJrJ2dlxw883XPjRYA+Jl52qB0gznZw6jo0PDOnu+hlE8w0gI E0S2NtyB5PehW2umSaCJCs0ksESE5bTJ9T+OTNw9iEMycCTcBhSfBAJ3dI4vR+ZL6+Nl pRLHCHMeJFStFvNhIOruewkvyVSOYR815aOcBronZAhSH8z1GaHcVJUQPbPl4AJQwAkT FbHg== X-Gm-Message-State: AOAM530OiROY1/5u4GlY1OC7lGvJGFag7qxMp8+27nqLg6SpADSEyOKM p6dRL9IDX78gyHNaTlMFOj70SIoV74vW X-Google-Smtp-Source: ABdhPJy1BMU80b5YgTS5yqKaDrFz5HZVkcONiJE3x3mdnFBkjJM+WarVov1mVVwmTVl9yoyev5MCWaZ1yYgH Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90a:8d83:: with SMTP id d3mr614698pjo.0.1612292302292; Tue, 02 Feb 2021 10:58:22 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:30 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-25-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 24/28] KVM: x86/mmu: Allow zap gfn range to operate under the mmu read lock From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To reduce lock contention and interference with page fault handlers, allow the TDP MMU function to zap a GFN range to operate under the MMU read lock. Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 13 ++- arch/x86/kvm/mmu/mmu_internal.h | 6 +- arch/x86/kvm/mmu/tdp_mmu.c | 165 +++++++++++++++++++++++++------- arch/x86/kvm/mmu/tdp_mmu.h | 3 +- 4 files changed, 145 insertions(+), 42 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3d181a2b2485..254ff87d2a61 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5518,13 +5518,17 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) } } + kvm_mmu_unlock(kvm); + if (kvm->arch.tdp_mmu_enabled) { - flush = kvm_tdp_mmu_zap_gfn_range(kvm, gfn_start, gfn_end); + read_lock(&kvm->mmu_lock); + flush = kvm_tdp_mmu_zap_gfn_range(kvm, gfn_start, gfn_end, + true); if (flush) kvm_flush_remote_tlbs(kvm); - } - write_unlock(&kvm->mmu_lock); + read_unlock(&kvm->mmu_lock); + } } static bool slot_rmap_write_protect(struct kvm *kvm, @@ -6015,7 +6019,8 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) WARN_ON_ONCE(!sp->lpage_disallowed); if (sp->tdp_mmu_page) { kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, - sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level)); + sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level), + false); } else { kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); WARN_ON_ONCE(sp->lpage_disallowed); diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 7f599cc64178..7df209fb8051 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -40,7 +40,11 @@ struct kvm_mmu_page { u64 *spt; /* hold the gfn of each spte inside spt */ gfn_t *gfns; - int root_count; /* Currently serving as active root */ + /* Currently serving as active root */ + union { + int root_count; + refcount_t tdp_mmu_root_count; + }; unsigned int unsync_children; struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */ DECLARE_BITMAP(unsync_child_bitmap, 512); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 0dd27e000dd0..de26762433ea 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -52,46 +52,104 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) rcu_barrier(); } -static void tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root) +static __always_inline __must_check bool tdp_mmu_get_root(struct kvm *kvm, + struct kvm_mmu_page *root) { - if (kvm_mmu_put_root(kvm, root)) - kvm_tdp_mmu_free_root(kvm, root); + return refcount_inc_not_zero(&root->tdp_mmu_root_count); } -static inline bool tdp_mmu_next_root_valid(struct kvm *kvm, - struct kvm_mmu_page *root) +static __always_inline void tdp_mmu_put_root(struct kvm *kvm, + struct kvm_mmu_page *root, + bool shared) { - lockdep_assert_held_write(&kvm->mmu_lock); + int root_count; + int r; - if (list_entry_is_head(root, &kvm->arch.tdp_mmu_roots, link)) - return false; + if (shared) { + lockdep_assert_held_read(&kvm->mmu_lock); - kvm_mmu_get_root(kvm, root); - return true; + root_count = atomic_read(&root->tdp_mmu_root_count.refs); + + /* + * If this is not the last reference on the root, it can be + * dropped under the MMU read lock. + */ + if (root_count > 1) { + r = atomic_cmpxchg(&root->tdp_mmu_root_count.refs, + root_count, root_count - 1); + if (r == root_count) + return; + } + + /* + * If the cmpxchg failed because of a race or this is the + * last reference on the root, drop the read lock, and + * reacquire the MMU lock in write mode. + */ + read_unlock(&kvm->mmu_lock); + write_lock(&kvm->mmu_lock); + } else { + lockdep_assert_held_write(&kvm->mmu_lock); + } + + /* + * No other thread can modify the root count since this thread holds + * the MMU lock in write mode. + */ + BUG_ON(!atomic_read(&root->tdp_mmu_root_count.refs)); + + if (refcount_dec_and_test(&root->tdp_mmu_root_count)) + kvm_tdp_mmu_free_root(kvm, root); + + if (shared) { + write_unlock(&kvm->mmu_lock); + read_lock(&kvm->mmu_lock); + + } } static inline struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm, - struct kvm_mmu_page *root) + struct kvm_mmu_page *root, + bool shared) { struct kvm_mmu_page *next_root; next_root = list_next_entry(root, link); - tdp_mmu_put_root(kvm, root); + tdp_mmu_put_root(kvm, root, shared); return next_root; } +static inline bool tdp_mmu_next_root_valid(struct kvm *kvm, + struct kvm_mmu_page *root) +{ + for (;;) { + if (list_entry_is_head(root, &kvm->arch.tdp_mmu_roots, link)) + return false; + + if (tdp_mmu_get_root(kvm, root)) + return true; + + root = list_next_entry(root, link); + } + +} + /* * Note: this iterator gets and puts references to the roots it iterates over. * This makes it safe to release the MMU lock and yield within the loop, but * if exiting the loop early, the caller must drop the reference to the most * recent root. (Unless keeping a live reference is desirable.) + * + * If shared is set, this function is operating under the MMU lock in read + * mode. In the unlikely event that this thread must free a root, the lock + * will be temporarily dropped and reacquired in write mode. */ -#define for_each_tdp_mmu_root_yield_safe(_kvm, _root) \ +#define for_each_tdp_mmu_root_yield_safe(_kvm, _root, _shared) \ for (_root = list_first_entry(&_kvm->arch.tdp_mmu_roots, \ typeof(*_root), link); \ tdp_mmu_next_root_valid(_kvm, _root); \ - _root = tdp_mmu_next_root(_kvm, _root)) + _root = tdp_mmu_next_root(_kvm, _root, _shared)) #define for_each_tdp_mmu_root(_kvm, _root) \ list_for_each_entry(_root, &_kvm->arch.tdp_mmu_roots, link) @@ -113,7 +171,7 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa) } static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, - gfn_t start, gfn_t end, bool can_yield); + gfn_t start, gfn_t end, bool can_yield, bool shared); void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) { @@ -126,7 +184,7 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root) list_del(&root->link); - zap_gfn_range(kvm, root, 0, max_gfn, false); + zap_gfn_range(kvm, root, 0, max_gfn, false, false); free_page((unsigned long)root->spt); kmem_cache_free(mmu_page_header_cache, root); @@ -658,7 +716,8 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, * Return false if a yield was not needed. */ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, - struct tdp_iter *iter, bool flush) + struct tdp_iter *iter, bool flush, + bool shared) { /* Ensure forward progress has been made before yielding. */ if (iter->next_last_level_gfn == iter->yielded_gfn) @@ -670,7 +729,11 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, if (flush) kvm_flush_remote_tlbs(kvm); - cond_resched_rwlock_write(&kvm->mmu_lock); + if (shared) + cond_resched_rwlock_read(&kvm->mmu_lock); + else + cond_resched_rwlock_write(&kvm->mmu_lock); + rcu_read_lock(); WARN_ON(iter->gfn > iter->next_last_level_gfn); @@ -690,23 +753,38 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm, * non-root pages mapping GFNs strictly within that range. Returns true if * SPTEs have been cleared and a TLB flush is needed before releasing the * MMU lock. + * * If can_yield is true, will release the MMU lock and reschedule if the * scheduler needs the CPU or there is contention on the MMU lock. If this * function cannot yield, it will not release the MMU lock or reschedule and * the caller must ensure it does not supply too large a GFN range, or the * operation can cause a soft lockup. + * + * If shared is true, this thread holds the MMU lock in read mode and must + * account for the possibility that other threads are modifying the paging + * structures concurrently. If shared is false, this thread should hold the + * MMU in write mode. */ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, - gfn_t start, gfn_t end, bool can_yield) + gfn_t start, gfn_t end, bool can_yield, bool shared) { struct tdp_iter iter; bool flush_needed = false; +#ifdef CONFIG_LOCKDEP + if (shared) + lockdep_assert_held_read(&kvm->mmu_lock); + else + lockdep_assert_held_write(&kvm->mmu_lock); +#endif /* CONFIG_LOCKDEP */ + rcu_read_lock(); tdp_root_for_each_pte(iter, root, start, end) { +retry: if (can_yield && - tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) { + tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed, + shared)) { flush_needed = false; continue; } @@ -724,8 +802,17 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, !is_last_spte(iter.old_spte, iter.level)) continue; - tdp_mmu_set_spte(kvm, &iter, 0); - flush_needed = true; + if (!shared) { + tdp_mmu_set_spte(kvm, &iter, 0); + flush_needed = true; + } else if (!tdp_mmu_zap_spte_atomic(kvm, &iter)) { + /* + * The iter must explicitly re-read the SPTE because + * the atomic cmpxchg failed. + */ + iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); + goto retry; + } } rcu_read_unlock(); @@ -737,14 +824,20 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, * non-root pages mapping GFNs strictly within that range. Returns true if * SPTEs have been cleared and a TLB flush is needed before releasing the * MMU lock. + * + * If shared is true, this thread holds the MMU lock in read mode and must + * account for the possibility that other threads are modifying the paging + * structures concurrently. If shared is false, this thread should hold the + * MMU in write mode. */ -bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end) +bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end, + bool shared) { struct kvm_mmu_page *root; bool flush = false; - for_each_tdp_mmu_root_yield_safe(kvm, root) - flush |= zap_gfn_range(kvm, root, start, end, true); + for_each_tdp_mmu_root_yield_safe(kvm, root, shared) + flush |= zap_gfn_range(kvm, root, start, end, true, shared); return flush; } @@ -754,7 +847,7 @@ void kvm_tdp_mmu_zap_all(struct kvm *kvm) gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT); bool flush; - flush = kvm_tdp_mmu_zap_gfn_range(kvm, 0, max_gfn); + flush = kvm_tdp_mmu_zap_gfn_range(kvm, 0, max_gfn, false); if (flush) kvm_flush_remote_tlbs(kvm); } @@ -918,7 +1011,7 @@ static int kvm_tdp_mmu_handle_hva_range(struct kvm *kvm, unsigned long start, int ret = 0; int as_id; - for_each_tdp_mmu_root_yield_safe(kvm, root) { + for_each_tdp_mmu_root_yield_safe(kvm, root, false) { as_id = kvm_mmu_page_as_id(root); slots = __kvm_memslots(kvm, as_id); kvm_for_each_memslot(memslot, slots) { @@ -950,7 +1043,7 @@ static int zap_gfn_range_hva_wrapper(struct kvm *kvm, struct kvm_mmu_page *root, gfn_t start, gfn_t end, unsigned long unused) { - return zap_gfn_range(kvm, root, start, end, false); + return zap_gfn_range(kvm, root, start, end, false, false); } int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start, @@ -1113,7 +1206,7 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, for_each_tdp_pte_min_level(iter, root->spt, root->role.level, min_level, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) continue; if (!is_shadow_present_pte(iter.old_spte) || @@ -1143,7 +1236,7 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, struct kvm_memory_slot *slot, int root_as_id; bool spte_set = false; - for_each_tdp_mmu_root_yield_safe(kvm, root) { + for_each_tdp_mmu_root_yield_safe(kvm, root, false) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; @@ -1172,7 +1265,7 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, rcu_read_lock(); tdp_root_for_each_leaf_pte(iter, root, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) continue; if (spte_ad_need_write_protect(iter.old_spte)) { @@ -1208,7 +1301,7 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm, struct kvm_memory_slot *slot) int root_as_id; bool spte_set = false; - for_each_tdp_mmu_root_yield_safe(kvm, root) { + for_each_tdp_mmu_root_yield_safe(kvm, root, false) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; @@ -1304,7 +1397,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, rcu_read_lock(); tdp_root_for_each_pte(iter, root, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, false)) + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) continue; if (!is_shadow_present_pte(iter.old_spte) || @@ -1332,7 +1425,7 @@ bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) int root_as_id; bool spte_set = false; - for_each_tdp_mmu_root_yield_safe(kvm, root) { + for_each_tdp_mmu_root_yield_safe(kvm, root, false) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; @@ -1358,7 +1451,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm, rcu_read_lock(); tdp_root_for_each_pte(iter, root, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, spte_set)) { + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) { spte_set = false; continue; } @@ -1392,7 +1485,7 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, struct kvm_mmu_page *root; int root_as_id; - for_each_tdp_mmu_root_yield_safe(kvm, root) { + for_each_tdp_mmu_root_yield_safe(kvm, root, false) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index cbbdbadd1526..10ada884270b 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -12,7 +12,8 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t root); hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root); -bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end); +bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end, + bool shared); void kvm_tdp_mmu_zap_all(struct kvm *kvm); int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, From patchwork Tue Feb 2 18:57:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062701 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 983D3C433E9 for ; Tue, 2 Feb 2021 19:09:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 596A664F55 for ; Tue, 2 Feb 2021 19:09:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239353AbhBBTHl (ORCPT ); Tue, 2 Feb 2021 14:07:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239420AbhBBTDi (ORCPT ); Tue, 2 Feb 2021 14:03:38 -0500 Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34717C0610D6 for ; Tue, 2 Feb 2021 10:58:25 -0800 (PST) Received: by mail-qk1-x749.google.com with SMTP id g80so15019500qke.17 for ; Tue, 02 Feb 2021 10:58:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=dSL9YAt4PaftCcOpv1mtWutPCMjGb6A4qY7KXrR77Zs=; b=HVFtFk8XQEi/jo2mOt8kTLivy+jMrjzX6ilURGID8u6SifizNdR9NeLzsF/Yky3E83 E9aC3FEqreGwuWr5UzWztygFojnG3nMggEsSDQbgl2qEuEqwpekxFy8zqcYa1LkkrQj6 297PX126fqD6HBRq+hnjj5KKW1LZXpNDcCFDJLCaIMk74CcKYNRNo8DOpXBJZp5i6Xf2 mb0Z6aYnXBPmumaMTAQoIDLY+o5q2wMmNtSUTdgBRqBSoXmQjV7G/axKA08Ku6OEo7px qSNcdSi/K8jDDn8mCGi5RsMmuMOy46XRjGz6zM8qAck7cGmG9jTa+aNw07W83a9L4Jpf zJ4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=dSL9YAt4PaftCcOpv1mtWutPCMjGb6A4qY7KXrR77Zs=; b=V7VuQumgS49Zvd6VGXKishLYye0E9Ihhzl+LG1+RJUUUjdY2AHlpDbtbmtWMmPju/u 3HUHNjP1376iE/4Vjx2+20D5DeRYT1TjmBAtXfHClqjvTb2BSo/1fiQ3QvokcUqLQ/Tj OhKov0QQupyO/lY9bgFTc3c9u8oZrHnRxJUMFuuh1X1rLw+ov+qIKhe1QIBOPvLOtRin VnvMDtxw0SEXrSc/JWoUuA/FX8/c9qd2ECPNfvryOz7bb0SPIx5Ki5zk2vZEEZA7R64j FYdJvUa5YlqlcD4y56XinFCqdm29SYhEcoW3R1419z6U3eI88Sfp1T7+lPRpoj/wA5dv nNpw== X-Gm-Message-State: AOAM530y9247LCjgDE7FSIuLWPdW0laRnrKFUDtwqiiWOCldL/YEmZdb lVXMlv0dRonQm/g4KbykgiFjlJ2jDsyn X-Google-Smtp-Source: ABdhPJzLHex+zt5wNPunEWO9byWg68HeubOeBmvoz/w1E1lA2GAwzBgmByUNVCg3p1uvGpPh4Wp573NYlolF Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a05:6214:1703:: with SMTP id db3mr21420509qvb.43.1612292304407; Tue, 02 Feb 2021 10:58:24 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:31 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-26-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 25/28] KVM: x86/mmu: Allow zapping collapsible SPTEs to use MMU read lock From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To speed the process of disabling dirty logging, change the TDP MMU function which zaps collapsible SPTEs to run under the MMU read lock. Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 5 ++--- arch/x86/kvm/mmu/tdp_mmu.c | 22 +++++++++++++++------- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 254ff87d2a61..e3cf868be6bd 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5517,8 +5517,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) start, end - 1, true); } } - - kvm_mmu_unlock(kvm); + write_unlock(&kvm->mmu_lock); if (kvm->arch.tdp_mmu_enabled) { read_lock(&kvm->mmu_lock); @@ -5611,10 +5610,10 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, write_lock(&kvm->mmu_lock); slot_handle_leaf(kvm, (struct kvm_memory_slot *)memslot, kvm_mmu_zap_collapsible_spte, true); + write_unlock(&kvm->mmu_lock); if (kvm->arch.tdp_mmu_enabled) kvm_tdp_mmu_zap_collapsible_sptes(kvm, memslot); - write_unlock(&kvm->mmu_lock); } void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm, diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index de26762433ea..cfe66b8d39fa 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1451,10 +1451,9 @@ static void zap_collapsible_spte_range(struct kvm *kvm, rcu_read_lock(); tdp_root_for_each_pte(iter, root, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) { - spte_set = false; +retry: + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true)) continue; - } if (!is_shadow_present_pte(iter.old_spte) || !is_last_spte(iter.old_spte, iter.level)) @@ -1465,9 +1464,14 @@ static void zap_collapsible_spte_range(struct kvm *kvm, !PageTransCompoundMap(pfn_to_page(pfn))) continue; - tdp_mmu_set_spte(kvm, &iter, 0); - - spte_set = true; + if (!tdp_mmu_zap_spte_atomic(kvm, &iter)) { + /* + * The iter must explicitly re-read the SPTE because + * the atomic cmpxchg failed. + */ + iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); + goto retry; + } } rcu_read_unlock(); @@ -1485,7 +1489,9 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, struct kvm_mmu_page *root; int root_as_id; - for_each_tdp_mmu_root_yield_safe(kvm, root, false) { + read_lock(&kvm->mmu_lock); + + for_each_tdp_mmu_root_yield_safe(kvm, root, true) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; @@ -1493,6 +1499,8 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, zap_collapsible_spte_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages); } + + read_unlock(&kvm->mmu_lock); } /* From patchwork Tue Feb 2 18:57:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062691 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0A28C433E0 for ; Tue, 2 Feb 2021 19:08:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5EE5164F55 for ; Tue, 2 Feb 2021 19:08:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239456AbhBBTIA (ORCPT ); Tue, 2 Feb 2021 14:08:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239432AbhBBTDv (ORCPT ); Tue, 2 Feb 2021 14:03:51 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFF79C061BD1 for ; Tue, 2 Feb 2021 10:58:26 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id z2so10916137pln.18 for ; Tue, 02 Feb 2021 10:58:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=Ea/WfcZfvg2RR5BqxG5prQgL5AJRdv/sOQkRymma1pQ=; b=KkwrCFNyw92sZ83NDZJSCE35gzQiP7qB54jgVjdHH/FfchU1kzyYP2mVrsxUfA/NV5 k0lWkoueYtDF3SNduP/l8UmLhWsVyhMsHdUGJBrpo4Kv4e2iVoJMHpTYkN3WngxpFnH1 VQvj8ItpN+2gWxFoQLy9gc6me3kIt5QrHCf8E1xWtbykmmNROxArfsD79UvdZ4suAbQZ cNufnBytD9FrmMnpfO1fNVLYgWPvTrP7kvolakobbEaebBSvVmN4qLcZQQZf1FBxtVcz mwUn7KUddvVDQMwpozQhp/ywo05Rdj0RiZyyletUQ8zFEYVuptozieGsgfyl9ZMlrFdS zuHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Ea/WfcZfvg2RR5BqxG5prQgL5AJRdv/sOQkRymma1pQ=; b=lTR75rMRpc9usUp5cka5e3qqebKRmioOwFJuNwE72pQ++TY32maXkCna9gfShTCFkY AN32z8CZfM5iCgtdDXeyxIkGnf13Rw2sBsa1cc7QTaurESPznxc58rZ/WPQV3u9pVsZW OKA1AbIZ2OaJRQFUxcs6ERfhjDhTodN/RGmD6GUGMDWrOk+STYtq6Vhkx9UBGu6ws0F4 +9CLDwDIk0N7nPdcGpY5mLQlngL5KqsOK1bpNJBLSy2onzI9It6ft3IId7mjEo0VijHC E8FwoIUHJ9B0OtgZ/2vJ2IivjP7uXgYYRjiKg0pDzxL1px4urfa8Q2aFWowXFFOuxE0V COOg== X-Gm-Message-State: AOAM530f/OVT1NE2bdnwmmcH6g/y6Bbl618zX2pU4bUXPg2jM9bmHTMs OCcrgX4vuKIPgpHVsEpcYSgg7rGJOPRy X-Google-Smtp-Source: ABdhPJxyiV/VNdVF4hzY9QnU1Q6fO/z/S3yQ+VmReVkk9xb0WGhltjZWUgFeH6ioH3fhefmV2oLsrZtUlA9o Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:902:724c:b029:e1:4aae:c72 with SMTP id c12-20020a170902724cb02900e14aae0c72mr13132359pll.81.1612292306332; Tue, 02 Feb 2021 10:58:26 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:32 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-27-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 26/28] KVM: x86/mmu: Allow enabling / disabling dirty logging under MMU read lock From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To reduce lock contention and interference with page fault handlers, allow the TDP MMU functions which enable and disable dirty logging to operate under the MMU read lock. Extend dirty logging enable disable functions read lock-ness Signed-off-by: Ben Gardon --- arch/x86/kvm/mmu/mmu.c | 14 +++--- arch/x86/kvm/mmu/tdp_mmu.c | 93 ++++++++++++++++++++++++++++++-------- arch/x86/kvm/mmu/tdp_mmu.h | 2 +- 3 files changed, 84 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e3cf868be6bd..6ba2a72d4330 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5638,9 +5638,10 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, write_lock(&kvm->mmu_lock); flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, false); + write_unlock(&kvm->mmu_lock); + if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_clear_dirty_slot(kvm, memslot); - write_unlock(&kvm->mmu_lock); /* * It's also safe to flush TLBs out of mmu lock here as currently this @@ -5661,9 +5662,10 @@ void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm, write_lock(&kvm->mmu_lock); flush = slot_handle_large_level(kvm, memslot, slot_rmap_write_protect, false); + write_unlock(&kvm->mmu_lock); + if (kvm->arch.tdp_mmu_enabled) flush |= kvm_tdp_mmu_wrprot_slot(kvm, memslot, PG_LEVEL_2M); - write_unlock(&kvm->mmu_lock); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); @@ -5677,12 +5679,12 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm, write_lock(&kvm->mmu_lock); flush = slot_handle_all_level(kvm, memslot, __rmap_set_dirty, false); - if (kvm->arch.tdp_mmu_enabled) - flush |= kvm_tdp_mmu_slot_set_dirty(kvm, memslot); - write_unlock(&kvm->mmu_lock); - if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); + write_unlock(&kvm->mmu_lock); + + if (kvm->arch.tdp_mmu_enabled) + kvm_tdp_mmu_slot_set_dirty(kvm, memslot); } EXPORT_SYMBOL_GPL(kvm_mmu_slot_set_dirty); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index cfe66b8d39fa..6093926a6bc5 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -553,18 +553,22 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, } /* - * tdp_mmu_set_spte_atomic - Set a TDP MMU SPTE atomically and handle the + * __tdp_mmu_set_spte_atomic - Set a TDP MMU SPTE atomically and handle the * associated bookkeeping * * @kvm: kvm instance * @iter: a tdp_iter instance currently on the SPTE that should be set * @new_spte: The value the SPTE should be set to + * @record_dirty_log: Record the page as dirty in the dirty bitmap if + * appropriate for the change being made. Should be set + * unless performing certain dirty logging operations. + * Leaving record_dirty_log unset in that case prevents page + * writes from being double counted. * Returns: true if the SPTE was set, false if it was not. If false is returned, * this function will have no side-effects. */ -static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, - struct tdp_iter *iter, - u64 new_spte) +static inline bool __tdp_mmu_set_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter, u64 new_spte, bool record_dirty_log) { u64 *root_pt = tdp_iter_root_pt(iter); struct kvm_mmu_page *root = sptep_to_sp(root_pt); @@ -583,12 +587,31 @@ static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, new_spte) != iter->old_spte) return false; - handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, - iter->level, true); + __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte, + iter->level, true); + handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level); + if (record_dirty_log) + handle_changed_spte_dirty_log(kvm, as_id, iter->gfn, + iter->old_spte, new_spte, + iter->level); return true; } +static inline bool tdp_mmu_set_spte_atomic_no_dirty_log(struct kvm *kvm, + struct tdp_iter *iter, + u64 new_spte) +{ + return __tdp_mmu_set_spte_atomic(kvm, iter, new_spte, false); +} + +static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter, + u64 new_spte) +{ + return __tdp_mmu_set_spte_atomic(kvm, iter, new_spte, true); +} + static inline bool tdp_mmu_zap_spte_atomic(struct kvm *kvm, struct tdp_iter *iter) { @@ -1206,7 +1229,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, for_each_tdp_pte_min_level(iter, root->spt, root->role.level, min_level, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) +retry: + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true)) continue; if (!is_shadow_present_pte(iter.old_spte) || @@ -1216,7 +1240,15 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, new_spte = iter.old_spte & ~PT_WRITABLE_MASK; - tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); + if (!tdp_mmu_set_spte_atomic_no_dirty_log(kvm, &iter, + new_spte)) { + /* + * The iter must explicitly re-read the SPTE because + * the atomic cmpxchg failed. + */ + iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); + goto retry; + } spte_set = true; } @@ -1236,7 +1268,8 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, struct kvm_memory_slot *slot, int root_as_id; bool spte_set = false; - for_each_tdp_mmu_root_yield_safe(kvm, root, false) { + read_lock(&kvm->mmu_lock); + for_each_tdp_mmu_root_yield_safe(kvm, root, true) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; @@ -1244,6 +1277,7 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, struct kvm_memory_slot *slot, spte_set |= wrprot_gfn_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages, min_level); } + read_unlock(&kvm->mmu_lock); return spte_set; } @@ -1265,7 +1299,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, rcu_read_lock(); tdp_root_for_each_leaf_pte(iter, root, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) +retry: + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true)) continue; if (spte_ad_need_write_protect(iter.old_spte)) { @@ -1280,7 +1315,15 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, continue; } - tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte); + if (!tdp_mmu_set_spte_atomic_no_dirty_log(kvm, &iter, + new_spte)) { + /* + * The iter must explicitly re-read the SPTE because + * the atomic cmpxchg failed. + */ + iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); + goto retry; + } spte_set = true; } @@ -1301,7 +1344,8 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm, struct kvm_memory_slot *slot) int root_as_id; bool spte_set = false; - for_each_tdp_mmu_root_yield_safe(kvm, root, false) { + read_lock(&kvm->mmu_lock); + for_each_tdp_mmu_root_yield_safe(kvm, root, true) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; @@ -1309,6 +1353,7 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm, struct kvm_memory_slot *slot) spte_set |= clear_dirty_gfn_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages); } + read_unlock(&kvm->mmu_lock); return spte_set; } @@ -1397,7 +1442,8 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, rcu_read_lock(); tdp_root_for_each_pte(iter, root, start, end) { - if (tdp_mmu_iter_cond_resched(kvm, &iter, false, false)) +retry: + if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true)) continue; if (!is_shadow_present_pte(iter.old_spte) || @@ -1406,7 +1452,14 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, new_spte = iter.old_spte | shadow_dirty_mask; - tdp_mmu_set_spte(kvm, &iter, new_spte); + if (!tdp_mmu_set_spte_atomic(kvm, &iter, new_spte)) { + /* + * The iter must explicitly re-read the SPTE because + * the atomic cmpxchg failed. + */ + iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); + goto retry; + } spte_set = true; } @@ -1417,15 +1470,15 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root, /* * Set the dirty status of all the SPTEs mapping GFNs in the memslot. This is * only used for PML, and so will involve setting the dirty bit on each SPTE. - * Returns true if an SPTE has been changed and the TLBs need to be flushed. */ -bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) +void kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) { struct kvm_mmu_page *root; int root_as_id; bool spte_set = false; - for_each_tdp_mmu_root_yield_safe(kvm, root, false) { + read_lock(&kvm->mmu_lock); + for_each_tdp_mmu_root_yield_safe(kvm, root, true) { root_as_id = kvm_mmu_page_as_id(root); if (root_as_id != slot->as_id) continue; @@ -1433,7 +1486,11 @@ bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) spte_set |= set_dirty_gfn_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages); } - return spte_set; + + if (spte_set) + kvm_arch_flush_remote_tlbs_memslot(kvm, slot); + + read_unlock(&kvm->mmu_lock); } /* diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 10ada884270b..848b41b20985 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -38,7 +38,7 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, unsigned long mask, bool wrprot); -bool kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot); +void kvm_tdp_mmu_slot_set_dirty(struct kvm *kvm, struct kvm_memory_slot *slot); void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, const struct kvm_memory_slot *slot); From patchwork Tue Feb 2 18:57:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062693 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BED2AC43381 for ; Tue, 2 Feb 2021 19:08:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 854C864D79 for ; Tue, 2 Feb 2021 19:08:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239481AbhBBTII (ORCPT ); Tue, 2 Feb 2021 14:08:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239325AbhBBTEG (ORCPT ); Tue, 2 Feb 2021 14:04:06 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF10AC061D7C for ; Tue, 2 Feb 2021 10:58:28 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id 16so14136462pfn.12 for ; Tue, 02 Feb 2021 10:58:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=3w/Fz3KRal32EzN7XgdofPfwSeoRKBs/rliSL0B+258=; b=oFJlf1jVlU3U6YYI2jjAGljxr5RlncvhscYzLLUQHdnBZOxaQc5HGLz1W9vxOmt3FV T2yILdv5+AxHIPRh5jfOIjxZ8HAyIyndC5jRSnau6MpMep9EuYTop62ZGfc7UnsxnUAL /l95e0QM3Crb7Fs4IruOrTUx7w9NdvXmbdW100osMr78UMa40axPMlvrR0P1LaxuELGX zagONAb3w2bZIQf+sZOKUftoE3c6abv++b9lyhTnfQpNs6K6TB1KY8lh5OD8uPaSJwf6 vYdqYxwwFMv6FjbijA8deacUiS6uAFCF88wAJDDrbPTViikaC8/G7zioF3+ubGDinT3O s1vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=3w/Fz3KRal32EzN7XgdofPfwSeoRKBs/rliSL0B+258=; b=my5bFcNs3uBsck0DYo3BOSfDAd7ORIIBSXRuvu+vRl+wyBP2VfjBYfNSCctHnMI1ed u7i86fShOXKwiFd4C4XkqnR/2RORyCJG22MOw0LMXtazaKKbURs0W1sYQiq7BSXwrI96 /YfnBs+8hrOseqxUkUbKM+YK9Q7yuUIVEq+LsU7mLNUA0mS5Xfn7n90IkXv/XuiSQRoW cSnQu9vzxrhNOAGouz89SVFjmgObUozVnGpmHfU/vF1RIgWfvT4tnU406ZJyypDoHtEN H0yIugnS3wMHU5oQTh9zdJDWebZ3YiCqsOBIilUezluy14DohC6MELdCOFUk+oe77d2H 5RfA== X-Gm-Message-State: AOAM533kZUM9OzK9arwoOISy4A1breUpPQNbQNyBOF5ZCJpGzhhbvK0z YI4Hm+EHqI4fvvgK25zeEsn8FdXSg7GH X-Google-Smtp-Source: ABdhPJwrOmmC+jea7XiyZdD3HaqZEF2v2jSVttQD60y9Xg6aQitnJ+ZqVuqrX9DY76rktQ3d2hqZ9TV0yGpD Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a62:9258:0:b029:1c8:c970:56d with SMTP id o85-20020a6292580000b02901c8c970056dmr23015975pfd.70.1612292308218; Tue, 02 Feb 2021 10:58:28 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:33 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-28-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 27/28] KVM: selftests: Add backing src parameter to dirty_log_perf_test From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Andrew Jones , Thomas Huth Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a parameter to control the backing memory type for dirty_log_perf_test so that the test can be run with hugepages. To: linux-kselftest@vger.kernel.org CC: Peter Xu CC: Andrew Jones CC: Thomas Huth Signed-off-by: Ben Gardon --- .../selftests/kvm/demand_paging_test.c | 3 +- .../selftests/kvm/dirty_log_perf_test.c | 15 ++++++++-- .../testing/selftests/kvm/include/kvm_util.h | 6 ---- .../selftests/kvm/include/perf_test_util.h | 3 +- .../testing/selftests/kvm/include/test_util.h | 14 +++++++++ .../selftests/kvm/lib/perf_test_util.c | 6 ++-- tools/testing/selftests/kvm/lib/test_util.c | 29 +++++++++++++++++++ 7 files changed, 62 insertions(+), 14 deletions(-) diff --git a/tools/testing/selftests/kvm/demand_paging_test.c b/tools/testing/selftests/kvm/demand_paging_test.c index cdad1eca72f7..9e3254ff0821 100644 --- a/tools/testing/selftests/kvm/demand_paging_test.c +++ b/tools/testing/selftests/kvm/demand_paging_test.c @@ -265,7 +265,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) int vcpu_id; int r; - vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size); + vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, + VM_MEM_SRC_ANONYMOUS); perf_test_args.wr_fract = 1; diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 2283a0ec74a9..604ccefd6e76 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -92,6 +92,7 @@ struct test_params { unsigned long iterations; uint64_t phys_offset; int wr_fract; + enum vm_mem_backing_src_type backing_src; }; static void run_test(enum vm_guest_mode mode, void *arg) @@ -111,7 +112,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct kvm_enable_cap cap = {}; struct timespec clear_dirty_log_total = (struct timespec){0}; - vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size); + vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size, + p->backing_src); perf_test_args.wr_fract = p->wr_fract; @@ -236,7 +238,7 @@ static void help(char *name) { puts(""); printf("usage: %s [-h] [-i iterations] [-p offset] " - "[-m mode] [-b vcpu bytes] [-v vcpus]\n", name); + "[-m mode] [-b vcpu bytes] [-v vcpus] [-s mem type]\n", name); puts(""); printf(" -i: specify iteration counts (default: %"PRIu64")\n", TEST_HOST_LOOP_N); @@ -251,6 +253,9 @@ static void help(char *name) " 1/.\n" " (default: 1 i.e. all pages are written to.)\n"); printf(" -v: specify the number of vCPUs to run.\n"); + printf(" -s: specify the type of memory that should be used to\n" + " back the guest data region.\n"); + backing_src_help(); puts(""); exit(0); } @@ -261,6 +266,7 @@ int main(int argc, char *argv[]) struct test_params p = { .iterations = TEST_HOST_LOOP_N, .wr_fract = 1, + .backing_src = VM_MEM_SRC_ANONYMOUS, }; int opt; @@ -271,7 +277,7 @@ int main(int argc, char *argv[]) guest_modes_append_default(); - while ((opt = getopt(argc, argv, "hi:p:m:b:f:v:")) != -1) { + while ((opt = getopt(argc, argv, "hi:p:m:b:f:v:s:")) != -1) { switch (opt) { case 'i': p.iterations = strtol(optarg, NULL, 10); @@ -295,6 +301,9 @@ int main(int argc, char *argv[]) TEST_ASSERT(nr_vcpus > 0 && nr_vcpus <= max_vcpus, "Invalid number of vcpus, must be between 1 and %d", max_vcpus); break; + case 's': + p.backing_src = parse_backing_src_type(optarg); + break; case 'h': default: help(argv[0]); diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 5cbb861525ed..2d7eb6989e83 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -79,12 +79,6 @@ struct vm_guest_mode_params { }; extern const struct vm_guest_mode_params vm_guest_mode_params[]; -enum vm_mem_backing_src_type { - VM_MEM_SRC_ANONYMOUS, - VM_MEM_SRC_ANONYMOUS_THP, - VM_MEM_SRC_ANONYMOUS_HUGETLB, -}; - int kvm_check_cap(long cap); int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap); int vcpu_enable_cap(struct kvm_vm *vm, uint32_t vcpu_id, diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h index b1188823c31b..8b66ab300175 100644 --- a/tools/testing/selftests/kvm/include/perf_test_util.h +++ b/tools/testing/selftests/kvm/include/perf_test_util.h @@ -44,7 +44,8 @@ extern struct perf_test_args perf_test_args; extern uint64_t guest_test_phys_mem; struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int vcpus, - uint64_t vcpu_memory_bytes); + uint64_t vcpu_memory_bytes, + enum vm_mem_backing_src_type backing_src); void perf_test_destroy_vm(struct kvm_vm *vm); void perf_test_setup_vcpus(struct kvm_vm *vm, int vcpus, uint64_t vcpu_memory_bytes); diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index ffffa560436b..749b24a239a1 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -67,4 +67,18 @@ struct timespec timespec_sub(struct timespec ts1, struct timespec ts2); struct timespec timespec_diff_now(struct timespec start); struct timespec timespec_div(struct timespec ts, int divisor); +enum vm_mem_backing_src_type { + VM_MEM_SRC_ANONYMOUS, + VM_MEM_SRC_ANONYMOUS_THP, + VM_MEM_SRC_ANONYMOUS_HUGETLB, +}; + +struct vm_mem_backing_src_alias { + const char *name; + enum vm_mem_backing_src_type type; +}; + +void backing_src_help(void); +enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name); + #endif /* SELFTEST_KVM_TEST_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c index 9be1944c2d1c..7f1571924347 100644 --- a/tools/testing/selftests/kvm/lib/perf_test_util.c +++ b/tools/testing/selftests/kvm/lib/perf_test_util.c @@ -49,7 +49,8 @@ static void guest_code(uint32_t vcpu_id) } struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int vcpus, - uint64_t vcpu_memory_bytes) + uint64_t vcpu_memory_bytes, + enum vm_mem_backing_src_type backing_src) { struct kvm_vm *vm; uint64_t guest_num_pages; @@ -93,8 +94,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int vcpus, pr_info("guest physical test memory offset: 0x%lx\n", guest_test_phys_mem); /* Add an extra memory slot for testing */ - vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, - guest_test_phys_mem, + vm_userspace_mem_region_add(vm, backing_src, guest_test_phys_mem, PERF_TEST_MEM_SLOT_INDEX, guest_num_pages, 0); diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c index 8e04c0b1608e..9fd60b142c23 100644 --- a/tools/testing/selftests/kvm/lib/test_util.c +++ b/tools/testing/selftests/kvm/lib/test_util.c @@ -10,6 +10,7 @@ #include #include #include +#include "linux/kernel.h" #include "test_util.h" @@ -109,3 +110,31 @@ void print_skip(const char *fmt, ...) va_end(ap); puts(", skipping test"); } + +const struct vm_mem_backing_src_alias backing_src_aliases[] = { + {"anonymous", VM_MEM_SRC_ANONYMOUS,}, + {"anonymous_thp", VM_MEM_SRC_ANONYMOUS_THP,}, + {"anonymous_hugetlb", VM_MEM_SRC_ANONYMOUS_HUGETLB,}, +}; + +void backing_src_help(void) +{ + int i; + + printf("Available backing src types:\n"); + for (i = 0; i < ARRAY_SIZE(backing_src_aliases); i++) + printf("\t%s\n", backing_src_aliases[i].name); +} + +enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(backing_src_aliases); i++) + if (!strcmp(type_name, backing_src_aliases[i].name)) + return backing_src_aliases[i].type; + + backing_src_help(); + TEST_FAIL("Unknown backing src type: %s", type_name); + return -1; +} From patchwork Tue Feb 2 18:57:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Gardon X-Patchwork-Id: 12062695 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9AEFC433E0 for ; Tue, 2 Feb 2021 19:08:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A9D6264EC1 for ; Tue, 2 Feb 2021 19:08:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239498AbhBBTIM (ORCPT ); Tue, 2 Feb 2021 14:08:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239457AbhBBTEW (ORCPT ); Tue, 2 Feb 2021 14:04:22 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 742ABC061D7E for ; Tue, 2 Feb 2021 10:58:30 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id 137so14093966pfw.4 for ; Tue, 02 Feb 2021 10:58:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=ZvItmnRmq3+3IUtsIrhOy3Q/zP9ao+Af4hI06xbsCBk=; b=T06o6eb7cr8x20ra650Qhv19YCauK4JvfpH2VPNTwvPmtbqZa4F6+bx4i+bkR2FwoN RYZbdt8+Sofz/kn8MhYElw6lAIB2hxwkBE64lxBZ8mCrvd9jg0Sgr1DqjCE/j91pO7xj Bpt0Zb6tTkjnwHHGf4AosXwJUy4CY8arXs/jHQRj41vmpIXxlTxo8iR+2EAXVrPn72Bw LMoVRM0LcjI+aaN4S9Ml3dCja3XCk4LmuvZUTc5DLmAXW67FXKhQttM0oAYWXHOWOhZv 0fRdg5Zawh5EQJnP9m4VCDxa1de7ftpF/0P/h5Zwk6RaC8wsCCoxwztH6JgpvcrrwxOI Uz5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=ZvItmnRmq3+3IUtsIrhOy3Q/zP9ao+Af4hI06xbsCBk=; b=hKxxFXbK6blXM83MX0yfh5+5X2/DBajTdGewZaWyPirDG+xPy1E6hjeYofsFfWI2A2 Dxnu5SMOKNnL3mcYoOGAwtWQYU7m9gGz6ajEDtpUUF1+gdIFl++Ay8ctDHeLkUOzYS6N jvN4FKlYiPXO+Cgx+qRfBlytMyRHzXP6qE463ZVbk1bhghR8JOJXqhF7LxMMYiB8t6rg londDAdWVj07D5FI8wEtfeGfJnJGHJ6TQG1yWHm+Xfzl7fkuBzXSHlocxZoDBWwm9JS8 RYxA9ynmz2hFOlZ6dK8u4bqELiookzWQtOi6bUBDCi5ynvvUv1wX0YFFKIt3rz7qD5XT PsQw== X-Gm-Message-State: AOAM530xyGsknocBuoLiukqGt/vw2V5eOVRsctVsEQfP3qmCaWmhKl3j oCju+yweOTDhqOab1w9t+Lci2tPphTun X-Google-Smtp-Source: ABdhPJxMmxSs+7Fj0bLxcfcTv53Dae5POhJVmfwPfvOLN6jt6IT1flYD3vfMkIXyS8vkwr0W/inwMI9kfCd4 Sender: "bgardon via sendgmr" X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:9090:561:5a98:6d47]) (user=bgardon job=sendgmr) by 2002:a17:90a:7143:: with SMTP id g3mr5960230pjs.196.1612292309943; Tue, 02 Feb 2021 10:58:29 -0800 (PST) Date: Tue, 2 Feb 2021 10:57:34 -0800 In-Reply-To: <20210202185734.1680553-1-bgardon@google.com> Message-Id: <20210202185734.1680553-29-bgardon@google.com> Mime-Version: 1.0 References: <20210202185734.1680553-1-bgardon@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v2 28/28] KVM: selftests: Disable dirty logging with vCPUs running From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong , Ben Gardon , Andrew Jones , Thomas Huth Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Disabling dirty logging is much more intestesting from a testing perspective if the vCPUs are still running. This also excercises the code-path in which collapsible SPTEs must be faulted back in at a higher level after disabling dirty logging. To: linux-kselftest@vger.kernel.org CC: Peter Xu CC: Andrew Jones CC: Thomas Huth Signed-off-by: Ben Gardon --- tools/testing/selftests/kvm/dirty_log_perf_test.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c index 604ccefd6e76..d44a5b8ef232 100644 --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c @@ -205,11 +205,6 @@ static void run_test(enum vm_guest_mode mode, void *arg) } } - /* Tell the vcpu thread to quit */ - host_quit = true; - for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) - pthread_join(vcpu_threads[vcpu_id], NULL); - /* Disable dirty logging */ clock_gettime(CLOCK_MONOTONIC, &start); vm_mem_region_set_flags(vm, PERF_TEST_MEM_SLOT_INDEX, 0); @@ -217,6 +212,11 @@ static void run_test(enum vm_guest_mode mode, void *arg) pr_info("Disabling dirty logging time: %ld.%.9lds\n", ts_diff.tv_sec, ts_diff.tv_nsec); + /* Tell the vcpu thread to quit */ + host_quit = true; + for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) + pthread_join(vcpu_threads[vcpu_id], NULL); + avg = timespec_div(get_dirty_log_total, p->iterations); pr_info("Get dirty log over %lu iterations took %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n", p->iterations, get_dirty_log_total.tv_sec,