From patchwork Tue Jan 25 23:05:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12724344 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F09DC63682 for ; Tue, 25 Jan 2022 23:05:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234639AbiAYXFn (ORCPT ); Tue, 25 Jan 2022 18:05:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229759AbiAYXF2 (ORCPT ); Tue, 25 Jan 2022 18:05:28 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1386C06161C for ; Tue, 25 Jan 2022 15:05:27 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id y17-20020a170902d65100b0014a17faf0bdso5806249plh.12 for ; Tue, 25 Jan 2022 15:05:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=HxJAxQ4MI8aCCbQ5x8o8jo9zvmoSS9iK0+ytmF0z014=; b=kVzDjs4bjahxuMCvNGitg3FRBPyJuW9NDLycy7/obZSpg/GDZ1bQ27pYMWGZMD1xty O/aJ9+vAsRl9toaNqOTroihCadE6R6u260hWMjAZwGObz9Ee1n7+uRKKnT70amRCNZtP MVLj0Mv8N5fAe5h6M5eBflc1El/hEsbWKh6jNzvVIWhZN/DOUagFFlfU+u2TrJJvOQWe /DEqoh0VoyE97SRkKp4lsVnQtYibMYFeikWDhQmzsi0g6pxkCSYIPKzA/TqqKwvUwkE+ b+ua72azS+EUxIEjCsoDKAZMIgWrwztKHivdxHpwNU4W401a1DnKl9sinwnRXRsXIo7A Ubxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=HxJAxQ4MI8aCCbQ5x8o8jo9zvmoSS9iK0+ytmF0z014=; b=2++chZDul1AgxzFdfwG34sU/worWPOB8K9XdYj3FlbtW36ctXdaeVCnVUV6FmsWeyX d9CazkMdBBL4NAuns24eFYD496mHxqk911mFfMy9tUQcmZEivHoi1VAIK9iPGAXStxX5 RlFPV2csBgvphWM6ZzLgY+hkukr8SYYPIrdX6VGgTBLGw0GwERFISscelAvKoKLHW++l GKjdRF7M/CYL97PHz+zjAh5uUR6r9/UO5SLV+/T3fD+WEhZOKHQUSZmAOYSwkUkMK5RI 8W29aqexqaMHryl9bEbXhDuAZc7yP0AETO98ZeoW0DpZhn+gjaOB0OSR21jz1L5WXIex VTuQ== X-Gm-Message-State: AOAM530ZhPWDn+wwpTDtp7CFElbtVyQ64L4OLRm/5xz84e9vAN4F8qJA 7k5j3qez+E81ifCs6SBEbtkydnOrPk7crQ== X-Google-Smtp-Source: ABdhPJxdkfimymCW0uzOnRFpuurtDT4twVS+xM6DqgEJYmF8lIS+Nk7SuveuhBVezcpIqMegHHQIgHfbgoWoYA== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a05:6a00:22d2:b0:4c1:d0dd:4b with SMTP id f18-20020a056a0022d200b004c1d0dd004bmr20498701pfj.59.1643151927140; Tue, 25 Jan 2022 15:05:27 -0800 (PST) Date: Tue, 25 Jan 2022 23:05:14 +0000 In-Reply-To: <20220125230518.1697048-1-dmatlack@google.com> Message-Id: <20220125230518.1697048-2-dmatlack@google.com> Mime-Version: 1.0 References: <20220125230518.1697048-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.0.rc0.227.g00780c9af4-goog Subject: [PATCH 1/5] KVM: x86/mmu: Move SPTE writable invariant checks to a helper function From: David Matlack To: pbonzini@redhat.com Cc: seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, kvm@vger.kernel.org, David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move the WARNs in spte_can_locklessly_be_made_writable() to a separate helper function. This is in preparation for moving these checks to the places where SPTEs are set. Opportunistically add warning error messages that include the SPTE to make future debugging of these warnings easier. Suggested-by: Sean Christopherson Signed-off-by: David Matlack --- arch/x86/kvm/mmu/spte.h | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) base-commit: e2e83a73d7ce66f62c7830a85619542ef59c90e4 diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index be6a007a4af3..912e66859ea0 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -339,15 +339,21 @@ static __always_inline bool is_rsvd_spte(struct rsvd_bits_validate *rsvd_check, __is_rsvd_bits_set(rsvd_check, spte, level); } -static inline bool spte_can_locklessly_be_made_writable(u64 spte) +static inline void check_spte_writable_invariants(u64 spte) { - if (spte & shadow_mmu_writable_mask) { - WARN_ON_ONCE(!(spte & shadow_host_writable_mask)); - return true; - } + if (spte & shadow_mmu_writable_mask) + WARN_ONCE(!(spte & shadow_host_writable_mask), + "kvm: MMU-writable SPTE is not Host-writable: %llx", + spte); + else + WARN_ONCE(spte & PT_WRITABLE_MASK, + "kvm: Writable SPTE is not MMU-writable: %llx", spte); +} - WARN_ON_ONCE(spte & PT_WRITABLE_MASK); - return false; +static inline bool spte_can_locklessly_be_made_writable(u64 spte) +{ + check_spte_writable_invariants(spte); + return spte & shadow_mmu_writable_mask; } static inline u64 get_mmio_spte_generation(u64 spte) From patchwork Tue Jan 25 23:05:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12724345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CCB1C63697 for ; Tue, 25 Jan 2022 23:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234651AbiAYXFo (ORCPT ); Tue, 25 Jan 2022 18:05:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230468AbiAYXF3 (ORCPT ); Tue, 25 Jan 2022 18:05:29 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60479C06161C for ; Tue, 25 Jan 2022 15:05:29 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id p16-20020aa78610000000b004c7cf2724beso5872728pfn.23 for ; Tue, 25 Jan 2022 15:05:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=49eFT8ZaENmW7lWo1NlcY1SJ/TxxnJcwmzqwK2CdB58=; b=q3UMorhvNjWVk29HQSEp0iR2oGfT8Tm5QXEOdGwAcjh2h1OLIWv3EWHWv7OhSivAid rDR5v3KAP9XcaQrG7rddQMObwqx0bWs3v0QkMHzrMu+fwvegft7a8dLgurjl1f6wJZ0p 3gf58sz10p4eExSHpzZR5dZJ3waO0wPu4GH0hA2MvAmVfNphKo6rVD8AtJ2aH3QQVzJm DJsYbbVuAwu9O1iufj1r+O/1UeSWZNZe7E2P+p8HEQ+H0YskRz9vEA9dMqjuCYjqyqWc kV18MAcfWpc+l9ifsqPNDjAmLQqxcIGAFxhsIaeZKNsSVH3A1xqhaK5T0lFduZsk6hgX kn3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=49eFT8ZaENmW7lWo1NlcY1SJ/TxxnJcwmzqwK2CdB58=; b=nAs5aqKDK43GwivibARqA1qDvcPuluU30xHIq9wHU47jtU8pLnGWhQt68b3cNIw+TX 2unZ90n/haoq7gbxJ77GCbQ7DxPl5cbqNr+95viBFlz6jVjzTjvMXMBiNxJKIXrwWhzv hhNuqsOwSTgjdpHTl3mxrnV+5FAss0XHkrckW7XabND6RTsSG0DNGPOKUqjGtsX5WrBv gXkIWQEI6GG0j7CIo/isWvjXzH5Qd5HhfiKaE+dBg8tevlO7B5UqlnHl9as2E1otSbLr 5+Gcw9IL0o6uALTNsnY1pC2S6vDYWHxgVorPfCjn9JbxHCX/rzU/6n7bEg2BsfW35J/J JsXQ== X-Gm-Message-State: AOAM533QY1Ub66gSulLXjZNbEaDxe3iElxJohZ6L+ePi3XT+GBGWUzne IICVQSc8xVmoPsg6Ge9TcOd6Ygrgkwdsrw== X-Google-Smtp-Source: ABdhPJwknMWMgIIX+paWkQ7OU70h3ZVmMtZN0fWXsr5RVmLkHHkFFoz3wcI17yuExtpjG+jSFPOLUz6TlqdfLQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a05:6a00:b82:b0:4cb:9790:90ac with SMTP id g2-20020a056a000b8200b004cb979090acmr140278pfj.21.1643151928842; Tue, 25 Jan 2022 15:05:28 -0800 (PST) Date: Tue, 25 Jan 2022 23:05:15 +0000 In-Reply-To: <20220125230518.1697048-1-dmatlack@google.com> Message-Id: <20220125230518.1697048-3-dmatlack@google.com> Mime-Version: 1.0 References: <20220125230518.1697048-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.0.rc0.227.g00780c9af4-goog Subject: [PATCH 2/5] KVM: x86/mmu: Check SPTE writable invariants when setting leaf SPTEs From: David Matlack To: pbonzini@redhat.com Cc: seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, kvm@vger.kernel.org, David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Check SPTE writable invariants when setting SPTEs rather than in spte_can_locklessly_be_made_writable(). By the time KVM checks spte_can_locklessly_be_made_writable(), the SPTE has long been since corrupted. Note that these invariants only apply to shadow-present leaf SPTEs (i.e. not to MMIO SPTEs, non-leaf SPTEs, etc.). Add a comment explaining the restriction and only instrument the code paths that set shadow-present leaf SPTEs. To account for access tracking, also check the SPTE writable invariants when marking an SPTE as an access track SPTE. This also lets us remove a redundant WARN from mark_spte_for_access_track(). Suggested-by: Sean Christopherson Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 1 + arch/x86/kvm/mmu/spte.c | 9 +-------- arch/x86/kvm/mmu/spte.h | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 3 +++ 4 files changed, 6 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 593093b52395..795db506c230 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -529,6 +529,7 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte) u64 old_spte = *sptep; WARN_ON(!is_shadow_present_pte(new_spte)); + check_spte_writable_invariants(new_spte); if (!is_shadow_present_pte(old_spte)) { mmu_spte_set(sptep, new_spte); diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index f8677404c93c..24d66bb899a4 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -249,14 +249,7 @@ u64 mark_spte_for_access_track(u64 spte) if (is_access_track_spte(spte)) return spte; - /* - * Making an Access Tracking PTE will result in removal of write access - * from the PTE. So, verify that we will be able to restore the write - * access in the fast page fault path later on. - */ - WARN_ONCE((spte & PT_WRITABLE_MASK) && - !spte_can_locklessly_be_made_writable(spte), - "kvm: Writable SPTE is not locklessly dirty-trackable\n"); + check_spte_writable_invariants(spte); WARN_ONCE(spte & (SHADOW_ACC_TRACK_SAVED_BITS_MASK << SHADOW_ACC_TRACK_SAVED_BITS_SHIFT), diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 912e66859ea0..b8fd055acdbd 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -339,6 +339,7 @@ static __always_inline bool is_rsvd_spte(struct rsvd_bits_validate *rsvd_check, __is_rsvd_bits_set(rsvd_check, spte, level); } +/* Note: spte must be a shadow-present leaf SPTE. */ static inline void check_spte_writable_invariants(u64 spte) { if (spte & shadow_mmu_writable_mask) @@ -352,7 +353,6 @@ static inline void check_spte_writable_invariants(u64 spte) static inline bool spte_can_locklessly_be_made_writable(u64 spte) { - check_spte_writable_invariants(spte); return spte & shadow_mmu_writable_mask; } diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index bc9e3553fba2..814c42def6e7 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -435,6 +435,9 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, trace_kvm_tdp_mmu_spte_changed(as_id, gfn, level, old_spte, new_spte); + if (is_leaf) + check_spte_writable_invariants(new_spte); + /* * The only times a SPTE should be changed from a non-present to * non-present state is when an MMIO entry is installed/modified/ From patchwork Tue Jan 25 23:05:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12724346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBD2CC5AC75 for ; Tue, 25 Jan 2022 23:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234652AbiAYXFp (ORCPT ); Tue, 25 Jan 2022 18:05:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231925AbiAYXFb (ORCPT ); Tue, 25 Jan 2022 18:05:31 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04192C06161C for ; Tue, 25 Jan 2022 15:05:31 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id q21-20020a170902edd500b0014ae79cc6d5so5805345plk.18 for ; Tue, 25 Jan 2022 15:05:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=konIu66tln70B4/KL1+Z+6gW9IXzyYE6pA+Z6X0RYPM=; b=oRPyQTE04G79fKhAUa9INSk+SDvbPvDYwvBhS6SkCHfqemQOR20h5gnyralD8fHuAF cJoRI4ybnOu4w/U7u/hzn9Th/fTEB2dkWYXg+ca0zwTzmppPOOMeX52rYQvAxC5cuvrl ftX1Fn8rt4b3D0Twuju/+5oFT0oFGt5qa2lbpmrezdU8gSkGov1a8zsRfYMENjdjfFwQ kWCAEojt73MRUdUQCab1gH5feUVM+wLUZJ0EuXARG3gNSJoiJXMkn0FcTCtikKOUwklt 8bmZbxWLgmHYpmmBm1RV5X7TfuB35/iAqyPmDQj2z4chZufUUfZG+xK7sXe4hGSt2CYo lkhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=konIu66tln70B4/KL1+Z+6gW9IXzyYE6pA+Z6X0RYPM=; b=q6Kk/XUNR8iNZYXYYOK1ziF3rLge/8BbDii4aFlzlXKc0MmRaBIBacmaKx4tBcbN7f v7Os1zehRC0FUrUjgF1qUf7A9v68erPn+OSXJzoXI7Ar8B90Rs+5nIrasqaI0UjQKmwD cRGzE17IDXhIPw+4l633cs9OnJzxggZuzDk8pgBatp2RxUpE79TOvsUbPCT70WnL13L4 T5xoVPE00Aias3siO01SzfAqvs2j8D8/jY1I281NQhnNvUoaBXGhMKYgFSdW4HJgl4Fg Gxz7Tt9wGCYHNIJ9BuAU9Fz1oZOqfooug9q4m5f/dkU1LJ7qfG1E2RH9ExfDDarzl9MS Bp/A== X-Gm-Message-State: AOAM533140+t0CzFVZpxu1iq2SOtqaodX5HIWlU/7hZAwZuKzTir/0XQ JQiJPLyEKZ6IccQm7qP4fGEBoUUssuixOQ== X-Google-Smtp-Source: ABdhPJxN5JyFU7LEcJ2LMIGHaiVteFmQAJ+ky7pNyaIQPSmd4URUzNuloeTgEywJWXsmK55o4FhJpCFHkwmEDA== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:90a:5d8c:: with SMTP id t12mr5687134pji.189.1643151930485; Tue, 25 Jan 2022 15:05:30 -0800 (PST) Date: Tue, 25 Jan 2022 23:05:16 +0000 In-Reply-To: <20220125230518.1697048-1-dmatlack@google.com> Message-Id: <20220125230518.1697048-4-dmatlack@google.com> Mime-Version: 1.0 References: <20220125230518.1697048-1-dmatlack@google.com> X-Mailer: git-send-email 2.35.0.rc0.227.g00780c9af4-goog Subject: [PATCH 3/5] KVM: x86/mmu: Move is_writable_pte() to spte.h From: David Matlack To: pbonzini@redhat.com Cc: seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, kvm@vger.kernel.org, David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move is_writable_pte() close to the other functions that check writability information about SPTEs. While here opportunistically replace the open-coded bit arithmetic in check_spte_writable_invariants() with a call to is_writable_pte(). No functional change intended. Suggested-by: Sean Christopherson Signed-off-by: David Matlack --- arch/x86/kvm/mmu.h | 38 -------------------------------------- arch/x86/kvm/mmu/spte.h | 40 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 39 insertions(+), 39 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index e9fbb2c8bbe2..51faa2c76ca5 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -202,44 +202,6 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, return vcpu->arch.mmu->page_fault(vcpu, &fault); } -/* - * Currently, we have two sorts of write-protection, a) the first one - * write-protects guest page to sync the guest modification, b) another one is - * used to sync dirty bitmap when we do KVM_GET_DIRTY_LOG. The differences - * between these two sorts are: - * 1) the first case clears MMU-writable bit. - * 2) the first case requires flushing tlb immediately avoiding corrupting - * shadow page table between all vcpus so it should be in the protection of - * mmu-lock. And the another case does not need to flush tlb until returning - * the dirty bitmap to userspace since it only write-protects the page - * logged in the bitmap, that means the page in the dirty bitmap is not - * missed, so it can flush tlb out of mmu-lock. - * - * So, there is the problem: the first case can meet the corrupted tlb caused - * by another case which write-protects pages but without flush tlb - * immediately. In order to making the first case be aware this problem we let - * it flush tlb if we try to write-protect a spte whose MMU-writable bit - * is set, it works since another case never touches MMU-writable bit. - * - * Anyway, whenever a spte is updated (only permission and status bits are - * changed) we need to check whether the spte with MMU-writable becomes - * readonly, if that happens, we need to flush tlb. Fortunately, - * mmu_spte_update() has already handled it perfectly. - * - * The rules to use MMU-writable and PT_WRITABLE_MASK: - * - if we want to see if it has writable tlb entry or if the spte can be - * writable on the mmu mapping, check MMU-writable, this is the most - * case, otherwise - * - if we fix page fault on the spte or do write-protection by dirty logging, - * check PT_WRITABLE_MASK. - * - * TODO: introduce APIs to split these two cases. - */ -static inline bool is_writable_pte(unsigned long pte) -{ - return pte & PT_WRITABLE_MASK; -} - /* * Check if a given access (described through the I/D, W/R and U/S bits of a * page fault error code pfec) causes a permission fault with the given PTE diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index b8fd055acdbd..e1ddba45bba1 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -339,6 +339,44 @@ static __always_inline bool is_rsvd_spte(struct rsvd_bits_validate *rsvd_check, __is_rsvd_bits_set(rsvd_check, spte, level); } +/* + * Currently, we have two sorts of write-protection, a) the first one + * write-protects guest page to sync the guest modification, b) another one is + * used to sync dirty bitmap when we do KVM_GET_DIRTY_LOG. The differences + * between these two sorts are: + * 1) the first case clears MMU-writable bit. + * 2) the first case requires flushing tlb immediately avoiding corrupting + * shadow page table between all vcpus so it should be in the protection of + * mmu-lock. And the another case does not need to flush tlb until returning + * the dirty bitmap to userspace since it only write-protects the page + * logged in the bitmap, that means the page in the dirty bitmap is not + * missed, so it can flush tlb out of mmu-lock. + * + * So, there is the problem: the first case can meet the corrupted tlb caused + * by another case which write-protects pages but without flush tlb + * immediately. In order to making the first case be aware this problem we let + * it flush tlb if we try to write-protect a spte whose MMU-writable bit + * is set, it works since another case never touches MMU-writable bit. + * + * Anyway, whenever a spte is updated (only permission and status bits are + * changed) we need to check whether the spte with MMU-writable becomes + * readonly, if that happens, we need to flush tlb. Fortunately, + * mmu_spte_update() has already handled it perfectly. + * + * The rules to use MMU-writable and PT_WRITABLE_MASK: + * - if we want to see if it has writable tlb entry or if the spte can be + * writable on the mmu mapping, check MMU-writable, this is the most + * case, otherwise + * - if we fix page fault on the spte or do write-protection by dirty logging, + * check PT_WRITABLE_MASK. + * + * TODO: introduce APIs to split these two cases. + */ +static inline bool is_writable_pte(unsigned long pte) +{ + return pte & PT_WRITABLE_MASK; +} + /* Note: spte must be a shadow-present leaf SPTE. */ static inline void check_spte_writable_invariants(u64 spte) { @@ -347,7 +385,7 @@ static inline void check_spte_writable_invariants(u64 spte) "kvm: MMU-writable SPTE is not Host-writable: %llx", spte); else - WARN_ONCE(spte & PT_WRITABLE_MASK, + WARN_ONCE(is_writable_pte(spte), "kvm: Writable SPTE is not MMU-writable: %llx", spte); } From patchwork Tue Jan 25 23:07:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12724347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C175C5DF62 for ; Tue, 25 Jan 2022 23:07:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234654AbiAYXHS (ORCPT ); Tue, 25 Jan 2022 18:07:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230468AbiAYXHR (ORCPT ); Tue, 25 Jan 2022 18:07:17 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57F55C06161C for ; Tue, 25 Jan 2022 15:07:17 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id z37-20020a056a001da500b004c74e3fd644so6915231pfw.1 for ; Tue, 25 Jan 2022 15:07:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=pj1P7x8X7Ztb282Wj3Bmrv1L4P94RF1b7x52MOCWza4=; b=rYZoX8kxiFsQyvsl9DsRt5gN5DgAAIXMF4LdjF6t27+wxPEO/CzPwbF+cRh6tdjbHk Nb0+4hNNoU7BV0RvNe7q2UyKXNogSYflQXDz+aguFJ6ID8NmlSl0TuC8jnHsEAjDUMbr uWd6sC81MoHVt1ieEWz6XUbP6xE2JnyAP7v86COxQAzA2Oe2Mb4Euaybj2y0eRT8BGv6 zK08klcphjrmvlBfbU5ciKkA3DX6uaXnUw/K3zqabHbeYMFaCn28M1jn95EKwDLROj4+ lK4CF0YNpytz1GWnvhhiiqtbcdmTBMzTyefdrvhL62Q87t2b2qhNtBmQxV8JASMaEjos mbNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=pj1P7x8X7Ztb282Wj3Bmrv1L4P94RF1b7x52MOCWza4=; b=BQgN/WvjL4Kcnyet3aCndjKUXiMkWgeC8GAU3ntIJkce/kAdPKCsVnVtMH+ztfCgm4 5uz7NxnlBMiYJfVnMb1fIWB4Y+TCwVSO4Icz5ufDIqXrD8+m9dhYItKmxcIZbMwsdZdO wQfRs7J8hVx9f3UQVv2GjDoNY5N9a5HLi2/tnARN3bwJLDX2yq1dZ1YelmtuZmH+zOcA L3A2IfHLeIIiY49umiiiV/AU1+s2W3HtHjGYXFMI7PuLhf6Gu9p1OiUKFQEE8Ukxgf1A +fBtfo+oWkJ/jrdQBNafC1fI9nF18Gu5YGSv3bXCcfucRsCmUV0NW8+jE9Y5nQVuzBt8 Qp5w== X-Gm-Message-State: AOAM533FxaBJlZYepwTj8zdCXq67sgvZSWKxW7Z7dexyuljbc9KZ4+/f DENLaTSi0UREsc4s+yFT5WKag/4pCX3SRg== X-Google-Smtp-Source: ABdhPJzQcOGKL9PXP4agWmfsmouEcPlm0KKTvr3/sLN7RHf4XUMyp/VzLriEH1jGP3h9Uc/Mf2EZtnVF8/fVcQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a05:6a00:2484:b0:4c8:ffbd:4c78 with SMTP id c4-20020a056a00248400b004c8ffbd4c78mr11224261pfv.24.1643152036746; Tue, 25 Jan 2022 15:07:16 -0800 (PST) Date: Tue, 25 Jan 2022 23:07:13 +0000 Message-Id: <20220125230713.1700406-1-dmatlack@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.0.rc0.227.g00780c9af4-goog Subject: [PATCH 4/5] KVM: x86/mmu: Rename DEFAULT_SPTE_MMU_WRITEABLE to DEFAULT_SPTE_MMU_WRITABLE From: David Matlack To: pbonzini@redhat.com Cc: seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, kvm@vger.kernel.org, David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Both "writeable" and "writable" are valid, but we should be consistent about which we use. DEFAULT_SPTE_MMU_WRITEABLE was the odd one out in the SPTE code, so rename it to DEFAULT_SPTE_MMU_WRITABLE. No functional change intended. Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/spte.c | 4 ++-- arch/x86/kvm/mmu/spte.h | 8 ++++---- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 795db506c230..88f3aa5f2a36 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5847,7 +5847,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, * will clear a separate software-only bit (MMU-writable) and skip the * flush if-and-only-if this bit was already clear. * - * See DEFAULT_SPTE_MMU_WRITEABLE for more details. + * See DEFAULT_SPTE_MMU_WRITABLE for more details. */ if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 24d66bb899a4..ad6acdd61a9f 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -360,8 +360,8 @@ void kvm_mmu_reset_all_pte_masks(void) shadow_acc_track_mask = 0; shadow_me_mask = sme_me_mask; - shadow_host_writable_mask = DEFAULT_SPTE_HOST_WRITEABLE; - shadow_mmu_writable_mask = DEFAULT_SPTE_MMU_WRITEABLE; + shadow_host_writable_mask = DEFAULT_SPTE_HOST_WRITABLE; + shadow_mmu_writable_mask = DEFAULT_SPTE_MMU_WRITABLE; /* * Set a reserved PA bit in MMIO SPTEs to generate page faults with diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index e1ddba45bba1..a179f089e3dd 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -75,7 +75,7 @@ static_assert(SPTE_TDP_AD_ENABLED_MASK == 0); static_assert(!(SPTE_TDP_AD_MASK & SHADOW_ACC_TRACK_SAVED_MASK)); /* - * *_SPTE_HOST_WRITEABLE (aka Host-writable) indicates whether the host permits + * *_SPTE_HOST_WRITABLE (aka Host-writable) indicates whether the host permits * writes to the guest page mapped by the SPTE. This bit is cleared on SPTEs * that map guest pages in read-only memslots and read-only VMAs. * @@ -83,7 +83,7 @@ static_assert(!(SPTE_TDP_AD_MASK & SHADOW_ACC_TRACK_SAVED_MASK)); * - If Host-writable is clear, PT_WRITABLE_MASK must be clear. * * - * *_SPTE_MMU_WRITEABLE (aka MMU-writable) indicates whether the shadow MMU + * *_SPTE_MMU_WRITABLE (aka MMU-writable) indicates whether the shadow MMU * allows writes to the guest page mapped by the SPTE. This bit is cleared when * the guest page mapped by the SPTE contains a page table that is being * monitored for shadow paging. In this case the SPTE can only be made writable @@ -100,8 +100,8 @@ static_assert(!(SPTE_TDP_AD_MASK & SHADOW_ACC_TRACK_SAVED_MASK)); */ /* Bits 9 and 10 are ignored by all non-EPT PTEs. */ -#define DEFAULT_SPTE_HOST_WRITEABLE BIT_ULL(9) -#define DEFAULT_SPTE_MMU_WRITEABLE BIT_ULL(10) +#define DEFAULT_SPTE_HOST_WRITABLE BIT_ULL(9) +#define DEFAULT_SPTE_MMU_WRITABLE BIT_ULL(10) /* * Low ignored bits are at a premium for EPT, use high ignored bits, taking care From patchwork Tue Jan 25 23:07:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Matlack X-Patchwork-Id: 12724348 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0876FC5DF62 for ; Tue, 25 Jan 2022 23:07:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234658AbiAYXH1 (ORCPT ); Tue, 25 Jan 2022 18:07:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234655AbiAYXH0 (ORCPT ); Tue, 25 Jan 2022 18:07:26 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60D3DC06161C for ; Tue, 25 Jan 2022 15:07:26 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id q1-20020a17090a064100b001b4d85cbaf7so1837583pje.9 for ; Tue, 25 Jan 2022 15:07:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=k8r4a6KDj+A3bJq0uWYzVTLx9UlIZS12qeMZsvsZ+XM=; b=jnefdm0C+V9iP4iGL/WX32AQPHb3iVkY25OBbDa2o9H/ul8PsrQkNm2S3RXn+4WP8w gN57MEPaRjpkS6DB1STzMX2ZcGsduCOxooTkGEaoC3XYhJT1kDtFSFGk6u/mlqI4wpAU R/kgrel+ULG17LmHMPvN2wHS6TFcC9qy7xQ0z7vLLLV6zGOGRaYSZSAWswRvXrKQx/re 7FGMsL8/zv8jLG16DwYQXTFMqk6ycUaELYsrUVVIcTwo6LHGW4CMV6gD6LwFNPZeyVAx hV+10SJIw6JtKXK87gDm1ksy0Bdm+BzpOU9++K+WTCe1wnrhr7q0eWWtZgbtvDJFtZT1 I1tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=k8r4a6KDj+A3bJq0uWYzVTLx9UlIZS12qeMZsvsZ+XM=; b=w3hcjFo8efaKk6+9PmvRHhm6qUYYHEPmm9UBJJLXd2dQSJRpDEAA+ThYm95UAHi8Gq EHwG9qDzanDiNF+7fs9eZPFAV5HaMEVv5qYdHZViKzrviLf4BGZRzO5rBpw+GUjJCltV mzEv6MXABZbLDztGTSkYw5DgPJRhA07NgRQ4r92gZObfDZGIYjYza96RBkgY2IwNJGtS lRKRjdlYQQqjckkRmW4GWAS40W2y3jUUQyvb3SRSMX4CWvlLpKLriyKmyPUvN07rDRPE fr4DtDFJzPcJBI1r4fUcAjJmO1PEaOzZfdghnt458apwE8bmvNGQirNKJaMo1snsGP3N dFWA== X-Gm-Message-State: AOAM5328JO+5W6nx8PeOy5oUsfHN8hXDf2WArYiaTWJcUBHNTBCaa/77 t8D/O8kzzzKHqRJsK84XHDELa9cc06Ex2w== X-Google-Smtp-Source: ABdhPJy1zTQMfkf7kCgc2vn5ZJMyFGvINsdueKjUcF7dgCgrt+idfZkgRkk68hoEgiIPkUKuvTe6+MxdG/mflQ== X-Received: from dmatlack-heavy.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd]) (user=dmatlack job=sendgmr) by 2002:a17:902:ab05:b0:14b:e74:d7bd with SMTP id ik5-20020a170902ab0500b0014b0e74d7bdmr20771642plb.126.1643152045888; Tue, 25 Jan 2022 15:07:25 -0800 (PST) Date: Tue, 25 Jan 2022 23:07:23 +0000 Message-Id: <20220125230723.1701061-1-dmatlack@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.0.rc0.227.g00780c9af4-goog Subject: [PATCH 5/5] KVM: x86/mmu: Consolidate comments about {Host,MMU}-writable From: David Matlack To: pbonzini@redhat.com Cc: seanjc@google.com, vkuznets@redhat.com, wanpengli@tencent.com, jmattson@google.com, joro@8bytes.org, kvm@vger.kernel.org, David Matlack Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Consolidate the large comment above DEFAULT_SPTE_HOST_WRITABLE with the large comment above is_writable_pte() into one comment. This comment explains the different reasons why an SPTE may be non-writable and KVM keeps track of that with the {Host,MMU}-writable bits. No functional change intended. Signed-off-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 10 ++-- arch/x86/kvm/mmu/spte.h | 105 +++++++++++++++++++++------------------- 2 files changed, 60 insertions(+), 55 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 88f3aa5f2a36..f8a508b3c3e7 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -549,11 +549,9 @@ static u64 mmu_spte_update_no_track(u64 *sptep, u64 new_spte) /* Rules for using mmu_spte_update: * Update the state bits, it means the mapped pfn is not changed. * - * Whenever we overwrite a writable spte with a read-only one we - * should flush remote TLBs. Otherwise rmap_write_protect - * will find a read-only spte, even though the writable spte - * might be cached on a CPU's TLB, the return value indicates this - * case. + * Whenever an MMU-writable SPTE is overwritten with a read-only SPTE, remote + * TLBs must be flushed. Otherwise rmap_write_protect will find a read-only + * spte, even though the writable spte might be cached on a CPU's TLB. * * Returns true if the TLB needs to be flushed */ @@ -5847,7 +5845,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, * will clear a separate software-only bit (MMU-writable) and skip the * flush if-and-only-if this bit was already clear. * - * See DEFAULT_SPTE_MMU_WRITABLE for more details. + * See is_writable_pte() for more details. */ if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index a179f089e3dd..08f471d8e409 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -75,28 +75,8 @@ static_assert(SPTE_TDP_AD_ENABLED_MASK == 0); static_assert(!(SPTE_TDP_AD_MASK & SHADOW_ACC_TRACK_SAVED_MASK)); /* - * *_SPTE_HOST_WRITABLE (aka Host-writable) indicates whether the host permits - * writes to the guest page mapped by the SPTE. This bit is cleared on SPTEs - * that map guest pages in read-only memslots and read-only VMAs. - * - * Invariants: - * - If Host-writable is clear, PT_WRITABLE_MASK must be clear. - * - * - * *_SPTE_MMU_WRITABLE (aka MMU-writable) indicates whether the shadow MMU - * allows writes to the guest page mapped by the SPTE. This bit is cleared when - * the guest page mapped by the SPTE contains a page table that is being - * monitored for shadow paging. In this case the SPTE can only be made writable - * by unsyncing the shadow page under the mmu_lock. - * - * Invariants: - * - If MMU-writable is clear, PT_WRITABLE_MASK must be clear. - * - If MMU-writable is set, Host-writable must be set. - * - * If MMU-writable is set, PT_WRITABLE_MASK is normally set but can be cleared - * to track writes for dirty logging. For such SPTEs, KVM will locklessly set - * PT_WRITABLE_MASK upon the next write from the guest and record the write in - * the dirty log (see fast_page_fault()). + * {DEFAULT,EPT}_SPTE_{HOST,MMU}_WRITABLE are used to keep track of why a given + * SPTE is write-protected. See is_writable_pte() for details. */ /* Bits 9 and 10 are ignored by all non-EPT PTEs. */ @@ -340,37 +320,64 @@ static __always_inline bool is_rsvd_spte(struct rsvd_bits_validate *rsvd_check, } /* - * Currently, we have two sorts of write-protection, a) the first one - * write-protects guest page to sync the guest modification, b) another one is - * used to sync dirty bitmap when we do KVM_GET_DIRTY_LOG. The differences - * between these two sorts are: - * 1) the first case clears MMU-writable bit. - * 2) the first case requires flushing tlb immediately avoiding corrupting - * shadow page table between all vcpus so it should be in the protection of - * mmu-lock. And the another case does not need to flush tlb until returning - * the dirty bitmap to userspace since it only write-protects the page - * logged in the bitmap, that means the page in the dirty bitmap is not - * missed, so it can flush tlb out of mmu-lock. + * An shadow-present leaf SPTE may be non-writable for 3 possible reasons: + * + * 1. To intercept writes for dirty logging. KVM write-protects huge pages + * so that they can be split be split down into the dirty logging + * granularity (4KiB) whenever the guest writes to them. KVM also + * write-protects 4KiB pages so that writes can be recorded in the dirty log + * (e.g. if not using PML). SPTEs are write-protected for dirty logging + * during the VM-iotcls that enable dirty logging. + * + * 2. To intercept writes to guest page tables that KVM is shadowing. When a + * guest writes to its page table the corresponding shadow page table will + * be marked "unsync". That way KVM knows which shadow page tables need to + * be updated on the next TLB flush, INVLPG, etc. and which do not. + * + * 3. To prevent guest writes to read-only memory, such as for memory in a + * read-only memslot or guest memory backed by a read-only VMA. Writes to + * such pages are disallowed entirely. + * + * To keep track of why a given SPTE is write-protected, KVM uses 2 + * software-only bits in the SPTE: + * + * shadow_mmu_writable_mask, aka MMU-writable - + * Cleared on SPTEs that KVM is currently write-protecting for shadow paging + * purposes (case 2 above). + * + * shadow_host_writable_mask, aka Host-writable - + * Cleared on SPTEs that are not host-writable (case 3 above) + * + * Note, not all possible combinations of PT_WRITABLE_MASK, + * shadow_mmu_writable_mask, and shadow_host_writable_mask are valid. A given + * SPTE can be in only one of the following states, which map to the + * aforementioned 3 cases: + * + * shadow_host_writable_mask | shadow_mmu_writable_mask | PT_WRITABLE_MASK + * ------------------------- | ------------------------ | ---------------- + * 1 | 1 | 1 (writable) + * 1 | 1 | 0 (case 1) + * 1 | 0 | 0 (case 2) + * 0 | 0 | 0 (case 3) * - * So, there is the problem: the first case can meet the corrupted tlb caused - * by another case which write-protects pages but without flush tlb - * immediately. In order to making the first case be aware this problem we let - * it flush tlb if we try to write-protect a spte whose MMU-writable bit - * is set, it works since another case never touches MMU-writable bit. + * The valid combinations of these bits are checked by + * check_spte_writable_invariants() whenever an SPTE is modified. * - * Anyway, whenever a spte is updated (only permission and status bits are - * changed) we need to check whether the spte with MMU-writable becomes - * readonly, if that happens, we need to flush tlb. Fortunately, - * mmu_spte_update() has already handled it perfectly. + * Clearing the MMU-writable bit is always done under the MMU lock and always + * accompanied by a TLB flush before dropping the lock to avoid corrupting the + * shadow page tables between vCPUs. Write-protecting an SPTE for dirty logging + * (which does not clear the MMU-writable bit), does not flush TLBs before + * dropping the lock, as it only needs to synchronize guest writes with the + * dirty bitmap. * - * The rules to use MMU-writable and PT_WRITABLE_MASK: - * - if we want to see if it has writable tlb entry or if the spte can be - * writable on the mmu mapping, check MMU-writable, this is the most - * case, otherwise - * - if we fix page fault on the spte or do write-protection by dirty logging, - * check PT_WRITABLE_MASK. + * So, there is the problem: clearing the MMU-writable bit can encounter a + * write-protected SPTE while CPUs still have writable mappings for that SPTE + * cached in their TLB. To address this, KVM always flushes TLBs when + * write-protecting SPTEs if the MMU-writable bit is set on the old SPTE. * - * TODO: introduce APIs to split these two cases. + * The Host-writable bit is not modified on present SPTEs, it is only set or + * cleared when an SPTE is first faulted in from non-present and then remains + * immutable. */ static inline bool is_writable_pte(unsigned long pte) {