From patchwork Tue Jul 30 13:02:01 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Guangrong X-Patchwork-Id: 2835633 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C4CEC9F7D6 for ; Tue, 30 Jul 2013 13:03:11 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0B3932022A for ; Tue, 30 Jul 2013 13:03:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CC7D420243 for ; Tue, 30 Jul 2013 13:02:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753838Ab3G3NCZ (ORCPT ); Tue, 30 Jul 2013 09:02:25 -0400 Received: from e23smtp01.au.ibm.com ([202.81.31.143]:58190 "EHLO e23smtp01.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753579Ab3G3NCY (ORCPT ); Tue, 30 Jul 2013 09:02:24 -0400 Received: from /spool/local by e23smtp01.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 30 Jul 2013 22:52:33 +1000 Received: from d23dlp03.au.ibm.com (202.81.31.214) by e23smtp01.au.ibm.com (202.81.31.207) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 30 Jul 2013 22:52:30 +1000 Received: from d23relay04.au.ibm.com (d23relay04.au.ibm.com [9.190.234.120]) by d23dlp03.au.ibm.com (Postfix) with ESMTP id 23EF63578050; Tue, 30 Jul 2013 23:02:19 +1000 (EST) Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay04.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r6UCklAd49414150; Tue, 30 Jul 2013 22:46:47 +1000 Received: from d23av02.au.ibm.com (loopback [127.0.0.1]) by d23av02.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r6UD2Ibb031415; Tue, 30 Jul 2013 23:02:18 +1000 Received: from localhost (ericxiao.cn.ibm.com [9.111.29.99]) by d23av02.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id r6UD2HSN031362; Tue, 30 Jul 2013 23:02:17 +1000 From: Xiao Guangrong To: gleb@redhat.com Cc: avi.kivity@gmail.com, mtosatti@redhat.com, pbonzini@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Xiao Guangrong Subject: [PATCH 03/12] KVM: MMU: lazily drop large spte Date: Tue, 30 Jul 2013 21:02:01 +0800 Message-Id: <1375189330-24066-4-git-send-email-xiaoguangrong@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.1.4 In-Reply-To: <1375189330-24066-1-git-send-email-xiaoguangrong@linux.vnet.ibm.com> References: <1375189330-24066-1-git-send-email-xiaoguangrong@linux.vnet.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13073012-1618-0000-0000-0000045E8BA7 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently, kvm zaps the large spte if write-protected is needed, the later read can fault on that spte. Actually, we can make the large spte readonly instead of making them un-present, the page fault caused by read access can be avoided The idea is from Avi: | As I mentioned before, write-protecting a large spte is a good idea, | since it moves some work from protect-time to fault-time, so it reduces | jitter. This removes the need for the return value. [ It has fixed the issue reported in 6b73a9606 by stopping fast page fault marking the large spte to writable ] Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 36 +++++++++++++++++------------------- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index cf163ca..35d4b50 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1181,8 +1181,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep) /* * Write-protect on the specified @sptep, @pt_protect indicates whether - * spte writ-protection is caused by protecting shadow page table. - * @flush indicates whether tlb need be flushed. + * spte write-protection is caused by protecting shadow page table. * * Note: write protection is difference between drity logging and spte * protection: @@ -1191,10 +1190,9 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep) * - for spte protection, the spte can be writable only after unsync-ing * shadow page. * - * Return true if the spte is dropped. + * Return true if tlb need be flushed. */ -static bool -spte_write_protect(struct kvm *kvm, u64 *sptep, bool *flush, bool pt_protect) +static bool spte_write_protect(struct kvm *kvm, u64 *sptep, bool pt_protect) { u64 spte = *sptep; @@ -1204,17 +1202,11 @@ spte_write_protect(struct kvm *kvm, u64 *sptep, bool *flush, bool pt_protect) rmap_printk("rmap_write_protect: spte %p %llx\n", sptep, *sptep); - if (__drop_large_spte(kvm, sptep)) { - *flush |= true; - return true; - } - if (pt_protect) spte &= ~SPTE_MMU_WRITEABLE; spte = spte & ~PT_WRITABLE_MASK; - *flush |= mmu_spte_update(sptep, spte); - return false; + return mmu_spte_update(sptep, spte); } static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, @@ -1226,11 +1218,8 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, for (sptep = rmap_get_first(*rmapp, &iter); sptep;) { BUG_ON(!(*sptep & PT_PRESENT_MASK)); - if (spte_write_protect(kvm, sptep, &flush, pt_protect)) { - sptep = rmap_get_first(*rmapp, &iter); - continue; - } + flush |= spte_write_protect(kvm, sptep, pt_protect); sptep = rmap_get_next(&iter); } @@ -2701,6 +2690,8 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, break; } + drop_large_spte(vcpu, iterator.sptep); + if (!is_shadow_present_pte(*iterator.sptep)) { u64 base_addr = iterator.addr; @@ -2855,7 +2846,7 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, * - false: let the real page fault path to fix it. */ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level, - u32 error_code) + u32 error_code, bool force_pt_level) { struct kvm_shadow_walk_iterator iterator; struct kvm_mmu_page *sp; @@ -2884,6 +2875,13 @@ static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, int level, goto exit; /* + * Can not map the large spte to writable if the page is dirty + * logged. + */ + if (sp->role.level > PT_PAGE_TABLE_LEVEL && force_pt_level) + goto exit; + + /* * Check if it is a spurious fault caused by TLB lazily flushed. * * Need not check the access of upper level table entries since @@ -2944,7 +2942,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, u32 error_code, } else level = PT_PAGE_TABLE_LEVEL; - if (fast_page_fault(vcpu, v, level, error_code)) + if (fast_page_fault(vcpu, v, level, error_code, force_pt_level)) return 0; mmu_seq = vcpu->kvm->mmu_notifier_seq; @@ -3422,7 +3420,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code, } else level = PT_PAGE_TABLE_LEVEL; - if (fast_page_fault(vcpu, gpa, level, error_code)) + if (fast_page_fault(vcpu, gpa, level, error_code, force_pt_level)) return 0; mmu_seq = vcpu->kvm->mmu_notifier_seq;