From patchwork Thu Jun 9 14:05:24 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takuya Yoshikawa X-Patchwork-Id: 865552 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.4) with ESMTP id p59E5bK8002768 for ; Thu, 9 Jun 2011 14:05:38 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754278Ab1FIOFb (ORCPT ); Thu, 9 Jun 2011 10:05:31 -0400 Received: from mail-px0-f179.google.com ([209.85.212.179]:62729 "EHLO mail-px0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753904Ab1FIOF3 (ORCPT ); Thu, 9 Jun 2011 10:05:29 -0400 Received: by pxi2 with SMTP id 2so1065600pxi.10 for ; Thu, 09 Jun 2011 07:05:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:date:from:to:cc:subject:message-id:in-reply-to :references:x-mailer:mime-version:content-type :content-transfer-encoding; bh=qjv7Jmx94lqWJWlp0DC8QXY5EmKOXSh4dkHR311EW38=; b=DdGBAcBm4VT6rHlCdSbjT9qFDs90g08MJU4xx9SvAwjzKDfydaHUgLi5bbtsi4uKTH p7dGOJGFJrF75M9Kgs5yEhWrCRZHAOAo+G0kc2pu1h5V1lBm/fjFlysLNzk9idrd6qf2 x+JjPpQ4/r52Tvu9d9qBqHThrRfNxsaFH9kx8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:in-reply-to:references:x-mailer :mime-version:content-type:content-transfer-encoding; b=H57bKe55fS5I2xA3Z3jqkbU4tzVC8KaWzhACikGQs0PCiaDuap/2G563e9VHQy5Rn0 sy8ZJ+WjWIwWlJkXfnawndHmL9nm9lgYGjCX0P9odz/zh1kkkdF3VOelwOU/jzRqK8lN 674C9KWr8qBcVnmhIdgrEil1UPxO4h7TUYTkA= Received: by 10.142.214.9 with SMTP id m9mr117302wfg.177.1307628328991; Thu, 09 Jun 2011 07:05:28 -0700 (PDT) Received: from amd (x096101.dynamic.ppp.asahi-net.or.jp [122.249.96.101]) by mx.google.com with ESMTPS id x1sm1436883pbb.34.2011.06.09.07.05.26 (version=SSLv3 cipher=OTHER); Thu, 09 Jun 2011 07:05:28 -0700 (PDT) Date: Thu, 9 Jun 2011 23:05:24 +0900 From: Takuya Yoshikawa To: avi@redhat.com, mtosatti@redhat.com Cc: kvm@vger.kernel.org, yoshikawa.takuya@oss.ntt.co.jp, mingo@elte.hu Subject: [PATCH 4/4] KVM: MMU: Split out the main body of walk_addr_generic() Message-Id: <20110609230524.346b3d1c.takuya.yoshikawa@gmail.com> In-Reply-To: <20110609225949.91cce4a0.takuya.yoshikawa@gmail.com> References: <20110609225949.91cce4a0.takuya.yoshikawa@gmail.com> X-Mailer: Sylpheed 3.1.0 (GTK+ 2.24.4; x86_64-pc-linux-gnu) Mime-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter2.kernel.org [140.211.167.43]); Thu, 09 Jun 2011 14:05:38 +0000 (UTC) From: Takuya Yoshikawa The code has clearly suffered from over inlining. So make the body of the walk loop a separate function: do_walk(). This will make it easy to do more cleanups and optimizations later. This was suggested by Ingo Molnar. Cc: Ingo Molnar Signed-off-by: Takuya Yoshikawa --- arch/x86/kvm/mmu.c | 21 ++++ arch/x86/kvm/paging_tmpl.h | 227 ++++++++++++++++++++++++-------------------- 2 files changed, 145 insertions(+), 103 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 2d14434..16ccf4b 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -69,6 +69,27 @@ char *audit_point_name[] = { "post sync" }; +/* + * do_walk() returns one of these. + * + * WALK_NEXT: Continue the walk loop. + * WALK_DONE: Break from the walk loop. + * WALK_RETRY: Retry walk. + * WALK_NOT_PRESENT: Set PFERR_PRESENT_MASK and goto error. + * WALK_RSVD_FAULT: Set PFERR_RSVD_MASK and goto error. + * WALK_ERROR: Goto error. + * WALK_ABORT: Return immediately. + */ +enum { + WALK_NEXT, + WALK_DONE, + WALK_RETRY, + WALK_NOT_PRESENT, + WALK_RSVD_FAULT, + WALK_ERROR, + WALK_ABORT +}; + #undef MMU_DEBUG #ifdef MMU_DEBUG diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 711336b..4913aa5 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -114,6 +114,111 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte) } /* + * Walk one level. + * Guest pte and its user address will be put in *pte and *ptep_user. + */ +static inline int +FNAME(do_walk)(struct guest_walker *walker, struct kvm_vcpu *vcpu, + struct kvm_mmu *mmu, gva_t addr, u32 access, bool *eperm, + pt_element_t *pte, pt_element_t __user **ptep_user) +{ + gfn_t real_gfn; + unsigned long host_addr; + unsigned index = PT_INDEX(addr, walker->level); + int offset = index * sizeof(pt_element_t); + gfn_t table_gfn = gpte_to_gfn(*pte); + gpa_t pte_gpa = gfn_to_gpa(table_gfn) + offset; + const int write_fault = access & PFERR_WRITE_MASK; + const int user_fault = access & PFERR_USER_MASK; + const int fetch_fault = access & PFERR_FETCH_MASK; + + walker->table_gfn[walker->level - 1] = table_gfn; + walker->pte_gpa[walker->level - 1] = pte_gpa; + + real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), + PFERR_USER_MASK|PFERR_WRITE_MASK); + if (unlikely(real_gfn == UNMAPPED_GVA)) + return WALK_NOT_PRESENT; + real_gfn = gpa_to_gfn(real_gfn); + + host_addr = gfn_to_hva(vcpu->kvm, real_gfn); + if (unlikely(kvm_is_error_hva(host_addr))) + return WALK_NOT_PRESENT; + + *ptep_user = (pt_element_t __user *)((void *)host_addr + offset); + if (unlikely(__copy_from_user(pte, *ptep_user, sizeof(*pte)))) + return WALK_NOT_PRESENT; + + trace_kvm_mmu_paging_element(*pte, walker->level); + + if (unlikely(!is_present_gpte(*pte))) + return WALK_NOT_PRESENT; + + if (unlikely(is_rsvd_bits_set(&vcpu->arch.mmu, *pte, walker->level))) + return WALK_RSVD_FAULT; + + if (unlikely(write_fault && !is_writable_pte(*pte) + && (user_fault || is_write_protection(vcpu)))) + *eperm = true; + + if (unlikely(user_fault && !(*pte & PT_USER_MASK))) + *eperm = true; + +#if PTTYPE == 64 + if (unlikely(fetch_fault && (*pte & PT64_NX_MASK))) + *eperm = true; +#endif + + if (!*eperm && unlikely(!(*pte & PT_ACCESSED_MASK))) { + int ret; + + trace_kvm_mmu_set_accessed_bit(table_gfn, index, sizeof(*pte)); + ret = FNAME(cmpxchg_gpte)(vcpu, mmu, *ptep_user, index, + *pte, *pte|PT_ACCESSED_MASK); + if (unlikely(ret < 0)) + return WALK_NOT_PRESENT; + else if (ret) + return WALK_RETRY; + + mark_page_dirty(vcpu->kvm, table_gfn); + *pte |= PT_ACCESSED_MASK; + } + + walker->pte_access = walker->pt_access & FNAME(gpte_access)(vcpu, *pte); + + walker->ptes[walker->level - 1] = *pte; + + if ((walker->level == PT_PAGE_TABLE_LEVEL) || + ((walker->level == PT_DIRECTORY_LEVEL) && is_large_pte(*pte) && + (PTTYPE == 64 || is_pse(vcpu))) || + ((walker->level == PT_PDPE_LEVEL) && is_large_pte(*pte) && + (mmu->root_level == PT64_ROOT_LEVEL))) { + gpa_t real_gpa; + gfn_t gfn; + u32 ac; + + gfn = gpte_to_gfn_lvl(*pte, walker->level); + gfn += (addr & PT_LVL_OFFSET_MASK(walker->level)) >> PAGE_SHIFT; + + if (PTTYPE == 32 && (walker->level == PT_DIRECTORY_LEVEL) && + is_cpuid_PSE36()) + gfn += pse36_gfn_delta(*pte); + + ac = write_fault | fetch_fault | user_fault; + + real_gpa = mmu->translate_gpa(vcpu, gfn_to_gpa(gfn), ac); + if (real_gpa == UNMAPPED_GVA) + return WALK_ABORT; + + walker->gfn = real_gpa >> PAGE_SHIFT; + + return WALK_DONE; + } + + return WALK_NEXT; +} + +/* * Fetch a guest pte for a guest virtual address */ static int FNAME(walk_addr_generic)(struct guest_walker *walker, @@ -130,7 +235,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault, fetch_fault); -walk: +walk_retry: eperm = false; walker->level = mmu->root_level; pte = mmu->get_cr3(vcpu); @@ -152,118 +257,34 @@ walk: walker->pt_access = ACC_ALL; for (;;) { - gfn_t real_gfn; - unsigned long host_addr; - unsigned index = PT_INDEX(addr, walker->level); - int offset = index * sizeof(pt_element_t); - gfn_t table_gfn = gpte_to_gfn(pte); - gpa_t pte_gpa = gfn_to_gpa(table_gfn) + offset; - - walker->table_gfn[walker->level - 1] = table_gfn; - walker->pte_gpa[walker->level - 1] = pte_gpa; - - real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), - PFERR_USER_MASK|PFERR_WRITE_MASK); - if (unlikely(real_gfn == UNMAPPED_GVA)) { - errcode |= PFERR_PRESENT_MASK; - goto error; - } - real_gfn = gpa_to_gfn(real_gfn); - - host_addr = gfn_to_hva(vcpu->kvm, real_gfn); - if (unlikely(kvm_is_error_hva(host_addr))) { - errcode |= PFERR_PRESENT_MASK; - goto error; - } - - ptep_user = (pt_element_t __user *)((void *)host_addr + offset); - if (unlikely(__copy_from_user(&pte, ptep_user, sizeof(pte)))) { - errcode |= PFERR_PRESENT_MASK; - goto error; - } - - trace_kvm_mmu_paging_element(pte, walker->level); + int ret; - if (unlikely(!is_present_gpte(pte))) { + ret = FNAME(do_walk)(walker, vcpu, mmu, addr, access, + &eperm, &pte, &ptep_user); + switch (ret) { + case WALK_NEXT: + break; + case WALK_DONE: + goto walk_done; + case WALK_RETRY: + goto walk_retry; + case WALK_NOT_PRESENT: errcode |= PFERR_PRESENT_MASK; goto error; - } - - if (unlikely(is_rsvd_bits_set(&vcpu->arch.mmu, pte, - walker->level))) { + case WALK_RSVD_FAULT: errcode |= PFERR_RSVD_MASK; goto error; - } - - if (unlikely(write_fault && !is_writable_pte(pte) - && (user_fault || is_write_protection(vcpu)))) - eperm = true; - - if (unlikely(user_fault && !(pte & PT_USER_MASK))) - eperm = true; - -#if PTTYPE == 64 - if (unlikely(fetch_fault && (pte & PT64_NX_MASK))) - eperm = true; -#endif - - if (!eperm && unlikely(!(pte & PT_ACCESSED_MASK))) { - int ret; - trace_kvm_mmu_set_accessed_bit(table_gfn, index, - sizeof(pte)); - ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, - pte, pte|PT_ACCESSED_MASK); - if (unlikely(ret < 0)) { - errcode |= PFERR_PRESENT_MASK; - goto error; - } else if (ret) - goto walk; - - mark_page_dirty(vcpu->kvm, table_gfn); - pte |= PT_ACCESSED_MASK; - } - - walker->pte_access = walker->pt_access & - FNAME(gpte_access)(vcpu, pte); - - walker->ptes[walker->level - 1] = pte; - - if ((walker->level == PT_PAGE_TABLE_LEVEL) || - ((walker->level == PT_DIRECTORY_LEVEL) && - is_large_pte(pte) && - (PTTYPE == 64 || is_pse(vcpu))) || - ((walker->level == PT_PDPE_LEVEL) && - is_large_pte(pte) && - mmu->root_level == PT64_ROOT_LEVEL)) { - int lvl = walker->level; - gpa_t real_gpa; - gfn_t gfn; - u32 ac; - - gfn = gpte_to_gfn_lvl(pte, lvl); - gfn += (addr & PT_LVL_OFFSET_MASK(lvl)) >> PAGE_SHIFT; - - if (PTTYPE == 32 && - walker->level == PT_DIRECTORY_LEVEL && - is_cpuid_PSE36()) - gfn += pse36_gfn_delta(pte); - - ac = write_fault | fetch_fault | user_fault; - - real_gpa = mmu->translate_gpa(vcpu, gfn_to_gpa(gfn), - ac); - if (real_gpa == UNMAPPED_GVA) - return 0; - - walker->gfn = real_gpa >> PAGE_SHIFT; - - break; + case WALK_ERROR: + goto error; + case WALK_ABORT: + return 0; } walker->pt_access = walker->pte_access; --walker->level; } +walk_done: if (unlikely(eperm)) goto error; @@ -279,7 +300,7 @@ walk: errcode |= PFERR_PRESENT_MASK; goto error; } else if (ret) - goto walk; + goto walk_retry; mark_page_dirty(vcpu->kvm, table_gfn); pte |= PT_DIRTY_MASK;