From patchwork Thu Aug 31 11:22:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Liu X-Patchwork-Id: 9931925 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 72E58602F0 for ; Thu, 31 Aug 2017 11:24:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6880628897 for ; Thu, 31 Aug 2017 11:24:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5CECF288DF; Thu, 31 Aug 2017 11:24:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 896A528897 for ; Thu, 31 Aug 2017 11:24:55 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dnNYQ-0002U6-OD; Thu, 31 Aug 2017 11:22:34 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dnNYQ-0002Tw-4k for xen-devel@lists.xenproject.org; Thu, 31 Aug 2017 11:22:34 +0000 Received: from [85.158.143.35] by server-6.bemta-6.messagelabs.com id DC/0D-03423-971F7A95; Thu, 31 Aug 2017 11:22:33 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpjkeJIrShJLcpLzFFi42JxWrohUrfi4/J Ig0trdSy+b5nM5MDocfjDFZYAxijWzLyk/IoE1owpE2YwFuxPqdj4eDVbA+N/jy5GTg4JAX+J 91372EBsNgFliZ+dvWC2iICeRNOB54xdjFwczAITGCV2377CApIQFvCS+HTgApjNIqAqca9lD VgDr4CFRG/7TiaIofISu9ousoLYnAKWErf6J4HFhYBqXv+5wQphK0h0TD/GBNErKHFy5hOwmc wCEhIHX7xgnsDIOwtJahaS1AJGplWMGsWpRWWpRbrGRnpJRZnpGSW5iZk5uoYGZnq5qcXFiem pOYlJxXrJ+bmbGIHhwwAEOxhPrws8xCjJwaQkyvvw9vJIIb6k/JTKjMTijPii0pzU4kOMMhwc ShK8u98D5QSLUtNTK9Iyc4CBDJOW4OBREuFV+QCU5i0uSMwtzkyHSJ1iVJQS59UASQiAJDJK8 +DaYNFziVFWSpiXEegQIZ6C1KLczBJU+VeM4hyMSsK86iBTeDLzSuCmvwJazAS0ONZrKcjikk SElFQDY4lvItuh+S23g5pz1tZ78nd3mtvsXtnv4hK9+fLau5cORN95tPl7127r+zOvxYfKC9a 5SUQIM6Qly3mvChA24trbfSZSJDPvn87LwPdtD7ytDJ7Mmb/mmlPez9/bWxP1W920jnvNO/Hh 5+HLhtv2Hk3/lSt8auNbEV8Xx1a5xKMZR73VZ6TfVGIpzkg01GIuKk4EAH6eYHGZAgAA X-Env-Sender: prvs=4097f923a=wei.liu2@citrix.com X-Msg-Ref: server-16.tower-21.messagelabs.com!1504178545!68501308!2 X-Originating-IP: [66.165.176.89] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni44OSA9PiAyMDMwMDc=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 43843 invoked from network); 31 Aug 2017 11:22:32 -0000 Received: from smtp.citrix.com (HELO SMTP.CITRIX.COM) (66.165.176.89) by server-16.tower-21.messagelabs.com with RC4-SHA encrypted SMTP; 31 Aug 2017 11:22:32 -0000 X-IronPort-AV: E=Sophos;i="5.41,453,1498521600"; d="scan'208";a="437815629" From: Wei Liu To: Xen-devel Date: Thu, 31 Aug 2017 12:22:23 +0100 Message-ID: <20170831112223.24761-3-wei.liu2@citrix.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170831112223.24761-1-wei.liu2@citrix.com> References: <20170831112223.24761-1-wei.liu2@citrix.com> MIME-Version: 1.0 Cc: Andrew Cooper , Wei Liu , Jan Beulich Subject: [Xen-devel] [PATCH v2 2/2] x86/mm: merge ptwr and mmio_ro page fault handlers X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Provide a unified entry to avoid going through pte look-up, decode and emulation cycle more than necessary. The path taken is determined by the faulting address. Note that the order of checks is changed in the new function, but the order of the checks is performed shouldn't matter. The sole caller is changed to use the new function. No functional change. Signed-off-by: Wei Liu --- Cc: Jan Beulich Cc: Andrew Cooper v2: rebase and (sorta) merge Andrew's changes Dom0 boots fine. XTF is happy with this change. Let me know if more tests can be done. --- xen/arch/x86/mm.c | 290 +++++++++++++++++++++-------------------------- xen/arch/x86/traps.c | 20 ++-- xen/include/asm-x86/mm.h | 5 +- 3 files changed, 140 insertions(+), 175 deletions(-) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 3306088255..429442aa1d 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5176,91 +5176,6 @@ static const struct x86_emulate_ops ptwr_emulate_ops = { .cpuid = pv_emul_cpuid, }; -/* Write page fault handler: check if guest is trying to modify a PTE. */ -int ptwr_do_page_fault(struct vcpu *v, unsigned long addr, - struct cpu_user_regs *regs) -{ - struct domain *d = v->domain; - struct page_info *page; - l1_pgentry_t pte; - struct ptwr_emulate_ctxt ptwr_ctxt; - struct x86_emulate_ctxt ctxt = { - .regs = regs, - .vendor = d->arch.cpuid->x86_vendor, - .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG, - .sp_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG, - .lma = !is_pv_32bit_domain(d), - .data = &ptwr_ctxt, - }; - int rc; - - /* Attempt to read the PTE that maps the VA being accessed. */ - pte = guest_get_eff_l1e(addr); - - /* We are looking only for read-only mappings of p.t. pages. */ - if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) || - rangeset_contains_singleton(mmio_ro_ranges, l1e_get_pfn(pte)) || - !get_page_from_mfn(l1e_get_mfn(pte), d) ) - goto bail; - - page = l1e_get_page(pte); - if ( !page_lock(page) ) - { - put_page(page); - goto bail; - } - - if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table ) - { - page_unlock(page); - put_page(page); - goto bail; - } - - ptwr_ctxt = (struct ptwr_emulate_ctxt) { - .cr2 = addr, - .pte = pte, - }; - - rc = x86_emulate(&ctxt, &ptwr_emulate_ops); - - page_unlock(page); - put_page(page); - - switch ( rc ) - { - case X86EMUL_EXCEPTION: - /* - * This emulation only covers writes to pagetables which are marked - * read-only by Xen. We tolerate #PF (in case a concurrent pagetable - * update has succeeded on a different vcpu). Anything else is an - * emulation bug, or a guest playing with the instruction stream under - * Xen's feet. - */ - if ( ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION && - ctxt.event.vector == TRAP_page_fault ) - pv_inject_event(&ctxt.event); - else - gdprintk(XENLOG_WARNING, - "Unexpected event (type %u, vector %#x) from emulation\n", - ctxt.event.type, ctxt.event.vector); - - /* Fallthrough */ - case X86EMUL_OKAY: - - if ( ctxt.retire.singlestep ) - pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC); - - /* Fallthrough */ - case X86EMUL_RETRY: - perfc_incr(ptwr_emulations); - return EXCRET_fault_fixed; - } - - bail: - return 0; -} - /************************* * fault handling for read-only MMIO pages */ @@ -5333,83 +5248,6 @@ static const struct x86_emulate_ops mmcfg_intercept_ops = { .cpuid = pv_emul_cpuid, }; -/* Check if guest is trying to modify a r/o MMIO page. */ -int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr, - struct cpu_user_regs *regs) -{ - l1_pgentry_t pte; - unsigned long mfn; - unsigned int addr_size = is_pv_32bit_vcpu(v) ? 32 : BITS_PER_LONG; - struct mmio_ro_emulate_ctxt mmio_ro_ctxt = { .cr2 = addr }; - struct x86_emulate_ctxt ctxt = { - .regs = regs, - .vendor = v->domain->arch.cpuid->x86_vendor, - .addr_size = addr_size, - .sp_size = addr_size, - .lma = !is_pv_32bit_vcpu(v), - .data = &mmio_ro_ctxt, - }; - int rc; - - /* Attempt to read the PTE that maps the VA being accessed. */ - pte = guest_get_eff_l1e(addr); - - /* We are looking only for read-only mappings of MMIO pages. */ - if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) ) - return 0; - - mfn = l1e_get_pfn(pte); - if ( mfn_valid(_mfn(mfn)) ) - { - struct page_info *page = mfn_to_page(_mfn(mfn)); - struct domain *owner = page_get_owner_and_reference(page); - - if ( owner ) - put_page(page); - if ( owner != dom_io ) - return 0; - } - - if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) ) - return 0; - - if ( pci_ro_mmcfg_decode(mfn, &mmio_ro_ctxt.seg, &mmio_ro_ctxt.bdf) ) - rc = x86_emulate(&ctxt, &mmcfg_intercept_ops); - else - rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops); - - switch ( rc ) - { - case X86EMUL_EXCEPTION: - /* - * This emulation only covers writes to MMCFG space or read-only MFNs. - * We tolerate #PF (from hitting an adjacent page or a successful - * concurrent pagetable update). Anything else is an emulation bug, - * or a guest playing with the instruction stream under Xen's feet. - */ - if ( ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION && - ctxt.event.vector == TRAP_page_fault ) - pv_inject_event(&ctxt.event); - else - gdprintk(XENLOG_WARNING, - "Unexpected event (type %u, vector %#x) from emulation\n", - ctxt.event.type, ctxt.event.vector); - - /* Fallthrough */ - case X86EMUL_OKAY: - - if ( ctxt.retire.singlestep ) - pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC); - - /* Fallthrough */ - case X86EMUL_RETRY: - perfc_incr(mmio_ro_emulations); - return EXCRET_fault_fixed; - } - - return 0; -} - void *alloc_xen_pagetable(void) { if ( system_state != SYS_STATE_early_boot ) @@ -6441,6 +6279,134 @@ void write_32bit_pse_identmap(uint32_t *l2) _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); } +/* Check if guest is trying to modify a r/o MMIO page. */ +static int mmio_ro_do_page_fault(struct x86_emulate_ctxt *ctxt, + unsigned long addr, l1_pgentry_t pte) +{ + struct mmio_ro_emulate_ctxt mmio_ro_ctxt = { .cr2 = addr }; + mfn_t mfn = l1e_get_mfn(pte); + + if ( mfn_valid(mfn) ) + { + struct page_info *page = mfn_to_page(mfn); + struct domain *owner = page_get_owner_and_reference(page); + + if ( owner ) + put_page(page); + if ( owner != dom_io ) + return X86EMUL_UNHANDLEABLE; + } + + ctxt->data = &mmio_ro_ctxt; + if ( pci_ro_mmcfg_decode(mfn_x(mfn), &mmio_ro_ctxt.seg, &mmio_ro_ctxt.bdf) ) + return x86_emulate(ctxt, &mmcfg_intercept_ops); + else + return x86_emulate(ctxt, &mmio_ro_emulate_ops); +} + +/* Write page fault handler: check if guest is trying to modify a PTE. */ +static int ptwr_do_page_fault(struct x86_emulate_ctxt *ctxt, struct domain *d, + unsigned long addr, l1_pgentry_t pte) +{ + struct ptwr_emulate_ctxt ptwr_ctxt = { + .cr2 = addr, + .pte = pte, + }; + struct page_info *page; + int rc = X86EMUL_UNHANDLEABLE; + + if ( !get_page_from_mfn(l1e_get_mfn(pte), d) ) + goto out; + + page = l1e_get_page(pte); + if ( !page_lock(page) ) + { + put_page(page); + goto out; + } + + if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_l1_page_table ) + { + page_unlock(page); + put_page(page); + goto out; + } + + ctxt->data = &ptwr_ctxt; + rc = x86_emulate(ctxt, &ptwr_emulate_ops); + + page_unlock(page); + put_page(page); + + out: + return rc; +} + +int pv_ro_page_fault(struct vcpu *v, unsigned long addr, + struct cpu_user_regs *regs) +{ + l1_pgentry_t pte; + struct domain *d = v->domain; + unsigned int addr_size = is_pv_32bit_vcpu(v) ? 32 : BITS_PER_LONG; + struct x86_emulate_ctxt ctxt = { + .regs = regs, + .vendor = d->arch.cpuid->x86_vendor, + .addr_size = addr_size, + .sp_size = addr_size, + .lma = !is_pv_32bit_vcpu(v), + }; + int rc; + bool mmio_ro; + + /* Attempt to read the PTE that maps the VA being accessed. */ + pte = guest_get_eff_l1e(addr); + + /* We are only looking for read-only mappings */ + if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT | _PAGE_RW)) != _PAGE_PRESENT) ) + return 0; + + mmio_ro = rangeset_contains_singleton(mmio_ro_ranges, l1e_get_pfn(pte)); + if ( mmio_ro ) + rc = mmio_ro_do_page_fault(&ctxt, addr, pte); + else + rc = ptwr_do_page_fault(&ctxt, d, addr, pte); + + switch ( rc ) + { + case X86EMUL_EXCEPTION: + /* + * This emulation covers writes to: + * - L1 pagetables. + * - MMCFG space or read-only MFNs. + * We tolerate #PF (from hitting an adjacent page or a successful + * concurrent pagetable update). Anything else is an emulation bug, + * or a guest playing with the instruction stream under Xen's feet. + */ + if ( ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION && + ctxt.event.vector == TRAP_page_fault ) + pv_inject_event(&ctxt.event); + else + gdprintk(XENLOG_WARNING, + "Unexpected event (type %u, vector %#x) from emulation\n", + ctxt.event.type, ctxt.event.vector); + + /* Fallthrough */ + case X86EMUL_OKAY: + if ( ctxt.retire.singlestep ) + pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC); + + /* Fallthrough */ + case X86EMUL_RETRY: + if ( mmio_ro ) + perfc_incr(mmio_ro_emulations); + else + perfc_incr(ptwr_emulations); + return EXCRET_fault_fixed; + } + + return 0; +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index f525fa28d3..8cdd3c892a 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1308,16 +1308,18 @@ static int fixup_page_fault(unsigned long addr, struct cpu_user_regs *regs) !(regs->error_code & (PFEC_reserved_bit | PFEC_insn_fetch)) && (regs->error_code & PFEC_write_access) ) { - if ( VM_ASSIST(d, writable_pagetables) && - /* Do not check if access-protection fault since the page may - legitimately be not present in shadow page tables */ - (paging_mode_enabled(d) || - (regs->error_code & PFEC_page_present)) && - ptwr_do_page_fault(v, addr, regs) ) - return EXCRET_fault_fixed; + bool ptwr, mmio_ro; + + ptwr = VM_ASSIST(d, writable_pagetables) && + /* Do not check if access-protection fault since the page may + legitimately be not present in shadow page tables */ + (paging_mode_enabled(d) || + (regs->error_code & PFEC_page_present)); + + mmio_ro = is_hardware_domain(d) && + (regs->error_code & PFEC_page_present); - if ( is_hardware_domain(d) && (regs->error_code & PFEC_page_present) && - mmio_ro_do_page_fault(v, addr, regs) ) + if ( (ptwr || mmio_ro) && pv_ro_page_fault(v, addr, regs) ) return EXCRET_fault_fixed; } diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h index 4c03a33e79..2dc4898bed 100644 --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -511,10 +511,7 @@ extern int mmcfg_intercept_write(enum x86_segment seg, int pv_emul_cpuid(uint32_t leaf, uint32_t subleaf, struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt); -int ptwr_do_page_fault(struct vcpu *, unsigned long, - struct cpu_user_regs *); -int mmio_ro_do_page_fault(struct vcpu *, unsigned long, - struct cpu_user_regs *); +int pv_ro_page_fault(struct vcpu *, unsigned long, struct cpu_user_regs *); int audit_adjust_pgtables(struct domain *d, int dir, int noisy);