From patchwork Fri Nov 17 06:22:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 10062329 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 19FDC6023A for ; Fri, 17 Nov 2017 06:27:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C0872A539 for ; Fri, 17 Nov 2017 06:27:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 00B062A990; Fri, 17 Nov 2017 06:27:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F276C2A825 for ; Fri, 17 Nov 2017 06:27:21 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eFa5g-00019Y-WC; Fri, 17 Nov 2017 06:25:28 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eFa5f-000179-Li for xen-devel@lists.xen.org; Fri, 17 Nov 2017 06:25:27 +0000 Received: from [85.158.139.211] by server-11.bemta-5.messagelabs.com id F0/03-04145-7D08E0A5; Fri, 17 Nov 2017 06:25:27 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpgkeJIrShJLcpLzFFi42Jpa+uQ0L3WwBd lcP+4pMWSj4tZHBg9ju7+zRTAGMWamZeUX5HAmnG6/TZ7wde8ihfPv7M1MG4M6mLk4hASmM4o cW/KVOYuRk4OCQFeiSPLZrBC2AESaze8BLOFBKok5h47xAhiswkoS1z82ssGYosISEtc+3yZE WQQs8BzZokl6/4CORwcwgIeEisWFYPUsAioSrzo2scEYvMKOEs8OjCRHWK+gsSUh+/B9nICxf e/OcgEsctJ4uC6aUwTGHkXMDKsYtQoTi0qSy3SNTTVSyrKTM8oyU3MzNE1NDDVy00tLk5MT81 JTCrWS87P3cQIDAcGINjB2LDd8xCjJAeTkigvjzlflBBfUn5KZUZicUZ8UWlOavEhRhkODiUJ 3mP1QDnBotT01Iq0zBxgYMKkJTh4lER4E4HBKcRbXJCYW5yZDpE6xWjM8Wzm6wZmjmlXW5uYh Vjy8vNSpcR574NMEgApzSjNgxsEi5hLjLJSwryMQKcJ8RSkFuVmlqDKv2IU52BUEuatBJnCk5 lXArfvFdApTECn2NzgBjmlJBEhJdXA6Ou2959G5YzpPQGlRtVHHE5pJfVbPnj1Kv7ZlgOB911 TmVZKezNf67Kc4Kdeo6+15PTcpjV1GXMcSifeTfz3ZLZ/mYIh/w9Ol1cyprsY2x23TeeeIB2v UnHzqO3V5Ctf3K4ujJkiFlKzbVn7geObjs27PKPKVkRFPObI9YlKp3f3VXQohZg4KLEUZyQaa jEXFScCAGAIydCTAgAA X-Env-Sender: chao.gao@intel.com X-Msg-Ref: server-11.tower-206.messagelabs.com!1510899922!87798367!2 X-Originating-IP: [134.134.136.24] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTM0LjEzNC4xMzYuMjQgPT4gMzkwOTcx\n X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 51899 invoked from network); 17 Nov 2017 06:25:25 -0000 Received: from mga09.intel.com (HELO mga09.intel.com) (134.134.136.24) by server-11.tower-206.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 17 Nov 2017 06:25:25 -0000 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Nov 2017 22:25:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.44,407,1505804400"; d="scan'208"; a="1245165235" Received: from skl-4s-chao.sh.intel.com ([10.239.48.9]) by fmsmga002.fm.intel.com with ESMTP; 16 Nov 2017 22:25:22 -0800 From: Chao Gao To: xen-devel@lists.xen.org Date: Fri, 17 Nov 2017 14:22:23 +0800 Message-Id: <1510899755-40237-17-git-send-email-chao.gao@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1510899755-40237-1-git-send-email-chao.gao@intel.com> References: <1510899755-40237-1-git-send-email-chao.gao@intel.com> Cc: Lan Tianyu , Kevin Tian , Stefano Stabellini , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Ian Jackson , Tim Deegan , Jan Beulich , Andrew Cooper , Chao Gao , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Subject: [Xen-devel] [PATCH v4 16/28] x86/vvtd: Add queued invalidation (QI) support X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Queued Invalidation Interface is an expanded invalidation interface with extended capabilities. Hardware implementations report support for queued invalidation interface through the Extended Capability Register. The queued invalidation interface uses an Invalidation Queue (IQ), which is a circular buffer in system memory. Software submits commands by writing Invalidation Descriptors to the IQ. In this patch, a new function viommu_process_iq() is used for emulating how hardware handles invalidation requests through QI. Signed-off-by: Chao Gao Signed-off-by: Lan Tianyu --- v4: - Introduce a lock to protect invalidation related registers. --- xen/drivers/passthrough/vtd/iommu.h | 24 +++- xen/drivers/passthrough/vtd/vvtd.c | 271 +++++++++++++++++++++++++++++++++++- 2 files changed, 293 insertions(+), 2 deletions(-) diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h index b71dab8..de9188b 100644 --- a/xen/drivers/passthrough/vtd/iommu.h +++ b/xen/drivers/passthrough/vtd/iommu.h @@ -47,7 +47,12 @@ #define DMAR_IQH_REG 0x80 /* invalidation queue head */ #define DMAR_IQT_REG 0x88 /* invalidation queue tail */ #define DMAR_IQA_REG 0x90 /* invalidation queue addr */ +#define DMAR_IQUA_REG 0x94 /* invalidation queue upper addr */ +#define DMAR_ICS_REG 0x9c /* invalidation completion status */ #define DMAR_IECTL_REG 0xa0 /* invalidation event control register */ +#define DMAR_IEDATA_REG 0xa4 /* invalidation event data register */ +#define DMAR_IEADDR_REG 0xa8 /* invalidation event address register */ +#define DMAR_IEUADDR_REG 0xac /* upper address register */ #define DMAR_IRTA_REG 0xb8 /* base address of intr remap table */ #define DMAR_IRTUA_REG 0xbc /* upper address of intr remap table */ @@ -175,6 +180,21 @@ #define DMA_IRTA_S(val) (val & 0xf) #define DMA_IRTA_SIZE(val) (1UL << (DMA_IRTA_S(val) + 1)) +/* IQA_REG */ +#define DMA_IQA_ADDR(val) (val & ~0xfffULL) +#define DMA_IQA_QS(val) (val & 0x7) +#define DMA_IQA_RSVD 0xff8ULL + +/* IECTL_REG */ +#define DMA_IECTL_IM_SHIFT 31 +#define DMA_IECTL_IM (1U << DMA_IECTL_IM_SHIFT) +#define DMA_IECTL_IP_SHIFT 30 +#define DMA_IECTL_IP (1U << DMA_IECTL_IP_SHIFT) + +/* ICS_REG */ +#define DMA_ICS_IWC_SHIFT 0 +#define DMA_ICS_IWC (1U << DMA_ICS_IWC_SHIFT) + /* PMEN_REG */ #define DMA_PMEN_EPM (((u32)1) << 31) #define DMA_PMEN_PRS (((u32)1) << 0) @@ -205,13 +225,14 @@ /* FSTS_REG */ #define DMA_FSTS_PFO_SHIFT 0 #define DMA_FSTS_PPF_SHIFT 1 +#define DMA_FSTS_IQE_SHIFT 4 #define DMA_FSTS_PRO_SHIFT 7 #define DMA_FSTS_PFO ((uint32_t)1 << DMA_FSTS_PFO_SHIFT) #define DMA_FSTS_PPF ((uint32_t)1 << DMA_FSTS_PPF_SHIFT) #define DMA_FSTS_AFO ((uint32_t)1 << 2) #define DMA_FSTS_APF ((uint32_t)1 << 3) -#define DMA_FSTS_IQE ((uint32_t)1 << 4) +#define DMA_FSTS_IQE ((uint32_t)1 << DMA_FSTS_IQE_SHIFT) #define DMA_FSTS_ICE ((uint32_t)1 << 5) #define DMA_FSTS_ITE ((uint32_t)1 << 6) #define DMA_FSTS_PRO ((uint32_t)1 << DMA_FSTS_PRO_SHIFT) @@ -555,6 +576,7 @@ struct qinval_entry { /* Queue invalidation head/tail shift */ #define QINVAL_INDEX_SHIFT 4 +#define QINVAL_INDEX_MASK 0x7fff0ULL #define qinval_present(v) ((v).lo & 1) #define qinval_fault_disable(v) (((v).lo >> 1) & 1) diff --git a/xen/drivers/passthrough/vtd/vvtd.c b/xen/drivers/passthrough/vtd/vvtd.c index a2fa64a..81170ec 100644 --- a/xen/drivers/passthrough/vtd/vvtd.c +++ b/xen/drivers/passthrough/vtd/vvtd.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include "iommu.h" @@ -68,6 +69,9 @@ struct vvtd { struct hvm_hw_vvtd hw; void *irt_base; + void *inv_queue_base; + /* This lock protects invalidation related registers */ + spinlock_t ie_lock; }; /* Setting viommu_verbose enables debugging messages of vIOMMU */ @@ -284,6 +288,12 @@ static void vvtd_notify_fault(const struct vvtd *vvtd) vvtd_get_reg(vvtd, DMAR_FEDATA_REG)); } +static void vvtd_notify_inv_completion(const struct vvtd *vvtd) +{ + vvtd_generate_interrupt(vvtd, vvtd_get_reg_quad(vvtd, DMAR_IEADDR_REG), + vvtd_get_reg(vvtd, DMAR_IEDATA_REG)); +} + /* Computing the IRTE index for a given interrupt request. When success, return * 0 and set index to reference the corresponding IRTE. Otherwise, return < 0, * i.e. -1 when the irq request isn't an remapping format. @@ -478,6 +488,189 @@ static int vvtd_record_fault(struct vvtd *vvtd, return X86EMUL_OKAY; } +/* + * Process an invalidation descriptor. Currently, only two types descriptors, + * Interrupt Entry Cache Invalidation Descritor and Invalidation Wait + * Descriptor are handled. + * @vvtd: the virtual vtd instance + * @i: the index of the invalidation descriptor to be processed + * + * If success return 0, or return non-zero when failure. + */ +static int process_iqe(struct vvtd *vvtd, uint32_t i) +{ + struct qinval_entry qinval; + int ret = 0; + + if ( !vvtd->inv_queue_base ) + { + gdprintk(XENLOG_ERR, "Invalidation queue base isn't set\n"); + return -1; + } + qinval = ((struct qinval_entry *)vvtd->inv_queue_base)[i]; + + switch ( qinval.q.inv_wait_dsc.lo.type ) + { + case TYPE_INVAL_WAIT: + if ( qinval.q.inv_wait_dsc.lo.sw ) + { + uint32_t data = qinval.q.inv_wait_dsc.lo.sdata; + uint64_t addr = qinval.q.inv_wait_dsc.hi.saddr << 2; + + ret = hvm_copy_to_guest_phys(addr, &data, sizeof(data), current); + if ( ret ) + vvtd_info("Failed to write status address\n"); + } + + /* + * The following code generates an invalidation completion event + * indicating the invalidation wait descriptor completion. Note that + * the following code fragment is not tested properly. + */ + if ( qinval.q.inv_wait_dsc.lo.iflag ) + { + if ( !vvtd_test_and_set_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_SHIFT) ) + { + vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT); + if ( !vvtd_test_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT) ) + { + vvtd_notify_inv_completion(vvtd); + vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT); + } + } + } + break; + + case TYPE_INVAL_IEC: + /* No cache is preserved in vvtd, nothing is needed to be flushed */ + break; + + default: + vvtd_debug("d%d: Invalidation type (%x) isn't supported\n", + vvtd->domain->domain_id, qinval.q.inv_wait_dsc.lo.type); + return -1; + } + + return ret; +} + +/* + * Invalidate all the descriptors in Invalidation Queue. + */ +static void vvtd_process_iq(struct vvtd *vvtd) +{ + uint32_t max_entry, i, iqh, iqt; + int err = 0; + + /* Trylock avoids more than 1 caller dealing with invalidation requests */ + if ( !spin_trylock(&vvtd->ie_lock) ) + return; + + iqh = MASK_EXTR(vvtd_get_reg_quad(vvtd, DMAR_IQH_REG), QINVAL_INDEX_MASK); + iqt = MASK_EXTR(vvtd_get_reg_quad(vvtd, DMAR_IQT_REG), QINVAL_INDEX_MASK); + /* + * No new descriptor is fetched from the Invalidation Queue until + * software clears the IQE field in the Fault Status Register + */ + if ( vvtd_test_bit(vvtd, DMAR_FSTS_REG, DMA_FSTS_IQE_SHIFT) ) + { + spin_unlock(&vvtd->ie_lock); + return; + } + + max_entry = 1 << (QINVAL_ENTRY_ORDER + + DMA_IQA_QS(vvtd_get_reg_quad(vvtd, DMAR_IQA_REG))); + + ASSERT(iqt < max_entry); + if ( iqh == iqt ) + { + spin_unlock(&vvtd->ie_lock); + return; + } + + for ( i = iqh; i != iqt; i = (i + 1) % max_entry ) + { + err = process_iqe(vvtd, i); + if ( err ) + break; + } + + /* + * set IQH before checking error, because IQH should reference + * the desriptor associated with the error when an error is seen + * by guest + */ + vvtd_set_reg_quad(vvtd, DMAR_IQH_REG, i << QINVAL_INDEX_SHIFT); + + spin_unlock(&vvtd->ie_lock); + if ( err ) + { + spin_lock(&vvtd->fe_lock); + vvtd_report_non_recoverable_fault(vvtd, DMA_FSTS_IQE_SHIFT); + spin_unlock(&vvtd->fe_lock); + } +} + +static void vvtd_write_iqt(struct vvtd *vvtd, uint32_t val) +{ + uint32_t max_entry; + + if ( val & ~QINVAL_INDEX_MASK ) + { + vvtd_info("attempts to set reserved bits in IQT\n"); + return; + } + + max_entry = 1U << (QINVAL_ENTRY_ORDER + + DMA_IQA_QS(vvtd_get_reg_quad(vvtd, DMAR_IQA_REG))); + if ( MASK_EXTR(val, QINVAL_INDEX_MASK) >= max_entry ) + { + vvtd_info("IQT: Value %x exceeded supported max index.", val); + return; + } + + vvtd_set_reg(vvtd, DMAR_IQT_REG, val); +} + +static void vvtd_write_iqa(struct vvtd *vvtd, uint32_t val, bool high) +{ + uint64_t cap = vvtd_get_reg_quad(vvtd, DMAR_CAP_REG); + uint64_t old = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG); + uint64_t new; + + if ( high ) + new = ((uint64_t)val << 32) | (old & 0xffffffff); + else + new = ((old >> 32) << 32) | val; + + if ( new & (~((1ULL << cap_mgaw(cap)) - 1) | DMA_IQA_RSVD) ) + { + vvtd_info("Attempt to set reserved bits in IQA\n"); + return; + } + + vvtd_set_reg_quad(vvtd, DMAR_IQA_REG, new); + if ( high && !vvtd->inv_queue_base ) + vvtd->inv_queue_base = map_guest_pages(vvtd->domain, + PFN_DOWN(DMA_IQA_ADDR(new)), + 1 << DMA_IQA_QS(new)); + else if ( !high && vvtd->inv_queue_base ) + { + unmap_guest_pages(vvtd->inv_queue_base, 1 << DMA_IQA_QS(old)); + vvtd->inv_queue_base = NULL; + } +} + +static void vvtd_write_ics(struct vvtd *vvtd, uint32_t val) +{ + if ( val & DMA_ICS_IWC ) + { + vvtd_clear_bit(vvtd, DMAR_ICS_REG, DMA_ICS_IWC_SHIFT); + /* When IWC field is cleared, the IP field needs to be cleared */ + vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT); + } +} + static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val) { /* Writing a 1 means clear fault */ @@ -489,6 +682,20 @@ static int vvtd_write_frcd3(struct vvtd *vvtd, uint32_t val) return X86EMUL_OKAY; } +static void vvtd_write_iectl(struct vvtd *vvtd, uint32_t val) +{ + /* Only DMA_IECTL_IM bit is writable. Generate pending event when unmask */ + if ( !(val & DMA_IECTL_IM) ) + { + /* Clear IM and clear IP */ + vvtd_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT); + if ( vvtd_test_and_clear_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IP_SHIFT) ) + vvtd_notify_inv_completion(vvtd); + } + else + vvtd_set_bit(vvtd, DMAR_IECTL_REG, DMA_IECTL_IM_SHIFT); +} + static void vvtd_write_fectl(struct vvtd *vvtd, uint32_t val) { /* @@ -681,6 +888,48 @@ static void vvtd_write_fault_regs(struct vvtd *vvtd, unsigned long val, spin_unlock(&vvtd->fe_lock); } +static void vvtd_write_invalidation_regs(struct vvtd *vvtd, unsigned long val, + unsigned int offset, unsigned int len) +{ + spin_lock(&vvtd->ie_lock); + for ( ; len ; len -= 4, offset += 4, val = val >> 32) + { + switch ( offset ) + { + case DMAR_IECTL_REG: + vvtd_write_iectl(vvtd, val); + break; + + case DMAR_ICS_REG: + vvtd_write_ics(vvtd, val); + break; + + case DMAR_IQT_REG: + vvtd_write_iqt(vvtd, val); + break; + + case DMAR_IQA_REG: + vvtd_write_iqa(vvtd, val, 0); + break; + + case DMAR_IQUA_REG: + vvtd_write_iqa(vvtd, val, 1); + break; + + case DMAR_IEDATA_REG: + case DMAR_IEADDR_REG: + case DMAR_IEUADDR_REG: + vvtd_set_reg(vvtd, offset, val); + break; + + default: + break; + } + } + spin_unlock(&vvtd->ie_lock); + +} + static int vvtd_write(struct vcpu *v, unsigned long addr, unsigned int len, unsigned long val) { @@ -719,6 +968,17 @@ static int vvtd_write(struct vcpu *v, unsigned long addr, vvtd_write_fault_regs(vvtd, val, offset, len); break; + case DMAR_IECTL_REG: + case DMAR_ICS_REG: + case DMAR_IQT_REG: + case DMAR_IQA_REG: + case DMAR_IQUA_REG: + case DMAR_IEDATA_REG: + case DMAR_IEADDR_REG: + case DMAR_IEUADDR_REG: + vvtd_write_invalidation_regs(vvtd, val, offset, len); + break; + default: if ( (offset == (fault_offset + DMA_FRCD2_OFFSET)) || (offset == (fault_offset + DMA_FRCD3_OFFSET)) ) @@ -840,7 +1100,8 @@ static int vvtd_handle_irq_request(const struct domain *d, irte.remap.tm); out: - atomic_dec(&vvtd->inflight_intr); + if ( !atomic_dec_and_test(&vvtd->inflight_intr) ) + vvtd_process_iq(vvtd); return ret; } @@ -911,6 +1172,7 @@ static int vvtd_create(struct domain *d, struct viommu *viommu) vvtd->domain = d; register_mmio_handler(d, &vvtd_mmio_ops); spin_lock_init(&vvtd->fe_lock); + spin_lock_init(&vvtd->ie_lock); viommu->priv = vvtd; @@ -930,6 +1192,13 @@ static int vvtd_destroy(struct viommu *viommu) sizeof(struct iremap_entry))); vvtd->irt_base = NULL; } + if ( vvtd->inv_queue_base ) + { + uint64_t old = vvtd_get_reg_quad(vvtd, DMAR_IQA_REG); + + unmap_guest_pages(vvtd->inv_queue_base, 1 << DMA_IQA_QS(old)); + vvtd->inv_queue_base = NULL; + } xfree(vvtd); }