From patchwork Thu Aug 11 11:05:57 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 9275077 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9B9BC600CB for ; Thu, 11 Aug 2016 11:06:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8AEE92858B for ; Thu, 11 Aug 2016 11:06:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7F7F428607; Thu, 11 Aug 2016 11:06:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BE4B12858B for ; Thu, 11 Aug 2016 11:06:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752268AbcHKLGB (ORCPT ); Thu, 11 Aug 2016 07:06:01 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:59310 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750856AbcHKLF7 (ORCPT ); Thu, 11 Aug 2016 07:05:59 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u7BB3x7G055283 for ; Thu, 11 Aug 2016 07:05:58 -0400 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0a-001b2d01.pphosted.com with ESMTP id 24qm9u2x4w-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 11 Aug 2016 07:05:58 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 11 Aug 2016 05:05:57 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 11 Aug 2016 05:05:54 -0600 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: xyjxie@linux.vnet.ibm.com Received: from b01cxnp22033.gho.pok.ibm.com (b01cxnp22033.gho.pok.ibm.com [9.57.198.23]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 4A40B1FF0043; Thu, 11 Aug 2016 05:05:36 -0600 (MDT) Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u7BB5r9G8520032; Thu, 11 Aug 2016 11:05:53 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5DA53AC041; Thu, 11 Aug 2016 07:05:53 -0400 (EDT) Received: from localhost (unknown [9.123.229.14]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP id 10FA6AC03A; Thu, 11 Aug 2016 07:05:53 -0400 (EDT) From: Yongji Xie To: alex.williamson@redhat.com Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, aik@ozlabs.ru, zhong@linux.vnet.ibm.com, gwshan@linux.vnet.ibm.com Subject: [Qemu-devel] [PATCH v2] vfio: Add support for mmapping sub-page MMIO BARs Date: Thu, 11 Aug 2016 19:05:57 +0800 X-Mailer: git-send-email 1.7.9.5 X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16081111-0020-0000-0000-00000987A2CD X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005577; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000181; SDB=6.00743138; UDB=6.00349895; IPR=6.00515691; BA=6.00004655; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012310; XFM=3.00000011; UTC=2016-08-11 11:05:56 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16081111-0021-0000-0000-00005478D6CD Message-Id: <1470913557-4355-1-git-send-email-xyjxie@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-08-11_07:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1608110155 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive") allows VFIO to mmap sub-page BARs. This is the corresponding QEMU patch. With those patches applied, we could passthrough sub-page BARs to guest, which can help to improve IO performance for some devices. In this patch, we expand MemoryRegions of these sub-page MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION with a valid size. The expanding size will be recovered when the base address of sub-page BAR is changed and not page aligned any more in guest. And we also set the priority of these BARs' memory regions to zero in case of overlap with BARs which share the same page with sub-page BARs in guest. Signed-off-by: Yongji Xie --- hw/vfio/common.c | 3 +-- hw/vfio/pci.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+), 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index b313e7c..1a70307 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -661,8 +661,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region, name, region->size); if (!vbasedev->no_mmap && - region->flags & VFIO_REGION_INFO_FLAG_MMAP && - !(region->size & ~qemu_real_host_page_mask)) { + region->flags & VFIO_REGION_INFO_FLAG_MMAP) { vfio_setup_region_sparse_mmaps(region, info); diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 7bfa17c..7035617 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1057,6 +1057,65 @@ static const MemoryRegionOps vfio_vga_ops = { }; /* + * Expand memory region of sub-page(size < PAGE_SIZE) MMIO BAR to page + * size if the BAR is in an exclusive page in host so that we could map + * this BAR to guest. But this sub-page BAR may not occupy an exclusive + * page in guest. So we should set the priority of the expanded memory + * region to zero in case of overlap with BARs which share the same page + * with the sub-page BAR in guest. Besides, we should also recover the + * size of this sub-page BAR when its base address is changed in guest + * and not page aligned any more. + */ +static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) +{ + VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev); + VFIORegion *region = &vdev->bars[bar].region; + MemoryRegion *mmap_mr, *mr; + PCIIORegion *r; + pcibus_t bar_addr; + + /* Make sure that the whole region is allowed to be mmapped */ + if (!(region->nr_mmaps == 1 && + region->mmaps[0].size == region->size)) { + return; + } + + r = &pdev->io_regions[bar]; + bar_addr = r->addr; + if (bar_addr == PCI_BAR_UNMAPPED) { + return; + } + + mr = region->mem; + mmap_mr = ®ion->mmaps[0].mem; + memory_region_transaction_begin(); + if (memory_region_size(mr) < qemu_real_host_page_size) { + if (!(bar_addr & ~qemu_real_host_page_mask) && + memory_region_is_mapped(mr) && region->mmaps[0].mmap) { + /* Expand memory region to page size and set priority */ + memory_region_del_subregion(r->address_space, mr); + memory_region_set_size(mr, qemu_real_host_page_size); + memory_region_set_size(mmap_mr, qemu_real_host_page_size); + memory_region_add_subregion_overlap(r->address_space, + bar_addr, mr, 0); + } + } else { + /* This case would happen when guest rescan one PCI device */ + if (bar_addr & ~qemu_real_host_page_mask) { + /* Recover the size of memory region */ + memory_region_set_size(mr, r->size); + memory_region_set_size(mmap_mr, r->size); + } else if (memory_region_is_mapped(mr)) { + /* Set the priority of memory region to zero */ + memory_region_del_subregion(r->address_space, mr); + memory_region_add_subregion_overlap(r->address_space, + bar_addr, mr, 0); + } + } + memory_region_transaction_commit(); +} + +/* * PCI config space */ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) @@ -1139,6 +1198,23 @@ void vfio_pci_write_config(PCIDevice *pdev, } else if (was_enabled && !is_enabled) { vfio_msix_disable(vdev); } + } else if (ranges_overlap(addr, len, PCI_BASE_ADDRESS_0, 24) || + range_covers_byte(addr, len, PCI_COMMAND)) { + pcibus_t old_addr[PCI_NUM_REGIONS - 1]; + int bar; + + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { + old_addr[bar] = pdev->io_regions[bar].addr; + } + + pci_default_write_config(pdev, addr, val, len); + + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { + if (old_addr[bar] != pdev->io_regions[bar].addr && + pdev->io_regions[bar].size > 0 && + pdev->io_regions[bar].size < qemu_real_host_page_size) + vfio_sub_page_bar_update_mapping(pdev, bar); + } } else { /* Write everything to QEMU to keep emulated bits correct */ pci_default_write_config(pdev, addr, val, len);