[v3,7/7] drm/i915/gvt: rw more pages a time for shadow context

1. as shadow context is pinned in intel_vgpu_setup_submission() and
unpinned in intel_vgpu_clean_submission(), its base virtual address of
is safely obtained from lrc_reg_state. no need to call kmap()/kunmap()
repeatedly.

2. IOVA(GPA)s of context pages are checked in this patch and if they are
continuous, read/write them together in one intel_gvt_hypervisor_read_gpa()
/intel_gvt_hypervisor_write_gpa().

after the two changes in this patch,
average cycles for populate_shadow_context() and update_guest_context()
are reduced by ~10000-20000 cycles, depending on the average continuous
pages in each read/write.

(1) comparison of cycles of
populate_shadow_context() + update_guest_context() when executing
different benchmarks
 -------------------------------------------------------------
|       cycles      | glmark2     | lightsmark  | openarena   |
|-------------------------------------------------------------|
| before this patch | 65968       | 97852       | 61373       |
|  after this patch | 56017 (85%) | 73862 (75%) | 47463 (77%) |
 -------------------------------------------------------------

(2) average count of pages read/written a time in
populate_shadow_context() and update_guest_context()
for each benchmark

 -----------------------------------------------------------
|     page cnt      | glmark2     | lightsmark  | openarena |
|-----------------------------------------------------------|
| before this patch |    1        |      1      |    1      |
|  after this patch |    5.25     |     19.99   |   20      |
 ------------------------------------------------------------

(3) comparison of benchmarks scores
 ---------------------------------------------------------------------
|      score        | glmark2       | lightsmark     | openarena      |
|---------------------------------------------------------------------|
| before this patch | 1244          | 222.18         | 114.4          |
|  after this patch | 1248 (100.3%) | 225.8 (101.6%) | 115.0 (100.9%) |
 ---------------------------------------------------------------------

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 drivers/gpu/drm/i915/gvt/scheduler.c | 101 +++++++++++++++++++--------
 1 file changed, 73 insertions(+), 28 deletions(-)

Message ID	20200224084824.31992-1-yan.y.zhao@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=NYzf=4M=vger.kernel.org=kvm-owner@kernel.org> From: Yan Zhao <yan.y.zhao@intel.com> To: zhenyuw@linux.intel.com Cc: alex.williamson@redhat.com, intel-gvt-dev@lists.freedesktop.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, kevin.tian@intel.com, peterx@redhat.com, Yan Zhao <yan.y.zhao@intel.com> Subject: [PATCH v3 7/7] drm/i915/gvt: rw more pages a time for shadow context Date: Mon, 24 Feb 2020 03:48:24 -0500 Message-Id: <20200224084824.31992-1-yan.y.zhao@intel.com> In-Reply-To: <20200224084350.31574-1-yan.y.zhao@intel.com> References: <20200224084350.31574-1-yan.y.zhao@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	use vfio_dma_rw to read/write IOVAs from CPU side \| expand [v3,0/7] use vfio_dma_rw to read/write IOVAs from CPU side [v3,1/7] vfio: allow external user to get vfio group from device [v3,2/7] vfio: introduce vfio_dma_rw to read/write a range of IOVAs [v3,3/7] vfio: avoid inefficient lookup of VFIO group in vfio_pin/unpin_pages [v3,4/7] drm/i915/gvt: hold reference of VFIO group during opening of vgpu [v3,5/7] drm/i915/gvt: subsitute kvm_read/write_guest with vfio_dma_rw [v3,6/7] drm/i915/gvt: avoid unnecessary lookup in each vfio pin & unpin pages [v3,7/7] drm/i915/gvt: rw more pages a time for shadow context

[v3,7/7] drm/i915/gvt: rw more pages a time for shadow context

Commit Message

Patch