Message ID | 20210728141249.357067-1-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Use Transparent Hugepages when IOMMU is enabled | expand |
On Wed, Jul 28, 2021 at 03:12:49PM +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Usage of Transparent Hugepages was disabled in 9987da4b5dcf > ("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it > appears majority of performance regressions reported with an enabled IOMMU > can be almost eliminated by turning them on, lets do that by adding a > couple of Kconfig options. > > To err on the side of safety we keep the current default in cases where > IOMMU is not active, and only when it is default to the "huge=within_size" > mode. Although there probably would be wins to enable them throughout, > more extensive testing across benchmarks and platforms would need to be > done. > > With the patch and IOMMU enabled my local testing on a small Skylake part > shows OglVSTangent regression being reduced from ~14% to ~2%. > > References: b901bb89324a ("drm/i915/gemfs: enable THP") > References: 9987da4b5dcf ("drm/i915: Disable THP until we have a GPU read BW W/A") > References: https://gitlab.freedesktop.org/drm/intel/-/issues/430 > Co-developed-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Matthew Auld <matthew.auld@intel.com> > Cc: Eero Tamminen <eero.t.tamminen@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/Kconfig.profile | 46 +++++++++++++++++++++++++++ > drivers/gpu/drm/i915/gem/i915_gemfs.c | 11 +++++-- > 2 files changed, 55 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile > index 39328567c200..c64c3d39a0f9 100644 > --- a/drivers/gpu/drm/i915/Kconfig.profile > +++ b/drivers/gpu/drm/i915/Kconfig.profile > @@ -119,3 +119,49 @@ config DRM_I915_TIMESLICE_DURATION > /sys/class/drm/card?/engine/*/timeslice_duration_ms > > May be 0 to disable timeslicing. > + > +choice > + prompt "Transparent Hugepage Support (native)" > + default DRM_I915_THP_NATIVE_NEVER > + help > + Select the preferred method for allocating from Transparent Hugepages > + when IOMMU is not enabled. > + > + config DRM_I915_THP_NATIVE_NEVER > + bool "Never" > + > + config DRM_I915_THP_NATIVE_WITHIN > + bool "Within" > + > + config DRM_I915_THP_NATIVE_ALWAYS > + bool "Always" > +endchoice > + > +config DRM_I915_THP_NATIVE > + string > + default "always" if DRM_I915_THP_NATIVE_ALWAYS > + default "within_size" if DRM_I915_THP_NATIVE_WITHIN > + default "never" if DRM_I915_THP_NATIVE_NEVER > + > +choice > + prompt "Transparent Hugepage Support (IOMMU)" > + default DRM_I915_THP_IOMMU_WITHIN > + help > + Select the preferred method for allocating from Transparent Hugepages > + with IOMMU active. > + > + config DRM_I915_THP_IOMMU_NEVER > + bool "Never" > + > + config DRM_I915_THP_IOMMU_WITHIN > + bool "Within" > + > + config DRM_I915_THP_IOMMU_ALWAYS > + bool "Always" > +endchoice > + > +config DRM_I915_THP_IOMMU > + string > + default "always" if DRM_I915_THP_IOMMU_ALWAYS > + default "within_size" if DRM_I915_THP_IOMMU_WITHIN > + default "never" if DRM_I915_THP_IOMMU_NEVER > diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c > index 5e6e8c91ab38..b71d2b2d2ada 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c > +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c > @@ -13,8 +13,11 @@ > > int i915_gemfs_init(struct drm_i915_private *i915) > { > + char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE; > + char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU; > struct file_system_type *type; > struct vfsmount *gemfs; > + char *opts; > > type = get_fs_type("tmpfs"); > if (!type) > @@ -26,15 +29,19 @@ int i915_gemfs_init(struct drm_i915_private *i915) > * > * One example, although it is probably better with a per-file > * control, is selecting huge page allocations ("huge=within_size"). > - * Currently unused due to bandwidth issues (slow reads) on Broadwell+. > + * However, we only do so to offset the overhead of iommu lookups > + * due to bandwidth issues (slow reads) on Broadwell+. > */ > + opts = intel_vtd_active() ? thp_iommu : thp_native; at first sight I got confused on why we where having to configs, then very few drivers using the mount option there, so it took me a few minutes to realize it... but I know understood the goal... Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > > - gemfs = kern_mount(type); > + gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); > if (IS_ERR(gemfs)) > return PTR_ERR(gemfs); > > i915->mm.gemfs = gemfs; > > + drm_info(&i915->drm, "Transparent Hugepage mode '%s'", opts); > + > return 0; > } > > -- > 2.30.2 >
Hi Tvrtko, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on v5.14-rc3 next-20210730] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Tvrtko-Ursulin/drm-i915-Use-Transparent-Hugepages-when-IOMMU-is-enabled/20210728-221515 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: i386-randconfig-a015-20210728 (attached as .config) compiler: gcc-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/26ad4b5023a428526c23ce544b593eced1916b49 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Tvrtko-Ursulin/drm-i915-Use-Transparent-Hugepages-when-IOMMU-is-enabled/20210728-221515 git checkout 26ad4b5023a428526c23ce544b593eced1916b49 # save the attached .config to linux build tree mkdir build_dir make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): drivers/gpu/drm/i915/gem/i915_gemfs.c: In function 'i915_gemfs_init': >> drivers/gpu/drm/i915/gem/i915_gemfs.c:16:30: error: expected ',' or ';' before 'CONFIG_DRM_I915_THP_NATIVE' 16 | char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE; | ^~~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/gpu/drm/i915/gem/i915_gemfs.c:17:29: error: expected ',' or ';' before 'CONFIG_DRM_I915_THP_IOMMU' 17 | char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU; | ^~~~~~~~~~~~~~~~~~~~~~~~~ vim +16 drivers/gpu/drm/i915/gem/i915_gemfs.c 13 14 int i915_gemfs_init(struct drm_i915_private *i915) 15 { > 16 char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE; > 17 char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU; 18 struct file_system_type *type; 19 struct vfsmount *gemfs; 20 char *opts; 21 22 type = get_fs_type("tmpfs"); 23 if (!type) 24 return -ENODEV; 25 26 /* 27 * By creating our own shmemfs mountpoint, we can pass in 28 * mount flags that better match our usecase. 29 * 30 * One example, although it is probably better with a per-file 31 * control, is selecting huge page allocations ("huge=within_size"). 32 * However, we only do so to offset the overhead of iommu lookups 33 * due to bandwidth issues (slow reads) on Broadwell+. 34 */ 35 opts = intel_vtd_active() ? thp_iommu : thp_native; 36 37 gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); 38 if (IS_ERR(gemfs)) 39 return PTR_ERR(gemfs); 40 41 i915->mm.gemfs = gemfs; 42 43 drm_info(&i915->drm, "Transparent Hugepage mode '%s'", opts); 44 45 return 0; 46 } 47 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 39328567c200..c64c3d39a0f9 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -119,3 +119,49 @@ config DRM_I915_TIMESLICE_DURATION /sys/class/drm/card?/engine/*/timeslice_duration_ms May be 0 to disable timeslicing. + +choice + prompt "Transparent Hugepage Support (native)" + default DRM_I915_THP_NATIVE_NEVER + help + Select the preferred method for allocating from Transparent Hugepages + when IOMMU is not enabled. + + config DRM_I915_THP_NATIVE_NEVER + bool "Never" + + config DRM_I915_THP_NATIVE_WITHIN + bool "Within" + + config DRM_I915_THP_NATIVE_ALWAYS + bool "Always" +endchoice + +config DRM_I915_THP_NATIVE + string + default "always" if DRM_I915_THP_NATIVE_ALWAYS + default "within_size" if DRM_I915_THP_NATIVE_WITHIN + default "never" if DRM_I915_THP_NATIVE_NEVER + +choice + prompt "Transparent Hugepage Support (IOMMU)" + default DRM_I915_THP_IOMMU_WITHIN + help + Select the preferred method for allocating from Transparent Hugepages + with IOMMU active. + + config DRM_I915_THP_IOMMU_NEVER + bool "Never" + + config DRM_I915_THP_IOMMU_WITHIN + bool "Within" + + config DRM_I915_THP_IOMMU_ALWAYS + bool "Always" +endchoice + +config DRM_I915_THP_IOMMU + string + default "always" if DRM_I915_THP_IOMMU_ALWAYS + default "within_size" if DRM_I915_THP_IOMMU_WITHIN + default "never" if DRM_I915_THP_IOMMU_NEVER diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c index 5e6e8c91ab38..b71d2b2d2ada 100644 --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c @@ -13,8 +13,11 @@ int i915_gemfs_init(struct drm_i915_private *i915) { + char thp_native[] = "huge=" CONFIG_DRM_I915_THP_NATIVE; + char thp_iommu[] = "huge=" CONFIG_DRM_I915_THP_IOMMU; struct file_system_type *type; struct vfsmount *gemfs; + char *opts; type = get_fs_type("tmpfs"); if (!type) @@ -26,15 +29,19 @@ int i915_gemfs_init(struct drm_i915_private *i915) * * One example, although it is probably better with a per-file * control, is selecting huge page allocations ("huge=within_size"). - * Currently unused due to bandwidth issues (slow reads) on Broadwell+. + * However, we only do so to offset the overhead of iommu lookups + * due to bandwidth issues (slow reads) on Broadwell+. */ + opts = intel_vtd_active() ? thp_iommu : thp_native; - gemfs = kern_mount(type); + gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts); if (IS_ERR(gemfs)) return PTR_ERR(gemfs); i915->mm.gemfs = gemfs; + drm_info(&i915->drm, "Transparent Hugepage mode '%s'", opts); + return 0; }