From patchwork Thu Mar 3 22:37:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12768217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A56C7C433F5 for ; Thu, 3 Mar 2022 22:37:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 31B7810E3AA; Thu, 3 Mar 2022 22:37:40 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 824DD10E38E; Thu, 3 Mar 2022 22:37:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646347058; x=1677883058; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KSF7CMpi5AlZE2XAyqc7iF8DrwVsfOp+HReq8WXU0as=; b=hAab3LuXRHzcwxbI2Hapymcl61/KL7mXRzicCme39ldFuu10oNDAXH/n T8eh70GEYqB3oB/Ded2p9Zcg6d3JFioulvkZTGqisvVKejEh9/8wZYaZK ViMjYFTl5VMaMscA1v1f00JUqdTHcqvYNDzTn2/9Zcd9u+In2Oozjz/mP FyaalMg5VOm9wI68ulUUb/pBwhY7Xc5n0LSeU+2iqcLX4f7cAL+SPr/z4 ZOAR+6nWRf3Bfz1ZIQ7K6RwnTkupiUkd2dJlaibdONJLz5ThdXinkNAYO mKCn0o1oT6XOf9HnFG66QSePUx7McjSHw2Qu5SVtMJaw/YJAEuVjbjoqa Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10275"; a="233794763" X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="233794763" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Mar 2022 14:37:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="609745283" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by fmsmga004.fm.intel.com with ESMTP; 03 Mar 2022 14:37:37 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Thu, 3 Mar 2022 14:37:34 -0800 Message-Id: <20220303223737.708659-2-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303223737.708659-1-John.C.Harrison@Intel.com> References: <20220303223737.708659-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH v3 1/4] drm/i915/guc: Limit scheduling properties to avoid overflow X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison GuC converts the pre-emption timeout and timeslice quantum values into clock ticks internally. That significantly reduces the point of 32bit overflow. On current platforms, worst case scenario is approximately 110 seconds. Rather than allowing the user to set higher values and then get confused by early timeouts, add limits when setting these values. v2: Add helper functins for clamping (review feedback from Tvrtko). Signed-off-by: John Harrison Reviewed-by: Daniele Ceraolo Spurio (v1) Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_engine.h | 6 ++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 69 +++++++++++++++++++++ drivers/gpu/drm/i915/gt/sysfs_engines.c | 25 +++++--- drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 9 +++ 4 files changed, 99 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 1c0ab05c3c40..d7044c4e526e 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -351,4 +351,10 @@ intel_engine_get_hung_context(struct intel_engine_cs *engine) return engine->hung_ce; } +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value); +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value); + #endif /* _INTEL_RINGBUFFER_H_ */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 7447411a5b26..22e70e4e007c 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,26 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; } + /* Cap properties according to any system limits */ +#define CLAMP_PROP(field) \ + do { \ + u64 clamp = intel_clamp_##field(engine, engine->props.field); \ + if (clamp != engine->props.field) { \ + drm_notice(&engine->i915->drm, \ + "Warning, clamping %s to %lld to prevent overflow\n", \ + #field, clamp); \ + engine->props.field = clamp; \ + } \ + } while (0) + + CLAMP_PROP(heartbeat_interval_ms); + CLAMP_PROP(max_busywait_duration_ns); + CLAMP_PROP(preempt_timeout_ms); + CLAMP_PROP(stop_timeout_ms); + CLAMP_PROP(timeslice_duration_ms); + +#undef CLAMP_PROP + engine->defaults = engine->props; /* never to change again */ engine->context_size = intel_engine_context_size(gt, engine->class); @@ -464,6 +484,55 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, return 0; } +u64 intel_clamp_heartbeat_interval_ms(struct intel_engine_cs *engine, u64 value) +{ + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + +u64 intel_clamp_max_busywait_duration_ns(struct intel_engine_cs *engine, u64 value) +{ + value = min(value, jiffies_to_nsecs(2)); + + return value; +} + +u64 intel_clamp_preempt_timeout_ms(struct intel_engine_cs *engine, u64 value) +{ + /* + * NB: The GuC API only supports 32bit values. However, the limit is further + * reduced due to internal calculations which would otherwise overflow. + */ + if (intel_guc_submission_is_wanted(&engine->gt->uc.guc)) + value = min_t(u64, value, GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS); + + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + +u64 intel_clamp_stop_timeout_ms(struct intel_engine_cs *engine, u64 value) +{ + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + +u64 intel_clamp_timeslice_duration_ms(struct intel_engine_cs *engine, u64 value) +{ + /* + * NB: The GuC API only supports 32bit values. However, the limit is further + * reduced due to internal calculations which would otherwise overflow. + */ + if (intel_guc_submission_is_wanted(&engine->gt->uc.guc)) + value = min_t(u64, value, GUC_POLICY_MAX_EXEC_QUANTUM_MS); + + value = min_t(u64, value, jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)); + + return value; +} + static void __setup_engine_capabilities(struct intel_engine_cs *engine) { struct drm_i915_private *i915 = engine->i915; diff --git a/drivers/gpu/drm/i915/gt/sysfs_engines.c b/drivers/gpu/drm/i915/gt/sysfs_engines.c index 967031056202..f2d9858d827c 100644 --- a/drivers/gpu/drm/i915/gt/sysfs_engines.c +++ b/drivers/gpu/drm/i915/gt/sysfs_engines.c @@ -144,7 +144,7 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { struct intel_engine_cs *engine = kobj_to_engine(kobj); - unsigned long long duration; + unsigned long long duration, clamped; int err; /* @@ -168,7 +168,8 @@ max_spin_store(struct kobject *kobj, struct kobj_attribute *attr, if (err) return err; - if (duration > jiffies_to_nsecs(2)) + clamped = intel_clamp_max_busywait_duration_ns(engine, duration); + if (duration != clamped) return -EINVAL; WRITE_ONCE(engine->props.max_busywait_duration_ns, duration); @@ -203,7 +204,7 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { struct intel_engine_cs *engine = kobj_to_engine(kobj); - unsigned long long duration; + unsigned long long duration, clamped; int err; /* @@ -218,7 +219,8 @@ timeslice_store(struct kobject *kobj, struct kobj_attribute *attr, if (err) return err; - if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)) + clamped = intel_clamp_timeslice_duration_ms(engine, duration); + if (duration != clamped) return -EINVAL; WRITE_ONCE(engine->props.timeslice_duration_ms, duration); @@ -256,7 +258,7 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { struct intel_engine_cs *engine = kobj_to_engine(kobj); - unsigned long long duration; + unsigned long long duration, clamped; int err; /* @@ -272,7 +274,8 @@ stop_store(struct kobject *kobj, struct kobj_attribute *attr, if (err) return err; - if (duration > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)) + clamped = intel_clamp_stop_timeout_ms(engine, duration); + if (duration != clamped) return -EINVAL; WRITE_ONCE(engine->props.stop_timeout_ms, duration); @@ -306,7 +309,7 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { struct intel_engine_cs *engine = kobj_to_engine(kobj); - unsigned long long timeout; + unsigned long long timeout, clamped; int err; /* @@ -322,7 +325,8 @@ preempt_timeout_store(struct kobject *kobj, struct kobj_attribute *attr, if (err) return err; - if (timeout > jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)) + clamped = intel_clamp_preempt_timeout_ms(engine, timeout); + if (timeout != clamped) return -EINVAL; WRITE_ONCE(engine->props.preempt_timeout_ms, timeout); @@ -362,7 +366,7 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) { struct intel_engine_cs *engine = kobj_to_engine(kobj); - unsigned long long delay; + unsigned long long delay, clamped; int err; /* @@ -379,7 +383,8 @@ heartbeat_store(struct kobject *kobj, struct kobj_attribute *attr, if (err) return err; - if (delay >= jiffies_to_msecs(MAX_SCHEDULE_TIMEOUT)) + clamped = intel_clamp_heartbeat_interval_ms(engine, delay); + if (delay != clamped) return -EINVAL; err = intel_engine_set_heartbeat(engine, delay); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h index 4b300b6cc0f9..a2d574f2fdd5 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h @@ -262,6 +262,15 @@ struct guc_lrc_desc { #define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000 +/* + * GuC converts the timeout to clock ticks internally. Different platforms have + * different GuC clocks. Thus, the maximum value before overflow is platform + * dependent. Current worst case scenario is about 110s. So, limit to 100s to be + * safe. + */ +#define GUC_POLICY_MAX_EXEC_QUANTUM_MS (100 * 1000) +#define GUC_POLICY_MAX_PREEMPT_TIMEOUT_MS (100 * 1000) + struct guc_policies { u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES]; /* In micro seconds. How much time to allow before DPC processing is From patchwork Thu Mar 3 22:37:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12768216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 46923C433F5 for ; Thu, 3 Mar 2022 22:37:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D69E210E397; Thu, 3 Mar 2022 22:37:39 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id A6EF310E38C; Thu, 3 Mar 2022 22:37:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646347058; x=1677883058; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vbMQmDdPfuVsr5zjC8+pTrfY7ZfaPiZ1RbUUMT6Lfoo=; b=lZtxVt4UzBZnPjt3tDj9imyu4ZzXwyNPQ/Vv2RAZHjnrENwpFUc4KWpe 4SM0OGIPCSaWdW3fT/qbQ8CRb9961rBX1A8HamuLJ4Nb+uCbix1V5Hi2m lyHsl6adIkbA3j1MJ7n+4cXYAFAM3t2yGdo7a3ghvzQrs1nuHdy+zZo8W hDwPkrS/kx2aGBpYFh+GWBCNpVmEyRDbDFi10N36RnKcZ9pBi+1MgOp0s 1mkgSCBofe84fv1KsyCAB7hR+tUZaOtZ/jHicV5LvUMR+rGqXfTm8ZlAL MItc21stfF8bW+5ryrmHaYxj+FJ9ZlHDlwsw1eodM5VDAVVbfv06tc37w Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10275"; a="233794764" X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="233794764" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Mar 2022 14:37:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="609745288" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by fmsmga004.fm.intel.com with ESMTP; 03 Mar 2022 14:37:37 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Thu, 3 Mar 2022 14:37:35 -0800 Message-Id: <20220303223737.708659-3-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303223737.708659-1-John.C.Harrison@Intel.com> References: <20220303223737.708659-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH v3 2/4] drm/i915: Fix compute pre-emption w/a to apply to compute engines X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Vetter , DRI-Devel@Lists.FreeDesktop.Org, Chris Wilson , Matthew Auld , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , Jani Nikula , Lucas De Marchi , =?utf-8?q?Micha=C5=82_Winiarski?= , Tejas Upadhyay Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison An earlier patch added support for compute engines. However, it missed enabling the anti-pre-emption w/a for the new engine class. So move the 'compute capable' flag earlier and use it for the pre-emption w/a test. Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions") Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Cc: Aravind Iddamsetty Cc: Matt Roper Cc: Tvrtko Ursulin Cc: Daniel Vetter Cc: Maarten Lankhorst Cc: Lucas De Marchi Cc: John Harrison Cc: Jason Ekstrand Cc: "Michał Winiarski" Cc: Matthew Brost Cc: Chris Wilson Cc: Tejas Upadhyay Cc: Umesh Nerlige Ramappa Cc: "Thomas Hellström" Cc: Stuart Summers Cc: Matthew Auld Cc: Jani Nikula Cc: Ramalingam C Cc: Akeem G Abodunrin Signed-off-by: John Harrison Reviewed-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 22e70e4e007c..4185c7338581 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -421,6 +421,12 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, engine->logical_mask = BIT(logical_instance); __sprint_engine_name(engine); + /* features common between engines sharing EUs */ + if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) { + engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE; + engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; + } + engine->props.heartbeat_interval_ms = CONFIG_DRM_I915_HEARTBEAT_INTERVAL; engine->props.max_busywait_duration_ns = @@ -433,15 +439,9 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, CONFIG_DRM_I915_TIMESLICE_DURATION; /* Override to uninterruptible for OpenCL workloads. */ - if (GRAPHICS_VER(i915) == 12 && engine->class == RENDER_CLASS) + if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE)) engine->props.preempt_timeout_ms = 0; - /* features common between engines sharing EUs */ - if (engine->class == RENDER_CLASS || engine->class == COMPUTE_CLASS) { - engine->flags |= I915_ENGINE_HAS_RCS_REG_STATE; - engine->flags |= I915_ENGINE_HAS_EU_PRIORITY; - } - /* Cap properties according to any system limits */ #define CLAMP_PROP(field) \ do { \ From patchwork Thu Mar 3 22:37:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12768218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 00C99C433FE for ; Thu, 3 Mar 2022 22:37:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A00AF10E3BF; Thu, 3 Mar 2022 22:37:41 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id CE2C710E38E; Thu, 3 Mar 2022 22:37:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646347058; x=1677883058; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/+ZaC5eBzBDrJItv3YIuvFo1Ck8vEL6cCpX5y8GOfbE=; b=UKZip2Fr+YfweOCFW5/YPOe664sPsKiXnwUyieBHl5Zx97y6Bg5f0tFp E8X1FTfgDJUtdG/Rk3Qtrq1MymbteyNTkQeol9Q3T5NNkmRv/6GlNuu8I KyhneNardNph+ooW8UcxshhfpkKPreWdHfbLvWbLp/55W/ACoCcSj+Akb y9tTYYmZ01StjU1YXq9i0EYhhvZJZrEb5MFpuX5QrLMO6o+bl3hAr3tyc DZwmLx0toL+V8ygJfSxrZXdL7umb55aeJZyuWyJ9Tf5Us9uU0rA+fB6+M 1kOg6TnhMOrzO6UCC080TjFPX++wxa1zKR23mZANKK6PzvUTZDcNtlAsX Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10275"; a="233794765" X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="233794765" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Mar 2022 14:37:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="609745291" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by fmsmga004.fm.intel.com with ESMTP; 03 Mar 2022 14:37:38 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Thu, 3 Mar 2022 14:37:36 -0800 Message-Id: <20220303223737.708659-4-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303223737.708659-1-John.C.Harrison@Intel.com> References: <20220303223737.708659-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH v3 3/4] drm/i915: Make the heartbeat play nice with long pre-emption timeouts X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison Compute workloads are inherently not pre-emptible for long periods on current hardware. As a workaround for this, the pre-emption timeout for compute capable engines was disabled. This is undesirable with GuC submission as it prevents per engine reset of hung contexts. Hence the next patch will re-enable the timeout but bumped up by an order of magnitude. However, the heartbeat might not respect that. Depending upon current activity, a pre-emption to the heartbeat pulse might not even be attempted until the last heartbeat period. Which means that only one period is granted for the pre-emption to occur. With the aforesaid bump, the pre-emption timeout could be significantly larger than this heartbeat period. So adjust the heartbeat code to take the pre-emption timeout into account. When it reaches the final (high priority) period, it now ensures the delay before hitting reset is bigger than the pre-emption timeout. v2: Fix for selftests which adjust the heartbeat period manually. Signed-off-by: John Harrison --- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index a3698f611f45..0dc53def8e42 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -22,9 +22,27 @@ static bool next_heartbeat(struct intel_engine_cs *engine) { + struct i915_request *rq; long delay; delay = READ_ONCE(engine->props.heartbeat_interval_ms); + + rq = engine->heartbeat.systole; + + if (rq && rq->sched.attr.priority >= I915_PRIORITY_BARRIER && + delay == engine->defaults.heartbeat_interval_ms) { + long longer; + + /* + * The final try is at the highest priority possible. Up until now + * a pre-emption might not even have been attempted. So make sure + * this last attempt allows enough time for a pre-emption to occur. + */ + longer = READ_ONCE(engine->props.preempt_timeout_ms) * 2; + if (longer > delay) + delay = longer; + } + if (!delay) return false; From patchwork Thu Mar 3 22:37:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 12768219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A3087C433F5 for ; Thu, 3 Mar 2022 22:37:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7770F10E3B7; Thu, 3 Mar 2022 22:37:42 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id F387110E38C; Thu, 3 Mar 2022 22:37:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646347058; x=1677883058; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=11J2CUQdpabO4ZfG+3yEoif29r2jD1GKLWPchj+jM2w=; b=e2fyq0GvYBcB7t/XUfwGFtwE5e04uq32K1lksVhDv1qy03LSUEaBy4IS RiEITqgsnRBXCxKkAn+nc2zpa7J5wEMMDastGT6BWRdqcMto7vxvhYybN Es0NLnYq2eAtBFEQksidv6kG10WHgDlhrGhlOmBL2thIHMwMnINclGvwn 08FmG6Q+kQwF39UXXnUIz336HNHm7GPN9hy6fUvEihjaNY/Zgc6L4WISk F3AMQMz3xf879dIssVE8sSbCOROIktz05txGq63mdzx7JtOZjgR/IEUNd RvFO2bCImuJg+8NsHFdZ0gNPFiUWffe/yzond6mAy5pjP+Bkn5Zh1Q2zk Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10275"; a="233794766" X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="233794766" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Mar 2022 14:37:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,153,1643702400"; d="scan'208";a="609745295" Received: from relo-linux-5.jf.intel.com ([10.165.21.134]) by fmsmga004.fm.intel.com with ESMTP; 03 Mar 2022 14:37:38 -0800 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Thu, 3 Mar 2022 14:37:37 -0800 Message-Id: <20220303223737.708659-5-John.C.Harrison@Intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303223737.708659-1-John.C.Harrison@Intel.com> References: <20220303223737.708659-1-John.C.Harrison@Intel.com> MIME-Version: 1.0 Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH v3 4/4] drm/i915: Improve long running OCL w/a for GuC submission X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Mrozek , DRI-Devel@Lists.FreeDesktop.Org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: John Harrison A workaround was added to the driver to allow OpenCL workloads to run 'forever' by disabling pre-emption on the RCS engine for Gen12. It is not totally unbound as the heartbeat will kick in eventually and cause a reset of the hung engine. However, this does not work well in GuC submission mode. In GuC mode, the pre-emption timeout is how GuC detects hung contexts and triggers a per engine reset. Thus, disabling the timeout means also losing all per engine reset ability. A full GT reset will still occur when the heartbeat finally expires, but that is a much more destructive and undesirable mechanism. The purpose of the workaround is actually to give OpenCL tasks longer to reach a pre-emption point after a pre-emption request has been issued. This is necessary because Gen12 does not support mid-thread pre-emption and OpenCL can have long running threads. So, rather than disabling the timeout completely, just set it to a 'long' value. v2: Review feedback from Tvrtko - must hard code the 'long' value instead of determining it algorithmically. So make it an extra CONFIG definition. Also, remove the execlist centric comment from the existing pre-emption timeout CONFIG option given that it applies to more than just execlists. Signed-off-by: John Harrison Reviewed-by: Daniele Ceraolo Spurio (v1) Acked-by: Michal Mrozek Acked-by: Michal Mrozek Signed-off-by: John Harrison Reviewed-by: Daniele Ceraolo Spurio (v1) Acked-by: Michal Mrozek --- drivers/gpu/drm/i915/Kconfig.profile | 26 +++++++++++++++++++---- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 ++++++-- 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 39328567c200..7cc38d25ee5c 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -57,10 +57,28 @@ config DRM_I915_PREEMPT_TIMEOUT default 640 # milliseconds help How long to wait (in milliseconds) for a preemption event to occur - when submitting a new context via execlists. If the current context - does not hit an arbitration point and yield to HW before the timer - expires, the HW will be reset to allow the more important context - to execute. + when submitting a new context. If the current context does not hit + an arbitration point and yield to HW before the timer expires, the + HW will be reset to allow the more important context to execute. + + This is adjustable via + /sys/class/drm/card?/engine/*/preempt_timeout_ms + + May be 0 to disable the timeout. + + The compiled in default may get overridden at driver probe time on + certain platforms and certain engines which will be reflected in the + sysfs control. + +config DRM_I915_PREEMPT_TIMEOUT_COMPUTE + int "Preempt timeout for compute engines (ms, jiffy granularity)" + default 7500 # milliseconds + help + How long to wait (in milliseconds) for a preemption event to occur + when submitting a new context to a compute capable engine. If the + current context does not hit an arbitration point and yield to HW + before the timer expires, the HW will be reset to allow the more + important context to execute. This is adjustable via /sys/class/drm/card?/engine/*/preempt_timeout_ms diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 4185c7338581..cc0954ad836a 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -438,9 +438,14 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id, engine->props.timeslice_duration_ms = CONFIG_DRM_I915_TIMESLICE_DURATION; - /* Override to uninterruptible for OpenCL workloads. */ + /* + * Mid-thread pre-emption is not available in Gen12. Unfortunately, + * some OpenCL workloads run quite long threads. That means they get + * reset due to not pre-empting in a timely manner. So, bump the + * pre-emption timeout value to be much higher for compute engines. + */ if (GRAPHICS_VER(i915) == 12 && (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE)) - engine->props.preempt_timeout_ms = 0; + engine->props.preempt_timeout_ms = CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE; /* Cap properties according to any system limits */ #define CLAMP_PROP(field) \