Message ID | 20230427234843.2886921-7-alan.previn.teres.alexis@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/pxp: Add MTL PXP Support | expand |
On Thu, 2023-04-27 at 16:48 -0700, Teres Alexis, Alan Previn wrote: > Because of the additional firmware, component-driver and > initialization depedencies required on MTL platform before a > PXP context can be created, UMD calling for PXP creation as a > way to get-caps can take a long time. An actual real world > customer stack has seen this happen in the 4-to-8 second range > after the kernel starts (which sees MESA's init appear in the > middle of this range as the compositor comes up). To avoid > unncessary delays experienced by the UMD for get-caps purposes, > add a GET_PARAM for I915_PARAM_PXP_SUPPORT. > alan:snip. Progress update on the UMD side - I'm working on patch for PR here: https://gitlab.freedesktop.org/alan_previn_intel/mesa-alan-previn-features/-/commit/fb9d4fbfbef7dfd3f41df335cd31549fd39ddb57 but taking time to verify certain code paths
On 2023-05-04 22:30:07, Teres Alexis, Alan Previn wrote: > On Thu, 2023-04-27 at 16:48 -0700, Teres Alexis, Alan Previn wrote: > > Because of the additional firmware, component-driver and > > initialization depedencies required on MTL platform before a > > PXP context can be created, UMD calling for PXP creation as a > > way to get-caps can take a long time. An actual real world > > customer stack has seen this happen in the 4-to-8 second range > > after the kernel starts (which sees MESA's init appear in the > > middle of this range as the compositor comes up). To avoid > > unncessary delays experienced by the UMD for get-caps purposes, > > add a GET_PARAM for I915_PARAM_PXP_SUPPORT. > > > alan:snip. > Progress update on the UMD side - I'm working on patch for PR here: > https://gitlab.freedesktop.org/alan_previn_intel/mesa-alan-previn-features/-/commit/fb9d4fbfbef7dfd3f41df335cd31549fd39ddb57 > but taking time to verify certain code paths Just to confirm, if I915_PARAM_PXP_STATUS returns 2 ("will be ready soon"), then it is basically certain that in a production environment, then it will eventually return 1 meaning it's ready, right? If this is correct, then I think that the change in i915_gem_supports_protected_context() is good, and probably we can skip the change in iris_create_hw_context(). Basically, we're timing out for multiple seconds either way. And, I'm hoping that the kernel will eventually get the PXP init done and create the context. I think there's 2 cases of interest here. First, and most likely, the application running doesn't care about protected content. In this case we can quickly advertise the support, but there will be no consequence because the application won't use the feature. Second, the application does care about protected content. In this case we can quickly advertise the support, but if the feature is used too quickly, then the context create might take a long time. If I915_PARAM_PXP_STATUS returning 2 has a reasonable chance in a production environment of eventually finding out that pxp can't work, then perhaps more disussion is needed. I guess the worst case scenario is no worse than today though. (Except it is still somewhat better, because the common case would not involve protected content being used by the application.) Another option besides from the timeout loop in iris_create_hw_context() might be to check I915_PARAM_PXP_STATUS after the context create fails to tweak the debug message. -Jordan
...fixing some Cc email addresses I somehow mangled. On 2023-05-10 12:40:07, Souza, Jose wrote: > On Mon, 2023-05-08 at 17:49 +0000, Teres Alexis, Alan Previn wrote: > > On Fri, 2023-05-05 at 00:39 -0700, Justen, Jordan L wrote: > > > On 2023-05-04 22:30:07, Teres Alexis, Alan Previn wrote: > > > > On Thu, 2023-04-27 at 16:48 -0700, Teres Alexis, Alan Previn wrote: > > > > > Because of the additional firmware, component-driver and > > > > > initialization depedencies required on MTL platform before a > > > > > PXP context can be created, UMD calling for PXP creation as a > > > > > way to get-caps can take a long time. An actual real world > > > > > customer stack has seen this happen in the 4-to-8 second range > > > > > after the kernel starts (which sees MESA's init appear in the > > > > > middle of this range as the compositor comes up). To avoid > > > > > unncessary delays experienced by the UMD for get-caps purposes, > > > > > add a GET_PARAM for I915_PARAM_PXP_SUPPORT. > > > > > > > > > alan:snip. > > > > Progress update on the UMD side - I'm working on patch for PR here: > > > > https://gitlab.freedesktop.org/alan_previn_intel/mesa-alan-previn-features/-/commit/fb9d4fbfbef7dfd3f41df335cd31549fd39ddb57 > > > > but taking time to verify certain code paths > > > > > > Just to confirm, if I915_PARAM_PXP_STATUS returns 2 ("will be ready > > > soon"), then it is basically certain that in a production environment, > > > then it will eventually return 1 meaning it's ready, right? > > alan: yes - but not 100%. non-platform-state-machine could still > > cause unexpected failures for example, [1] if hw was fused in a way > > that doesnt permit PXP or [2] enabling certain BIOS debug knobs > > doesnt allow PXP. However at the moment, there is no way for us > > to know for sure without actually creating a protected context. > > Of course having hw-fusing + bios-config that supports PXP have > > always 100% succeeded for me. > > In my opinion it is problematic mark that we support protected > context but then it fails to create protected context. This is why I asked if it was it was "basically certain that in a production environment, then it will eventually return 1 meaning it's ready". Alan's response was a little ambiguous on this point. But, considering the number of applications that will need this feature is low, combined with an extremely low chance that the kernel will end up going from 2 (will be ready soon) to -ENODEV (will never work), well, it seems worth considering advertising it with no delay even if it later fails if used. Arguably a transition from 2 to -ENODEV could be considered a kernel bug, but it doesn't sound like Alan would agree. :) Maybe Alan would agree to saying it's either a kernel, or system integration bug. I think it'd also be ok if we didn't advertise support if an application starts when the kernel is still in the 2 (will be ready soon) state. But, some environments might prefer to wait, so I think the kernel uapi should stay as Alan has it now so we have the flexibility to potentially accommodate this. (Perhaps with driconf, or a build flag, or an env-var.) -Jordan
alan:snip > This is why I asked if it was it was "basically certain that in a > production environment, then it will eventually return 1 meaning it's > ready". Alan's response was a little ambiguous on this point. alan: if we get a '2' and never transition to '1' - thats a kernel bug or firmware / system issue. > Arguably a transition from 2 to -ENODEV could be considered a kernel > bug, but it doesn't sound like Alan would agree. :) Maybe Alan would > agree to saying it's either a kernel, or system integration bug. alan: agreed - that would be a kernel bug or a system integration bug. I also agreed on the init-flow vs app-usage thoughts Jordan had. That said MESA has many ways it can use this UAPI and we can discuss that on the MESA patch. hmmm.. so... ack anyone? [insert big hopeful smiley here] apologies if I am asking too often.
On 2023-05-10 15:00:03, Teres Alexis, Alan Previn wrote: > > alan:snip > > > This is why I asked if it was it was "basically certain that in a > > production environment, then it will eventually return 1 meaning it's > > ready". Alan's response was a little ambiguous on this point. > alan: if we get a '2' and never transition to '1' - thats a kernel bug or > firmware / system issue. > > > Arguably a transition from 2 to -ENODEV could be considered a kernel > > bug, but it doesn't sound like Alan would agree. :) Maybe Alan would > > agree to saying it's either a kernel, or system integration bug. > alan: agreed - that would be a kernel bug or a system integration bug. > > I also agreed on the init-flow vs app-usage thoughts Jordan had. > That said MESA has many ways it can use this UAPI and we can discuss > that on the MESA patch. > > > hmmm.. so... ack anyone? [insert big hopeful smiley here] > apologies if I am asking too often. Assuming that: 2 = PXP feature is supported but should be ready soon (pending initialization of non-i915 system dependencies). really means, "it'll be ready soon or there is a bug somewhere", Acked-by: Jordan Justen <jordan.l.justen@intel.com> If Mesa finds that it can't really rely on that, we may have to decide on a different approach in how to use that return value. Is it possible to hold on for another ~12 hours to see if Lionel wants to Ack as well? -Jordan
diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index 2238e096c957..6f11d7eaa91a 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -5,6 +5,8 @@ #include "gem/i915_gem_mman.h" #include "gt/intel_engine_user.h" +#include "pxp/intel_pxp.h" + #include "i915_cmd_parser.h" #include "i915_drv.h" #include "i915_getparam.h" @@ -102,6 +104,11 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, if (value < 0) return value; break; + case I915_PARAM_PXP_STATUS: + value = intel_pxp_get_readiness_status(i915->pxp); + if (value < 0) + return value; + break; case I915_PARAM_MMAP_GTT_VERSION: /* Though we've started our numbering from 1, and so class all * earlier versions as 0, in effect their value is undefined as diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c index b600d68de2a4..f143eadbc253 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c @@ -358,23 +358,38 @@ void intel_pxp_end(struct intel_pxp *pxp) } /* - * the arb session is restarted from the irq work when we receive the - * termination completion interrupt + * this helper is used by both intel_pxp_start and by + * the GET_PARAM IOCTL that user space calls. Thus, the + * return values here should match the UAPI spec. */ -int intel_pxp_start(struct intel_pxp *pxp) +int intel_pxp_get_readiness_status(struct intel_pxp *pxp) { - int ret = 0; - if (!intel_pxp_is_enabled(pxp)) return -ENODEV; if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) { if (wait_for(intel_pxp_gsccs_is_ready_for_sessions(pxp), 250)) - return -ENXIO; + return 2; } else { if (wait_for(pxp_component_bound(pxp), 250)) - return -ENXIO; + return 2; } + return 1; +} + +/* + * the arb session is restarted from the irq work when we receive the + * termination completion interrupt + */ +int intel_pxp_start(struct intel_pxp *pxp) +{ + int ret = 0; + + ret = intel_pxp_get_readiness_status(pxp); + if (ret < 0) + return ret; + else if (ret > 1) + return -EIO; /* per UAPI spec, user may retry later */ mutex_lock(&pxp->arb_mutex); diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h b/drivers/gpu/drm/i915/pxp/intel_pxp.h index 3ded0890cd27..17e6dbc86b38 100644 --- a/drivers/gpu/drm/i915/pxp/intel_pxp.h +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h @@ -26,6 +26,7 @@ void intel_pxp_fini_hw(struct intel_pxp *pxp); void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp); void intel_pxp_tee_end_arb_fw_session(struct intel_pxp *pxp, u32 arb_session_id); +int intel_pxp_get_readiness_status(struct intel_pxp *pxp); int intel_pxp_start(struct intel_pxp *pxp); void intel_pxp_end(struct intel_pxp *pxp); diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index b15dd9cc2ffb..f4efbb2db0c5 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -771,6 +771,25 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57 +/* + * Query the status of PXP support in i915. + * + * The query can fail in the following scenarios with the listed error codes: + * -ENODEV = PXP support is not available on the GPU device or in the + * kernel due to missing component drivers or kernel configs. + * + * If the IOCTL is successful, the returned parameter will be set to one of + * the following values: + * 1 = PXP feature is supported and is ready for use. + * 2 = PXP feature is supported but should be ready soon (pending + * initialization of non-i915 system dependencies). + * + * NOTE: When param is supported (positive return values), user space should + * still refer to the GEM PXP context-creation UAPI header specs to be + * aware of possible failure due to system state machine at the time. + */ +#define I915_PARAM_PXP_STATUS 58 + /* Must be kept compact -- no holes and well documented */ /**
Because of the additional firmware, component-driver and initialization depedencies required on MTL platform before a PXP context can be created, UMD calling for PXP creation as a way to get-caps can take a long time. An actual real world customer stack has seen this happen in the 4-to-8 second range after the kernel starts (which sees MESA's init appear in the middle of this range as the compositor comes up). To avoid unncessary delays experienced by the UMD for get-caps purposes, add a GET_PARAM for I915_PARAM_PXP_SUPPORT. However, some failures can still occur after all the depedencies are met (such as firmware init flow failure, bios configurations or SOC fusing not allowing PXP enablement). Those scenarios will only be known to user space when it attempts creating a PXP context and is documented in the GEM UAPI headers. While making this change, create a helper that is common to both GET_PARAM caller and intel_pxp_start since the latter does similiar checks. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> --- drivers/gpu/drm/i915/i915_getparam.c | 7 +++++++ drivers/gpu/drm/i915/pxp/intel_pxp.c | 29 +++++++++++++++++++++------- drivers/gpu/drm/i915/pxp/intel_pxp.h | 1 + include/uapi/drm/i915_drm.h | 19 ++++++++++++++++++ 4 files changed, 49 insertions(+), 7 deletions(-)