Message ID | 20200330113302.1670-1-michal.wajdeczko@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/huc: Add more errors for I915_PARAM_HUC_STATUS | expand |
Quoting Michal Wajdeczko (2020-03-30 12:33:02) > There might be many reasons why we failed to successfully > load and authenticate HuC firmware, but today we only use > single error in case of no HuC hardware. Add some more > error codes for most common cases (disabled, not installed, > corrupted or mismatched firmware). > > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > Cc: Tony Ye <tony.ye@intel.com> > Cc: Robert M. Fosha <robert.m.fosha@intel.com> > --- > drivers/gpu/drm/i915/gt/uc/intel_huc.c | 22 ++++++++++++++++++---- > 1 file changed, 18 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > index d6097b46600c..1e8073ec343f 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > @@ -200,9 +200,13 @@ int intel_huc_auth(struct intel_huc *huc) > * This function reads status register to verify if HuC > * firmware was successfully loaded. > * > - * Returns: 1 if HuC firmware is loaded and verified, > - * 0 if HuC firmware is not loaded and -ENODEV if HuC > - * is not present on this platform. > + * Returns: > + * * -ENODEV if HuC is not present on this platform, > + * * -EOPNOTSUPP if HuC firmware is disabled, > + * * -ENOPKG if HuC firmware was not installed, > + * * -ENOEXEC if HuC firmware is invalid or mismatched, > + * * 0 if HuC firmware is not running, > + * * 1 if HuC firmware is authenticated and running. > */ > int intel_huc_check_status(struct intel_huc *huc) > { > @@ -210,8 +214,18 @@ int intel_huc_check_status(struct intel_huc *huc) > intel_wakeref_t wakeref; > u32 status = 0; > > - if (!intel_huc_is_supported(huc)) > + switch (__intel_uc_fw_status(&huc->fw)) { > + case INTEL_UC_FIRMWARE_NOT_SUPPORTED: > return -ENODEV; No HW support. > + case INTEL_UC_FIRMWARE_DISABLED: > + return -EOPNOTSUPP; Override by user [sysadmin] > + case INTEL_UC_FIRMWARE_MISSING: > + return -ENOPKG; FILENOTFOUND. > + case INTEL_UC_FIRMWARE_ERROR: > + return -ENOEXEC; File corruption. There's nothing else between us loading the fw and the huc rejecting it? FIRMWARE_FAIL? That's set as the opposite of FIRMWARE_TRANSFERRED in that we failed to upload the image to the HW. The firmware itself hasn't had a chance to run. case INTEL_UC_FIRMWARE_FAIL: return -ENXIO; Or is that being overridden to FIRMWARE_ERROR? Other than the question of whether there's one more step before the fw is being run [and then able to set HUC_STATUS as it determines for itself], Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
On 30.03.2020 14:28, Chris Wilson wrote: > Quoting Michal Wajdeczko (2020-03-30 12:33:02) >> There might be many reasons why we failed to successfully >> load and authenticate HuC firmware, but today we only use >> single error in case of no HuC hardware. Add some more >> error codes for most common cases (disabled, not installed, >> corrupted or mismatched firmware). >> >> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> >> Cc: Chris Wilson <chris@chris-wilson.co.uk> >> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >> Cc: Tony Ye <tony.ye@intel.com> >> Cc: Robert M. Fosha <robert.m.fosha@intel.com> >> --- >> drivers/gpu/drm/i915/gt/uc/intel_huc.c | 22 ++++++++++++++++++---- >> 1 file changed, 18 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c >> index d6097b46600c..1e8073ec343f 100644 >> --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c >> @@ -200,9 +200,13 @@ int intel_huc_auth(struct intel_huc *huc) >> * This function reads status register to verify if HuC >> * firmware was successfully loaded. >> * >> - * Returns: 1 if HuC firmware is loaded and verified, >> - * 0 if HuC firmware is not loaded and -ENODEV if HuC >> - * is not present on this platform. >> + * Returns: >> + * * -ENODEV if HuC is not present on this platform, >> + * * -EOPNOTSUPP if HuC firmware is disabled, >> + * * -ENOPKG if HuC firmware was not installed, >> + * * -ENOEXEC if HuC firmware is invalid or mismatched, >> + * * 0 if HuC firmware is not running, >> + * * 1 if HuC firmware is authenticated and running. >> */ >> int intel_huc_check_status(struct intel_huc *huc) >> { >> @@ -210,8 +214,18 @@ int intel_huc_check_status(struct intel_huc *huc) >> intel_wakeref_t wakeref; >> u32 status = 0; >> >> - if (!intel_huc_is_supported(huc)) >> + switch (__intel_uc_fw_status(&huc->fw)) { >> + case INTEL_UC_FIRMWARE_NOT_SUPPORTED: >> return -ENODEV; > > No HW support. > >> + case INTEL_UC_FIRMWARE_DISABLED: >> + return -EOPNOTSUPP; > > Override by user [sysadmin] > >> + case INTEL_UC_FIRMWARE_MISSING: >> + return -ENOPKG; > > FILENOTFOUND. > >> + case INTEL_UC_FIRMWARE_ERROR: >> + return -ENOEXEC; > > File corruption. > > There's nothing else between us loading the fw and the huc rejecting > it? > > FIRMWARE_FAIL? That's set as the opposite of FIRMWARE_TRANSFERRED in > that we failed to upload the image to the HW. The firmware itself hasn't > had a chance to run. > > case INTEL_UC_FIRMWARE_FAIL: > return -ENXIO; > > Or is that being overridden to FIRMWARE_ERROR? No, it's not overridden by FIRMWARE_ERROR (since we use FIRMWARE_ERROR as final state, while with FIRMWARE_FAIL there is a chance for recovery during reset) Also note that FIRMWARE_FAIL case is covered by the register check that we have below, which provides HuC runtime status. And if we decide to use FIRMWARE_FAIL to report -ENXIO, then it is unlikely that we will ever report 0 again for any other fw error that could prevent fw from successful load (now recall your and Joonas position that this param shall stay as reflection of register read). Michal ps. on other hand, if we trust our uc_fw_status() then we can drop that register read, finally decouple GET_PARAM from MMIO_READ and fully rely on cached status: case INTEL_UC_FIRMWARE_RUNNING: return 1; default: return 0; see [1] for my earlier attempt, before uc_fw.status was added [1] https://patchwork.freedesktop.org/patch/306179/?series=60928&rev=1 > > Other than the question of whether there's one more step before the fw > is being run [and then able to set HUC_STATUS as it determines for > itself], > > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > -Chris >
Quoting Michal Wajdeczko (2020-03-30 15:02:53) > > > On 30.03.2020 14:28, Chris Wilson wrote: > > There's nothing else between us loading the fw and the huc rejecting > > it? > > > > FIRMWARE_FAIL? That's set as the opposite of FIRMWARE_TRANSFERRED in > > that we failed to upload the image to the HW. The firmware itself hasn't > > had a chance to run. > > > > case INTEL_UC_FIRMWARE_FAIL: > > return -ENXIO; > > > > Or is that being overridden to FIRMWARE_ERROR? > > No, it's not overridden by FIRMWARE_ERROR (since we use FIRMWARE_ERROR > as final state, while with FIRMWARE_FAIL there is a chance for recovery > during reset) > > Also note that FIRMWARE_FAIL case is covered by the register check that > we have below, which provides HuC runtime status. Yes, if it only reports on the auth failure. > And if we decide to use FIRMWARE_FAIL to report -ENXIO, then it is > unlikely that we will ever report 0 again for any other fw error that > could prevent fw from successful load (now recall your and Joonas > position that this param shall stay as reflection of register read). > > Michal > > ps. on other hand, if we trust our uc_fw_status() then we can drop that > register read, finally decouple GET_PARAM from MMIO_READ and fully rely > on cached status: imo, that register read is the icing on the cake. We can tell whether the FW got to the HW, but we can't tell if the HW was truly happy with the FW without asking it. I look at it as exposing an interface for the final capability bits to userspace that the kernel does not have to understand, that go above and beyond the kernel loading the firmware and confirming execution. -Chris
On 30.03.2020 16:12, Chris Wilson wrote: > Quoting Michal Wajdeczko (2020-03-30 15:02:53) >> >> >> On 30.03.2020 14:28, Chris Wilson wrote: >>> There's nothing else between us loading the fw and the huc rejecting >>> it? >>> >>> FIRMWARE_FAIL? That's set as the opposite of FIRMWARE_TRANSFERRED in >>> that we failed to upload the image to the HW. The firmware itself hasn't >>> had a chance to run. >>> >>> case INTEL_UC_FIRMWARE_FAIL: >>> return -ENXIO; >>> >>> Or is that being overridden to FIRMWARE_ERROR? >> >> No, it's not overridden by FIRMWARE_ERROR (since we use FIRMWARE_ERROR >> as final state, while with FIRMWARE_FAIL there is a chance for recovery >> during reset) >> >> Also note that FIRMWARE_FAIL case is covered by the register check that >> we have below, which provides HuC runtime status. > > Yes, if it only reports on the auth failure. > >> And if we decide to use FIRMWARE_FAIL to report -ENXIO, then it is >> unlikely that we will ever report 0 again for any other fw error that >> could prevent fw from successful load (now recall your and Joonas >> position that this param shall stay as reflection of register read). >> >> Michal >> >> ps. on other hand, if we trust our uc_fw_status() then we can drop that >> register read, finally decouple GET_PARAM from MMIO_READ and fully rely >> on cached status: > > imo, that register read is the icing on the cake. We can tell whether > the FW got to the HW, but we can't tell if the HW was truly happy with > the FW without asking it. > > I look at it as exposing an interface for the final capability bits to > userspace that the kernel does not have to understand, that go above and > beyond the kernel loading the firmware and confirming execution. note that kernel already asked HW in intel_huc_auth() for FW status and based on that info changed our cached fw status to RUNNING if and only if HW was happy with that FW (and that shall not change until reset) Michal
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c index d6097b46600c..1e8073ec343f 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c @@ -200,9 +200,13 @@ int intel_huc_auth(struct intel_huc *huc) * This function reads status register to verify if HuC * firmware was successfully loaded. * - * Returns: 1 if HuC firmware is loaded and verified, - * 0 if HuC firmware is not loaded and -ENODEV if HuC - * is not present on this platform. + * Returns: + * * -ENODEV if HuC is not present on this platform, + * * -EOPNOTSUPP if HuC firmware is disabled, + * * -ENOPKG if HuC firmware was not installed, + * * -ENOEXEC if HuC firmware is invalid or mismatched, + * * 0 if HuC firmware is not running, + * * 1 if HuC firmware is authenticated and running. */ int intel_huc_check_status(struct intel_huc *huc) { @@ -210,8 +214,18 @@ int intel_huc_check_status(struct intel_huc *huc) intel_wakeref_t wakeref; u32 status = 0; - if (!intel_huc_is_supported(huc)) + switch (__intel_uc_fw_status(&huc->fw)) { + case INTEL_UC_FIRMWARE_NOT_SUPPORTED: return -ENODEV; + case INTEL_UC_FIRMWARE_DISABLED: + return -EOPNOTSUPP; + case INTEL_UC_FIRMWARE_MISSING: + return -ENOPKG; + case INTEL_UC_FIRMWARE_ERROR: + return -ENOEXEC; + default: + break; + } with_intel_runtime_pm(gt->uncore->rpm, wakeref) status = intel_uncore_read(gt->uncore, huc->status.reg);
There might be many reasons why we failed to successfully load and authenticate HuC firmware, but today we only use single error in case of no HuC hardware. Add some more error codes for most common cases (disabled, not installed, corrupted or mismatched firmware). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tony Ye <tony.ye@intel.com> Cc: Robert M. Fosha <robert.m.fosha@intel.com> --- drivers/gpu/drm/i915/gt/uc/intel_huc.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-)