Message ID | 1502458107-28852-1-git-send-email-shashank.sharma@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 2017-08-11 at 18:58 +0530, Shashank Sharma wrote: > It's an observation during some CI tests that few LSPCON chips > respond slow while system is under load, and need some delay Thanks for the patch. Can you please explain what you mean by load here? I can't follow why an external chip would respond slow if the system is under load. > while reading current mode status using i2c-over-aux channel. > Is there a fdo bug where I can see more CI logs? The only log I saw has this pattern around the time where this error occurs. I am not sure how GPU hangs affect AUX transactions. But, I think we should investigate/fix the underlying problem instead of adding delays. [ 78.758510] i915 0000:00:02.0: Resetting vcs0 after gpu hang [ 78.758922] [drm:i915_gem_reset_engine [i915]] context kms_busy[1372]/0 marked guilty (score 19) banned? no [ 78.758941] [drm:i915_gem_reset_engine [i915]] resetting vcs0 to restart from tail of request 0x62 [ 78.758990] [drm:gen8_init_common_ring [i915]] Execlists enabled for vcs0 [ 78.759014] [drm:gen8_init_common_ring [i915]] Restarting vcs0:0 from 0x62 [ 78.759482] [drm:intel_power_well_enable [i915]] enabling always-on [ 78.759499] [drm:intel_power_well_enable [i915]] enabling DC off [ 78.759908] [drm:gen9_set_dc_state [i915]] Setting DC state from 02 to 00 [ 78.759932] [drm:intel_power_well_enable [i915]] enabling power well 2 [ 78.759993] [drm:intel_atomic_commit_tail [i915]] [ENCODER:57:DDI B] [ 78.760021] [drm:intel_atomic_commit_tail [i915]] [ENCODER:59:DP-MST A] [ 78.760079] [drm:intel_atomic_commit_tail [i915]] [ENCODER:60:DP-MST B] [ 78.760107] [drm:intel_atomic_commit_tail [i915]] [ENCODER:61:DP-MST C] [ 78.760136] [drm:intel_atomic_commit_tail [i915]] [ENCODER:64:DDI C] [ 78.760163] [drm:intel_atomic_commit_tail [i915]] [ENCODER:66:DP-MST A] [ 78.760191] [drm:intel_atomic_commit_tail [i915]] [ENCODER:67:DP-MST B] [ 78.760218] [drm:intel_atomic_commit_tail [i915]] [ENCODER:68:DP-MST C] [ 78.760246] [drm:verify_single_dpll_state.isra.70 [i915]] DPLL 0 [ 78.760275] [drm:verify_single_dpll_state.isra.70 [i915]] DPLL 1 [ 78.760304] [drm:verify_single_dpll_state.isra.70 [i915]] DPLL 2 [ 78.760332] [drm:verify_single_dpll_state.isra.70 [i915]] DPLL 3 [ 78.760362] [drm:intel_enable_shared_dpll [i915]] enable DPLL 1 (active 1, on? 0) for crtc 36 [ 78.760390] [drm:intel_enable_shared_dpll [i915]] enabling DPLL 1 [ 78.762529] [drm:intel_power_well_enable [i915]] enabling DDI B IO power well [ 78.763146] [drm:drm_dp_i2c_do_msg] native defer [ 78.764268] [drm:drm_dp_i2c_do_msg] native defer [ 78.765735] [drm:drm_dp_i2c_do_msg] native defer [ 78.766794] [drm:drm_dp_i2c_do_msg] native defer [ 78.767853] [drm:drm_dp_i2c_do_msg] native defer [ 78.768982] [drm:drm_dp_i2c_do_msg] native defer [ 78.770112] [drm:drm_dp_i2c_do_msg] native defer [ 78.771240] [drm:drm_dp_i2c_do_msg] native defer [ 78.772371] [drm:drm_dp_i2c_do_msg] native defer [ 78.773068] [drm:drm_dp_i2c_do_msg] too many retries, giving up [ 78.773507] [drm:drm_dp_i2c_do_msg] native defer [ 78.774531] [drm:drm_dp_i2c_do_msg] native defer [ 78.775629] [drm:drm_dp_i2c_do_msg] native defer [ 78.776668] [drm:drm_dp_i2c_do_msg] native defer [ 78.777710] [drm:drm_dp_i2c_do_msg] native defer [ 78.778810] [drm:drm_dp_i2c_do_msg] native defer [ 78.779907] [drm:drm_dp_i2c_do_msg] native defer [ 78.780585] [drm:drm_dp_i2c_do_msg] too many retries, giving up [ 78.780589] [drm:drm_lspcon_get_mode] *ERROR* LSPCON read(0x80, 0x41) failed [ 78.780621] [drm:lspcon_wait_mode [i915]] *ERROR* Error reading LSPCON mode [ 78.780646] [drm:lspcon_wait_mode [i915]] Current LSPCON mode INVALID > This patch: > - Adds few retries and delays before declaring a read > failure from LSPCON hardware. > - Changes the debug level of the print from ERROR->DEBUG > whereas another patch in I915 will add an ERROR message > from the caller when we have timed out all our limits. > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com> > Cc: Imre Deak <imre.deak@intel.com> > Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> > --- > drivers/gpu/drm/drm_dp_dual_mode_helper.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/drm_dp_dual_mode_helper.c > index 80e62f6..c63eac8 100644 > --- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c > +++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c > @@ -409,6 +409,7 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, > enum drm_lspcon_mode *mode) > { > u8 data; > + u8 retry = 5; > int ret = 0; > > if (!mode) { > @@ -417,10 +418,17 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, > } > > /* Read Status: i2c over aux */ > - ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE, > - &data, sizeof(data)); > + do { > + ret = drm_dp_dual_mode_read(adapter, > + DP_DUAL_MODE_LSPCON_CURRENT_MODE, > + &data, sizeof(data)); > + if (!ret || !retry--) > + break; > + usleep_range(500, 1000); Are these numbers from a spec? I think it'd be pertinent to mention the source. > + } while (1); > + > if (ret < 0) { > - DRM_ERROR("LSPCON read(0x80, 0x41) failed\n"); > + DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n"); > return -EFAULT; > } >
On Fri, Aug 11, 2017 at 06:58:26PM +0530, Shashank Sharma wrote: > It's an observation during some CI tests that few LSPCON chips > respond slow while system is under load, and need some delay > while reading current mode status using i2c-over-aux channel. > > This patch: > - Adds few retries and delays before declaring a read > failure from LSPCON hardware. > - Changes the debug level of the print from ERROR->DEBUG > whereas another patch in I915 will add an ERROR message > from the caller when we have timed out all our limits. Hm yea, I was wondering if this is the same issue we saw earlier due to HPD not being asserted. But looking at the logs it's not the case. After HPD is asserted it really takes this much time (~36ms) for the adaptor to respond. This is against the DP 1.4 spec which specifies a 20ms maximum wake up time, but what can you do. > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com> > Cc: Imre Deak <imre.deak@intel.com> > Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> > --- > drivers/gpu/drm/drm_dp_dual_mode_helper.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/drm_dp_dual_mode_helper.c > index 80e62f6..c63eac8 100644 > --- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c > +++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c > @@ -409,6 +409,7 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, > enum drm_lspcon_mode *mode) > { > u8 data; > + u8 retry = 5; Nitpick: no reason for a specific type so just int? > int ret = 0; > > if (!mode) { > @@ -417,10 +418,17 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, > } > > /* Read Status: i2c over aux */ > - ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE, > - &data, sizeof(data)); > + do { Nitpick: a for loop whenever possible is generally clearer. With the above optionally changed: Reviewed-by: Imre Deak <imre.deak@intel.com> > + ret = drm_dp_dual_mode_read(adapter, > + DP_DUAL_MODE_LSPCON_CURRENT_MODE, > + &data, sizeof(data)); > + if (!ret || !retry--) > + break; > + usleep_range(500, 1000); > + } while (1); > + > if (ret < 0) { > - DRM_ERROR("LSPCON read(0x80, 0x41) failed\n"); > + DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n"); > return -EFAULT; > } > > -- > 2.7.4 >
Thanks for the review, Imre. My comments, inline. Regards Shashank On 8/16/2017 7:35 PM, Imre Deak wrote: > On Fri, Aug 11, 2017 at 06:58:26PM +0530, Shashank Sharma wrote: >> It's an observation during some CI tests that few LSPCON chips >> respond slow while system is under load, and need some delay >> while reading current mode status using i2c-over-aux channel. >> >> This patch: >> - Adds few retries and delays before declaring a read >> failure from LSPCON hardware. >> - Changes the debug level of the print from ERROR->DEBUG >> whereas another patch in I915 will add an ERROR message >> from the caller when we have timed out all our limits. > Hm yea, I was wondering if this is the same issue we saw earlier due to > HPD not being asserted. But looking at the logs it's not the case. After > HPD is asserted it really takes this much time (~36ms) for the adaptor > to respond. This is against the DP 1.4 spec which specifies a 20ms > maximum wake up time, but what can you do. > >> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> >> Cc: Imre Deak <imre.deak@intel.com> >> Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> >> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> >> --- >> drivers/gpu/drm/drm_dp_dual_mode_helper.c | 14 +++++++++++--- >> 1 file changed, 11 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/drm_dp_dual_mode_helper.c >> index 80e62f6..c63eac8 100644 >> --- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c >> +++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c >> @@ -409,6 +409,7 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, >> enum drm_lspcon_mode *mode) >> { >> u8 data; >> + u8 retry = 5; > Nitpick: no reason for a specific type so just int? Actually, was trying to save 3 bytes, as I knew max value for retry would be 5 < 255, but might be too much optimization ? >> int ret = 0; >> >> if (!mode) { >> @@ -417,10 +418,17 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, >> } >> >> /* Read Status: i2c over aux */ >> - ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE, >> - &data, sizeof(data)); >> + do { > Nitpick: a for loop whenever possible is generally clearer. Sorry, I dint understand this comment, little more help required :) ? - Shashank > > With the above optionally changed: > Reviewed-by: Imre Deak <imre.deak@intel.com> > >> + ret = drm_dp_dual_mode_read(adapter, >> + DP_DUAL_MODE_LSPCON_CURRENT_MODE, >> + &data, sizeof(data)); >> + if (!ret || !retry--) >> + break; >> + usleep_range(500, 1000); >> + } while (1); >> + >> if (ret < 0) { >> - DRM_ERROR("LSPCON read(0x80, 0x41) failed\n"); >> + DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n"); >> return -EFAULT; >> } >> >> -- >> 2.7.4 >>
On Wed, Aug 16, 2017 at 09:18:58PM +0530, Sharma, Shashank wrote: > Thanks for the review, Imre. > > My comments, inline. > > Regards > Shashank > On 8/16/2017 7:35 PM, Imre Deak wrote: > > On Fri, Aug 11, 2017 at 06:58:26PM +0530, Shashank Sharma wrote: > > > It's an observation during some CI tests that few LSPCON chips > > > respond slow while system is under load, and need some delay > > > while reading current mode status using i2c-over-aux channel. > > > > > > This patch: > > > - Adds few retries and delays before declaring a read > > > failure from LSPCON hardware. > > > - Changes the debug level of the print from ERROR->DEBUG > > > whereas another patch in I915 will add an ERROR message > > > from the caller when we have timed out all our limits. > > Hm yea, I was wondering if this is the same issue we saw earlier due to > > HPD not being asserted. But looking at the logs it's not the case. After > > HPD is asserted it really takes this much time (~36ms) for the adaptor > > to respond. This is against the DP 1.4 spec which specifies a 20ms > > maximum wake up time, but what can you do. > > > > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com> > > > Cc: Imre Deak <imre.deak@intel.com> > > > Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> > > > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> > > > --- > > > drivers/gpu/drm/drm_dp_dual_mode_helper.c | 14 +++++++++++--- > > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/drm_dp_dual_mode_helper.c > > > index 80e62f6..c63eac8 100644 > > > --- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c > > > +++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c > > > @@ -409,6 +409,7 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, > > > enum drm_lspcon_mode *mode) > > > { > > > u8 data; > > > + u8 retry = 5; > > Nitpick: no reason for a specific type so just int? > Actually, was trying to save 3 bytes, as I knew max value for retry would be > 5 < 255, but might be too much optimization ? Yes, imo it's odd to use u8 for simple loop counter. The compiler may even generate some extra code to zero extend or truncate the result. > > > int ret = 0; > > > if (!mode) { > > > @@ -417,10 +418,17 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, > > > } > > > /* Read Status: i2c over aux */ > > > - ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE, > > > - &data, sizeof(data)); > > > + do { > > Nitpick: a for loop whenever possible is generally clearer. > Sorry, I dint understand this comment, little more help required :) ? for (retry = 0; retry < 5; retry++) ... would be clearer imo. --Imre > - Shashank > > > > With the above optionally changed: > > Reviewed-by: Imre Deak <imre.deak@intel.com> > > > > > + ret = drm_dp_dual_mode_read(adapter, > > > + DP_DUAL_MODE_LSPCON_CURRENT_MODE, > > > + &data, sizeof(data)); > > > + if (!ret || !retry--) > > > + break; > > > + usleep_range(500, 1000); > > > + } while (1); > > > + > > > if (ret < 0) { > > > - DRM_ERROR("LSPCON read(0x80, 0x41) failed\n"); > > > + DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n"); > > > return -EFAULT; > > > } > > > -- > > > 2.7.4 > > > >
Regards Shashank On 8/16/2017 9:42 PM, Imre Deak wrote: > On Wed, Aug 16, 2017 at 09:18:58PM +0530, Sharma, Shashank wrote: >> Thanks for the review, Imre. >> >> My comments, inline. >> >> Regards >> Shashank >> On 8/16/2017 7:35 PM, Imre Deak wrote: >>> On Fri, Aug 11, 2017 at 06:58:26PM +0530, Shashank Sharma wrote: >>>> It's an observation during some CI tests that few LSPCON chips >>>> respond slow while system is under load, and need some delay >>>> while reading current mode status using i2c-over-aux channel. >>>> >>>> This patch: >>>> - Adds few retries and delays before declaring a read >>>> failure from LSPCON hardware. >>>> - Changes the debug level of the print from ERROR->DEBUG >>>> whereas another patch in I915 will add an ERROR message >>>> from the caller when we have timed out all our limits. >>> Hm yea, I was wondering if this is the same issue we saw earlier due to >>> HPD not being asserted. But looking at the logs it's not the case. After >>> HPD is asserted it really takes this much time (~36ms) for the adaptor >>> to respond. This is against the DP 1.4 spec which specifies a 20ms >>> maximum wake up time, but what can you do. >>> >>>> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> >>>> Cc: Imre Deak <imre.deak@intel.com> >>>> Signed-off-by: Shashank Sharma <shashank.sharma@intel.com> >>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com> >>>> --- >>>> drivers/gpu/drm/drm_dp_dual_mode_helper.c | 14 +++++++++++--- >>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/drm_dp_dual_mode_helper.c >>>> index 80e62f6..c63eac8 100644 >>>> --- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c >>>> +++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c >>>> @@ -409,6 +409,7 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, >>>> enum drm_lspcon_mode *mode) >>>> { >>>> u8 data; >>>> + u8 retry = 5; >>> Nitpick: no reason for a specific type so just int? >> Actually, was trying to save 3 bytes, as I knew max value for retry would be >> 5 < 255, but might be too much optimization ? > Yes, imo it's odd to use u8 for simple loop counter. The compiler may > even generate some extra code to zero extend or truncate the result. cool, so int it is. >>>> int ret = 0; >>>> if (!mode) { >>>> @@ -417,10 +418,17 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, >>>> } >>>> /* Read Status: i2c over aux */ >>>> - ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE, >>>> - &data, sizeof(data)); >>>> + do { >>> Nitpick: a for loop whenever possible is generally clearer. >> Sorry, I dint understand this comment, little more help required :) ? > for (retry = 0; retry < 5; retry++) ... > > would be clearer imo. Sure, will do it. > --Imre > >> - Shashank >>> With the above optionally changed: >>> Reviewed-by: Imre Deak <imre.deak@intel.com> >>> >>>> + ret = drm_dp_dual_mode_read(adapter, >>>> + DP_DUAL_MODE_LSPCON_CURRENT_MODE, >>>> + &data, sizeof(data)); >>>> + if (!ret || !retry--) >>>> + break; >>>> + usleep_range(500, 1000); >>>> + } while (1); >>>> + >>>> if (ret < 0) { >>>> - DRM_ERROR("LSPCON read(0x80, 0x41) failed\n"); >>>> + DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n"); >>>> return -EFAULT; >>>> } >>>> -- >>>> 2.7.4 >>>>
diff --git a/drivers/gpu/drm/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/drm_dp_dual_mode_helper.c index 80e62f6..c63eac8 100644 --- a/drivers/gpu/drm/drm_dp_dual_mode_helper.c +++ b/drivers/gpu/drm/drm_dp_dual_mode_helper.c @@ -409,6 +409,7 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, enum drm_lspcon_mode *mode) { u8 data; + u8 retry = 5; int ret = 0; if (!mode) { @@ -417,10 +418,17 @@ int drm_lspcon_get_mode(struct i2c_adapter *adapter, } /* Read Status: i2c over aux */ - ret = drm_dp_dual_mode_read(adapter, DP_DUAL_MODE_LSPCON_CURRENT_MODE, - &data, sizeof(data)); + do { + ret = drm_dp_dual_mode_read(adapter, + DP_DUAL_MODE_LSPCON_CURRENT_MODE, + &data, sizeof(data)); + if (!ret || !retry--) + break; + usleep_range(500, 1000); + } while (1); + if (ret < 0) { - DRM_ERROR("LSPCON read(0x80, 0x41) failed\n"); + DRM_DEBUG_KMS("LSPCON read(0x80, 0x41) failed\n"); return -EFAULT; }