Message ID | 20210216161924.1687-1-diego.viola@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drm/nouveau/pmu: fix timeout on GP108 | expand |
On Wed, Feb 17, 2021 at 1:20 AM Diego Viola <diego.viola@gmail.com> wrote: > > This code times out on GP108, probably because the BIOS puts it into a > bad state. > > Since we reset the PMU on driver load anyway, we are at no risk from > missing a response from it since we are not waiting for one to begin > with. This looks safe to me, provided indeed that the PMU's reset is not called outside of initialization (which for GP108 is shouldn't be IIRC?). > > Signed-off-by: Diego Viola <diego.viola@gmail.com> > --- > drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > index a0fe607c9c07..5c802f2d00cb 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > @@ -102,12 +102,8 @@ nvkm_pmu_reset(struct nvkm_pmu *pmu) > if (!pmu->func->enabled(pmu)) > return 0; > > - /* Inhibit interrupts, and wait for idle. */ > + /* Inhibit interrupts. */ > nvkm_wr32(device, 0x10a014, 0x0000ffff); > - nvkm_msec(device, 2000, > - if (!nvkm_rd32(device, 0x10a04c)) > - break; > - ); > > /* Reset. */ > if (pmu->func->reset) > -- > 2.30.1 >
On Wed, 17 Feb 2021 at 13:30, Alexandre Courbot <gnurou@gmail.com> wrote: > > On Wed, Feb 17, 2021 at 1:20 AM Diego Viola <diego.viola@gmail.com> wrote: > > > > This code times out on GP108, probably because the BIOS puts it into a > > bad state. > > > > Since we reset the PMU on driver load anyway, we are at no risk from > > missing a response from it since we are not waiting for one to begin > > with. > > This looks safe to me, provided indeed that the PMU's reset is not > called outside of initialization (which for GP108 is shouldn't be > IIRC?). ISTR that the PMU FW we use prior to GM200 might depend on that being there. I've posted a proposed alternate fix here[1], as we probably shouldn't have been touching PMU there anyway on those GPUs. Ben. [1] https://github.com/skeggsb/linux/commit/90224a17437b1f39dbecbb385567c1fce958f992 > > > > > Signed-off-by: Diego Viola <diego.viola@gmail.com> > > --- > > drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c | 6 +----- > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > index a0fe607c9c07..5c802f2d00cb 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > @@ -102,12 +102,8 @@ nvkm_pmu_reset(struct nvkm_pmu *pmu) > > if (!pmu->func->enabled(pmu)) > > return 0; > > > > - /* Inhibit interrupts, and wait for idle. */ > > + /* Inhibit interrupts. */ > > nvkm_wr32(device, 0x10a014, 0x0000ffff); > > - nvkm_msec(device, 2000, > > - if (!nvkm_rd32(device, 0x10a04c)) > > - break; > > - ); > > > > /* Reset. */ > > if (pmu->func->reset) > > -- > > 2.30.1 > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi Ben, On Wed, Feb 24, 2021 at 6:50 AM Ben Skeggs <skeggsb@gmail.com> wrote: > > On Wed, 17 Feb 2021 at 13:30, Alexandre Courbot <gnurou@gmail.com> wrote: > > > > On Wed, Feb 17, 2021 at 1:20 AM Diego Viola <diego.viola@gmail.com> wrote: > > > > > > This code times out on GP108, probably because the BIOS puts it into a > > > bad state. > > > > > > Since we reset the PMU on driver load anyway, we are at no risk from > > > missing a response from it since we are not waiting for one to begin > > > with. > > > > This looks safe to me, provided indeed that the PMU's reset is not > > called outside of initialization (which for GP108 is shouldn't be > > IIRC?). > ISTR that the PMU FW we use prior to GM200 might depend on that being there. > > I've posted a proposed alternate fix here[1], as we probably shouldn't > have been touching PMU there anyway on those GPUs. > > Ben. > > [1] https://github.com/skeggsb/linux/commit/90224a17437b1f39dbecbb385567c1fce958f992 > > > > > > > > > Signed-off-by: Diego Viola <diego.viola@gmail.com> > > > --- > > > drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c | 6 +----- > > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > index a0fe607c9c07..5c802f2d00cb 100644 > > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > @@ -102,12 +102,8 @@ nvkm_pmu_reset(struct nvkm_pmu *pmu) > > > if (!pmu->func->enabled(pmu)) > > > return 0; > > > > > > - /* Inhibit interrupts, and wait for idle. */ > > > + /* Inhibit interrupts. */ > > > nvkm_wr32(device, 0x10a014, 0x0000ffff); > > > - nvkm_msec(device, 2000, > > > - if (!nvkm_rd32(device, 0x10a04c)) > > > - break; > > > - ); > > > > > > /* Reset. */ > > > if (pmu->func->reset) > > > -- > > > 2.30.1 > > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel I tested your patch and can confirm that the timeout is gone after booting my system, but unfortunately it's back after doing a suspend/resume. Any ideas about that? Thanks, Diego
Hi Ben, I can confirm that your last two patches[0][1] fix the timeout issues (those from a normal boot and from suspend/resume). [0] https://github.com/skeggsb/linux/commit/90224a17437b1f39dbecbb385567c1fce958f992 [1] https://github.com/skeggsb/linux/commit/0ee6dc49601359042fd254bbd8ba6b4685b4d0d7 Tested-by: Diego Viola <diego.viola@gmail.com> on both patches. Thanks, I appreciate it a lot. Diego On Wed, Feb 24, 2021 at 6:50 AM Ben Skeggs <skeggsb@gmail.com> wrote: > > On Wed, 17 Feb 2021 at 13:30, Alexandre Courbot <gnurou@gmail.com> wrote: > > > > On Wed, Feb 17, 2021 at 1:20 AM Diego Viola <diego.viola@gmail.com> wrote: > > > > > > This code times out on GP108, probably because the BIOS puts it into a > > > bad state. > > > > > > Since we reset the PMU on driver load anyway, we are at no risk from > > > missing a response from it since we are not waiting for one to begin > > > with. > > > > This looks safe to me, provided indeed that the PMU's reset is not > > called outside of initialization (which for GP108 is shouldn't be > > IIRC?). > ISTR that the PMU FW we use prior to GM200 might depend on that being there. > > I've posted a proposed alternate fix here[1], as we probably shouldn't > have been touching PMU there anyway on those GPUs. > > Ben. > > [1] https://github.com/skeggsb/linux/commit/90224a17437b1f39dbecbb385567c1fce958f992 > > > > > > > > > Signed-off-by: Diego Viola <diego.viola@gmail.com> > > > --- > > > drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c | 6 +----- > > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > index a0fe607c9c07..5c802f2d00cb 100644 > > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > @@ -102,12 +102,8 @@ nvkm_pmu_reset(struct nvkm_pmu *pmu) > > > if (!pmu->func->enabled(pmu)) > > > return 0; > > > > > > - /* Inhibit interrupts, and wait for idle. */ > > > + /* Inhibit interrupts. */ > > > nvkm_wr32(device, 0x10a014, 0x0000ffff); > > > - nvkm_msec(device, 2000, > > > - if (!nvkm_rd32(device, 0x10a04c)) > > > - break; > > > - ); > > > > > > /* Reset. */ > > > if (pmu->func->reset) > > > -- > > > 2.30.1 > > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Thu, Feb 25, 2021 at 2:22 AM Diego Viola <diego.viola@gmail.com> wrote: > > Hi Ben, > > I can confirm that your last two patches[0][1] fix the timeout issues > (those from a normal boot and from suspend/resume). > > [0] https://github.com/skeggsb/linux/commit/90224a17437b1f39dbecbb385567c1fce958f992 > [1] https://github.com/skeggsb/linux/commit/0ee6dc49601359042fd254bbd8ba6b4685b4d0d7 > > Tested-by: Diego Viola <diego.viola@gmail.com> > > on both patches. > > Thanks, I appreciate it a lot. > > Diego > > On Wed, Feb 24, 2021 at 6:50 AM Ben Skeggs <skeggsb@gmail.com> wrote: > > > > On Wed, 17 Feb 2021 at 13:30, Alexandre Courbot <gnurou@gmail.com> wrote: > > > > > > On Wed, Feb 17, 2021 at 1:20 AM Diego Viola <diego.viola@gmail.com> wrote: > > > > > > > > This code times out on GP108, probably because the BIOS puts it into a > > > > bad state. > > > > > > > > Since we reset the PMU on driver load anyway, we are at no risk from > > > > missing a response from it since we are not waiting for one to begin > > > > with. > > > > > > This looks safe to me, provided indeed that the PMU's reset is not > > > called outside of initialization (which for GP108 is shouldn't be > > > IIRC?). > > ISTR that the PMU FW we use prior to GM200 might depend on that being there. > > > > I've posted a proposed alternate fix here[1], as we probably shouldn't > > have been touching PMU there anyway on those GPUs. > > > > Ben. > > > > [1] https://github.com/skeggsb/linux/commit/90224a17437b1f39dbecbb385567c1fce958f992 > > > > > > > > > > > > > Signed-off-by: Diego Viola <diego.viola@gmail.com> > > > > --- > > > > drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c | 6 +----- > > > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > > index a0fe607c9c07..5c802f2d00cb 100644 > > > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c > > > > @@ -102,12 +102,8 @@ nvkm_pmu_reset(struct nvkm_pmu *pmu) > > > > if (!pmu->func->enabled(pmu)) > > > > return 0; > > > > > > > > - /* Inhibit interrupts, and wait for idle. */ > > > > + /* Inhibit interrupts. */ > > > > nvkm_wr32(device, 0x10a014, 0x0000ffff); > > > > - nvkm_msec(device, 2000, > > > > - if (!nvkm_rd32(device, 0x10a04c)) > > > > - break; > > > > - ); > > > > > > > > /* Reset. */ > > > > if (pmu->func->reset) > > > > -- > > > > 2.30.1 > > > > > > > _______________________________________________ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel Ugh, sorry for breaking the regular email flow/order. Damn you gmail! Regards, Diego
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c index a0fe607c9c07..5c802f2d00cb 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c @@ -102,12 +102,8 @@ nvkm_pmu_reset(struct nvkm_pmu *pmu) if (!pmu->func->enabled(pmu)) return 0; - /* Inhibit interrupts, and wait for idle. */ + /* Inhibit interrupts. */ nvkm_wr32(device, 0x10a014, 0x0000ffff); - nvkm_msec(device, 2000, - if (!nvkm_rd32(device, 0x10a04c)) - break; - ); /* Reset. */ if (pmu->func->reset)
This code times out on GP108, probably because the BIOS puts it into a bad state. Since we reset the PMU on driver load anyway, we are at no risk from missing a response from it since we are not waiting for one to begin with. Signed-off-by: Diego Viola <diego.viola@gmail.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)