Message ID | 1596871502-3432-1-git-send-email-yangtiezhu@loongson.cn (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64 | expand |
On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: > Loongson processors have a writecombine issue that maybe failed to > write back framebuffer used with ATI Radeon or AMD GPU at times, > after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine > mapping for MIPS"), there exists some errors such as blurred screen > and lockup, and so on. > > Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for > Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any > influence on the other platforms. well it's not my call to take or reject this patch, but I already indicated it might be better to disable writecombine on the CPU detection side (or do you have other devices where writecombining works ?). Something like below will disbale it for all loongson64 CPUs. If you now find out where it works and where it doesn't, you can even reduce it to the required minium of affected CPUs. Thomas. diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c index def1659fe262..cdd87009e931 100644 --- a/arch/mips/kernel/cpu-probe.c +++ b/arch/mips/kernel/cpu-probe.c @@ -2043,7 +2043,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu) set_isa(c, MIPS_CPU_ISA_M64R2); break; } - c->writecombine = _CACHE_UNCACHED_ACCELERATED; c->ases |= (MIPS_ASE_LOONGSON_MMI | MIPS_ASE_LOONGSON_EXT | MIPS_ASE_LOONGSON_EXT2); break; @@ -2073,7 +2072,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu) * register, we correct it here. */ c->options |= MIPS_CPU_FTLB | MIPS_CPU_TLBINV | MIPS_CPU_LDPTE; - c->writecombine = _CACHE_UNCACHED_ACCELERATED; c->ases |= (MIPS_ASE_LOONGSON_MMI | MIPS_ASE_LOONGSON_CAM | MIPS_ASE_LOONGSON_EXT | MIPS_ASE_LOONGSON_EXT2); c->ases &= ~MIPS_ASE_VZ; /* VZ of Loongson-3A2000/3000 is incomplete */ @@ -2084,7 +2082,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu) set_elf_platform(cpu, "loongson3a"); set_isa(c, MIPS_CPU_ISA_M64R2); decode_cpucfg(c); - c->writecombine = _CACHE_UNCACHED_ACCELERATED; break; default: panic("Unknown Loongson Processor ID!");
在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: > On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: >> Loongson processors have a writecombine issue that maybe failed to >> write back framebuffer used with ATI Radeon or AMD GPU at times, >> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine >> mapping for MIPS"), there exists some errors such as blurred screen >> and lockup, and so on. >> >> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for >> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any >> influence on the other platforms. > well it's not my call to take or reject this patch, but I already > indicated it might be better to disable writecombine on the CPU > detection side (or do you have other devices where writecombining > works ?). Something like below will disbale it for all loongson64 CPUs. > If you now find out where it works and where it doesn't, you can even > reduce it to the required minium of affected CPUs. Hi Tiezhu, Thomas, Yes, writecombine works well on LS7A's internal GPU.... And even works well with some AMD GPUs (in my case, RX550). Tiezhu, is it possible to investigate the issue deeper in Loongson? Probably we just need to add some barrier to maintain the data coherency, or disable writecombine for AMD GPU's command buffer and leave texture/frame buffer wc accelerated. Thanks. - Jiaxun
Am 08.08.20 um 15:50 schrieb Jiaxun Yang: > > > 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: >> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: >>> Loongson processors have a writecombine issue that maybe failed to >>> write back framebuffer used with ATI Radeon or AMD GPU at times, >>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine >>> mapping for MIPS"), there exists some errors such as blurred screen >>> and lockup, and so on. >>> >>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for >>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any >>> influence on the other platforms. >> well it's not my call to take or reject this patch, but I already >> indicated it might be better to disable writecombine on the CPU >> detection side (or do you have other devices where writecombining >> works ?). Something like below will disbale it for all loongson64 CPUs. >> If you now find out where it works and where it doesn't, you can even >> reduce it to the required minium of affected CPUs. > Hi Tiezhu, Thomas, > > Yes, writecombine works well on LS7A's internal GPU.... > And even works well with some AMD GPUs (in my case, RX550). In this case the patch is a clear NAK since you haven't root caused the issue and are just working around it in a very questionable manner. > > Tiezhu, is it possible to investigate the issue deeper in Loongson? > Probably we just need to add some barrier to maintain the data coherency, > or disable writecombine for AMD GPU's command buffer and leave > texture/frame > buffer wc accelerated. Have you moved any buffer to VRAM and forgot to add an HDP flush/invalidate? The acceleration is not much of a problem, but if WC doesn't work in general you need to disable it for the whole CPU and not for individual drivers. Regards, Christian. > > Thanks. > > - Jiaxun
On 08/09/2020 08:13 PM, Christian König wrote: > Am 08.08.20 um 15:50 schrieb Jiaxun Yang: >> >> >> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: >>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: >>>> Loongson processors have a writecombine issue that maybe failed to >>>> write back framebuffer used with ATI Radeon or AMD GPU at times, >>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine >>>> mapping for MIPS"), there exists some errors such as blurred screen >>>> and lockup, and so on. >>>> >>>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for >>>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any >>>> influence on the other platforms. >>> well it's not my call to take or reject this patch, but I already >>> indicated it might be better to disable writecombine on the CPU >>> detection side (or do you have other devices where writecombining >>> works ?). Something like below will disbale it for all loongson64 CPUs. >>> If you now find out where it works and where it doesn't, you can even >>> reduce it to the required minium of affected CPUs. >> Hi Tiezhu, Thomas, >> >> Yes, writecombine works well on LS7A's internal GPU.... >> And even works well with some AMD GPUs (in my case, RX550). > > In this case the patch is a clear NAK since you haven't root caused > the issue and are just working around it in a very questionable manner. > >> >> Tiezhu, is it possible to investigate the issue deeper in Loongson? >> Probably we just need to add some barrier to maintain the data >> coherency, >> or disable writecombine for AMD GPU's command buffer and leave >> texture/frame >> buffer wc accelerated. > > Have you moved any buffer to VRAM and forgot to add an HDP > flush/invalidate? > > The acceleration is not much of a problem, but if WC doesn't work in > general you need to disable it for the whole CPU and not for > individual drivers. Hi Thomas, Jiaxun and Christian, Thank you very much for your suggestions. Actually, this patch is a temporary solution to just make it work well, it is not a proper and final solution. I understand your opinions, it will take some time to find the root cause. Thanks, Tiezhu > > Regards, > Christian. > >> >> Thanks. >> >> - Jiaxun
On 2020-08-09 2:13 p.m., Christian König wrote: > Am 08.08.20 um 15:50 schrieb Jiaxun Yang: >> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: >>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: >>>> Loongson processors have a writecombine issue that maybe failed to >>>> write back framebuffer used with ATI Radeon or AMD GPU at times, >>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine >>>> mapping for MIPS"), there exists some errors such as blurred screen >>>> and lockup, and so on. >>>> >>>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for >>>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any >>>> influence on the other platforms. >>> well it's not my call to take or reject this patch, but I already >>> indicated it might be better to disable writecombine on the CPU >>> detection side (or do you have other devices where writecombining >>> works ?). Something like below will disbale it for all loongson64 CPUs. >>> If you now find out where it works and where it doesn't, you can even >>> reduce it to the required minium of affected CPUs. >> Hi Tiezhu, Thomas, >> >> Yes, writecombine works well on LS7A's internal GPU.... >> And even works well with some AMD GPUs (in my case, RX550). > > In this case the patch is a clear NAK since you haven't root caused the > issue and are just working around it in a very questionable manner. To be fair though, amdgpu & radeon are already disabling write-combining for system memory pages in 32-bit x86 kernels for similar reasons.
Am 10.08.20 um 12:50 schrieb Michel Dänzer: > On 2020-08-09 2:13 p.m., Christian König wrote: >> Am 08.08.20 um 15:50 schrieb Jiaxun Yang: >>> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: >>>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: >>>>> Loongson processors have a writecombine issue that maybe failed to >>>>> write back framebuffer used with ATI Radeon or AMD GPU at times, >>>>> after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine >>>>> mapping for MIPS"), there exists some errors such as blurred screen >>>>> and lockup, and so on. >>>>> >>>>> Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for >>>>> Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any >>>>> influence on the other platforms. >>>> well it's not my call to take or reject this patch, but I already >>>> indicated it might be better to disable writecombine on the CPU >>>> detection side (or do you have other devices where writecombining >>>> works ?). Something like below will disbale it for all loongson64 CPUs. >>>> If you now find out where it works and where it doesn't, you can even >>>> reduce it to the required minium of affected CPUs. >>> Hi Tiezhu, Thomas, >>> >>> Yes, writecombine works well on LS7A's internal GPU.... >>> And even works well with some AMD GPUs (in my case, RX550). >> In this case the patch is a clear NAK since you haven't root caused the >> issue and are just working around it in a very questionable manner. > To be fair though, amdgpu & radeon are already disabling write-combining > for system memory pages in 32-bit x86 kernels for similar reasons. Yeah, well that is USWC for system memory. But this is about WC for the VRAM BAR. When we don't understand or don't correctly implement something on the platform for USWC then this is annoying, but not a serious issue. But when the hardware doesn't correctly implement WC for PCIe BARs, then this is a violation of the PCIe spec and a bit more serious issue for the whole platform. We can work around that by disabling WC for PCIe BARs on the whole platform, or behind specific bridges or or or, but patching each individual driver so that they work is not really the right approach. Cheers, Christian.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 5ac7b55..9f785f6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -136,8 +136,11 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain) places[c].fpfn = 0; places[c].lpfn = 0; - places[c].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED | - TTM_PL_FLAG_VRAM; + if (IS_ENABLED(CONFIG_MACH_LOONGSON64)) + places[c].flags = TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM; + else + places[c].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED | + TTM_PL_FLAG_VRAM; if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) places[c].lpfn = visible_pfn; diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index f3dee01..c6cede6 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -112,15 +112,23 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) rbo->rdev->mc.visible_vram_size < rbo->rdev->mc.real_vram_size) { rbo->placements[c].fpfn = rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT; - rbo->placements[c++].flags = TTM_PL_FLAG_WC | - TTM_PL_FLAG_UNCACHED | - TTM_PL_FLAG_VRAM; + if (IS_ENABLED(CONFIG_MACH_LOONGSON64)) + rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED | + TTM_PL_FLAG_VRAM; + else + rbo->placements[c++].flags = TTM_PL_FLAG_WC | + TTM_PL_FLAG_UNCACHED | + TTM_PL_FLAG_VRAM; } rbo->placements[c].fpfn = 0; - rbo->placements[c++].flags = TTM_PL_FLAG_WC | - TTM_PL_FLAG_UNCACHED | - TTM_PL_FLAG_VRAM; + if (IS_ENABLED(CONFIG_MACH_LOONGSON64)) + rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED | + TTM_PL_FLAG_VRAM; + else + rbo->placements[c++].flags = TTM_PL_FLAG_WC | + TTM_PL_FLAG_UNCACHED | + TTM_PL_FLAG_VRAM; } if (domain & RADEON_GEM_DOMAIN_GTT) {
Loongson processors have a writecombine issue that maybe failed to write back framebuffer used with ATI Radeon or AMD GPU at times, after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine mapping for MIPS"), there exists some errors such as blurred screen and lockup, and so on. Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any influence on the other platforms. [ 60.958721] radeon 0000:03:00.0: ring 0 stalled for more than 10079msec [ 60.965315] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000112 last fence id 0x000000000000011d on ring 0) [ 60.976525] radeon 0000:03:00.0: ring 3 stalled for more than 10086msec [ 60.983156] radeon 0000:03:00.0: GPU lockup (current fence id 0x0000000000000374 last fence id 0x00000000000003a8 on ring 3) Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 7 +++++-- drivers/gpu/drm/radeon/radeon_object.c | 20 ++++++++++++++------ 2 files changed, 19 insertions(+), 8 deletions(-)