Message ID | 1438252085-4773-1-git-send-email-pure.logic@nexus-software.ie (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 30/07/15 15:52, Bryan O'Donoghue wrote: > On 30/07/15 15:49, Peter Hurley wrote: >> On 07/30/2015 10:12 AM, Ilia Mirkin wrote: >>> Is this happening with libdrm 2.4.60? If so, that's a known >>> (user-side) issue and should be fixed by using any version but that >>> one. >> >> What's the freedesktop bugzilla # for reference? >> >> Regards, >> Peter Hurley > > I believe it's this one > > https://bugs.freedesktop.org/show_bug.cgi?id=89842#c19 > Not really a world of choice on ubuntu to fix it though... deckard@aineko:~/Development/projectara$ apt-show-versions libdrm2 libdrm2:amd64/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate libdrm2:i386/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate :(
On 30/07/15 16:02, Ilia Mirkin wrote: > > That's unfortunate. I know next to nothing about debian/ubuntu or how > they do versions or how to even build packages for them. But they're > big distros, presumably they have support teams of some sort, perhaps > they can help you. > > Assuming that switching away does resolve the issue for you, perhaps > you can also recommend that they avoid shipping that version, or > include this nouveau fix in it: > > http://cgit.freedesktop.org/mesa/drm/commit/?id=812e8fe6ce46d733c30207ee26c788c61f546294 > > This whole libdrm thing is a bit of a cluster%@#$ unfortunately -- > 2.4.60 is broken for nouveau, building even the latest released > xf86-video-intel against 2.4.61+ causes it to not start ("fixed" in > xf86-video-intel git), and newer mesa requires libdrm 2.4.60+. > > -ilia > Matter of fact apt-cache show libdrm2 sudo apt-get install libdrm2=2.4.56-1~ubuntu2 #sudo echo “package libdrm2” | sudo dpkg –set-selections I'll give it a go at the end of the working day - should give enough time to recover if it all goes spectacularly wrong :)
On 31/07/15 01:03, Bryan O'Donoghue wrote: > On 30/07/15 22:45, Peter Hurley wrote: >> [ +cc Debian maintainer ] >> >> On 07/30/2015 11:26 AM, Emil Velikov wrote: >>> On 30 July 2015 at 16:02, Ilia Mirkin <imirkin@alum.mit.edu> wrote: >>>> On Thu, Jul 30, 2015 at 10:56 AM, Bryan O'Donoghue >>>> <pure.logic@nexus-software.ie> wrote: >>>>> On 30/07/15 15:52, Bryan O'Donoghue wrote: >>>>>> >>>>>> On 30/07/15 15:49, Peter Hurley wrote: >>>>>>> >>>>>>> On 07/30/2015 10:12 AM, Ilia Mirkin wrote: >>>>>>>> >>>>>>>> Is this happening with libdrm 2.4.60? If so, that's a known >>>>>>>> (user-side) issue and should be fixed by using any version but that >>>>>>>> one. >>>>>>> >>>>>>> >>>>>>> What's the freedesktop bugzilla # for reference? >>>>>>> >>>>>>> Regards, >>>>>>> Peter Hurley >>>>>> >>>>>> >>>>>> I believe it's this one >>>>>> >>>>>> https://bugs.freedesktop.org/show_bug.cgi?id=89842#c19 >>>>>> >>>>> >>>>> Not really a world of choice on ubuntu to fix it though... >>>>> >>>>> deckard@aineko:~/Development/projectara$ apt-show-versions libdrm2 >>>>> libdrm2:amd64/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>>> libdrm2:i386/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>>> >>>>> :( >>>> >>>> That's unfortunate. I know next to nothing about debian/ubuntu or how >>>> they do versions or how to even build packages for them. But they're >>>> big distros, presumably they have support teams of some sort, perhaps >>>> they can help you. >>>> >>>> Assuming that switching away does resolve the issue for you, perhaps >>>> you can also recommend that they avoid shipping that version, or >>>> include this nouveau fix in it: >>>> >>>> http://cgit.freedesktop.org/mesa/drm/commit/?id=812e8fe6ce46d733c30207ee26c788c61f546294 >>>> >>>> >>> Fwiw debian has been tracking this as #789759, and they are shipping >>> 2.4.62 which includes the fix. >> >> Unfortunately the LTS version of Ubuntu (trusty) was updated to 2.4.60 >> several days ago without this fix. >> >> I repackaged libdrm 2.4.60 with only the bug fix above and confirm the >> patch above fixes the observed behavior in freedesktop bug# 89842/ >> debian bug# 789759. >> >> I pushed the repackage to Launchpad PPA @ ppa:phurley/libdrm >> >> Hopefully the Debian maintainer grabs this fix and updates the official >> distribution version soon. >> >> Regards, >> Peter Hurley > > Yep. > > Dropping down to 2.4.56-1~ubuntu2 definitely removes the > > nouveau E[chrome[2737]] multiple instances of buffer 33 on validation list > nouveau E[chrome[2737]] validate_init > nouveau E[chrome[2737]] validate: -22 > nouveau E[chrome[2737]] multiple instances of buffer 18 on validation list > nouveau E[chrome[2737]] validate_init > nouveau E[chrome[2737]] validate: -22 > nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at > 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum > 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f80c000 > [unknown] > > and hard lock-up of X. I'll update these guys with the fix > > http://tinyurl.com/orvbzf3 Hmm. Interesting - I spoke way too soon on that. Left my machine up overnight and lo and behold now that I do a dmesg I see... nouveau E[chrome[2870]] multiple instances of buffer 16 on validation list nouveau E[chrome[2870]] multiple instances of buffer 24 on validation list nouveau E[chrome[2870]] multiple instances of buffer 167 on validation list nouveau E[chrome[2870]] multiple instances of buffer 251 on validation list nouveau E[chrome[2870]] multiple instances of buffer 248 on validation list nouveau E[chrome[2870]] multiple instances of buffer 249 on validation list nouveau E[chrome[2870]] multiple instances of buffer 230 on validation list nouveau E[chrome[2870]] multiple instances of buffer 253 on validation list nouveau E[chrome[2870]] multiple instances of buffer 255 on validation list nouveau E[chrome[2870]] multiple instances of buffer 230 on validation list nouveau E[chrome[2870]] multiple instances of buffer 257 on validation list nouveau E[chrome[2870]] multiple instances of buffer 230 on validation list deckard@aineko:~$ dpkg -s libdrm2 Package: libdrm2 Status: install ok installed Priority: optional Section: libs Installed-Size: 106 Maintainer: Ubuntu X-SWAT <ubuntu-x@lists.ubuntu.com> Architecture: amd64 Multi-Arch: same Source: libdrm Version: 2.4.56-1~ubuntu2 Depends: libc6 (>= 2.17) Pre-Depends: multiarch-support Description: Userspace interface to kernel DRM services -- runtime This library implements the userspace interface to the kernel DRM services. DRM stands for "Direct Rendering Manager", which is the kernelspace portion of the "Direct Rendering Infrastructure" (DRI). The DRI is currently used on Linux to provide hardware-accelerated OpenGL drivers. . This package provides the runtime environment for libdrm. Orig-Maintainer: Debian X Strike Force <debian-x@lists.debian.org> deckard@aineko:~$ uname -a Linux aineko 4.2.0-rc4+ #50 SMP Thu Jul 30 01:22:01 IST 2015 x86_64 x86_64 x86_64 GNU/Linux Please note - this machine has the fix I proposed in the original patch applied so X is not locking up when the multiple instances message happens. In any case to answer the original question - I don't believe switching away from 2.4.60 will resolve this issue and similarly then Ilia - do you happen to know if 2.4.56 should have the bug ? Should I now be using a version of libdrm2 which is good ??? Bryan
On 31/07/15 10:53, Bryan O'Donoghue wrote: > On 31/07/15 01:03, Bryan O'Donoghue wrote: >> On 30/07/15 22:45, Peter Hurley wrote: >>> [ +cc Debian maintainer ] >>> >>> On 07/30/2015 11:26 AM, Emil Velikov wrote: >>>> On 30 July 2015 at 16:02, Ilia Mirkin <imirkin@alum.mit.edu> wrote: >>>>> On Thu, Jul 30, 2015 at 10:56 AM, Bryan O'Donoghue >>>>> <pure.logic@nexus-software.ie> wrote: >>>>>> On 30/07/15 15:52, Bryan O'Donoghue wrote: >>>>>>> >>>>>>> On 30/07/15 15:49, Peter Hurley wrote: >>>>>>>> >>>>>>>> On 07/30/2015 10:12 AM, Ilia Mirkin wrote: >>>>>>>>> >>>>>>>>> Is this happening with libdrm 2.4.60? If so, that's a known >>>>>>>>> (user-side) issue and should be fixed by using any version but >>>>>>>>> that >>>>>>>>> one. >>>>>>>> >>>>>>>> >>>>>>>> What's the freedesktop bugzilla # for reference? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Peter Hurley >>>>>>> >>>>>>> >>>>>>> I believe it's this one >>>>>>> >>>>>>> https://bugs.freedesktop.org/show_bug.cgi?id=89842#c19 >>>>>>> >>>>>> >>>>>> Not really a world of choice on ubuntu to fix it though... >>>>>> >>>>>> deckard@aineko:~/Development/projectara$ apt-show-versions libdrm2 >>>>>> libdrm2:amd64/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>>>> libdrm2:i386/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>>>> >>>>>> :( >>>>> >>>>> That's unfortunate. I know next to nothing about debian/ubuntu or how >>>>> they do versions or how to even build packages for them. But they're >>>>> big distros, presumably they have support teams of some sort, perhaps >>>>> they can help you. >>>>> >>>>> Assuming that switching away does resolve the issue for you, perhaps >>>>> you can also recommend that they avoid shipping that version, or >>>>> include this nouveau fix in it: >>>>> >>>>> http://cgit.freedesktop.org/mesa/drm/commit/?id=812e8fe6ce46d733c30207ee26c788c61f546294 >>>>> >>>>> >>>>> >>>> Fwiw debian has been tracking this as #789759, and they are shipping >>>> 2.4.62 which includes the fix. >>> >>> Unfortunately the LTS version of Ubuntu (trusty) was updated to 2.4.60 >>> several days ago without this fix. >>> >>> I repackaged libdrm 2.4.60 with only the bug fix above and confirm the >>> patch above fixes the observed behavior in freedesktop bug# 89842/ >>> debian bug# 789759. >>> >>> I pushed the repackage to Launchpad PPA @ ppa:phurley/libdrm >>> >>> Hopefully the Debian maintainer grabs this fix and updates the official >>> distribution version soon. >>> >>> Regards, >>> Peter Hurley >> >> Yep. >> >> Dropping down to 2.4.56-1~ubuntu2 definitely removes the >> >> nouveau E[chrome[2737]] multiple instances of buffer 33 on validation >> list >> nouveau E[chrome[2737]] validate_init >> nouveau E[chrome[2737]] validate: -22 >> nouveau E[chrome[2737]] multiple instances of buffer 18 on validation >> list >> nouveau E[chrome[2737]] validate_init >> nouveau E[chrome[2737]] validate: -22 >> nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at >> 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum >> 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f80c000 >> [unknown] >> >> and hard lock-up of X. I'll update these guys with the fix >> >> http://tinyurl.com/orvbzf3 > > Hmm. > > Interesting - I spoke way too soon on that. > > Left my machine up overnight and lo and behold now that I do a dmesg I > see... > > nouveau E[chrome[2870]] multiple instances of buffer 16 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 24 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 167 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 251 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 248 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 249 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 230 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 253 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 255 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 230 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 257 on validation list > nouveau E[chrome[2870]] multiple instances of buffer 230 on validation list > > deckard@aineko:~$ dpkg -s libdrm2 > Package: libdrm2 > Status: install ok installed > Priority: optional > Section: libs > Installed-Size: 106 > Maintainer: Ubuntu X-SWAT <ubuntu-x@lists.ubuntu.com> > Architecture: amd64 > Multi-Arch: same > Source: libdrm > Version: 2.4.56-1~ubuntu2 > Depends: libc6 (>= 2.17) > Pre-Depends: multiarch-support > Description: Userspace interface to kernel DRM services -- runtime > This library implements the userspace interface to the kernel DRM > services. DRM stands for "Direct Rendering Manager", which is the > kernelspace portion of the "Direct Rendering Infrastructure" (DRI). > The DRI is currently used on Linux to provide hardware-accelerated > OpenGL drivers. > . > This package provides the runtime environment for libdrm. > Orig-Maintainer: Debian X Strike Force <debian-x@lists.debian.org> > > deckard@aineko:~$ uname -a > Linux aineko 4.2.0-rc4+ #50 SMP Thu Jul 30 01:22:01 IST 2015 x86_64 > x86_64 x86_64 GNU/Linux > > Please note - this machine has the fix I proposed in the original patch > applied so X is not locking up when the multiple instances message happens. > > In any case to answer the original question - I don't believe switching > away from 2.4.60 will resolve this issue 2.40.6 I mean :)
On 31/07/15 10:58, Bryan O'Donoghue wrote: > On 31/07/15 10:53, Bryan O'Donoghue wrote: >> On 31/07/15 01:03, Bryan O'Donoghue wrote: >>> On 30/07/15 22:45, Peter Hurley wrote: >>>> [ +cc Debian maintainer ] >>>> >>>> On 07/30/2015 11:26 AM, Emil Velikov wrote: >>>>> On 30 July 2015 at 16:02, Ilia Mirkin <imirkin@alum.mit.edu> wrote: >>>>>> On Thu, Jul 30, 2015 at 10:56 AM, Bryan O'Donoghue >>>>>> <pure.logic@nexus-software.ie> wrote: >>>>>>> On 30/07/15 15:52, Bryan O'Donoghue wrote: >>>>>>>> >>>>>>>> On 30/07/15 15:49, Peter Hurley wrote: >>>>>>>>> >>>>>>>>> On 07/30/2015 10:12 AM, Ilia Mirkin wrote: >>>>>>>>>> >>>>>>>>>> Is this happening with libdrm 2.4.60? If so, that's a known >>>>>>>>>> (user-side) issue and should be fixed by using any version but >>>>>>>>>> that >>>>>>>>>> one. >>>>>>>>> >>>>>>>>> >>>>>>>>> What's the freedesktop bugzilla # for reference? >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Peter Hurley >>>>>>>> >>>>>>>> >>>>>>>> I believe it's this one >>>>>>>> >>>>>>>> https://bugs.freedesktop.org/show_bug.cgi?id=89842#c19 >>>>>>>> >>>>>>> >>>>>>> Not really a world of choice on ubuntu to fix it though... >>>>>>> >>>>>>> deckard@aineko:~/Development/projectara$ apt-show-versions libdrm2 >>>>>>> libdrm2:amd64/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>>>>> libdrm2:i386/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>>>>> >>>>>>> :( >>>>>> >>>>>> That's unfortunate. I know next to nothing about debian/ubuntu or how >>>>>> they do versions or how to even build packages for them. But they're >>>>>> big distros, presumably they have support teams of some sort, perhaps >>>>>> they can help you. >>>>>> >>>>>> Assuming that switching away does resolve the issue for you, perhaps >>>>>> you can also recommend that they avoid shipping that version, or >>>>>> include this nouveau fix in it: >>>>>> >>>>>> http://cgit.freedesktop.org/mesa/drm/commit/?id=812e8fe6ce46d733c30207ee26c788c61f546294 >>>>>> >>>>>> >>>>>> >>>>>> >>>>> Fwiw debian has been tracking this as #789759, and they are shipping >>>>> 2.4.62 which includes the fix. >>>> >>>> Unfortunately the LTS version of Ubuntu (trusty) was updated to 2.4.60 >>>> several days ago without this fix. >>>> >>>> I repackaged libdrm 2.4.60 with only the bug fix above and confirm the >>>> patch above fixes the observed behavior in freedesktop bug# 89842/ >>>> debian bug# 789759. >>>> >>>> I pushed the repackage to Launchpad PPA @ ppa:phurley/libdrm >>>> >>>> Hopefully the Debian maintainer grabs this fix and updates the official >>>> distribution version soon. >>>> >>>> Regards, >>>> Peter Hurley >>> >>> Yep. >>> >>> Dropping down to 2.4.56-1~ubuntu2 definitely removes the >>> >>> nouveau E[chrome[2737]] multiple instances of buffer 33 on validation >>> list >>> nouveau E[chrome[2737]] validate_init >>> nouveau E[chrome[2737]] validate: -22 >>> nouveau E[chrome[2737]] multiple instances of buffer 18 on validation >>> list >>> nouveau E[chrome[2737]] validate_init >>> nouveau E[chrome[2737]] validate: -22 >>> nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at >>> 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum >>> 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f80c000 >>> [unknown] >>> >>> and hard lock-up of X. I'll update these guys with the fix >>> >>> http://tinyurl.com/orvbzf3 >> >> Hmm. >> >> Interesting - I spoke way too soon on that. >> >> Left my machine up overnight and lo and behold now that I do a dmesg I >> see... >> >> nouveau E[chrome[2870]] multiple instances of buffer 16 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 24 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 167 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 251 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 248 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 249 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 230 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 253 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 255 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 230 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 257 on validation >> list >> nouveau E[chrome[2870]] multiple instances of buffer 230 on validation >> list >> >> deckard@aineko:~$ dpkg -s libdrm2 >> Package: libdrm2 >> Status: install ok installed >> Priority: optional >> Section: libs >> Installed-Size: 106 >> Maintainer: Ubuntu X-SWAT <ubuntu-x@lists.ubuntu.com> >> Architecture: amd64 >> Multi-Arch: same >> Source: libdrm >> Version: 2.4.56-1~ubuntu2 >> Depends: libc6 (>= 2.17) >> Pre-Depends: multiarch-support >> Description: Userspace interface to kernel DRM services -- runtime >> This library implements the userspace interface to the kernel DRM >> services. DRM stands for "Direct Rendering Manager", which is the >> kernelspace portion of the "Direct Rendering Infrastructure" (DRI). >> The DRI is currently used on Linux to provide hardware-accelerated >> OpenGL drivers. >> . >> This package provides the runtime environment for libdrm. >> Orig-Maintainer: Debian X Strike Force <debian-x@lists.debian.org> >> >> deckard@aineko:~$ uname -a >> Linux aineko 4.2.0-rc4+ #50 SMP Thu Jul 30 01:22:01 IST 2015 x86_64 >> x86_64 x86_64 GNU/Linux >> >> Please note - this machine has the fix I proposed in the original patch >> applied so X is not locking up when the multiple instances message >> happens. >> >> In any case to answer the original question - I don't believe switching >> away from 2.4.60 will resolve this issue > > 2.40.6 I mean :) > ah no... 2.4.60 is right... Yes so Ilia - I've switched out 2.4.60 as per your suggestion to 2.4.56 (getting the version numbers right :) ) and it's still definitely giving me the multiple instances message.
On 30/07/15 22:45, Peter Hurley wrote: > [ +cc Debian maintainer ] > > On 07/30/2015 11:26 AM, Emil Velikov wrote: >> On 30 July 2015 at 16:02, Ilia Mirkin <imirkin@alum.mit.edu> wrote: >>> On Thu, Jul 30, 2015 at 10:56 AM, Bryan O'Donoghue >>> <pure.logic@nexus-software.ie> wrote: >>>> On 30/07/15 15:52, Bryan O'Donoghue wrote: >>>>> >>>>> On 30/07/15 15:49, Peter Hurley wrote: >>>>>> >>>>>> On 07/30/2015 10:12 AM, Ilia Mirkin wrote: >>>>>>> >>>>>>> Is this happening with libdrm 2.4.60? If so, that's a known >>>>>>> (user-side) issue and should be fixed by using any version but that >>>>>>> one. >>>>>> >>>>>> >>>>>> What's the freedesktop bugzilla # for reference? >>>>>> >>>>>> Regards, >>>>>> Peter Hurley >>>>> >>>>> >>>>> I believe it's this one >>>>> >>>>> https://bugs.freedesktop.org/show_bug.cgi?id=89842#c19 >>>>> >>>> >>>> Not really a world of choice on ubuntu to fix it though... >>>> >>>> deckard@aineko:~/Development/projectara$ apt-show-versions libdrm2 >>>> libdrm2:amd64/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>> libdrm2:i386/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate >>>> >>>> :( >>> >>> That's unfortunate. I know next to nothing about debian/ubuntu or how >>> they do versions or how to even build packages for them. But they're >>> big distros, presumably they have support teams of some sort, perhaps >>> they can help you. >>> >>> Assuming that switching away does resolve the issue for you, perhaps >>> you can also recommend that they avoid shipping that version, or >>> include this nouveau fix in it: >>> >>> http://cgit.freedesktop.org/mesa/drm/commit/?id=812e8fe6ce46d733c30207ee26c788c61f546294 >>> >> Fwiw debian has been tracking this as #789759, and they are shipping >> 2.4.62 which includes the fix. > > Unfortunately the LTS version of Ubuntu (trusty) was updated to 2.4.60 > several days ago without this fix. > > I repackaged libdrm 2.4.60 with only the bug fix above and confirm the > patch above fixes the observed behavior in freedesktop bug# 89842/ > debian bug# 789759. > > I pushed the repackage to Launchpad PPA @ ppa:phurley/libdrm > > Hopefully the Debian maintainer grabs this fix and updates the official > distribution version soon. > > Regards, > Peter Hurley > Peter. I tried your suggested fix too as uploaded and it doesn't fix it either. I think that this is a separate bug or the fix as applied to libdrm doesn't actually fix it. deckard@aineko:~$ dmesg | tail -n 1 nouveau E[chrome[3176]] multiple instances of buffer 413 on validation list deckard@aineko:~$ dpkg -s libdrm2 Package: libdrm2 Status: install ok installed Priority: optional Section: libs Installed-Size: 106 Maintainer: Debian X Strike Force <debian-x@lists.debian.org> Architecture: amd64 Multi-Arch: same Source: libdrm Version: 2.4.60-2ppa1~trusty1 deckard@aineko:~$ uname -a Linux aineko 4.2.0-rc4+ #50 SMP Thu Jul 30 01:22:01 IST 2015 x86_64 x86_64 x86_64 GNU/Linux
On Fri, Jul 31, 2015 at 6:27 AM, Bryan O'Donoghue <pure.logic@nexus-software.ie> wrote: > ah no... 2.4.60 is right... > > Yes so Ilia - I've switched out 2.4.60 as per your suggestion to 2.4.56 > (getting the version numbers right :) ) and it's still definitely giving me > the multiple instances message. This is going to sound like a stupid question, but I'll ask anyways -- you *did* restart chrome after changing libdrm versions, right? I was going to mention that there were a handful of fixes in libdrm, potentially since 2.4.56 (I forget the exact versions), but if 2.4.60 also fails, then that would have them. There was a final assert() added in 2.4.62, but that was to better isolate the cause of weirdo crashes (i.e. crash when the thing going wrong happens rather than stashing bad pointers for later very confusing dereference). Not GPU crashes. Just for your information, nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f80c000 [unknown] means that there was VM fault from an unknown gpu unit (???) when reading some resource by the GPU. (The GPU has its own MMU.) Unfortunately this can happen for one of a million reasons, the biggest one being "unknown", but mesa definitely doesn't handle command submission failures particularly well... should probably add a "fail 1% of the time" thing to help fix that up. Do you have a reproducible way of achieving the multiple buffer on validation list thing? What GPU do you have? (Looking for a codename, not a marketing name... lspci should have it... GFxxx or GKxxx or Gxx.) -ilia
On 31/07/15 17:36, Ilia Mirkin wrote: > On Fri, Jul 31, 2015 at 6:27 AM, Bryan O'Donoghue > <pure.logic@nexus-software.ie> wrote: >> ah no... 2.4.60 is right... >> >> Yes so Ilia - I've switched out 2.4.60 as per your suggestion to 2.4.56 >> (getting the version numbers right :) ) and it's still definitely giving me >> the multiple instances message. > > This is going to sound like a stupid question, but I'll ask anyways -- > you *did* restart chrome after changing libdrm versions, right? There are no stupid questions - just stupid answers like 'whaddya mean restart chrome' Seriously though, I've restarted the machine each time I've tried to switch out those libraries, so it's definitely not that. > I was going to mention that there were a handful of fixes in libdrm, > potentially since 2.4.56 (I forget the exact versions), but if 2.4.60 > also fails, then that would have them. > > There was a final assert() added in 2.4.62, but that was to better > isolate the cause of weirdo crashes (i.e. crash when the thing going > wrong happens rather than stashing bad pointers for later very > confusing dereference). Not GPU crashes. > > Just for your information, > > nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at > 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum > 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f80c000 > [unknown] > > means that there was VM fault from an unknown gpu unit (???) when > reading some resource by the GPU. OK, I was assuming it was a side effect of the -EINVAL when we get the multiple instances message. > (The GPU has its own MMU.) > Unfortunately this can happen for one of a million reasons, the > biggest one being "unknown", but mesa definitely doesn't handle > command submission failures particularly well... should probably add a > "fail 1% of the time" thing to help fix that up. > > Do you have a reproducible way of achieving the multiple buffer on > validation list thing? What GPU do you have? (Looking for a codename, > not a marketing name... lspci should have it... GFxxx or GKxxx or 01:00.0 VGA compatible controller: NVIDIA Corporation GK107M [GeForce GT 750M Mac Edition] (rev a1) (prog-if 00 [VGA controller]) Subsystem: Apple Inc. Device 0130 Flags: bus master, fast devsel, latency 0, IRQ 45 Memory at c0000000 (32-bit, non-prefetchable) [size=16M] Memory at 80000000 (64-bit, prefetchable) [size=256M] Memory at 90000000 (64-bit, prefetchable) [size=32M] I/O ports at 1000 [size=128] Expansion ROM at c1000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] #19 Kernel driver in use: nouveau Macbook pro retina 2014
On 31/07/15 17:43, Bryan O'Donoghue wrote: > On 31/07/15 17:36, Ilia Mirkin wrote: >> Do you have a reproducible way of achieving the multiple buffer on >> validation list thing? Reliable enough. Start Chrome, then get Chrome to open a menu on top of it's own screen - for example click the top right menu bar - the thing with the three horizontal bars, scroll down to 'recent tabs' and let the mouse hover. You'll get a menu that opens up over the main chrome screen and at that point you'll also get a 'multiple instances of buffer' Basically drawing one window on top of another inside of the same Chrome tab. I guess the same PID is mapping the same piece of memory twice because if I open a seperate Chrome window (which will have a seperate PID) and drag one window over the other we don't see a repeat. If it helps deckard@aineko:~$ dmesg | tail -n 5 [ 6900.249427] nouveau E[chrome[3176]] multiple instances of buffer 456 on validation list [ 6920.992475] nouveau E[chrome[3176]] multiple instances of buffer 458 on validation list [ 6934.277352] nouveau E[chrome[3176]] multiple instances of buffer 458 on validation list [ 6994.303600] nouveau E[chrome[3176]] multiple instances of buffer 458 on validation list [ 7067.436049] nouveau E[chrome[3176]] multiple instances of buffer 456 on validation list deckard@aineko:~$ ps -ax | grep chrome | grep 3176 3176 pts/6 Sl+ 0:29 /opt/google/chrome/chrome --type=gpu-process --channel=3143.0.1295591 ives-passed-by-fd --v8-snapshot-passed-by-fd --supports-dual-gpus=false --gpu-driver-bug-workarounds=2,29,32,45,55,57 --disable-accelerated-video-decode --gpu-vendor-id=0x10de --gpu-device-id=0x0fe9 --gpu-driver-vendor --gpu-driver-version --v8-natives-passed-by-fd --v8-snapshot-passed-by-fd > What GPU do you have? (Looking for a codename, >> not a marketing name... lspci should have it... GFxxx or GKxxx or > > 01:00.0 VGA compatible controller: NVIDIA Corporation GK107M [GeForce GT > 750M Mac Edition] (rev a1) (prog-if 00 [VGA controller]) > Subsystem: Apple Inc. Device 0130 > Flags: bus master, fast devsel, latency 0, IRQ 45 > Memory at c0000000 (32-bit, non-prefetchable) [size=16M] > Memory at 80000000 (64-bit, prefetchable) [size=256M] > Memory at 90000000 (64-bit, prefetchable) [size=32M] > I/O ports at 1000 [size=128] > Expansion ROM at c1000000 [disabled] [size=512K] > Capabilities: [60] Power Management version 3 > Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Capabilities: [78] Express Endpoint, MSI 00 > Capabilities: [b4] Vendor Specific Information: Len=14 <?> > Capabilities: [100] Virtual Channel > Capabilities: [128] Power Budgeting <?> > Capabilities: [420] Advanced Error Reporting > Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 > Len=024 <?> > Capabilities: [900] #19 > Kernel driver in use: nouveau > > Macbook pro retina 2014
On 31/07/15 19:11, Bryan O'Donoghue wrote: > On 31/07/15 17:43, Bryan O'Donoghue wrote: >> On 31/07/15 17:36, Ilia Mirkin wrote: >>> Do you have a reproducible way of achieving the multiple buffer on >>> validation list thing? Ilia, Peter. I've filed a bug for you here : https://bugs.freedesktop.org/show_bug.cgi?id=91535 I've verified that Peter's PPA library when installed fixes the race condition you guys were talking about but running the test program tests/nouveau/threaded so, this issue we're discussing here is a separate one. Cheers, Bryan
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index af1ee51..a9694faad 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -401,9 +401,7 @@ retry: if (nvbo->reserved_by && nvbo->reserved_by == file_priv) { NV_PRINTK(error, cli, "multiple instances of buffer %d on " "validation list\n", b->handle); - drm_gem_object_unreference_unlocked(gem); - ret = -EINVAL; - break; + continue; } ret = ttm_bo_reserve(&nvbo->bo, true, false, true, &op->ticket);
Ubuntu is shipping Chrome Version 44.0.2403.125 (64-bit). With this version of the browser and current tip-of-tree 86ea07ca846a I get the following error message followed by a lock-up of X. nouveau E[chrome[2737]] multiple instances of buffer 33 on validation list nouveau E[chrome[2737]] validate_init nouveau E[chrome[2737]] validate: -22 nouveau E[chrome[2737]] multiple instances of buffer 18 on validation list nouveau E[chrome[2737]] validate_init nouveau E[chrome[2737]] validate: -22 nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum 0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f80c000 [unknown] This patch suggests a fix for this with the kernel simply tolerating an application such as chrome requesting the same buffer more than once. With the version of chrome given above, you can elicit this behaviour by clicking on the bookmarks drop down. This will open another window on-top of the current window. Minus the fix included here, this will lead to hard lockup of all windows on the desktop. Chrome Version 44.0.2403.125 (64-bit) Linux 4.2.0-rc4+ 86ea07ca846a People are suggesting running chrome with -disable-gpu however it is possible to run Chrome in it's default mode, so long as we tolerate the above behaviour. http://tinyurl.com/orvbzf3 Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> --- drivers/gpu/drm/nouveau/nouveau_gem.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)