Message ID | 1493370666-14461-1-git-send-email-oscar.mateo@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: > The new batchbuffer for CNL surpasses the 4096 byte mark. > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Ben Widawsky <ben@bwidawsk.net> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Evil, 4k+ of nothing-ness that userspace then has to configure for itself for correctness anyway. Patch looks ok, but still question the sanity. -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >> The new batchbuffer for CNL surpasses the 4096 byte mark. >> >> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >> Cc: Ben Widawsky <ben@bwidawsk.net> >> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> > > Evil, 4k+ of nothing-ness that userspace then has to configure for itself > for correctness anyway. > > Patch looks ok, but still question the sanity. Is there a requirement for CNL to init the renderstate? I would like to drop the render state init from CNL if we can't find evidence that it needs it. Bspec indicates that it doesnt. -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre
On 05/02/2017 09:17 AM, Mika Kuoppala wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > >> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>> >>> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >>> Cc: Ben Widawsky <ben@bwidawsk.net> >>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> >> Evil, 4k+ of nothing-ness that userspace then has to configure for itself >> for correctness anyway. >> >> Patch looks ok, but still question the sanity. > Is there a requirement for CNL to init the renderstate? > > I would like to drop the render state init from CNL if > we can't find evidence that it needs it. Bspec indicates > that it doesnt. > > -Mika Hi Mika, I can double-check with the hardware architects, but word around here is that render state init has never stopped being a requirement. Where did you see in the BSpec that it is not required for CNL? Thanks
Oscar Mateo <oscar.mateo@intel.com> writes: > On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >> Chris Wilson <chris@chris-wilson.co.uk> writes: >> >>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>>> >>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >>>> Cc: Ben Widawsky <ben@bwidawsk.net> >>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> >>> Evil, 4k+ of nothing-ness that userspace then has to configure for itself >>> for correctness anyway. >>> >>> Patch looks ok, but still question the sanity. >> Is there a requirement for CNL to init the renderstate? >> >> I would like to drop the render state init from CNL if >> we can't find evidence that it needs it. Bspec indicates >> that it doesnt. >> >> -Mika > > Hi Mika, > > I can double-check with the hardware architects, but word around here is > that render state init has never stopped being a requirement. Where did > you see in the BSpec that it is not required for CNL? > It would be great if you could refresh the answer and perhaps even get some answers to the 'why' parts. In the "Context Descriptor Format" section, it says: "Render CS Only: Render state need not be initialized; the Render Context Restore Inhibit bit in the Context/Save image in memory should be set to prevent restoring garbage render context." -Mika
On 05/03/2017 08:52 AM, Mika Kuoppala wrote: > Oscar Mateo <oscar.mateo@intel.com> writes: > >> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >>> Chris Wilson <chris@chris-wilson.co.uk> writes: >>> >>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>>>> >>>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >>>>> Cc: Ben Widawsky <ben@bwidawsk.net> >>>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> >>>> Evil, 4k+ of nothing-ness that userspace then has to configure for itself >>>> for correctness anyway. >>>> >>>> Patch looks ok, but still question the sanity. >>> Is there a requirement for CNL to init the renderstate? >>> >>> I would like to drop the render state init from CNL if >>> we can't find evidence that it needs it. Bspec indicates >>> that it doesnt. >>> >>> -Mika >> Hi Mika, >> >> I can double-check with the hardware architects, but word around here is >> that render state init has never stopped being a requirement. Where did >> you see in the BSpec that it is not required for CNL? >> > It would be great if you could refresh the answer and perhaps > even get some answers to the 'why' parts. > > In the "Context Descriptor Format" section, it says: > "Render CS Only: Render state need not be initialized; the Render > Context Restore Inhibit bit in the Context/Save image in memory should > be set to prevent restoring garbage render context." > > -Mika :_( The same section also says: “See the Logical Ring Context Format section for details.” And then “Logical Ring Context Format” section goes on to say: “It is tedious for software to populate the engine context as per the requirements, it is recommended to implicitly use engine to populate this portion of the context. […] Software must program all the state required to initialize the engine in the ring buffer which would initialize the hardware state.” I’ll try to clarify it...
On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: > On 05/03/2017 08:52 AM, Mika Kuoppala wrote: > > Oscar Mateo [1]<oscar.mateo@intel.com> writes: > > > On 05/02/2017 09:17 AM, Mika Kuoppala wrote: > > Chris Wilson [2]<chris@chris-wilson.co.uk> writes: > > > On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: > > The new batchbuffer for CNL surpasses the 4096 byte mark. > > Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> > Cc: Ben Widawsky [4]<ben@bwidawsk.net> > Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> > > Evil, 4k+ of nothing-ness that userspace then has to configure for itself > for correctness anyway. > > Patch looks ok, but still question the sanity. > > Is there a requirement for CNL to init the renderstate? > > I would like to drop the render state init from CNL if > we can't find evidence that it needs it. Bspec indicates > that it doesnt. > > -Mika > > Hi Mika, > > I can double-check with the hardware architects, but word around here is > that render state init has never stopped being a requirement. Where did > you see in the BSpec that it is not required for CNL? > > > It would be great if you could refresh the answer and perhaps > even get some answers to the 'why' parts. > > In the "Context Descriptor Format" section, it says: > "Render CS Only: Render state need not be initialized; the Render > Context Restore Inhibit bit in the Context/Save image in memory should > be set to prevent restoring garbage render context." > > -Mika > > :_( > > The same section also says: > > Â > > â**See the Logical Ring Context Format section for details.â** > > Â > > And then â**Logical Ring Context Formatâ** section goes on to say: > > Â > > â**It is tedious for software to populate the engine context as per the > requirements, it is recommended to implicitly use engine to populate this > portion of the context. [â*¦] Software must program all the state required > to initialize the engine in the ring buffer which would initialize the > hardware state.â** Yet what the kernel programs is completely garbage for the user, so the user still has to program the initial GPU state to their own specifications. Just say no to policy in the kernel. We need a stronger reason than this, and if that was the only reason the original render state was merged, I am very angry. -Chris
On Wed, May 3, 2017 at 9:31 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: >> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: >> >> Oscar Mateo [1]<oscar.mateo@intel.com> writes: >> >> >> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >> >> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: >> >> >> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >> >> The new batchbuffer for CNL surpasses the 4096 byte mark. >> >> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> >> Cc: Ben Widawsky [4]<ben@bwidawsk.net> >> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> >> >> Evil, 4k+ of nothing-ness that userspace then has to configure for itself >> for correctness anyway. >> >> Patch looks ok, but still question the sanity. >> >> Is there a requirement for CNL to init the renderstate? >> >> I would like to drop the render state init from CNL if >> we can't find evidence that it needs it. Bspec indicates >> that it doesnt. I'd like to drop as well, and I was hearing people around telling we didn't need anymore, however without this during power on I had bad failures... >> >> -Mika >> >> Hi Mika, >> >> I can double-check with the hardware architects, but word around here is >> that render state init has never stopped being a requirement. Where did >> you see in the BSpec that it is not required for CNL? >> >> >> It would be great if you could refresh the answer and perhaps >> even get some answers to the 'why' parts. >> >> In the "Context Descriptor Format" section, it says: >> "Render CS Only: Render state need not be initialized; the Render >> Context Restore Inhibit bit in the Context/Save image in memory should >> be set to prevent restoring garbage render context." >> >> -Mika >> >> :_( >> >> The same section also says: >> >> Â >> >> â**See the Logical Ring Context Format section for details.â** >> >> Â >> >> And then â**Logical Ring Context Formatâ** section goes on to say: >> >> Â >> >> â**It is tedious for software to populate the engine context as per the >> requirements, it is recommended to implicitly use engine to populate this >> portion of the context. [â*¦] Software must program all the state required >> to initialize the engine in the ring buffer which would initialize the >> hardware state.â** > > Yet what the kernel programs is completely garbage for the user, so the > user still has to program the initial GPU state to their own > specifications. Just say no to policy in the kernel. We need a stronger > reason than this, and if that was the only reason the original render > state was merged, I am very angry. so... based on what I saw we need this, I agree the justification is not good because I could never actually understand or make any sense out of this golden context.... But we need a solution to this impasse, to be able to move forward... > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: > On Wed, May 3, 2017 at 9:31 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: >> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: >>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: >>> >>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: >>> >>> >>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >>> >>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: >>> >>> >>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>> >>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>> >>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> >>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> >>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> >>> >>> Evil, 4k+ of nothing-ness that userspace then has to configure for itself >>> for correctness anyway. >>> >>> Patch looks ok, but still question the sanity. >>> >>> Is there a requirement for CNL to init the renderstate? >>> >>> I would like to drop the render state init from CNL if >>> we can't find evidence that it needs it. Bspec indicates >>> that it doesnt. > I'd like to drop as well, and I was hearing people around telling we > didn't need anymore, > however without this during power on I had bad failures... > The best I could get from architecture (+Raf) is that setting valid and coherent values for the whole render state is required as soon as the context is created, no matter who does it. If you see failures when the KMD does not do it, that means the UMD must be missing something, right? >>> -Mika >>> >>> Hi Mika, >>> >>> I can double-check with the hardware architects, but word around here is >>> that render state init has never stopped being a requirement. Where did >>> you see in the BSpec that it is not required for CNL? >>> >>> >>> It would be great if you could refresh the answer and perhaps >>> even get some answers to the 'why' parts. >>> >>> In the "Context Descriptor Format" section, it says: >>> "Render CS Only: Render state need not be initialized; the Render >>> Context Restore Inhibit bit in the Context/Save image in memory should >>> be set to prevent restoring garbage render context." >>> >>> -Mika >>> >>> :_( >>> >>> The same section also says: >>> >>> Â >>> >>> â**See the Logical Ring Context Format section for details.â** >>> >>> Â >>> >>> And then â**Logical Ring Context Formatâ** section goes on to say: >>> >>> Â >>> >>> â**It is tedious for software to populate the engine context as per the >>> requirements, it is recommended to implicitly use engine to populate this >>> portion of the context. [â*¦] Software must program all the state required >>> to initialize the engine in the ring buffer which would initialize the >>> hardware state.â** >> Yet what the kernel programs is completely garbage for the user, so the >> user still has to program the initial GPU state to their own >> specifications. Just say no to policy in the kernel. We need a stronger >> reason than this, and if that was the only reason the original render >> state was merged, I am very angry. > so... based on what I saw we need this, > I agree the justification is not good because I could never actually > understand or make any sense out of this golden context.... > But we need a solution to this impasse, to be able to move forward... > >> -Chris >> >> -- >> Chris Wilson, Intel Open Source Technology Centre >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx > > > >
Quoting Oscar Mateo (2017-07-14 15:52:59) > > > > On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: > > On Wed, May 3, 2017 at 9:31 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > >> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: > >>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: > >>> > >>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: > >>> > >>> > >>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: > >>> > >>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: > >>> > >>> > >>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: > >>> > >>> The new batchbuffer for CNL surpasses the 4096 byte mark. > >>> > >>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> > >>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> > >>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> > >>> > >>> Evil, 4k+ of nothing-ness that userspace then has to configure for itself > >>> for correctness anyway. > >>> > >>> Patch looks ok, but still question the sanity. > >>> > >>> Is there a requirement for CNL to init the renderstate? > >>> > >>> I would like to drop the render state init from CNL if > >>> we can't find evidence that it needs it. Bspec indicates > >>> that it doesnt. > > I'd like to drop as well, and I was hearing people around telling we > > didn't need anymore, > > however without this during power on I had bad failures... > > > > The best I could get from architecture (+Raf) is that setting valid and > coherent values for the whole render state is required as soon as the > context is created, no matter who does it. If you see failures when the > KMD does not do it, that means the UMD must be missing something, right? That is my initial response as well. The kernel does load one context, just so that the hardware always has space to write to on power saving. The only batch executed for it is the golden render state. Easy enough to only initialise that kernel context to isolate whether it is self-inflicted or that userspace overlooked something in its state management. (I have the view that even if userspace doesn't think it needs to use a particular bit of state today, tomorrow it will so will need it anyway!) -Chris >
On 07/14/2017 08:08 AM, Chris Wilson wrote: > Quoting Oscar Mateo (2017-07-14 15:52:59) >> >> >> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: >>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: >>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: >>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: >>>>> >>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: >>>>> >>>>> >>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >>>>> >>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: >>>>> >>>>> >>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>>>> >>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>>>> >>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> >>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> >>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> >>>>> >>>>> Evil, 4k+ of nothing-ness that userspace then has to configure for itself >>>>> for correctness anyway. >>>>> >>>>> Patch looks ok, but still question the sanity. >>>>> >>>>> Is there a requirement for CNL to init the renderstate? >>>>> >>>>> I would like to drop the render state init from CNL if >>>>> we can't find evidence that it needs it. Bspec indicates >>>>> that it doesnt. >>> I'd like to drop as well, and I was hearing people around telling we >>> didn't need anymore, >>> however without this during power on I had bad failures... >>> >> The best I could get from architecture (+Raf) is that setting valid and >> coherent values for the whole render state is required as soon as the >> context is created, no matter who does it. If you see failures when the >> KMD does not do it, that means the UMD must be missing something, right? > That is my initial response as well. The kernel does load one context, > just so that the hardware always has space to write to on power saving. > The only batch executed for it is the golden render state. Easy enough > to only initialise that kernel context to isolate whether it is > self-inflicted or that userspace overlooked something in its state > management. (I have the view that even if userspace doesn't think it > needs to use a particular bit of state today, tomorrow it will so will > need it anyway!) > -Chris Rodrigo, you have access to a CNL: can you make this test? The idea is to find out if the root cause for the failures you were seeing is the kernel default context or in the UMD-created contexts. Thanks, Oscar
On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo@intel.com> wrote: > > > > On 07/14/2017 08:08 AM, Chris Wilson wrote: >> >> Quoting Oscar Mateo (2017-07-14 15:52:59) >>> >>> >>> >>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: >>>> >>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson <chris@chris-wilson.co.uk> >>>> wrote: >>>>> >>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: >>>>>> >>>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: >>>>>> >>>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: >>>>>> >>>>>> >>>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >>>>>> >>>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: >>>>>> >>>>>> >>>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>>>>> >>>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>>>>> >>>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> >>>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> >>>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> >>>>>> >>>>>> Evil, 4k+ of nothing-ness that userspace then has to configure for >>>>>> itself >>>>>> for correctness anyway. >>>>>> >>>>>> Patch looks ok, but still question the sanity. >>>>>> >>>>>> Is there a requirement for CNL to init the renderstate? >>>>>> >>>>>> I would like to drop the render state init from CNL if >>>>>> we can't find evidence that it needs it. Bspec indicates >>>>>> that it doesnt. >>>> >>>> I'd like to drop as well, and I was hearing people around telling we >>>> didn't need anymore, >>>> however without this during power on I had bad failures... >>>> >>> The best I could get from architecture (+Raf) is that setting valid and >>> coherent values for the whole render state is required as soon as the >>> context is created, no matter who does it. If you see failures when the >>> KMD does not do it, that means the UMD must be missing something, right? >> >> That is my initial response as well. The kernel does load one context, >> just so that the hardware always has space to write to on power saving. >> The only batch executed for it is the golden render state. Easy enough >> to only initialise that kernel context to isolate whether it is >> self-inflicted or that userspace overlooked something in its state >> management. (I have the view that even if userspace doesn't think it >> needs to use a particular bit of state today, tomorrow it will so will >> need it anyway!) >> -Chris > > > Rodrigo, you have access to a CNL: can you make this test? The idea is to > find out if the root cause for the failures you were seeing is the kernel > default context or in the UMD-created contexts. I'm sorry for the delay on this one. On the parts I have now I couldn't reproduce the issues I saw during power-on where null context helped. But anyways apparently we need this right?! What about the 4k+ sanity that Chris raised? Anything we should address first? > > Thanks, > Oscar >
On 08/23/2017 05:01 PM, Rodrigo Vivi wrote: > On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo@intel.com> wrote: >> >> >> On 07/14/2017 08:08 AM, Chris Wilson wrote: >>> Quoting Oscar Mateo (2017-07-14 15:52:59) >>>> >>>> >>>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: >>>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson <chris@chris-wilson.co.uk> >>>>> wrote: >>>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: >>>>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: >>>>>>> >>>>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: >>>>>>> >>>>>>> >>>>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >>>>>>> >>>>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: >>>>>>> >>>>>>> >>>>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>>>>>> >>>>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>>>>>> >>>>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> >>>>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> >>>>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> >>>>>>> >>>>>>> Evil, 4k+ of nothing-ness that userspace then has to configure for >>>>>>> itself >>>>>>> for correctness anyway. >>>>>>> >>>>>>> Patch looks ok, but still question the sanity. >>>>>>> >>>>>>> Is there a requirement for CNL to init the renderstate? >>>>>>> >>>>>>> I would like to drop the render state init from CNL if >>>>>>> we can't find evidence that it needs it. Bspec indicates >>>>>>> that it doesnt. >>>>> I'd like to drop as well, and I was hearing people around telling we >>>>> didn't need anymore, >>>>> however without this during power on I had bad failures... >>>>> >>>> The best I could get from architecture (+Raf) is that setting valid and >>>> coherent values for the whole render state is required as soon as the >>>> context is created, no matter who does it. If you see failures when the >>>> KMD does not do it, that means the UMD must be missing something, right? >>> That is my initial response as well. The kernel does load one context, >>> just so that the hardware always has space to write to on power saving. >>> The only batch executed for it is the golden render state. Easy enough >>> to only initialise that kernel context to isolate whether it is >>> self-inflicted or that userspace overlooked something in its state >>> management. (I have the view that even if userspace doesn't think it >>> needs to use a particular bit of state today, tomorrow it will so will >>> need it anyway!) >>> -Chris >> >> Rodrigo, you have access to a CNL: can you make this test? The idea is to >> find out if the root cause for the failures you were seeing is the kernel >> default context or in the UMD-created contexts. > I'm sorry for the delay on this one. > > On the parts I have now I couldn't reproduce the issues I saw during power-on > where null context helped. > > But anyways apparently we need this right?! > > What about the 4k+ sanity that Chris raised? Anything we should address first? I don't think Chris had any problem with the batchbuffer being bigger than 4k per se. His concern was: "why do we need to send this batchbuffer from the KMD at all if the UMD has to send something very similar anyway?". Even if this was true (I haven't found anybody to confirm or deny it) there is still the question of the kernel context (which would never get initialized to valid values by the UMD). The test was to only send the golden state for the kernel context (and nothing else) and see if your issues went away. Since your issues went away on their own without any golden state whatsoever... does that mean Mesa fixed something they were missing during the PO?
On Thu, Aug 24, 2017 at 3:39 PM, Oscar Mateo <oscar.mateo@intel.com> wrote: > > > On 08/23/2017 05:01 PM, Rodrigo Vivi wrote: >> >> On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo@intel.com> >> wrote: >>> >>> >>> >>> On 07/14/2017 08:08 AM, Chris Wilson wrote: >>>> >>>> Quoting Oscar Mateo (2017-07-14 15:52:59) >>>>> >>>>> >>>>> >>>>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: >>>>>> >>>>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson >>>>>> <chris@chris-wilson.co.uk> >>>>>> wrote: >>>>>>> >>>>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: >>>>>>>> >>>>>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: >>>>>>>> >>>>>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: >>>>>>>> >>>>>>>> >>>>>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: >>>>>>>> >>>>>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: >>>>>>>> >>>>>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. >>>>>>>> >>>>>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> >>>>>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> >>>>>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> >>>>>>>> >>>>>>>> Evil, 4k+ of nothing-ness that userspace then has to configure >>>>>>>> for >>>>>>>> itself >>>>>>>> for correctness anyway. >>>>>>>> >>>>>>>> Patch looks ok, but still question the sanity. >>>>>>>> >>>>>>>> Is there a requirement for CNL to init the renderstate? >>>>>>>> >>>>>>>> I would like to drop the render state init from CNL if >>>>>>>> we can't find evidence that it needs it. Bspec indicates >>>>>>>> that it doesnt. >>>>>> >>>>>> I'd like to drop as well, and I was hearing people around telling we >>>>>> didn't need anymore, >>>>>> however without this during power on I had bad failures... >>>>>> >>>>> The best I could get from architecture (+Raf) is that setting valid and >>>>> coherent values for the whole render state is required as soon as the >>>>> context is created, no matter who does it. If you see failures when the >>>>> KMD does not do it, that means the UMD must be missing something, >>>>> right? >>>> >>>> That is my initial response as well. The kernel does load one context, >>>> just so that the hardware always has space to write to on power saving. >>>> The only batch executed for it is the golden render state. Easy enough >>>> to only initialise that kernel context to isolate whether it is >>>> self-inflicted or that userspace overlooked something in its state >>>> management. (I have the view that even if userspace doesn't think it >>>> needs to use a particular bit of state today, tomorrow it will so will >>>> need it anyway!) >>>> -Chris >>> >>> >>> Rodrigo, you have access to a CNL: can you make this test? The idea is to >>> find out if the root cause for the failures you were seeing is the kernel >>> default context or in the UMD-created contexts. >> >> I'm sorry for the delay on this one. >> >> On the parts I have now I couldn't reproduce the issues I saw during >> power-on >> where null context helped. >> >> But anyways apparently we need this right?! >> >> What about the 4k+ sanity that Chris raised? Anything we should address >> first? > > > I don't think Chris had any problem with the batchbuffer being bigger than > 4k per se. His concern was: "why do we need to send this batchbuffer from > the KMD at all if the UMD has to send something very similar anyway?". > Even if this was true (I haven't found anybody to confirm or deny it) there > is still the question of the kernel context (which would never get > initialized to valid values by the UMD). so, chris, rv-b? acked-by? > The test was to only send the > golden state for the kernel context (and nothing else) and see if your > issues went away. > > Since your issues went away on their own without any golden state > whatsoever... does that mean Mesa fixed something they were missing during > the PO? not sure what it was anymore > >
On Thu, Aug 24, 2017 at 11:00:27PM +0000, Rodrigo Vivi wrote: > On Thu, Aug 24, 2017 at 3:39 PM, Oscar Mateo <oscar.mateo@intel.com> wrote: > > > > > > On 08/23/2017 05:01 PM, Rodrigo Vivi wrote: > >> > >> On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo@intel.com> > >> wrote: > >>> > >>> > >>> > >>> On 07/14/2017 08:08 AM, Chris Wilson wrote: > >>>> > >>>> Quoting Oscar Mateo (2017-07-14 15:52:59) > >>>>> > >>>>> > >>>>> > >>>>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: > >>>>>> > >>>>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson > >>>>>> <chris@chris-wilson.co.uk> > >>>>>> wrote: > >>>>>>> > >>>>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: > >>>>>>>> > >>>>>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: > >>>>>>>> > >>>>>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: > >>>>>>>> > >>>>>>>> > >>>>>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: > >>>>>>>> > >>>>>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: > >>>>>>>> > >>>>>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. > >>>>>>>> > >>>>>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> > >>>>>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> > >>>>>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> > >>>>>>>> > >>>>>>>> Evil, 4k+ of nothing-ness that userspace then has to configure > >>>>>>>> for > >>>>>>>> itself > >>>>>>>> for correctness anyway. > >>>>>>>> > >>>>>>>> Patch looks ok, but still question the sanity. > >>>>>>>> > >>>>>>>> Is there a requirement for CNL to init the renderstate? > >>>>>>>> > >>>>>>>> I would like to drop the render state init from CNL if > >>>>>>>> we can't find evidence that it needs it. Bspec indicates > >>>>>>>> that it doesnt. > >>>>>> > >>>>>> I'd like to drop as well, and I was hearing people around telling we > >>>>>> didn't need anymore, > >>>>>> however without this during power on I had bad failures... > >>>>>> > >>>>> The best I could get from architecture (+Raf) is that setting valid and > >>>>> coherent values for the whole render state is required as soon as the > >>>>> context is created, no matter who does it. If you see failures when the > >>>>> KMD does not do it, that means the UMD must be missing something, > >>>>> right? > >>>> > >>>> That is my initial response as well. The kernel does load one context, > >>>> just so that the hardware always has space to write to on power saving. > >>>> The only batch executed for it is the golden render state. Easy enough > >>>> to only initialise that kernel context to isolate whether it is > >>>> self-inflicted or that userspace overlooked something in its state > >>>> management. (I have the view that even if userspace doesn't think it > >>>> needs to use a particular bit of state today, tomorrow it will so will > >>>> need it anyway!) > >>>> -Chris > >>> > >>> > >>> Rodrigo, you have access to a CNL: can you make this test? The idea is to > >>> find out if the root cause for the failures you were seeing is the kernel > >>> default context or in the UMD-created contexts. > >> > >> I'm sorry for the delay on this one. > >> > >> On the parts I have now I couldn't reproduce the issues I saw during > >> power-on > >> where null context helped. > >> > >> But anyways apparently we need this right?! > >> > >> What about the 4k+ sanity that Chris raised? Anything we should address > >> first? > > > > > > I don't think Chris had any problem with the batchbuffer being bigger than > > 4k per se. His concern was: "why do we need to send this batchbuffer from > > the KMD at all if the UMD has to send something very similar anyway?". > > Even if this was true (I haven't found anybody to confirm or deny it) there > > is still the question of the kernel context (which would never get > > initialized to valid values by the UMD). > > so, chris, rv-b? acked-by? chris, mika, oscar... what should we do with this? just discard, ignore and move on without the null context for gen10+? > > > The test was to only send the > > golden state for the kernel context (and nothing else) and see if your > > issues went away. > > > > Since your issues went away on their own without any golden state > > whatsoever... does that mean Mesa fixed something they were missing during > > the PO? > > not sure what it was anymore > > > > > > > > > -- > Rodrigo Vivi > Blog: http://blog.vivi.eng.br
Quoting Rodrigo Vivi (2017-10-05 05:34:02) > On Thu, Aug 24, 2017 at 11:00:27PM +0000, Rodrigo Vivi wrote: > > On Thu, Aug 24, 2017 at 3:39 PM, Oscar Mateo <oscar.mateo@intel.com> wrote: > > > > > > > > > On 08/23/2017 05:01 PM, Rodrigo Vivi wrote: > > >> > > >> On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo@intel.com> > > >> wrote: > > >>> > > >>> > > >>> > > >>> On 07/14/2017 08:08 AM, Chris Wilson wrote: > > >>>> > > >>>> Quoting Oscar Mateo (2017-07-14 15:52:59) > > >>>>> > > >>>>> > > >>>>> > > >>>>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: > > >>>>>> > > >>>>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson > > >>>>>> <chris@chris-wilson.co.uk> > > >>>>>> wrote: > > >>>>>>> > > >>>>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: > > >>>>>>>> > > >>>>>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: > > >>>>>>>> > > >>>>>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: > > >>>>>>>> > > >>>>>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: > > >>>>>>>> > > >>>>>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. > > >>>>>>>> > > >>>>>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> > > >>>>>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> > > >>>>>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> > > >>>>>>>> > > >>>>>>>> Evil, 4k+ of nothing-ness that userspace then has to configure > > >>>>>>>> for > > >>>>>>>> itself > > >>>>>>>> for correctness anyway. > > >>>>>>>> > > >>>>>>>> Patch looks ok, but still question the sanity. > > >>>>>>>> > > >>>>>>>> Is there a requirement for CNL to init the renderstate? > > >>>>>>>> > > >>>>>>>> I would like to drop the render state init from CNL if > > >>>>>>>> we can't find evidence that it needs it. Bspec indicates > > >>>>>>>> that it doesnt. > > >>>>>> > > >>>>>> I'd like to drop as well, and I was hearing people around telling we > > >>>>>> didn't need anymore, > > >>>>>> however without this during power on I had bad failures... > > >>>>>> > > >>>>> The best I could get from architecture (+Raf) is that setting valid and > > >>>>> coherent values for the whole render state is required as soon as the > > >>>>> context is created, no matter who does it. If you see failures when the > > >>>>> KMD does not do it, that means the UMD must be missing something, > > >>>>> right? > > >>>> > > >>>> That is my initial response as well. The kernel does load one context, > > >>>> just so that the hardware always has space to write to on power saving. > > >>>> The only batch executed for it is the golden render state. Easy enough > > >>>> to only initialise that kernel context to isolate whether it is > > >>>> self-inflicted or that userspace overlooked something in its state > > >>>> management. (I have the view that even if userspace doesn't think it > > >>>> needs to use a particular bit of state today, tomorrow it will so will > > >>>> need it anyway!) > > >>>> -Chris > > >>> > > >>> > > >>> Rodrigo, you have access to a CNL: can you make this test? The idea is to > > >>> find out if the root cause for the failures you were seeing is the kernel > > >>> default context or in the UMD-created contexts. > > >> > > >> I'm sorry for the delay on this one. > > >> > > >> On the parts I have now I couldn't reproduce the issues I saw during > > >> power-on > > >> where null context helped. > > >> > > >> But anyways apparently we need this right?! > > >> > > >> What about the 4k+ sanity that Chris raised? Anything we should address > > >> first? > > > > > > > > > I don't think Chris had any problem with the batchbuffer being bigger than > > > 4k per se. His concern was: "why do we need to send this batchbuffer from > > > the KMD at all if the UMD has to send something very similar anyway?". > > > Even if this was true (I haven't found anybody to confirm or deny it) there > > > is still the question of the kernel context (which would never get > > > initialized to valid values by the UMD). > > > > so, chris, rv-b? acked-by? > > chris, mika, oscar... > what should we do with this? > just discard, ignore and move on without the null context for gen10+? If there's no requirement for us to have it, then let's break the cargo cult. Certainly userspace does not expect 3DSTATE to have any default value, unlike the defaults specified for mmio state (which is currently causing a huge upset). It's only if the bspec has wording that makes certain valid 3DSTATE (or GPGPU or MEDIA) mandatory for powercontext etc do we have to worry. -Chris
Quoting Chris Wilson (2017-10-10 11:25:38) > Quoting Rodrigo Vivi (2017-10-05 05:34:02) > > On Thu, Aug 24, 2017 at 11:00:27PM +0000, Rodrigo Vivi wrote: > > > On Thu, Aug 24, 2017 at 3:39 PM, Oscar Mateo <oscar.mateo@intel.com> wrote: > > > > > > > > > > > > On 08/23/2017 05:01 PM, Rodrigo Vivi wrote: > > > >> > > > >> On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo@intel.com> > > > >> wrote: > > > >>> > > > >>> > > > >>> > > > >>> On 07/14/2017 08:08 AM, Chris Wilson wrote: > > > >>>> > > > >>>> Quoting Oscar Mateo (2017-07-14 15:52:59) > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: > > > >>>>>> > > > >>>>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson > > > >>>>>> <chris@chris-wilson.co.uk> > > > >>>>>> wrote: > > > >>>>>>> > > > >>>>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: > > > >>>>>>>> > > > >>>>>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: > > > >>>>>>>> > > > >>>>>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: > > > >>>>>>>> > > > >>>>>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: > > > >>>>>>>> > > > >>>>>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. > > > >>>>>>>> > > > >>>>>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> > > > >>>>>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> > > > >>>>>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> > > > >>>>>>>> > > > >>>>>>>> Evil, 4k+ of nothing-ness that userspace then has to configure > > > >>>>>>>> for > > > >>>>>>>> itself > > > >>>>>>>> for correctness anyway. > > > >>>>>>>> > > > >>>>>>>> Patch looks ok, but still question the sanity. > > > >>>>>>>> > > > >>>>>>>> Is there a requirement for CNL to init the renderstate? > > > >>>>>>>> > > > >>>>>>>> I would like to drop the render state init from CNL if > > > >>>>>>>> we can't find evidence that it needs it. Bspec indicates > > > >>>>>>>> that it doesnt. > > > >>>>>> > > > >>>>>> I'd like to drop as well, and I was hearing people around telling we > > > >>>>>> didn't need anymore, > > > >>>>>> however without this during power on I had bad failures... > > > >>>>>> > > > >>>>> The best I could get from architecture (+Raf) is that setting valid and > > > >>>>> coherent values for the whole render state is required as soon as the > > > >>>>> context is created, no matter who does it. If you see failures when the > > > >>>>> KMD does not do it, that means the UMD must be missing something, > > > >>>>> right? > > > >>>> > > > >>>> That is my initial response as well. The kernel does load one context, > > > >>>> just so that the hardware always has space to write to on power saving. > > > >>>> The only batch executed for it is the golden render state. Easy enough > > > >>>> to only initialise that kernel context to isolate whether it is > > > >>>> self-inflicted or that userspace overlooked something in its state > > > >>>> management. (I have the view that even if userspace doesn't think it > > > >>>> needs to use a particular bit of state today, tomorrow it will so will > > > >>>> need it anyway!) > > > >>>> -Chris > > > >>> > > > >>> > > > >>> Rodrigo, you have access to a CNL: can you make this test? The idea is to > > > >>> find out if the root cause for the failures you were seeing is the kernel > > > >>> default context or in the UMD-created contexts. > > > >> > > > >> I'm sorry for the delay on this one. > > > >> > > > >> On the parts I have now I couldn't reproduce the issues I saw during > > > >> power-on > > > >> where null context helped. > > > >> > > > >> But anyways apparently we need this right?! > > > >> > > > >> What about the 4k+ sanity that Chris raised? Anything we should address > > > >> first? > > > > > > > > > > > > I don't think Chris had any problem with the batchbuffer being bigger than > > > > 4k per se. His concern was: "why do we need to send this batchbuffer from > > > > the KMD at all if the UMD has to send something very similar anyway?". > > > > Even if this was true (I haven't found anybody to confirm or deny it) there > > > > is still the question of the kernel context (which would never get > > > > initialized to valid values by the UMD). > > > > > > so, chris, rv-b? acked-by? > > > > chris, mika, oscar... > > what should we do with this? > > just discard, ignore and move on without the null context for gen10+? > > If there's no requirement for us to have it, then let's break the cargo > cult. Certainly userspace does not expect 3DSTATE to have any default > value, unlike the defaults specified for mmio state (which is currently > causing a huge upset). It's only if the bspec has wording that makes > certain valid 3DSTATE (or GPGPU or MEDIA) mandatory for powercontext etc > do we have to worry. The other angle is that the proto context is entirely defined by us. New userspace contexts should not see any state that is outside of the context construction (either directly specified inside the image or implicitly from priv registers). In essence for lrc, we already define the golden render state but call it a context image instead. -Chris
On Tue, Oct 10, 2017 at 10:29:41AM +0000, Chris Wilson wrote: > Quoting Chris Wilson (2017-10-10 11:25:38) > > Quoting Rodrigo Vivi (2017-10-05 05:34:02) > > > On Thu, Aug 24, 2017 at 11:00:27PM +0000, Rodrigo Vivi wrote: > > > > On Thu, Aug 24, 2017 at 3:39 PM, Oscar Mateo <oscar.mateo@intel.com> wrote: > > > > > > > > > > > > > > > On 08/23/2017 05:01 PM, Rodrigo Vivi wrote: > > > > >> > > > > >> On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo@intel.com> > > > > >> wrote: > > > > >>> > > > > >>> > > > > >>> > > > > >>> On 07/14/2017 08:08 AM, Chris Wilson wrote: > > > > >>>> > > > > >>>> Quoting Oscar Mateo (2017-07-14 15:52:59) > > > > >>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote: > > > > >>>>>> > > > > >>>>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson > > > > >>>>>> <chris@chris-wilson.co.uk> > > > > >>>>>> wrote: > > > > >>>>>>> > > > > >>>>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote: > > > > >>>>>>>> > > > > >>>>>>>> On 05/03/2017 08:52 AM, Mika Kuoppala wrote: > > > > >>>>>>>> > > > > >>>>>>>> Oscar Mateo [1]<oscar.mateo@intel.com> writes: > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> On 05/02/2017 09:17 AM, Mika Kuoppala wrote: > > > > >>>>>>>> > > > > >>>>>>>> Chris Wilson [2]<chris@chris-wilson.co.uk> writes: > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote: > > > > >>>>>>>> > > > > >>>>>>>> The new batchbuffer for CNL surpasses the 4096 byte mark. > > > > >>>>>>>> > > > > >>>>>>>> Cc: Mika Kuoppala [3]<mika.kuoppala@intel.com> > > > > >>>>>>>> Cc: Ben Widawsky [4]<ben@bwidawsk.net> > > > > >>>>>>>> Signed-off-by: Oscar Mateo [5]<oscar.mateo@intel.com> > > > > >>>>>>>> > > > > >>>>>>>> Evil, 4k+ of nothing-ness that userspace then has to configure > > > > >>>>>>>> for > > > > >>>>>>>> itself > > > > >>>>>>>> for correctness anyway. > > > > >>>>>>>> > > > > >>>>>>>> Patch looks ok, but still question the sanity. > > > > >>>>>>>> > > > > >>>>>>>> Is there a requirement for CNL to init the renderstate? > > > > >>>>>>>> > > > > >>>>>>>> I would like to drop the render state init from CNL if > > > > >>>>>>>> we can't find evidence that it needs it. Bspec indicates > > > > >>>>>>>> that it doesnt. > > > > >>>>>> > > > > >>>>>> I'd like to drop as well, and I was hearing people around telling we > > > > >>>>>> didn't need anymore, > > > > >>>>>> however without this during power on I had bad failures... > > > > >>>>>> > > > > >>>>> The best I could get from architecture (+Raf) is that setting valid and > > > > >>>>> coherent values for the whole render state is required as soon as the > > > > >>>>> context is created, no matter who does it. If you see failures when the > > > > >>>>> KMD does not do it, that means the UMD must be missing something, > > > > >>>>> right? > > > > >>>> > > > > >>>> That is my initial response as well. The kernel does load one context, > > > > >>>> just so that the hardware always has space to write to on power saving. > > > > >>>> The only batch executed for it is the golden render state. Easy enough > > > > >>>> to only initialise that kernel context to isolate whether it is > > > > >>>> self-inflicted or that userspace overlooked something in its state > > > > >>>> management. (I have the view that even if userspace doesn't think it > > > > >>>> needs to use a particular bit of state today, tomorrow it will so will > > > > >>>> need it anyway!) > > > > >>>> -Chris > > > > >>> > > > > >>> > > > > >>> Rodrigo, you have access to a CNL: can you make this test? The idea is to > > > > >>> find out if the root cause for the failures you were seeing is the kernel > > > > >>> default context or in the UMD-created contexts. > > > > >> > > > > >> I'm sorry for the delay on this one. > > > > >> > > > > >> On the parts I have now I couldn't reproduce the issues I saw during > > > > >> power-on > > > > >> where null context helped. > > > > >> > > > > >> But anyways apparently we need this right?! > > > > >> > > > > >> What about the 4k+ sanity that Chris raised? Anything we should address > > > > >> first? > > > > > > > > > > > > > > > I don't think Chris had any problem with the batchbuffer being bigger than > > > > > 4k per se. His concern was: "why do we need to send this batchbuffer from > > > > > the KMD at all if the UMD has to send something very similar anyway?". > > > > > Even if this was true (I haven't found anybody to confirm or deny it) there > > > > > is still the question of the kernel context (which would never get > > > > > initialized to valid values by the UMD). > > > > > > > > so, chris, rv-b? acked-by? > > > > > > chris, mika, oscar... > > > what should we do with this? > > > just discard, ignore and move on without the null context for gen10+? > > > > If there's no requirement for us to have it, then let's break the cargo > > cult. Certainly userspace does not expect 3DSTATE to have any default > > value, unlike the defaults specified for mmio state (which is currently > > causing a huge upset). It's only if the bspec has wording that makes > > certain valid 3DSTATE (or GPGPU or MEDIA) mandatory for powercontext etc > > do we have to worry. > > The other angle is that the proto context is entirely defined by us. New > userspace contexts should not see any state that is outside of the > context construction (either directly specified inside the image or > implicitly from priv registers). In essence for lrc, we already define > the golden render state but call it a context image instead. > -Chris So, are you saying there is absolutely no risk of one userspace component leaving garbage on any of these registers and other component assuming it is null or valid do some RMW and end up with wrong setup? I believe in the past there were cases like this between Mesa and Libva. And if issues like this starts to appear back than apparently the debug is harder because it would be random the garbage left behind. I understand the cargo part, but with many different userspaces out there using the GPU, the cost of kernel assuring the null is really low compared with the stability it can bring without relying on userspace. I understand your part of breaking the cargo. But my doubt is, if we stop this after many years clearing this up we expect userspaces go ahead and all of them modify all their code to not make any assumptions on CNL+ regarding those states? Thanks, Rodrigo.
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 12d7036..07f9bd6 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -62,12 +62,12 @@ struct intel_render_state { * this is sufficient as the null state generator makes the final batch * with two passes to build command and state separately. At this point * the size of both are known and it compacts them by relocating the state - * right after the commands taking care of alignment so we should sufficient - * space below them for adding new commands. + * right after the commands taking care of alignment so we should have + * sufficient space below them for adding new commands. */ -#define OUT_BATCH(batch, i, val) \ +#define OUT_BATCH(batch, size, i, val) \ do { \ - if ((i) >= PAGE_SIZE / sizeof(u32)) \ + if ((i) >= size / sizeof(u32)) \ goto err; \ (batch)[(i)++] = (val); \ } while(0) @@ -86,7 +86,11 @@ static int render_state_setup(struct intel_render_state *so, if (ret) return ret; - d = kmap_atomic(i915_gem_object_get_dirty_page(obj, 0)); + d = i915_gem_object_pin_map(obj, I915_MAP_WB); + if (IS_ERR(d)) { + ret = PTR_ERR(d); + goto out; + } while (i < rodata->batch_items) { u32 s = rodata->batch[i]; @@ -118,7 +122,7 @@ static int render_state_setup(struct intel_render_state *so, so->batch_size = rodata->batch_items * sizeof(u32); while (i % CACHELINE_DWORDS) - OUT_BATCH(d, i, MI_NOOP); + OUT_BATCH(d, obj->base.size, i, MI_NOOP); so->aux_offset = i * sizeof(u32); @@ -141,15 +145,15 @@ static int render_state_setup(struct intel_render_state *so, */ u32 eu_pool_config = 0x00777000; - OUT_BATCH(d, i, GEN9_MEDIA_POOL_STATE); - OUT_BATCH(d, i, GEN9_MEDIA_POOL_ENABLE); - OUT_BATCH(d, i, eu_pool_config); - OUT_BATCH(d, i, 0); - OUT_BATCH(d, i, 0); - OUT_BATCH(d, i, 0); + OUT_BATCH(d, obj->base.size, i, GEN9_MEDIA_POOL_STATE); + OUT_BATCH(d, obj->base.size, i, GEN9_MEDIA_POOL_ENABLE); + OUT_BATCH(d, obj->base.size, i, eu_pool_config); + OUT_BATCH(d, obj->base.size, i, 0); + OUT_BATCH(d, obj->base.size, i, 0); + OUT_BATCH(d, obj->base.size, i, 0); } - OUT_BATCH(d, i, MI_BATCH_BUFFER_END); + OUT_BATCH(d, obj->base.size, i, MI_BATCH_BUFFER_END); so->aux_size = i * sizeof(u32) - so->aux_offset; so->aux_offset += so->batch_offset; /* @@ -160,7 +164,7 @@ static int render_state_setup(struct intel_render_state *so, if (needs_clflush) drm_clflush_virt_range(d, i * sizeof(u32)); - kunmap_atomic(d); + i915_gem_object_unpin_map(obj); ret = i915_gem_object_set_to_gtt_domain(obj, false); out: @@ -168,7 +172,7 @@ static int render_state_setup(struct intel_render_state *so, return ret; err: - kunmap_atomic(d); + i915_gem_object_unpin_map(obj); ret = -EINVAL; goto out; } @@ -189,14 +193,12 @@ int i915_gem_render_state_init(struct intel_engine_cs *engine) if (!rodata) return 0; - if (rodata->batch_items * 4 > PAGE_SIZE) - return -EINVAL; - so = kmalloc(sizeof(*so), GFP_KERNEL); if (!so) return -ENOMEM; - obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE); + obj = i915_gem_object_create_internal(engine->i915, + PAGE_ALIGN(rodata->batch_items * 4)); if (IS_ERR(obj)) { ret = PTR_ERR(obj); goto err_free;
The new batchbuffer for CNL surpasses the 4096 byte mark. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> --- drivers/gpu/drm/i915/i915_gem_render_state.c | 40 +++++++++++++++------------- 1 file changed, 21 insertions(+), 19 deletions(-)