Message ID | 1395943218-7708-18-git-send-email-oscar.mateo@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Mar 27, 2014 at 05:59:46PM +0000, oscar.mateo@intel.com wrote: > --- a/drivers/gpu/drm/i915/i915_lrc.c > +++ b/drivers/gpu/drm/i915/i915_lrc.c > @@ -41,7 +41,45 @@ > #include <drm/i915_drm.h> > #include "i915_drv.h" > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE) I'm a bit puzzled by that number: - I found a sentence saying: "the Context Image for the rendering engine consists of 20 4K pages", which seems that it includes the HWS page (on the same page it says context layout = HWS Page + register state context). - When looking at the register state context for the render engine: 18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add the HWS Page) - Clearly I must be missing something :) - That's only for the render engine, other engines have a much smaller context, smaller enough that it's worth looking at their exact size. - It'd be nice to work out the real size from the *CXT_*SIZE registers. All of this can be refinement patches on top I guess, might have a look at it.
> > --- a/drivers/gpu/drm/i915/i915_lrc.c > > +++ b/drivers/gpu/drm/i915/i915_lrc.c > > @@ -41,7 +41,45 @@ > > #include <drm/i915_drm.h> > > #include "i915_drv.h" > > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE) > > I'm a bit puzzled by that number: > - I found a sentence saying: "the Context Image for the rendering > engine consists of 20 4K pages", which seems that it includes the > HWS page (on the same page it says context layout = HWS Page + > register state context). > - When looking at the register state context for the render engine: > 18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add > the HWS Page) > - Clearly I must be missing something :) > - That's only for the render engine, other engines have a much smaller > context, smaller enough that it's worth looking at their exact size. > - It'd be nice to work out the real size from the *CXT_*SIZE > registers. Hmmmm... I´ll try to get the real context sizes from the registers and compare. At least for RCS, VCS and BCS since there doesn´t seem to be a register for VECS? -- Oscar
On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote: > > > --- a/drivers/gpu/drm/i915/i915_lrc.c > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c > > > @@ -41,7 +41,45 @@ > > > #include <drm/i915_drm.h> > > > #include "i915_drv.h" > > > > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE) > > > > I'm a bit puzzled by that number: > > - I found a sentence saying: "the Context Image for the rendering > > engine consists of 20 4K pages", which seems that it includes the > > HWS page (on the same page it says context layout = HWS Page + > > register state context). > > - When looking at the register state context for the render engine: > > 18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add > > the HWS Page) > > - Clearly I must be missing something :) > > - That's only for the render engine, other engines have a much smaller > > context, smaller enough that it's worth looking at their exact size. > > - It'd be nice to work out the real size from the *CXT_*SIZE > > registers. > > Hmmmm... I´ll try to get the real context sizes from the registers and > compare. At least for RCS, VCS and BCS since there doesn´t seem to be > a register for VECS? Couldn't find it either. I guess we'll need to ask the help of a friend. Or the 50/50 joker maybe.
On Tue, Apr 01, 2014 at 02:51:27PM +0100, Damien Lespiau wrote: > On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote: > > > > --- a/drivers/gpu/drm/i915/i915_lrc.c > > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c > > > > @@ -41,7 +41,45 @@ > > > > #include <drm/i915_drm.h> > > > > #include "i915_drv.h" > > > > > > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE) > > > > > > I'm a bit puzzled by that number: > > > - I found a sentence saying: "the Context Image for the rendering > > > engine consists of 20 4K pages", which seems that it includes the > > > HWS page (on the same page it says context layout = HWS Page + > > > register state context). > > > - When looking at the register state context for the render engine: > > > 18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add > > > the HWS Page) > > > - Clearly I must be missing something :) > > > - That's only for the render engine, other engines have a much smaller > > > context, smaller enough that it's worth looking at their exact size. > > > - It'd be nice to work out the real size from the *CXT_*SIZE > > > registers. > > > > Hmmmm... I´ll try to get the real context sizes from the registers and > > compare. At least for RCS, VCS and BCS since there doesn´t seem to be > > a register for VECS? > > Couldn't find it either. I guess we'll need to ask the help of a friend. > Or the 50/50 joker maybe. > > -- > Damien CXT_SIZE is total garbage on anything past Ivybridge. That's why we don't use it for HSW either... I know, right? We should request the spec get updated. I have no excuse for not requesting that sooner.
On Tue, Apr 01, 2014 at 12:18:24PM -0700, Ben Widawsky wrote: > On Tue, Apr 01, 2014 at 02:51:27PM +0100, Damien Lespiau wrote: > > On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote: > > > > > --- a/drivers/gpu/drm/i915/i915_lrc.c > > > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c > > > > > @@ -41,7 +41,45 @@ > > > > > #include <drm/i915_drm.h> > > > > > #include "i915_drv.h" > > > > > > > > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE) > > > > > > > > I'm a bit puzzled by that number: > > > > - I found a sentence saying: "the Context Image for the rendering > > > > engine consists of 20 4K pages", which seems that it includes the > > > > HWS page (on the same page it says context layout = HWS Page + > > > > register state context). > > > > - When looking at the register state context for the render engine: > > > > 18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add > > > > the HWS Page) > > > > - Clearly I must be missing something :) > > > > - That's only for the render engine, other engines have a much smaller > > > > context, smaller enough that it's worth looking at their exact size. > > > > - It'd be nice to work out the real size from the *CXT_*SIZE > > > > registers. > > > > > > Hmmmm... I´ll try to get the real context sizes from the registers and > > > compare. At least for RCS, VCS and BCS since there doesn´t seem to be > > > a register for VECS? > > > > Couldn't find it either. I guess we'll need to ask the help of a friend. > > Or the 50/50 joker maybe. > > > > -- > > Damien > > CXT_SIZE is total garbage on anything past Ivybridge. That's why we > don't use it for HSW either... I know, right? We should request the spec > get updated. I have no excuse for not requesting that sooner. (talking about BDW only) For the render ring: HWSP: 4KB Ring context: CTX_SIZE[26:24] 5 cache lines -> offsets (in DW) 0x0 to 0x4f (= 5 * 64 / 4) Render context: CTX_SIZE[23:16] -> 0x65 caches lines -> offets (in DW) 0x50 to 0x69f (= 0x50 + 0x65 * 64 / 4 - 1) VF/VFE context CTX_SIZE[7:0] -> 0x82 cache lines -> offsets (in DW) 0x6A0 to 0xebf (= 0x6a0 + 0x82*64/4 - 1) Atomic storage is the max that you can allocate, 32KB ie 8192 DWords So we're almost there. What's missing here is the RS context size, couldn't find it in the spec :/ Maybe because that is a "well known" value. Note that I don't actually know what we read back from hw. Considering that the BCS context size seems to be 2 pages, I think it's worth digging a bit more to save ~66KB per BCS context (for instance). Even if we have to hardcode the different context sizes.
On Tue, Apr 01, 2014 at 10:05:12PM +0100, Damien Lespiau wrote: > On Tue, Apr 01, 2014 at 12:18:24PM -0700, Ben Widawsky wrote: > > On Tue, Apr 01, 2014 at 02:51:27PM +0100, Damien Lespiau wrote: > > > On Tue, Apr 01, 2014 at 02:47:19PM +0100, Mateo Lozano, Oscar wrote: > > > > > > --- a/drivers/gpu/drm/i915/i915_lrc.c > > > > > > +++ b/drivers/gpu/drm/i915/i915_lrc.c > > > > > > @@ -41,7 +41,45 @@ > > > > > > #include <drm/i915_drm.h> > > > > > > #include "i915_drv.h" > > > > > > > > > > > > +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE) > > > > > > > > > > I'm a bit puzzled by that number: > > > > > - I found a sentence saying: "the Context Image for the rendering > > > > > engine consists of 20 4K pages", which seems that it includes the > > > > > HWS page (on the same page it says context layout = HWS Page + > > > > > register state context). > > > > > - When looking at the register state context for the render engine: > > > > > 18096 dwords -> 18 pages, so in total it'd be 19 pages (need to add > > > > > the HWS Page) > > > > > - Clearly I must be missing something :) > > > > > - That's only for the render engine, other engines have a much smaller > > > > > context, smaller enough that it's worth looking at their exact size. > > > > > - It'd be nice to work out the real size from the *CXT_*SIZE > > > > > registers. > > > > > > > > Hmmmm... I´ll try to get the real context sizes from the registers and > > > > compare. At least for RCS, VCS and BCS since there doesn´t seem to be > > > > a register for VECS? > > > > > > Couldn't find it either. I guess we'll need to ask the help of a friend. > > > Or the 50/50 joker maybe. > > > > > > -- > > > Damien > > > > CXT_SIZE is total garbage on anything past Ivybridge. That's why we > > don't use it for HSW either... I know, right? We should request the spec > > get updated. I have no excuse for not requesting that sooner. > > (talking about BDW only) > > For the render ring: > > HWSP: 4KB > Ring context: CTX_SIZE[26:24] 5 cache lines -> offsets (in DW) 0x0 to 0x4f (= 5 * 64 / 4) > Render context: CTX_SIZE[23:16] -> 0x65 caches lines -> offets (in DW) 0x50 to 0x69f (= 0x50 + 0x65 * 64 / 4 - 1) > VF/VFE context CTX_SIZE[7:0] -> 0x82 cache lines -> offsets (in DW) 0x6A0 to 0xebf (= 0x6a0 + 0x82*64/4 - 1) > Atomic storage is the max that you can allocate, 32KB ie 8192 DWords > > So we're almost there. What's missing here is the RS context size, couldn't find > it in the spec :/ Maybe because that is a "well known" value. > > Note that I don't actually know what we read back from hw. > > Considering that the BCS context size seems to be 2 pages, I think it's worth > digging a bit more to save ~66KB per BCS context (for instance). Even if we > have to hardcode the different context sizes. > > -- > Damien I guess I should have checked first. Looks like there are actually quite a few changes since I wrote the code originally. Carry on.
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 264ea67..ff6a33c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2316,6 +2316,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, /* i915_lrc.c */ int gen8_gem_context_init(struct drm_device *dev); +void gen8_gem_context_fini(struct drm_device *dev); /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct drm_device *dev, diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index e92b9c5..4a6f1b0 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -440,6 +440,11 @@ void i915_gem_context_fini(struct drm_device *dev) * other code, leading to spurious errors. */ intel_gpu_reset(dev); + if (dev_priv->lrc_enabled) { + gen8_gem_context_fini(dev); + return; + } + /* When default context is created and switched to, base object refcount * will be 2 (+1 from object creation and +1 from do_switch()). * i915_gem_context_fini() will be called after gpu_idle() has switched diff --git a/drivers/gpu/drm/i915/i915_lrc.c b/drivers/gpu/drm/i915/i915_lrc.c index 3a93e99..10e6dbc 100644 --- a/drivers/gpu/drm/i915/i915_lrc.c +++ b/drivers/gpu/drm/i915/i915_lrc.c @@ -41,7 +41,45 @@ #include <drm/i915_drm.h> #include "i915_drv.h" +#define GEN8_LR_CONTEXT_SIZE (21 * PAGE_SIZE) + +void gen8_gem_context_fini(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + struct intel_engine *ring; + int unused; + + for_each_ring(ring, dev_priv, unused) { + if (ring->default_context) { + i915_gem_object_ggtt_unpin(ring->default_context->obj); + i915_gem_context_unreference(ring->default_context); + ring->default_context = NULL; + } + } + + dev_priv->mm.aliasing_ppgtt = NULL; +} + int gen8_gem_context_init(struct drm_device *dev) { - return -ENOSYS; + struct drm_i915_private *dev_priv = dev->dev_private; + struct intel_engine *ring; + int ret = -ENOSYS, ring_id; + + dev_priv->hw_context_size = round_up(GEN8_LR_CONTEXT_SIZE, 4096); + + for_each_ring(ring, dev_priv, ring_id) { + ring->default_context = i915_gem_create_context(dev, + NULL, (ring_id == RCS)); + if (IS_ERR_OR_NULL(ring->default_context)) { + ret = PTR_ERR(ring->default_context); + DRM_DEBUG_DRIVER("Create ctx failed: %d\n", ret); + ring->default_context = NULL; + goto err_out; + } + } + +err_out: + gen8_gem_context_fini(dev); + return ret; }