[12/12] drm/msm: gpu: Use the zap shader on 5XX if we can

Message ID	1480361317-9937-13-git-send-email-jcrouse@codeaurora.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 6F5FE613E7; Mon, 28 Nov 2016 19:28:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.1 smtp.codeaurora.org 6F5FE613E7 From: Jordan Crouse <jcrouse@codeaurora.org> To: freedreno@lists.freedesktop.org Subject: [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can Date: Mon, 28 Nov 2016 12:28:37 -0700 Message-Id: <1480361317-9937-13-git-send-email-jcrouse@codeaurora.org> In-Reply-To: <1480361317-9937-1-git-send-email-jcrouse@codeaurora.org> References: <1480361317-9937-1-git-send-email-jcrouse@codeaurora.org> Cc: linux-arm-msm@vger.kernel.org, dri-devel@lists.freedesktop.org Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Jordan Crouse Nov. 28, 2016, 7:28 p.m. UTC

The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
only render to buffers that are marked as secure and inaccessible
to the kernel and user through a series of hardware protections. In
practice secure mode is used to draw things like a UI on a secure
video frame.

In order to switch out of secure mode the GPU executes a special
shader that clears out the GMEM and other sensitve registers and
then writes a register. Because the kernel can't be trusted the
shader binary is signed and verified and programmed by the
secure world. To do this we need to read the MDT header and the
segments from the firmware location and put them in memory and
present them for approval.

For targets without secure support there is an out: if the
secure world doesn't support secure then there are no hardware
protections and we can freely write the SECVID_TRUST register from
the CPU. We don't have 100% confidence that we can query the
secure capabilities at run time but we have enough calls that
need to go right to give us some confidence that we're at least doing
something useful.

Of course if we guess wrong you trigger a permissions violation
which usually ends up in a system crash but thats a problem
that shows up immediately.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
 1 file changed, 70 insertions(+), 2 deletions(-)

Bjorn Andersson Dec. 5, 2016, 7:57 p.m. UTC | #1

On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:

> The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
> only render to buffers that are marked as secure and inaccessible
> to the kernel and user through a series of hardware protections. In
> practice secure mode is used to draw things like a UI on a secure
> video frame.
> 
> In order to switch out of secure mode the GPU executes a special
> shader that clears out the GMEM and other sensitve registers and
> then writes a register. Because the kernel can't be trusted the
> shader binary is signed and verified and programmed by the
> secure world. To do this we need to read the MDT header and the
> segments from the firmware location and put them in memory and
> present them for approval.
> 
> For targets without secure support there is an out: if the
> secure world doesn't support secure then there are no hardware
> protections and we can freely write the SECVID_TRUST register from
> the CPU. We don't have 100% confidence that we can query the
> secure capabilities at run time but we have enough calls that
> need to go right to give us some confidence that we're at least doing
> something useful.
> 
> Of course if we guess wrong you trigger a permissions violation
> which usually ends up in a system crash but thats a problem
> that shows up immediately.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 70 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index eefe197..a7a58ec 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
>  	return 0;
>  }
>  
> +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
> +{
> +	int ret;
> +
> +	ret = qcom_scm_gpu_zap_resume();
> +	if (ret)
> +		DRM_ERROR("%s: zap-shader resume failed: %d\n",
> +			gpu->name, ret);
> +
> +	return ret;
> +}
> +
> +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
> +{
> +	static bool loaded;
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> +	struct platform_device *pdev = a5xx_gpu->pdev;
> +	struct device_node *node;
> +	int ret;
> +
> +	/*
> +	 * If the zap shader is already loaded into memory we just need to kick
> +	 * the remote processor to reinitialize it
> +	 */
> +	if (loaded)

Why is this handling needed? Why can init be called multiple times?

> +		return a5xx_zap_shader_resume(gpu);
> +
> +	/* Populate the sub-nodes if they haven't already been done */
> +	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);

I haven't been able to find the qcom,zap-shader platform driver, but I
presume you have something like:

adreno {
	qcom,zap-shader {
		compatible = "qcom,zap-shader";

		firmware = "zapfw";
		memory-region = <&zap_region>;
	};
};

I presume this is done to not "taint" the adreno device's with the zap
memory region, but I don't think you should (ab)use a platform driver
for this.

You should rather add a struct device zap_dev to your adreno context, do
minimal initialization (name and a parent I think is enough), call
device_register(&zap_dev);, of_reserved_mem_device_init() and then use
that for your dma allocation.

This saves you from creating a platform_driver, instantiating a
platform_device and the worry of the race between the creation of that
device and the of_find_device_by_node() below.

> +
> +	/* Find the sub-node for the zap shader */
> +	node = of_find_node_by_name(pdev->dev.of_node, "qcom,zap-shader");

If you're looking for immediate children use of_get_child_by_name()

And no "qcom," in node names please.

> +	if (!node) {
> +		DRM_ERROR("%s: qcom,zap-shader not found in device tree\n",
> +			gpu->name);
> +		return -ENODEV;
> +	}
> +
> +	ret = _pil_tz_load_image(of_find_device_by_node(node));
> +	if (ret)
> +		DRM_ERROR("%s: Unable to load the zap shader\n",
> +			gpu->name);
> +
> +	loaded = !ret;
> +
> +	return ret;
> +}

Regards,
Bjorn

Bjorn Andersson Dec. 5, 2016, 8:10 p.m. UTC | #2

On Mon 05 Dec 11:57 PST 2016, Bjorn Andersson wrote:

[..]
> You should rather add a struct device zap_dev to your adreno context, do
> minimal initialization (name and a parent I think is enough), call
> device_register(&zap_dev);, of_reserved_mem_device_init() and then use
> that for your dma allocation.

You would of course need to set the of_node on the zap_dev for
of_reserved_mem_device_init() to work. Sorry for missing that.

Regards,
Bjorn

Jordan Crouse Dec. 6, 2016, 3:35 p.m. UTC | #3

On Mon, Dec 05, 2016 at 11:57:12AM -0800, Bjorn Andersson wrote:
> On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:
> 
> > The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
> > only render to buffers that are marked as secure and inaccessible
> > to the kernel and user through a series of hardware protections. In
> > practice secure mode is used to draw things like a UI on a secure
> > video frame.
> > 
> > In order to switch out of secure mode the GPU executes a special
> > shader that clears out the GMEM and other sensitve registers and
> > then writes a register. Because the kernel can't be trusted the
> > shader binary is signed and verified and programmed by the
> > secure world. To do this we need to read the MDT header and the
> > segments from the firmware location and put them in memory and
> > present them for approval.
> > 
> > For targets without secure support there is an out: if the
> > secure world doesn't support secure then there are no hardware
> > protections and we can freely write the SECVID_TRUST register from
> > the CPU. We don't have 100% confidence that we can query the
> > secure capabilities at run time but we have enough calls that
> > need to go right to give us some confidence that we're at least doing
> > something useful.
> > 
> > Of course if we guess wrong you trigger a permissions violation
> > which usually ends up in a system crash but thats a problem
> > that shows up immediately.
> > 
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
> >  1 file changed, 70 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index eefe197..a7a58ec 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
> >  	return 0;
> >  }
> >  
> > +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
> > +{
> > +	int ret;
> > +
> > +	ret = qcom_scm_gpu_zap_resume();
> > +	if (ret)
> > +		DRM_ERROR("%s: zap-shader resume failed: %d\n",
> > +			gpu->name, ret);
> > +
> > +	return ret;
> > +}
> > +
> > +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
> > +{
> > +	static bool loaded;
> > +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > +	struct platform_device *pdev = a5xx_gpu->pdev;
> > +	struct device_node *node;
> > +	int ret;
> > +
> > +	/*
> > +	 * If the zap shader is already loaded into memory we just need to kick
> > +	 * the remote processor to reinitialize it
> > +	 */
> > +	if (loaded)
> 
> Why is this handling needed? Why can init be called multiple times?

This is for resume - if we suspend and resume the device without losing state
the secure zone we can't load it again, so we have to call a different
operation to "resume" it. This will be much more heavily used when we have more
aggressive power management.

> > +		return a5xx_zap_shader_resume(gpu);
> > +
> > +	/* Populate the sub-nodes if they haven't already been done */
> > +	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
> 
> I haven't been able to find the qcom,zap-shader platform driver, but I
> presume you have something like:
> 
> adreno {
> 	qcom,zap-shader {
> 		compatible = "qcom,zap-shader";
> 
> 		firmware = "zapfw";
> 		memory-region = <&zap_region>;
> 	};
> };
> 
> I presume this is done to not "taint" the adreno device's with the zap
> memory region, but I don't think you should (ab)use a platform driver
> for this.
> 
> You should rather add a struct device zap_dev to your adreno context, do
> minimal initialization (name and a parent I think is enough), call
> device_register(&zap_dev);, of_reserved_mem_device_init() and then use
> that for your dma allocation.
> 
> This saves you from creating a platform_driver, instantiating a
> platform_device and the worry of the race between the creation of that
> device and the of_find_device_by_node() below.

As far as I know, of_platform_populate() just fleshed out the platform devices
for the sub-nodes. We are not creating a new platform driver or doing any sort
of probe code, we're just setting up the useful memory to attach the
memory-region to. As far as I can tell, using of_platform_populate() +
of_find_device_by_node() does a lot of the heavy lifting of what you
describe.

> > +
> > +	/* Find the sub-node for the zap shader */
> > +	node = of_find_node_by_name(pdev->dev.of_node, "qcom,zap-shader");
> 
> If you're looking for immediate children use of_get_child_by_name()
> 
> And no "qcom," in node names please.

Okay, can do.

> > +	if (!node) {
> > +		DRM_ERROR("%s: qcom,zap-shader not found in device tree\n",
> > +			gpu->name);
> > +		return -ENODEV;
> > +	}
> > +
> > +	ret = _pil_tz_load_image(of_find_device_by_node(node));
> > +	if (ret)
> > +		DRM_ERROR("%s: Unable to load the zap shader\n",
> > +			gpu->name);
> > +
> > +	loaded = !ret;
> > +
> > +	return ret;
> > +}
> 
> Regards,
> Bjorn

Jordan

Rob Clark Dec. 6, 2016, 4:37 p.m. UTC | #4

On Tue, Dec 6, 2016 at 10:35 AM, Jordan Crouse <jcrouse@codeaurora.org> wrote:
> On Mon, Dec 05, 2016 at 11:57:12AM -0800, Bjorn Andersson wrote:
>> On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:
>>
>> > The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
>> > only render to buffers that are marked as secure and inaccessible
>> > to the kernel and user through a series of hardware protections. In
>> > practice secure mode is used to draw things like a UI on a secure
>> > video frame.
>> >
>> > In order to switch out of secure mode the GPU executes a special
>> > shader that clears out the GMEM and other sensitve registers and
>> > then writes a register. Because the kernel can't be trusted the
>> > shader binary is signed and verified and programmed by the
>> > secure world. To do this we need to read the MDT header and the
>> > segments from the firmware location and put them in memory and
>> > present them for approval.
>> >
>> > For targets without secure support there is an out: if the
>> > secure world doesn't support secure then there are no hardware
>> > protections and we can freely write the SECVID_TRUST register from
>> > the CPU. We don't have 100% confidence that we can query the
>> > secure capabilities at run time but we have enough calls that
>> > need to go right to give us some confidence that we're at least doing
>> > something useful.
>> >
>> > Of course if we guess wrong you trigger a permissions violation
>> > which usually ends up in a system crash but thats a problem
>> > that shows up immediately.
>> >
>> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
>> > ---
>> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
>> >  1 file changed, 70 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
>> > index eefe197..a7a58ec 100644
>> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
>> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
>> > @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
>> >     return 0;
>> >  }
>> >
>> > +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
>> > +{
>> > +   int ret;
>> > +
>> > +   ret = qcom_scm_gpu_zap_resume();
>> > +   if (ret)
>> > +           DRM_ERROR("%s: zap-shader resume failed: %d\n",
>> > +                   gpu->name, ret);
>> > +
>> > +   return ret;
>> > +}
>> > +
>> > +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
>> > +{
>> > +   static bool loaded;
>> > +   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>> > +   struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
>> > +   struct platform_device *pdev = a5xx_gpu->pdev;
>> > +   struct device_node *node;
>> > +   int ret;
>> > +
>> > +   /*
>> > +    * If the zap shader is already loaded into memory we just need to kick
>> > +    * the remote processor to reinitialize it
>> > +    */
>> > +   if (loaded)
>>
>> Why is this handling needed? Why can init be called multiple times?
>
> This is for resume - if we suspend and resume the device without losing state
> the secure zone we can't load it again, so we have to call a different
> operation to "resume" it. This will be much more heavily used when we have more
> aggressive power management.

also for recovery when I crash the gpu ;-)

BR,
-R

Bjorn Andersson Dec. 6, 2016, 5:18 p.m. UTC | #5

On Tue 06 Dec 07:35 PST 2016, Jordan Crouse wrote:

> On Mon, Dec 05, 2016 at 11:57:12AM -0800, Bjorn Andersson wrote:
> > On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:
> > 
> > > The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
> > > only render to buffers that are marked as secure and inaccessible
> > > to the kernel and user through a series of hardware protections. In
> > > practice secure mode is used to draw things like a UI on a secure
> > > video frame.
> > > 
> > > In order to switch out of secure mode the GPU executes a special
> > > shader that clears out the GMEM and other sensitve registers and
> > > then writes a register. Because the kernel can't be trusted the
> > > shader binary is signed and verified and programmed by the
> > > secure world. To do this we need to read the MDT header and the
> > > segments from the firmware location and put them in memory and
> > > present them for approval.
> > > 
> > > For targets without secure support there is an out: if the
> > > secure world doesn't support secure then there are no hardware
> > > protections and we can freely write the SECVID_TRUST register from
> > > the CPU. We don't have 100% confidence that we can query the
> > > secure capabilities at run time but we have enough calls that
> > > need to go right to give us some confidence that we're at least doing
> > > something useful.
> > > 
> > > Of course if we guess wrong you trigger a permissions violation
> > > which usually ends up in a system crash but thats a problem
> > > that shows up immediately.
> > > 
> > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 70 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > index eefe197..a7a58ec 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
> > >  	return 0;
> > >  }
> > >  
> > > +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
> > > +{
> > > +	int ret;
> > > +
> > > +	ret = qcom_scm_gpu_zap_resume();
> > > +	if (ret)
> > > +		DRM_ERROR("%s: zap-shader resume failed: %d\n",
> > > +			gpu->name, ret);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
> > > +{
> > > +	static bool loaded;
> > > +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > > +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > > +	struct platform_device *pdev = a5xx_gpu->pdev;
> > > +	struct device_node *node;
> > > +	int ret;
> > > +
> > > +	/*
> > > +	 * If the zap shader is already loaded into memory we just need to kick
> > > +	 * the remote processor to reinitialize it
> > > +	 */
> > > +	if (loaded)
> > 
> > Why is this handling needed? Why can init be called multiple times?
> 
> This is for resume - if we suspend and resume the device without
> losing state the secure zone we can't load it again, so we have to
> call a different operation to "resume" it. This will be much more
> heavily used when we have more aggressive power management.
> 

I presume then that the rest of the resume path is the same as the
initialization.

> > > +		return a5xx_zap_shader_resume(gpu);
> > > +
> > > +	/* Populate the sub-nodes if they haven't already been done */
> > > +	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
> > 
> > I haven't been able to find the qcom,zap-shader platform driver, but I
> > presume you have something like:
> > 
> > adreno {
> > 	qcom,zap-shader {
> > 		compatible = "qcom,zap-shader";
> > 
> > 		firmware = "zapfw";
> > 		memory-region = <&zap_region>;
> > 	};
> > };
> > 
> > I presume this is done to not "taint" the adreno device's with the zap
> > memory region, but I don't think you should (ab)use a platform driver
> > for this.
> > 
> > You should rather add a struct device zap_dev to your adreno context, do
> > minimal initialization (name and a parent I think is enough), call
> > device_register(&zap_dev);, of_reserved_mem_device_init() and then use
> > that for your dma allocation.
> > 
> > This saves you from creating a platform_driver, instantiating a
> > platform_device and the worry of the race between the creation of that
> > device and the of_find_device_by_node() below.
> 
> As far as I know, of_platform_populate() just fleshed out the platform
> devices for the sub-nodes.

Correct, it will scan all subnodes and try to match their compatible
with platform_drivers. For each match it finds it probes that driver.

> We are not creating a new platform driver
> or doing any sort of probe code, we're just setting up the useful
> memory to attach the memory-region to. As far as I can tell, using
> of_platform_populate() + of_find_device_by_node() does a lot of the
> heavy lifting of what you describe.

But as far as I know the compatible of node that you name
"qcom,zap-shader" must actually match a platform_driver for there to be
a device that of_find_device_by_node() will be able to match against.

The heavy lifting I'm talking about is (untested):

	struct device zap_dev;

	zap_dev.of_node = of_get_child_by_name("zap-shader");
	zap_dev.parent = &pdev->dev;
	dev_set_name(&zap_dev, "zap-shader");

	ret = device_register(&zap_dev);
	if (ret < 0) {
		dev_err();
		return ret;
	}

	of_reserved_mem_device_init(&zap_dev);


And then you need a device_put(&zap_dev) when you're done with the
device and if you set zap_dev.release you can have that clean things
up...

Regards,
Bjorn

[12/12] drm/msm: gpu: Use the zap shader on 5XX if we can

Commit Message

Comments

Patch