diff mbox series

[06/14] ASoC: SOF: add a power status IPC

Message ID 20200312144429.17959-7-guennadi.liakhovetski@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series ASoC: SOF DSP virtualisation | expand

Commit Message

Guennadi Liakhovetski March 12, 2020, 2:44 p.m. UTC
In a virtualised configuration the runtime PM of the host and any
guests aren't synchronised. But guests have to be able to tell the
host when they suspend and resume and know, whether the host has been
runtime suspended since that guests's topology had been sent to the
host last time. This is needed to decide whether to re-send the
topology again. To support this we add a new PM IPC message
SOF_IPC_PM_VFE_POWER_STATUS and a reset counter to track the state of
the DSP.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
---
 include/sound/sof/header.h | 1 +
 sound/soc/sof/core.c       | 2 ++
 sound/soc/sof/ipc.c        | 2 ++
 sound/soc/sof/loader.c     | 4 ++++
 sound/soc/sof/sof-priv.h   | 4 ++++
 5 files changed, 13 insertions(+)

Comments

Mark Brown March 13, 2020, 2:39 p.m. UTC | #1
On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote:

>  #endif
> +	atomic_set(&sdev->reset_count, 0);
>  	dev_set_drvdata(dev, sdev);

Do we really need to use atomics for this?  They are hard to use
correctly.

>  #include "ops.h"
> @@ -617,6 +618,9 @@ int snd_sof_run_firmware(struct snd_sof_dev *sdev)
>  	/* fw boot is complete. Update the active cores mask */
>  	sdev->enabled_cores_mask = init_core_mask;
>  
> +	/* increment reset count */
> +	atomic_add(1, &sdev->reset_count);
> +

We at no point seem to read from this reset counter?  I can't figure out
from this commit what it's doing.
Guennadi Liakhovetski March 20, 2020, 11:52 a.m. UTC | #2
Hi Mark,

On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote:
> On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote:
> 
> >  #endif
> > +	atomic_set(&sdev->reset_count, 0);
> >  	dev_set_drvdata(dev, sdev);
> 
> Do we really need to use atomics for this?  They are hard to use
> correctly.

This variable is accessed from 2 contexts: it's incremented by the SOF 
driver, when the firmware has booted and it's read by the SOF
VirtIO backend vhost-be.c when receiving a resume request from the guest. 
Timewise the variable will only be incremented during the DSP resume / 
power up, while the VirtIO back end is waiting for the resume to complete in 
pm_runtime_get_sync(). And only after that it reads the variable. But that 
can happen on different CPUs. Whereas I think that runtime PM will sync 
caches somewhere during the process, I think it is better to access the 
variable in an SMP-safe way, e.g. using atomic operations.

> >  #include "ops.h"
> > @@ -617,6 +618,9 @@ int snd_sof_run_firmware(struct snd_sof_dev *sdev)
> >  	/* fw boot is complete. Update the active cores mask */
> >  	sdev->enabled_cores_mask = init_core_mask;
> >  
> > +	/* increment reset count */
> > +	atomic_add(1, &sdev->reset_count);
> > +
> 
> We at no point seem to read from this reset counter?  I can't figure out
> from this commit what it's doing.

It's used in vhost-be.c (patch 10/14).

Thanks
Guennadi
Mark Brown March 20, 2020, 12:19 p.m. UTC | #3
On Fri, Mar 20, 2020 at 12:52:03PM +0100, Guennadi Liakhovetski wrote:
> On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote:
> > On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote:

> > >  #endif
> > > +	atomic_set(&sdev->reset_count, 0);
> > >  	dev_set_drvdata(dev, sdev);

> > Do we really need to use atomics for this?  They are hard to use
> > correctly.

> This variable is accessed from 2 contexts: it's incremented by the SOF 
> driver, when the firmware has booted and it's read by the SOF
> VirtIO backend vhost-be.c when receiving a resume request from the guest. 
> Timewise the variable will only be incremented during the DSP resume / 
> power up, while the VirtIO back end is waiting for the resume to complete in 
> pm_runtime_get_sync(). And only after that it reads the variable. But that 
> can happen on different CPUs. Whereas I think that runtime PM will sync 
> caches somewhere during the process, I think it is better to access the 
> variable in an SMP-safe way, e.g. using atomic operations.

That doesn't address my concern - to repeat, my concern is that atomics
are hard to use correctly.  Is there no other concurrency primitive (for
example this sounds like a completion) which can be used?
Guennadi Liakhovetski March 20, 2020, 1:27 p.m. UTC | #4
On Fri, Mar 20, 2020 at 12:19:52PM +0000, Mark Brown wrote:
> On Fri, Mar 20, 2020 at 12:52:03PM +0100, Guennadi Liakhovetski wrote:
> > On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote:
> > > On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote:
> 
> > > >  #endif
> > > > +	atomic_set(&sdev->reset_count, 0);
> > > >  	dev_set_drvdata(dev, sdev);
> 
> > > Do we really need to use atomics for this?  They are hard to use
> > > correctly.
> 
> > This variable is accessed from 2 contexts: it's incremented by the SOF 
> > driver, when the firmware has booted and it's read by the SOF
> > VirtIO backend vhost-be.c when receiving a resume request from the guest. 
> > Timewise the variable will only be incremented during the DSP resume / 
> > power up, while the VirtIO back end is waiting for the resume to complete in 
> > pm_runtime_get_sync(). And only after that it reads the variable. But that 
> > can happen on different CPUs. Whereas I think that runtime PM will sync 
> > caches somewhere during the process, I think it is better to access the 
> > variable in an SMP-safe way, e.g. using atomic operations.
> 
> That doesn't address my concern - to repeat, my concern is that atomics
> are hard to use correctly.  Is there no other concurrency primitive (for
> example this sounds like a completion) which can be used?

No, this isn't a completion - it's a counter. I've used atomic variables 
before, I cannot remember seeing any difficulties with their correct use 
described. Do you have a pointer?

Thinking about it, one problem I see is wrapping, it isn't currently 
handled, but that would happen after quite a few PM suspend / resume 
cycles... Still it can and should be fixed. But this isn't the concern, 
that you have?

Thanks
Guennadi
Guennadi Liakhovetski March 20, 2020, 3:07 p.m. UTC | #5
On Fri, Mar 20, 2020 at 02:27:32PM +0100, Guennadi Liakhovetski wrote:
> On Fri, Mar 20, 2020 at 12:19:52PM +0000, Mark Brown wrote:
> > On Fri, Mar 20, 2020 at 12:52:03PM +0100, Guennadi Liakhovetski wrote:
> > > On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote:
> > > > On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote:
> > 
> > > > >  #endif
> > > > > +	atomic_set(&sdev->reset_count, 0);
> > > > >  	dev_set_drvdata(dev, sdev);
> > 
> > > > Do we really need to use atomics for this?  They are hard to use
> > > > correctly.
> > 
> > > This variable is accessed from 2 contexts: it's incremented by the SOF 
> > > driver, when the firmware has booted and it's read by the SOF
> > > VirtIO backend vhost-be.c when receiving a resume request from the guest. 
> > > Timewise the variable will only be incremented during the DSP resume / 
> > > power up, while the VirtIO back end is waiting for the resume to complete in 
> > > pm_runtime_get_sync(). And only after that it reads the variable. But that 
> > > can happen on different CPUs. Whereas I think that runtime PM will sync 
> > > caches somewhere during the process, I think it is better to access the 
> > > variable in an SMP-safe way, e.g. using atomic operations.
> > 
> > That doesn't address my concern - to repeat, my concern is that atomics
> > are hard to use correctly.  Is there no other concurrency primitive (for
> > example this sounds like a completion) which can be used?
> 
> No, this isn't a completion - it's a counter. I've used atomic variables 
> before, I cannot remember seeing any difficulties with their correct use 
> described. Do you have a pointer?
> 
> Thinking about it, one problem I see is wrapping, it isn't currently 
> handled, but that would happen after quite a few PM suspend / resume 
> cycles... Still it can and should be fixed. But this isn't the concern, 
> that you have?

Actually I'd even say this isn't a problem. I think it's safe to say, you 
won't suspend and resume your audio interface more often than every 10 
seconds. That makes under 3200000 cycles per year. Even with 31 bits for a 
signed integer that makes more than 600 years. I think we're safe.

Thanks
Guennadi
Mark Brown March 20, 2020, 4:39 p.m. UTC | #6
On Fri, Mar 20, 2020 at 04:07:27PM +0100, Guennadi Liakhovetski wrote:
> On Fri, Mar 20, 2020 at 02:27:32PM +0100, Guennadi Liakhovetski wrote:

> > No, this isn't a completion - it's a counter. I've used atomic variables 
> > before, I cannot remember seeing any difficulties with their correct use 
> > described. Do you have a pointer?

> Actually I'd even say this isn't a problem. I think it's safe to say, you 
> won't suspend and resume your audio interface more often than every 10 
> seconds. That makes under 3200000 cycles per year. Even with 31 bits for a 
> signed integer that makes more than 600 years. I think we're safe.

The problem is that atomics are just incredibly error prone - for
example they're just a plain number, they're usually counting something
which is not being locked so you have to be careful any time you do
anything around them.  Their lack of structure makes them harder to
reason about than most other locking primitives, often harder than it's
worth.
Guennadi Liakhovetski March 23, 2020, 9:31 a.m. UTC | #7
On Fri, Mar 20, 2020 at 04:39:48PM +0000, Mark Brown wrote:
> On Fri, Mar 20, 2020 at 04:07:27PM +0100, Guennadi Liakhovetski wrote:
> > On Fri, Mar 20, 2020 at 02:27:32PM +0100, Guennadi Liakhovetski wrote:
> 
> > > No, this isn't a completion - it's a counter. I've used atomic variables 
> > > before, I cannot remember seeing any difficulties with their correct use 
> > > described. Do you have a pointer?
> 
> > Actually I'd even say this isn't a problem. I think it's safe to say, you 
> > won't suspend and resume your audio interface more often than every 10 
> > seconds. That makes under 3200000 cycles per year. Even with 31 bits for a 
> > signed integer that makes more than 600 years. I think we're safe.
> 
> The problem is that atomics are just incredibly error prone - for
> example they're just a plain number, they're usually counting something
> which is not being locked so you have to be careful any time you do
> anything around them.  Their lack of structure makes them harder to
> reason about than most other locking primitives, often harder than it's
> worth.

Ok, understand, but do you see any problems with this specific use case? 
Thinking about possible replacements - it isn't a case for a ref-count, 
because it isn't a get-put scenario. We really just need to count a 
specific event - DSP reboots. It can be the case that this counter doesn't 
need to be atomic at all. When it is read, the DSP is guaranteed to be up 
and running - I think. So no race would be possible. I can try to think 
about this more carefully and maybe make it a normal unprotected counter. 
But I don't think it has to be protected even better than what these 
patches are doing.

Thanks
Guennadi
Mark Brown March 23, 2020, 12:37 p.m. UTC | #8
On Mon, Mar 23, 2020 at 10:31:02AM +0100, Guennadi Liakhovetski wrote:

> Ok, understand, but do you see any problems with this specific use case? 
> Thinking about possible replacements - it isn't a case for a ref-count, 
> because it isn't a get-put scenario. We really just need to count a 
> specific event - DSP reboots. It can be the case that this counter doesn't 
> need to be atomic at all. When it is read, the DSP is guaranteed to be up 
> and running - I think. So no race would be possible. I can try to think 
> about this more carefully and maybe make it a normal unprotected counter. 
> But I don't think it has to be protected even better than what these 
> patches are doing.

I think at the very least it probably needs more documentation in the
code since I don't recall being able to work out what it was supposed to
be doing quickly.
diff mbox series

Patch

diff --git a/include/sound/sof/header.h b/include/sound/sof/header.h
index b794795..1aaccb5a 100644
--- a/include/sound/sof/header.h
+++ b/include/sound/sof/header.h
@@ -77,6 +77,7 @@ 
 #define SOF_IPC_PM_CLK_REQ			SOF_CMD_TYPE(0x006)
 #define SOF_IPC_PM_CORE_ENABLE			SOF_CMD_TYPE(0x007)
 #define SOF_IPC_PM_GATE				SOF_CMD_TYPE(0x008)
+#define SOF_IPC_PM_VFE_POWER_STATUS		SOF_CMD_TYPE(0x010)
 
 /* component runtime config - multiple different types */
 #define SOF_IPC_COMP_SET_VALUE			SOF_CMD_TYPE(0x001)
diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
index ca30d67..42dd72e 100644
--- a/sound/soc/sof/core.c
+++ b/sound/soc/sof/core.c
@@ -8,6 +8,7 @@ 
 // Author: Liam Girdwood <liam.r.girdwood@linux.intel.com>
 //
 
+#include <linux/atomic.h>
 #include <linux/firmware.h>
 #include <linux/module.h>
 #include <sound/soc.h>
@@ -311,6 +312,7 @@  int snd_sof_device_probe(struct device *dev, struct snd_sof_pdata *plat_data)
 #if IS_ENABLED(CONFIG_SND_SOC_SOF_DEBUG_PROBES)
 	sdev->extractor_stream_tag = SOF_PROBE_INVALID_NODE_ID;
 #endif
+	atomic_set(&sdev->reset_count, 0);
 	dev_set_drvdata(dev, sdev);
 
 	/* check all mandatory ops */
diff --git a/sound/soc/sof/ipc.c b/sound/soc/sof/ipc.c
index 6fa501c1..b0b38ca 100644
--- a/sound/soc/sof/ipc.c
+++ b/sound/soc/sof/ipc.c
@@ -105,6 +105,8 @@  static void ipc_log_header(struct device *dev, u8 *text, u32 cmd)
 			str2 = "CLK_REQ"; break;
 		case SOF_IPC_PM_CORE_ENABLE:
 			str2 = "CORE_ENABLE"; break;
+		case SOF_IPC_PM_VFE_POWER_STATUS:
+			str2 = "VFE_POWER_STATUS"; break;
 		default:
 			str2 = "unknown type"; break;
 		}
diff --git a/sound/soc/sof/loader.c b/sound/soc/sof/loader.c
index fc4ab51..3b2f788 100644
--- a/sound/soc/sof/loader.c
+++ b/sound/soc/sof/loader.c
@@ -10,6 +10,7 @@ 
 // Generic firmware loader.
 //
 
+#include <linux/atomic.h>
 #include <linux/firmware.h>
 #include <sound/sof.h>
 #include "ops.h"
@@ -617,6 +618,9 @@  int snd_sof_run_firmware(struct snd_sof_dev *sdev)
 	/* fw boot is complete. Update the active cores mask */
 	sdev->enabled_cores_mask = init_core_mask;
 
+	/* increment reset count */
+	atomic_add(1, &sdev->reset_count);
+
 	return 0;
 }
 EXPORT_SYMBOL(snd_sof_run_firmware);
diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h
index e029640..9695107 100644
--- a/sound/soc/sof/sof-priv.h
+++ b/sound/soc/sof/sof-priv.h
@@ -11,6 +11,7 @@ 
 #ifndef __SOUND_SOC_SOF_PRIV_H
 #define __SOUND_SOC_SOF_PRIV_H
 
+#include <linux/atomic.h>
 #include <linux/device.h>
 #include <sound/hdaudio.h>
 #include <sound/sof.h>
@@ -430,6 +431,9 @@  struct snd_sof_dev {
 	unsigned int extractor_stream_tag;
 #endif
 
+	/* VirtIO fields for host and guest */
+	atomic_t reset_count;
+
 	/* DMA for Trace */
 	struct snd_dma_buffer dmatb;
 	struct snd_dma_buffer dmatp;