Message ID | 20200312144429.17959-7-guennadi.liakhovetski@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ASoC: SOF DSP virtualisation | expand |
On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote: > #endif > + atomic_set(&sdev->reset_count, 0); > dev_set_drvdata(dev, sdev); Do we really need to use atomics for this? They are hard to use correctly. > #include "ops.h" > @@ -617,6 +618,9 @@ int snd_sof_run_firmware(struct snd_sof_dev *sdev) > /* fw boot is complete. Update the active cores mask */ > sdev->enabled_cores_mask = init_core_mask; > > + /* increment reset count */ > + atomic_add(1, &sdev->reset_count); > + We at no point seem to read from this reset counter? I can't figure out from this commit what it's doing.
Hi Mark, On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote: > On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote: > > > #endif > > + atomic_set(&sdev->reset_count, 0); > > dev_set_drvdata(dev, sdev); > > Do we really need to use atomics for this? They are hard to use > correctly. This variable is accessed from 2 contexts: it's incremented by the SOF driver, when the firmware has booted and it's read by the SOF VirtIO backend vhost-be.c when receiving a resume request from the guest. Timewise the variable will only be incremented during the DSP resume / power up, while the VirtIO back end is waiting for the resume to complete in pm_runtime_get_sync(). And only after that it reads the variable. But that can happen on different CPUs. Whereas I think that runtime PM will sync caches somewhere during the process, I think it is better to access the variable in an SMP-safe way, e.g. using atomic operations. > > #include "ops.h" > > @@ -617,6 +618,9 @@ int snd_sof_run_firmware(struct snd_sof_dev *sdev) > > /* fw boot is complete. Update the active cores mask */ > > sdev->enabled_cores_mask = init_core_mask; > > > > + /* increment reset count */ > > + atomic_add(1, &sdev->reset_count); > > + > > We at no point seem to read from this reset counter? I can't figure out > from this commit what it's doing. It's used in vhost-be.c (patch 10/14). Thanks Guennadi
On Fri, Mar 20, 2020 at 12:52:03PM +0100, Guennadi Liakhovetski wrote: > On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote: > > On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote: > > > #endif > > > + atomic_set(&sdev->reset_count, 0); > > > dev_set_drvdata(dev, sdev); > > Do we really need to use atomics for this? They are hard to use > > correctly. > This variable is accessed from 2 contexts: it's incremented by the SOF > driver, when the firmware has booted and it's read by the SOF > VirtIO backend vhost-be.c when receiving a resume request from the guest. > Timewise the variable will only be incremented during the DSP resume / > power up, while the VirtIO back end is waiting for the resume to complete in > pm_runtime_get_sync(). And only after that it reads the variable. But that > can happen on different CPUs. Whereas I think that runtime PM will sync > caches somewhere during the process, I think it is better to access the > variable in an SMP-safe way, e.g. using atomic operations. That doesn't address my concern - to repeat, my concern is that atomics are hard to use correctly. Is there no other concurrency primitive (for example this sounds like a completion) which can be used?
On Fri, Mar 20, 2020 at 12:19:52PM +0000, Mark Brown wrote: > On Fri, Mar 20, 2020 at 12:52:03PM +0100, Guennadi Liakhovetski wrote: > > On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote: > > > On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote: > > > > > #endif > > > > + atomic_set(&sdev->reset_count, 0); > > > > dev_set_drvdata(dev, sdev); > > > > Do we really need to use atomics for this? They are hard to use > > > correctly. > > > This variable is accessed from 2 contexts: it's incremented by the SOF > > driver, when the firmware has booted and it's read by the SOF > > VirtIO backend vhost-be.c when receiving a resume request from the guest. > > Timewise the variable will only be incremented during the DSP resume / > > power up, while the VirtIO back end is waiting for the resume to complete in > > pm_runtime_get_sync(). And only after that it reads the variable. But that > > can happen on different CPUs. Whereas I think that runtime PM will sync > > caches somewhere during the process, I think it is better to access the > > variable in an SMP-safe way, e.g. using atomic operations. > > That doesn't address my concern - to repeat, my concern is that atomics > are hard to use correctly. Is there no other concurrency primitive (for > example this sounds like a completion) which can be used? No, this isn't a completion - it's a counter. I've used atomic variables before, I cannot remember seeing any difficulties with their correct use described. Do you have a pointer? Thinking about it, one problem I see is wrapping, it isn't currently handled, but that would happen after quite a few PM suspend / resume cycles... Still it can and should be fixed. But this isn't the concern, that you have? Thanks Guennadi
On Fri, Mar 20, 2020 at 02:27:32PM +0100, Guennadi Liakhovetski wrote: > On Fri, Mar 20, 2020 at 12:19:52PM +0000, Mark Brown wrote: > > On Fri, Mar 20, 2020 at 12:52:03PM +0100, Guennadi Liakhovetski wrote: > > > On Fri, Mar 13, 2020 at 02:39:56PM +0000, Mark Brown wrote: > > > > On Thu, Mar 12, 2020 at 03:44:21PM +0100, Guennadi Liakhovetski wrote: > > > > > > > #endif > > > > > + atomic_set(&sdev->reset_count, 0); > > > > > dev_set_drvdata(dev, sdev); > > > > > > Do we really need to use atomics for this? They are hard to use > > > > correctly. > > > > > This variable is accessed from 2 contexts: it's incremented by the SOF > > > driver, when the firmware has booted and it's read by the SOF > > > VirtIO backend vhost-be.c when receiving a resume request from the guest. > > > Timewise the variable will only be incremented during the DSP resume / > > > power up, while the VirtIO back end is waiting for the resume to complete in > > > pm_runtime_get_sync(). And only after that it reads the variable. But that > > > can happen on different CPUs. Whereas I think that runtime PM will sync > > > caches somewhere during the process, I think it is better to access the > > > variable in an SMP-safe way, e.g. using atomic operations. > > > > That doesn't address my concern - to repeat, my concern is that atomics > > are hard to use correctly. Is there no other concurrency primitive (for > > example this sounds like a completion) which can be used? > > No, this isn't a completion - it's a counter. I've used atomic variables > before, I cannot remember seeing any difficulties with their correct use > described. Do you have a pointer? > > Thinking about it, one problem I see is wrapping, it isn't currently > handled, but that would happen after quite a few PM suspend / resume > cycles... Still it can and should be fixed. But this isn't the concern, > that you have? Actually I'd even say this isn't a problem. I think it's safe to say, you won't suspend and resume your audio interface more often than every 10 seconds. That makes under 3200000 cycles per year. Even with 31 bits for a signed integer that makes more than 600 years. I think we're safe. Thanks Guennadi
On Fri, Mar 20, 2020 at 04:07:27PM +0100, Guennadi Liakhovetski wrote: > On Fri, Mar 20, 2020 at 02:27:32PM +0100, Guennadi Liakhovetski wrote: > > No, this isn't a completion - it's a counter. I've used atomic variables > > before, I cannot remember seeing any difficulties with their correct use > > described. Do you have a pointer? > Actually I'd even say this isn't a problem. I think it's safe to say, you > won't suspend and resume your audio interface more often than every 10 > seconds. That makes under 3200000 cycles per year. Even with 31 bits for a > signed integer that makes more than 600 years. I think we're safe. The problem is that atomics are just incredibly error prone - for example they're just a plain number, they're usually counting something which is not being locked so you have to be careful any time you do anything around them. Their lack of structure makes them harder to reason about than most other locking primitives, often harder than it's worth.
On Fri, Mar 20, 2020 at 04:39:48PM +0000, Mark Brown wrote: > On Fri, Mar 20, 2020 at 04:07:27PM +0100, Guennadi Liakhovetski wrote: > > On Fri, Mar 20, 2020 at 02:27:32PM +0100, Guennadi Liakhovetski wrote: > > > > No, this isn't a completion - it's a counter. I've used atomic variables > > > before, I cannot remember seeing any difficulties with their correct use > > > described. Do you have a pointer? > > > Actually I'd even say this isn't a problem. I think it's safe to say, you > > won't suspend and resume your audio interface more often than every 10 > > seconds. That makes under 3200000 cycles per year. Even with 31 bits for a > > signed integer that makes more than 600 years. I think we're safe. > > The problem is that atomics are just incredibly error prone - for > example they're just a plain number, they're usually counting something > which is not being locked so you have to be careful any time you do > anything around them. Their lack of structure makes them harder to > reason about than most other locking primitives, often harder than it's > worth. Ok, understand, but do you see any problems with this specific use case? Thinking about possible replacements - it isn't a case for a ref-count, because it isn't a get-put scenario. We really just need to count a specific event - DSP reboots. It can be the case that this counter doesn't need to be atomic at all. When it is read, the DSP is guaranteed to be up and running - I think. So no race would be possible. I can try to think about this more carefully and maybe make it a normal unprotected counter. But I don't think it has to be protected even better than what these patches are doing. Thanks Guennadi
On Mon, Mar 23, 2020 at 10:31:02AM +0100, Guennadi Liakhovetski wrote: > Ok, understand, but do you see any problems with this specific use case? > Thinking about possible replacements - it isn't a case for a ref-count, > because it isn't a get-put scenario. We really just need to count a > specific event - DSP reboots. It can be the case that this counter doesn't > need to be atomic at all. When it is read, the DSP is guaranteed to be up > and running - I think. So no race would be possible. I can try to think > about this more carefully and maybe make it a normal unprotected counter. > But I don't think it has to be protected even better than what these > patches are doing. I think at the very least it probably needs more documentation in the code since I don't recall being able to work out what it was supposed to be doing quickly.
diff --git a/include/sound/sof/header.h b/include/sound/sof/header.h index b794795..1aaccb5a 100644 --- a/include/sound/sof/header.h +++ b/include/sound/sof/header.h @@ -77,6 +77,7 @@ #define SOF_IPC_PM_CLK_REQ SOF_CMD_TYPE(0x006) #define SOF_IPC_PM_CORE_ENABLE SOF_CMD_TYPE(0x007) #define SOF_IPC_PM_GATE SOF_CMD_TYPE(0x008) +#define SOF_IPC_PM_VFE_POWER_STATUS SOF_CMD_TYPE(0x010) /* component runtime config - multiple different types */ #define SOF_IPC_COMP_SET_VALUE SOF_CMD_TYPE(0x001) diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c index ca30d67..42dd72e 100644 --- a/sound/soc/sof/core.c +++ b/sound/soc/sof/core.c @@ -8,6 +8,7 @@ // Author: Liam Girdwood <liam.r.girdwood@linux.intel.com> // +#include <linux/atomic.h> #include <linux/firmware.h> #include <linux/module.h> #include <sound/soc.h> @@ -311,6 +312,7 @@ int snd_sof_device_probe(struct device *dev, struct snd_sof_pdata *plat_data) #if IS_ENABLED(CONFIG_SND_SOC_SOF_DEBUG_PROBES) sdev->extractor_stream_tag = SOF_PROBE_INVALID_NODE_ID; #endif + atomic_set(&sdev->reset_count, 0); dev_set_drvdata(dev, sdev); /* check all mandatory ops */ diff --git a/sound/soc/sof/ipc.c b/sound/soc/sof/ipc.c index 6fa501c1..b0b38ca 100644 --- a/sound/soc/sof/ipc.c +++ b/sound/soc/sof/ipc.c @@ -105,6 +105,8 @@ static void ipc_log_header(struct device *dev, u8 *text, u32 cmd) str2 = "CLK_REQ"; break; case SOF_IPC_PM_CORE_ENABLE: str2 = "CORE_ENABLE"; break; + case SOF_IPC_PM_VFE_POWER_STATUS: + str2 = "VFE_POWER_STATUS"; break; default: str2 = "unknown type"; break; } diff --git a/sound/soc/sof/loader.c b/sound/soc/sof/loader.c index fc4ab51..3b2f788 100644 --- a/sound/soc/sof/loader.c +++ b/sound/soc/sof/loader.c @@ -10,6 +10,7 @@ // Generic firmware loader. // +#include <linux/atomic.h> #include <linux/firmware.h> #include <sound/sof.h> #include "ops.h" @@ -617,6 +618,9 @@ int snd_sof_run_firmware(struct snd_sof_dev *sdev) /* fw boot is complete. Update the active cores mask */ sdev->enabled_cores_mask = init_core_mask; + /* increment reset count */ + atomic_add(1, &sdev->reset_count); + return 0; } EXPORT_SYMBOL(snd_sof_run_firmware); diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h index e029640..9695107 100644 --- a/sound/soc/sof/sof-priv.h +++ b/sound/soc/sof/sof-priv.h @@ -11,6 +11,7 @@ #ifndef __SOUND_SOC_SOF_PRIV_H #define __SOUND_SOC_SOF_PRIV_H +#include <linux/atomic.h> #include <linux/device.h> #include <sound/hdaudio.h> #include <sound/sof.h> @@ -430,6 +431,9 @@ struct snd_sof_dev { unsigned int extractor_stream_tag; #endif + /* VirtIO fields for host and guest */ + atomic_t reset_count; + /* DMA for Trace */ struct snd_dma_buffer dmatb; struct snd_dma_buffer dmatp;
In a virtualised configuration the runtime PM of the host and any guests aren't synchronised. But guests have to be able to tell the host when they suspend and resume and know, whether the host has been runtime suspended since that guests's topology had been sent to the host last time. This is needed to decide whether to re-send the topology again. To support this we add a new PM IPC message SOF_IPC_PM_VFE_POWER_STATUS and a reset counter to track the state of the DSP. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com> --- include/sound/sof/header.h | 1 + sound/soc/sof/core.c | 2 ++ sound/soc/sof/ipc.c | 2 ++ sound/soc/sof/loader.c | 4 ++++ sound/soc/sof/sof-priv.h | 4 ++++ 5 files changed, 13 insertions(+)