diff mbox

[v2,2/6] ALSA: compress: Add function to indicate the stream has gone bad

Message ID 1459938115-25456-3-git-send-email-ckeepax@opensource.wolfsonmicro.com (mailing list archive)
State New, archived
Headers show

Commit Message

Charles Keepax April 6, 2016, 10:21 a.m. UTC
Currently, the avail IOCTL doesn't pass any error status, which
means typically on error it simply shows no data available. This
can lead to situations where user-space is waiting indefinitely
for data that will never come as the DSP has suffered an
unrecoverable error.

Add snd_compr_stop_xrun which end drivers can call to indicate
the stream has suffered an unrecoverable error and stop it. The
avail and poll IOCTLs are then updated to report if the stream is
in an error state to user-space. Allowing the error to propagate
out. Processing of the actual snd_compr_stop needs to be deferred
to a worker thread as the end driver may detect the errors during
an existing operation callback.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
---
 include/sound/compress_driver.h |  3 +++
 sound/core/compress_offload.c   | 58 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 60 insertions(+), 1 deletion(-)

Comments

Vinod Koul April 7, 2016, 12:40 a.m. UTC | #1
On Wed, 2016-04-06 at 11:21 +0100, Charles Keepax wrote:

> +int snd_compr_stop_xrun(struct snd_compr_stream *stream)
> +{
> +	if (stream->runtime->state == SNDRV_PCM_STATE_XRUN)
> +		return 0;
> +
> +	stream->runtime->state = SNDRV_PCM_STATE_XRUN;
> +
> +	queue_delayed_work(system_power_efficient_wq, &stream
> ->xrun_work, 0);

why do we want to do this in workqueue and not stop the compress stream
immediately.

Also if we do this, then why should pointer return error?
Charles Keepax April 7, 2016, 8:28 a.m. UTC | #2
On Thu, Apr 07, 2016 at 12:40:03AM +0000, Koul, Vinod wrote:
> On Wed, 2016-04-06 at 11:21 +0100, Charles Keepax wrote:
> 
> > +int snd_compr_stop_xrun(struct snd_compr_stream *stream)
> > +{
> > +	if (stream->runtime->state == SNDRV_PCM_STATE_XRUN)
> > +		return 0;
> > +
> > +	stream->runtime->state = SNDRV_PCM_STATE_XRUN;
> > +
> > +	queue_delayed_work(system_power_efficient_wq, &stream
> > ->xrun_work, 0);
> 
> why do we want to do this in workqueue and not stop the compress stream
> immediately.

We need to defer this to a work queue because it is very likely
we will detect the errors whilst already in a callback in the
driver. For example we notice the stream is bad whilst processing
a read or a pointer callback in the driver. Because this call by
definition goes right back to the top of the stack rather than
unwinding the stack nicely as returning an error would do, we
need to be careful of all the locks that are likely to be held in
between.

snd_compr_ioctl - takes stream->device->lock
snd_compr_tstamp
soc_compr_pointer - takes rtd->pcm_mutex
wm_adsp_compr_pointer - takes dsp->pwr_lock
snd_compr_stop_xrun
snd_compr_stop
soc_compr_trigger - Deadlock as we take rtd->pcm_mutex again

> 
> Also if we do this, then why should pointer return error?

The first patch in the chain could indeed be changed to have
pointer calls not return an error status. But I feel that would
be making the code worse. Ok the situation I am most interested
here indicates a failure of the stream, but its a very small leap
to imagine situations where pointer fails temporarily and the
stream is still good.

Thanks,
Charles
Vinod Koul April 7, 2016, 11:07 p.m. UTC | #3
On Thu, 2016-04-07 at 09:28 +0100, Charles Keepax wrote:
> On Thu, Apr 07, 2016 at 12:40:03AM +0000, Koul, Vinod wrote:
> > On Wed, 2016-04-06 at 11:21 +0100, Charles Keepax wrote:
> > 
> > > +int snd_compr_stop_xrun(struct snd_compr_stream *stream)
> > > +{
> > > +	if (stream->runtime->state == SNDRV_PCM_STATE_XRUN)
> > > +		return 0;
> > > +
> > > +	stream->runtime->state = SNDRV_PCM_STATE_XRUN;
> > > +
> > > +	queue_delayed_work(system_power_efficient_wq, &stream
> > > ->xrun_work, 0);
> > 
> > why do we want to do this in workqueue and not stop the compress
> > stream
> > immediately.
> 
> We need to defer this to a work queue because it is very likely
> we will detect the errors whilst already in a callback in the
> driver. For example we notice the stream is bad whilst processing
> a read or a pointer callback in the driver. Because this call by
> definition goes right back to the top of the stack rather than
> unwinding the stack nicely as returning an error would do, we
> need to be careful of all the locks that are likely to be held in
> between.
> 
> snd_compr_ioctl - takes stream->device->lock
> snd_compr_tstamp
> soc_compr_pointer - takes rtd->pcm_mutex
> wm_adsp_compr_pointer - takes dsp->pwr_lock
> snd_compr_stop_xrun
> snd_compr_stop
> soc_compr_trigger - Deadlock as we take rtd->pcm_mutex again

This is what I suspected as well :D so this should be fine. I will take
a detailed look at the changes once am back home.

Also should this be made a generic stop rather than xrun. Perhaps the
reason can be specified as an argument.

Btw Takashi you okay with this approach?


> 
> > 
> > Also if we do this, then why should pointer return error?
> 
> The first patch in the chain could indeed be changed to have
> pointer calls not return an error status. But I feel that would
> be making the code worse. Ok the situation I am most interested
> here indicates a failure of the stream, but its a very small leap
> to imagine situations where pointer fails temporarily and the
> stream is still good.

The point here is that we are anyway propagating error by invoking the
new API so why return error here.
Btw can you please explain how this makes code worse?

I think on PCM we don't do that.
Takashi Iwai April 8, 2016, 4:49 a.m. UTC | #4
On Fri, 08 Apr 2016 01:07:12 +0200,
Vinod Koul wrote:
> 
> On Thu, 2016-04-07 at 09:28 +0100, Charles Keepax wrote:
> > On Thu, Apr 07, 2016 at 12:40:03AM +0000, Koul, Vinod wrote:
> > > On Wed, 2016-04-06 at 11:21 +0100, Charles Keepax wrote:
> > > 
> > > > +int snd_compr_stop_xrun(struct snd_compr_stream *stream)
> > > > +{
> > > > +	if (stream->runtime->state == SNDRV_PCM_STATE_XRUN)
> > > > +		return 0;
> > > > +
> > > > +	stream->runtime->state = SNDRV_PCM_STATE_XRUN;
> > > > +
> > > > +	queue_delayed_work(system_power_efficient_wq, &stream
> > > > ->xrun_work, 0);
> > > 
> > > why do we want to do this in workqueue and not stop the compress
> > > stream
> > > immediately.
> > 
> > We need to defer this to a work queue because it is very likely
> > we will detect the errors whilst already in a callback in the
> > driver. For example we notice the stream is bad whilst processing
> > a read or a pointer callback in the driver. Because this call by
> > definition goes right back to the top of the stack rather than
> > unwinding the stack nicely as returning an error would do, we
> > need to be careful of all the locks that are likely to be held in
> > between.
> > 
> > snd_compr_ioctl - takes stream->device->lock
> > snd_compr_tstamp
> > soc_compr_pointer - takes rtd->pcm_mutex
> > wm_adsp_compr_pointer - takes dsp->pwr_lock
> > snd_compr_stop_xrun
> > snd_compr_stop
> > soc_compr_trigger - Deadlock as we take rtd->pcm_mutex again
> 
> This is what I suspected as well :D so this should be fine. I will take
> a detailed look at the changes once am back home.
> 
> Also should this be made a generic stop rather than xrun. Perhaps the
> reason can be specified as an argument.
> 
> Btw Takashi you okay with this approach?

Yes, it looks good to me.

> > > Also if we do this, then why should pointer return error?
> > 
> > The first patch in the chain could indeed be changed to have
> > pointer calls not return an error status. But I feel that would
> > be making the code worse. Ok the situation I am most interested
> > here indicates a failure of the stream, but its a very small leap
> > to imagine situations where pointer fails temporarily and the
> > stream is still good.
> 
> The point here is that we are anyway propagating error by invoking the
> new API so why return error here.
> Btw can you please explain how this makes code worse?
> 
> I think on PCM we don't do that.

PCM stops the stream from the lock context, so there is no leap.


thanks,

Takashi
Charles Keepax April 8, 2016, 10:58 a.m. UTC | #5
On Thu, Apr 07, 2016 at 11:07:12PM +0000, Koul, Vinod wrote:
> On Thu, 2016-04-07 at 09:28 +0100, Charles Keepax wrote:
> > On Thu, Apr 07, 2016 at 12:40:03AM +0000, Koul, Vinod wrote:
> > > On Wed, 2016-04-06 at 11:21 +0100, Charles Keepax wrote:
> > > Also if we do this, then why should pointer return error?
> > 
> > The first patch in the chain could indeed be changed to have
> > pointer calls not return an error status. But I feel that would
> > be making the code worse. Ok the situation I am most interested
> > here indicates a failure of the stream, but its a very small leap
> > to imagine situations where pointer fails temporarily and the
> > stream is still good.
> 
> The point here is that we are anyway propagating error by invoking the
> new API so why return error here.
> Btw can you please explain how this makes code worse?

So if I make pointer return void, the two possible error paths
are either return zero available data or call xrun. Xrun will
stop the stream so I only want to do that if I think the stream
is dead. But that likely leaves various catagories of errors I
don't want to do that for. Returning zero on the other hand hides
the error so no one will ever know an error occurs (well except by
inspecting the kernel log). It feels to me like someone is going
to hit a case where they need to do that at some point. So
removing the return type seems wrong, but at the moment you have
two functions that return ints but don't propogate their values
though which just looks odd in the code.

Thanks,
Charles
diff mbox

Patch

diff --git a/include/sound/compress_driver.h b/include/sound/compress_driver.h
index c0abcdc..968fe2e 100644
--- a/include/sound/compress_driver.h
+++ b/include/sound/compress_driver.h
@@ -78,6 +78,7 @@  struct snd_compr_stream {
 	struct snd_compr_ops *ops;
 	struct snd_compr_runtime *runtime;
 	struct snd_compr *device;
+	struct delayed_work xrun_work;
 	enum snd_compr_direction direction;
 	bool metadata_set;
 	bool next_track;
@@ -187,4 +188,6 @@  static inline void snd_compr_drain_notify(struct snd_compr_stream *stream)
 	wake_up(&stream->runtime->sleep);
 }
 
+int snd_compr_stop_xrun(struct snd_compr_stream *stream);
+
 #endif
diff --git a/sound/core/compress_offload.c b/sound/core/compress_offload.c
index a9933c0..0b93121 100644
--- a/sound/core/compress_offload.c
+++ b/sound/core/compress_offload.c
@@ -67,6 +67,8 @@  struct snd_compr_file {
 	struct snd_compr_stream stream;
 };
 
+static void xrun_delayed_work(struct work_struct *work);
+
 /*
  * a note on stream states used:
  * we use following states in the compressed core
@@ -123,6 +125,9 @@  static int snd_compr_open(struct inode *inode, struct file *f)
 		snd_card_unref(compr->card);
 		return -ENOMEM;
 	}
+
+	INIT_DELAYED_WORK(&data->stream.xrun_work, xrun_delayed_work);
+
 	data->stream.ops = compr->ops;
 	data->stream.direction = dirn;
 	data->stream.private_data = compr->private_data;
@@ -153,6 +158,8 @@  static int snd_compr_free(struct inode *inode, struct file *f)
 	struct snd_compr_file *data = f->private_data;
 	struct snd_compr_runtime *runtime = data->stream.runtime;
 
+	cancel_delayed_work_sync(&data->stream.xrun_work);
+
 	switch (runtime->state) {
 	case SNDRV_PCM_STATE_RUNNING:
 	case SNDRV_PCM_STATE_DRAINING:
@@ -237,6 +244,14 @@  snd_compr_ioctl_avail(struct snd_compr_stream *stream, unsigned long arg)
 	avail = snd_compr_calc_avail(stream, &ioctl_avail);
 	ioctl_avail.avail = avail;
 
+	switch (stream->runtime->state) {
+	case SNDRV_PCM_STATE_OPEN:
+	case SNDRV_PCM_STATE_XRUN:
+		return -EBADFD;
+	default:
+		break;
+	}
+
 	if (copy_to_user((__u64 __user *)arg,
 				&ioctl_avail, sizeof(ioctl_avail)))
 		return -EFAULT;
@@ -397,10 +412,16 @@  static unsigned int snd_compr_poll(struct file *f, poll_table *wait)
 		return -EFAULT;
 
 	mutex_lock(&stream->device->lock);
-	if (stream->runtime->state == SNDRV_PCM_STATE_OPEN) {
+
+	switch (stream->runtime->state) {
+	case SNDRV_PCM_STATE_OPEN:
+	case SNDRV_PCM_STATE_XRUN:
 		retval = -EBADFD;
 		goto out;
+	default:
+		break;
 	}
+
 	poll_wait(f, &stream->runtime->sleep, wait);
 
 	avail = snd_compr_get_avail(stream);
@@ -420,6 +441,9 @@  static unsigned int snd_compr_poll(struct file *f, poll_table *wait)
 		if (avail >= stream->runtime->fragment_size)
 			retval = snd_compr_get_poll(stream);
 		break;
+	case SNDRV_PCM_STATE_XRUN:
+		retval = -EBADFD;
+		break;
 	default:
 		if (stream->direction == SND_COMPRESS_PLAYBACK)
 			retval = POLLOUT | POLLWRNORM | POLLERR;
@@ -698,6 +722,38 @@  static int snd_compr_stop(struct snd_compr_stream *stream)
 	return retval;
 }
 
+static void xrun_delayed_work(struct work_struct *work)
+{
+	struct snd_compr_stream *stream;
+
+	stream = container_of(work, struct snd_compr_stream, xrun_work.work);
+
+	mutex_lock(&stream->device->lock);
+	snd_compr_stop(stream);
+	mutex_unlock(&stream->device->lock);
+}
+
+/*
+ * snd_compr_stop_xrun: Report a fatal error on a stream
+ * @stream: pointer to stream
+ *
+ * Stop the stream and set its state to XRUN.
+ *
+ * Should be called with compressed device lock held.
+ */
+int snd_compr_stop_xrun(struct snd_compr_stream *stream)
+{
+	if (stream->runtime->state == SNDRV_PCM_STATE_XRUN)
+		return 0;
+
+	stream->runtime->state = SNDRV_PCM_STATE_XRUN;
+
+	queue_delayed_work(system_power_efficient_wq, &stream->xrun_work, 0);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(snd_compr_stop_xrun);
+
 static int snd_compress_wait_for_drain(struct snd_compr_stream *stream)
 {
 	int ret;