diff mbox series

[1/8] soundwire: fix enumeration completion

Message ID 20230705123018.30903-2-johan+linaro@kernel.org (mailing list archive)
State Accepted
Commit c40d6b3249b11d60e09d81530588f56233d9aa44
Headers show
Series ASoC/soundwire/qdsp6/wcd: fix leaks and probe deferral | expand

Commit Message

Johan Hovold July 5, 2023, 12:30 p.m. UTC
The soundwire subsystem uses two completion structures that allow
drivers to wait for soundwire device to become enumerated on the bus and
initialised by their drivers, respectively.

The code implementing the signalling is currently broken as it does not
signal all current and future waiters and also uses the wrong
reinitialisation function, which can potentially lead to memory
corruption if there are still waiters on the queue.

Not signalling future waiters specifically breaks sound card probe
deferrals as codec drivers can not tell that the soundwire device is
already attached when being reprobed. Some codec runtime PM
implementations suffer from similar problems as waiting for enumeration
during resume can also timeout despite the device already having been
enumerated.

Fixes: fb9469e54fa7 ("soundwire: bus: fix race condition with enumeration_complete signaling")
Fixes: a90def068127 ("soundwire: bus: fix race condition with initialization_complete signaling")
Cc: stable@vger.kernel.org      # 5.7
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: Rander Wang <rander.wang@linux.intel.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
---
 drivers/soundwire/bus.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Pierre-Louis Bossart July 5, 2023, 2:44 p.m. UTC | #1
On 7/5/23 16:30, Johan Hovold wrote:
> On Wed, Jul 05, 2023 at 02:53:17PM +0200, Pierre-Louis Bossart wrote:
>> On 7/5/23 14:30, Johan Hovold wrote:
>>> The soundwire subsystem uses two completion structures that allow
>>> drivers to wait for soundwire device to become enumerated on the bus and
>>> initialised by their drivers, respectively.
>>>
>>> The code implementing the signalling is currently broken as it does not
>>> signal all current and future waiters and also uses the wrong
>>> reinitialisation function, which can potentially lead to memory
>>> corruption if there are still waiters on the queue.
>>
>> That change sounds good, but I am not following the two paragraphs below.
>>
>>> Not signalling future waiters specifically breaks sound card probe
>>> deferrals as codec drivers can not tell that the soundwire device is
>>> already attached when being reprobed. 
>>
>> What makes you say that? There is a test in the probe and the codec
>> driver will absolutely be notified, see bus_type.c
>>
>> 	if (drv->ops && drv->ops->update_status) {
>> 		ret = drv->ops->update_status(slave, slave->status);
>> 		if (ret < 0)
>> 			dev_warn(dev, "%s: update_status failed with status %d\n", __func__,
>> ret);
>> 	}
> 
> I'm talking about signalling the codec driver using the soundwire device
> via the completion structs. Unless the underling device is detached and
> reattached, trying to wait for completion a second time will currently
> timeout instead of returning immediately.
> 
> This affects codecs like rt5682, which wait for completion in component
> probe (see rt5682_probe()).
> 
>>> Some codec runtime PM
>>> implementations suffer from similar problems as waiting for enumeration
>>> during resume can also timeout despite the device already having been
>>> enumerated.
>>
>> I am not following this either. Are you saying the wait_for_completion()
>> times out because of the init_completion/reinit_completion confusion, or
>> something else.
> 
> It times out because the completion counter is not saturated unless you
> use complete_all().
> 
> Drivers that wait unconditionally in resume, will time out the second
> time they are runtime resumed unless the underlying device has been
> detached and reattached in the meantime (e.g. wsa881x_runtime_resume()).

Makes sense. The default on Intel platforms is to reset the bus in all
resume cases, that forces the attachment so we never saw the issue.

For this patch:

Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
diff mbox series

Patch

diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index 1ea6a64f8c4a..66e5dba919fa 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -908,8 +908,8 @@  static void sdw_modify_slave_status(struct sdw_slave *slave,
 			"initializing enumeration and init completion for Slave %d\n",
 			slave->dev_num);
 
-		init_completion(&slave->enumeration_complete);
-		init_completion(&slave->initialization_complete);
+		reinit_completion(&slave->enumeration_complete);
+		reinit_completion(&slave->initialization_complete);
 
 	} else if ((status == SDW_SLAVE_ATTACHED) &&
 		   (slave->status == SDW_SLAVE_UNATTACHED)) {
@@ -917,7 +917,7 @@  static void sdw_modify_slave_status(struct sdw_slave *slave,
 			"signaling enumeration completion for Slave %d\n",
 			slave->dev_num);
 
-		complete(&slave->enumeration_complete);
+		complete_all(&slave->enumeration_complete);
 	}
 	slave->status = status;
 	mutex_unlock(&bus->bus_lock);
@@ -1941,7 +1941,7 @@  int sdw_handle_slave_status(struct sdw_bus *bus,
 				"signaling initialization completion for Slave %d\n",
 				slave->dev_num);
 
-			complete(&slave->initialization_complete);
+			complete_all(&slave->initialization_complete);
 
 			/*
 			 * If the manager became pm_runtime active, the peripherals will be