diff mbox series

[v1,3/7] dmaengine: tegra-apb: Prevent race conditions on channel's freeing

Message ID 20191228204640.25163-4-digetx@gmail.com (mailing list archive)
State Superseded
Headers show
Series NVIDIA Tegra APB DMA driver fixes and improvements | expand

Commit Message

Dmitry Osipenko Dec. 28, 2019, 8:46 p.m. UTC
It's unsafe to check the channel's "busy" state without taking a lock,
it is also unsafe to assume that tasklet isn't in-fly.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/dma/tegra20-apb-dma.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Michał Mirosław Dec. 30, 2019, 8:45 p.m. UTC | #1
On Sat, Dec 28, 2019 at 11:46:36PM +0300, Dmitry Osipenko wrote:
> It's unsafe to check the channel's "busy" state without taking a lock,
> it is also unsafe to assume that tasklet isn't in-fly.

'in-flight'. Also, the patch seems to have two independent bug-fixes
in it. Second one doesn't look right, at least not without an explanation.

First:

> -	if (tdc->busy)
> -		tegra_dma_terminate_all(dc);
> +	tegra_dma_terminate_all(dc);

Second:

> +	tasklet_kill(&tdc->tasklet);
>  
>  	spin_lock_irqsave(&tdc->lock, flags);
>  	list_splice_init(&tdc->pending_sg_req, &sg_req_list);
> @@ -1543,7 +1543,6 @@ static int tegra_dma_probe(struct platform_device *pdev)
>  		struct tegra_dma_channel *tdc = &tdma->channels[i];
>  
>  		free_irq(tdc->irq, tdc);
> -		tasklet_kill(&tdc->tasklet);
>  	}
>  
>  	pm_runtime_disable(&pdev->dev);
> @@ -1563,7 +1562,6 @@ static int tegra_dma_remove(struct platform_device *pdev)
>  	for (i = 0; i < tdma->chip_data->nr_channels; ++i) {
>  		tdc = &tdma->channels[i];
>  		free_irq(tdc->irq, tdc);
> -		tasklet_kill(&tdc->tasklet);
>  	}
>  
>  	pm_runtime_disable(&pdev->dev);

Best Regards,
Michał Mirosław
Michał Mirosław Dec. 30, 2019, 8:50 p.m. UTC | #2
On Mon, Dec 30, 2019 at 09:45:55PM +0100, Michał Mirosław wrote:
> On Sat, Dec 28, 2019 at 11:46:36PM +0300, Dmitry Osipenko wrote:
> > It's unsafe to check the channel's "busy" state without taking a lock,
> > it is also unsafe to assume that tasklet isn't in-fly.
> 
> 'in-flight'. Also, the patch seems to have two independent bug-fixes
> in it. Second one doesn't look right, at least not without an explanation.
> 
> First:
> 
> > -	if (tdc->busy)
> > -		tegra_dma_terminate_all(dc);
> > +	tegra_dma_terminate_all(dc);
> 
> Second:
> 
> > +	tasklet_kill(&tdc->tasklet);

BTW, maybe you can convert the code to threaded interrupt handler and
just get rid of the tasklet instead of fixing it?

Best Regards,
Michał Mirosław
Dmitry Osipenko Jan. 2, 2020, 3:03 p.m. UTC | #3
30.12.2019 23:45, Michał Mirosław пишет:
> On Sat, Dec 28, 2019 at 11:46:36PM +0300, Dmitry Osipenko wrote:
>> It's unsafe to check the channel's "busy" state without taking a lock,
>> it is also unsafe to assume that tasklet isn't in-fly.
> 
> 'in-flight'. Also, the patch seems to have two independent bug-fixes
> in it. Second one doesn't look right, at least not without an explanation.

Technically, this all shouldn't be needed at all since it should be a
responsibility of the DMA client drivers to make sure that channel is
idling before releasing it. But, AFAIK, the behavior of channel's
releasing isn't strictly defined by the DMA API, so it should be better
to keep the original behavior in place.

> First:
> 
>> -	if (tdc->busy)
>> -		tegra_dma_terminate_all(dc);
>> +	tegra_dma_terminate_all(dc);
> 
> Second:
> 
>> +	tasklet_kill(&tdc->tasklet);

Yes, it could be a separate change. Actually, this is not a fix, but a
clean-up change that simply stops tasklet instead of trying to work
around the fact that tasklet could be scheduled at the time channel's
freeing.

>>  	spin_lock_irqsave(&tdc->lock, flags);

I now see that missed to remove this locking since it's not needed now,
given that tasklet is already stopped after killing it by the above change.

I'll update and split this patch into two in v3.

>>  	list_splice_init(&tdc->pending_sg_req, &sg_req_list);
>> @@ -1543,7 +1543,6 @@ static int tegra_dma_probe(struct platform_device *pdev)
>>  		struct tegra_dma_channel *tdc = &tdma->channels[i];
>>  
>>  		free_irq(tdc->irq, tdc);
>> -		tasklet_kill(&tdc->tasklet);
>>  	}
>>  
>>  	pm_runtime_disable(&pdev->dev);
>> @@ -1563,7 +1562,6 @@ static int tegra_dma_remove(struct platform_device *pdev)
>>  	for (i = 0; i < tdma->chip_data->nr_channels; ++i) {
>>  		tdc = &tdma->channels[i];
>>  		free_irq(tdc->irq, tdc);
>> -		tasklet_kill(&tdc->tasklet);
>>  	}
>>  
>>  	pm_runtime_disable(&pdev->dev);

Thank you very much for taking a look at it!
Dmitry Osipenko Jan. 2, 2020, 3:09 p.m. UTC | #4
30.12.2019 23:50, Michał Mirosław пишет:
> On Mon, Dec 30, 2019 at 09:45:55PM +0100, Michał Mirosław wrote:
>> On Sat, Dec 28, 2019 at 11:46:36PM +0300, Dmitry Osipenko wrote:
>>> It's unsafe to check the channel's "busy" state without taking a lock,
>>> it is also unsafe to assume that tasklet isn't in-fly.
>>
>> 'in-flight'. Also, the patch seems to have two independent bug-fixes
>> in it. Second one doesn't look right, at least not without an explanation.
>>
>> First:
>>
>>> -	if (tdc->busy)
>>> -		tegra_dma_terminate_all(dc);
>>> +	tegra_dma_terminate_all(dc);
>>
>> Second:
>>
>>> +	tasklet_kill(&tdc->tasklet);
> 
> BTW, maybe you can convert the code to threaded interrupt handler and
> just get rid of the tasklet instead of fixing it?

This shouldn't bring much benefit because the the code's logic won't be
changed since we will still have to use the threaded ISR part as the
bottom-half and then IRQ API doesn't provide a nice way to synchronize
interrupt's execution, while tasklet_kill() is a nice way to sync it.
Michał Mirosław Jan. 3, 2020, 8:16 a.m. UTC | #5
On Thu, Jan 02, 2020 at 06:09:45PM +0300, Dmitry Osipenko wrote:
> 30.12.2019 23:50, Michał Mirosław пишет:
> > On Mon, Dec 30, 2019 at 09:45:55PM +0100, Michał Mirosław wrote:
> >> On Sat, Dec 28, 2019 at 11:46:36PM +0300, Dmitry Osipenko wrote:
> >>> It's unsafe to check the channel's "busy" state without taking a lock,
> >>> it is also unsafe to assume that tasklet isn't in-fly.
> >>
> >> 'in-flight'. Also, the patch seems to have two independent bug-fixes
> >> in it. Second one doesn't look right, at least not without an explanation.
> >>
> >> First:
> >>
> >>> -	if (tdc->busy)
> >>> -		tegra_dma_terminate_all(dc);
> >>> +	tegra_dma_terminate_all(dc);
> >>
> >> Second:
> >>
> >>> +	tasklet_kill(&tdc->tasklet);
> > 
> > BTW, maybe you can convert the code to threaded interrupt handler and
> > just get rid of the tasklet instead of fixing it?
> 
> This shouldn't bring much benefit because the the code's logic won't be
> changed since we will still have to use the threaded ISR part as the
> bottom-half and then IRQ API doesn't provide a nice way to synchronize
> interrupt's execution, while tasklet_kill() is a nice way to sync it.

What about synchronize_irq()?

BTW, does tegra_dma_terminate_all() prevent further interrupts that might
cause the tasklet to be scheduled again?

Best Regards,
Michał Mirosław
Dmitry Osipenko Jan. 4, 2020, 12:27 a.m. UTC | #6
03.01.2020 11:16, Michał Mirosław пишет:
> On Thu, Jan 02, 2020 at 06:09:45PM +0300, Dmitry Osipenko wrote:
>> 30.12.2019 23:50, Michał Mirosław пишет:
>>> On Mon, Dec 30, 2019 at 09:45:55PM +0100, Michał Mirosław wrote:
>>>> On Sat, Dec 28, 2019 at 11:46:36PM +0300, Dmitry Osipenko wrote:
>>>>> It's unsafe to check the channel's "busy" state without taking a lock,
>>>>> it is also unsafe to assume that tasklet isn't in-fly.
>>>>
>>>> 'in-flight'. Also, the patch seems to have two independent bug-fixes
>>>> in it. Second one doesn't look right, at least not without an explanation.
>>>>
>>>> First:
>>>>
>>>>> -	if (tdc->busy)
>>>>> -		tegra_dma_terminate_all(dc);
>>>>> +	tegra_dma_terminate_all(dc);
>>>>
>>>> Second:
>>>>
>>>>> +	tasklet_kill(&tdc->tasklet);
>>>
>>> BTW, maybe you can convert the code to threaded interrupt handler and
>>> just get rid of the tasklet instead of fixing it?
>>
>> This shouldn't bring much benefit because the the code's logic won't be
>> changed since we will still have to use the threaded ISR part as the
>> bottom-half and then IRQ API doesn't provide a nice way to synchronize
>> interrupt's execution, while tasklet_kill() is a nice way to sync it.
> 
> What about synchronize_irq()?

Good point! I totally forgot about it.

The only difference between tasklet and threaded ISR should be that
hardware interrupt is masked during of the threaded ISR execution, but
at quick glance it shouldn't be a problem.

BTW, I'm now thinking that the current code is wrong by accumulating
callbacks count in ISR if callback's execution takes too much time, not
sure that it's something what DMA clients expect to happen, will try to
verify that.

It also will be nice to get rid of the free list since it only
complicates code without any real benefits, I actually checked that
kmalloc doesn't introduce any noticeable latency at all.

I'll probably defer the above changes for now, leaving them for 5.7,
otherwise it could be a bit too many changes for this patchset
(hopefully it will get into 5.6).

> BTW, does tegra_dma_terminate_all() prevent further interrupts that might
> cause the tasklet to be scheduled again?

Yes, it should prevent further interrupts because it stops hardware and
clears interrupt status, thus in a worst case ISR could emit "Interrupt
already served status" message.
diff mbox series

Patch

diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
index 664e9c5df3ba..28aff0b9763e 100644
--- a/drivers/dma/tegra20-apb-dma.c
+++ b/drivers/dma/tegra20-apb-dma.c
@@ -1294,8 +1294,8 @@  static void tegra_dma_free_chan_resources(struct dma_chan *dc)
 
 	dev_dbg(tdc2dev(tdc), "Freeing channel %d\n", tdc->id);
 
-	if (tdc->busy)
-		tegra_dma_terminate_all(dc);
+	tegra_dma_terminate_all(dc);
+	tasklet_kill(&tdc->tasklet);
 
 	spin_lock_irqsave(&tdc->lock, flags);
 	list_splice_init(&tdc->pending_sg_req, &sg_req_list);
@@ -1543,7 +1543,6 @@  static int tegra_dma_probe(struct platform_device *pdev)
 		struct tegra_dma_channel *tdc = &tdma->channels[i];
 
 		free_irq(tdc->irq, tdc);
-		tasklet_kill(&tdc->tasklet);
 	}
 
 	pm_runtime_disable(&pdev->dev);
@@ -1563,7 +1562,6 @@  static int tegra_dma_remove(struct platform_device *pdev)
 	for (i = 0; i < tdma->chip_data->nr_channels; ++i) {
 		tdc = &tdma->channels[i];
 		free_irq(tdc->irq, tdc);
-		tasklet_kill(&tdc->tasklet);
 	}
 
 	pm_runtime_disable(&pdev->dev);