Message ID | 20130517123423.GR14863@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Vinod, Thanks for the reply. On Fri, May 17, 2013 at 01:34:23PM +0100, Vinod Koul wrote: > On Thu, May 16, 2013 at 04:35:53PM +0100, Will Deacon wrote: > > Right, so I think I understand what's causing this, but I'll leave it to > > Andriy to suggest a fix. The problem comes about because the dmatest > > module is now driven from debugfs, making it possible to unload the module > > whilst a test run is in progress. In this case: > > > > - The DMA threads will return from wait_event_freezable_timeout(...) > > due to kthread_should_stop() returning true, and subsequently > > report failure because done.done is false. > > > > - The DMA engines may not be idle, so the asynchronous callback can > > be invoked after we've started cleaning up, explaining the NULL > > dereference I'm seeing. > > > > The solutions are either fixing the module exit code to cope with concurrent > > DMA transfers or to revert 77101ce578bb and not allow the channel threads to > > return mid-transfer. > We need to properly abort the channels on removal. This is already handled in > the code but the kthread_stop is called after the transactions are aborted. It > should be the other way round. Can you try with below patch Unfortunately, I can trigger the exact same panic with this patch applied. Isn't there a race between terminating the dmaengine transfers (dmaengine_terminate_all) and killing the test threads (kthread_stop) where a new transfer could be kicked off by dmatest_func? Will
On Fri, 2013-05-17 at 15:18 +0100, Will Deacon wrote: > On Fri, May 17, 2013 at 01:34:23PM +0100, Vinod Koul wrote: > > On Thu, May 16, 2013 at 04:35:53PM +0100, Will Deacon wrote: > > > Right, so I think I understand what's causing this, but I'll leave it to > > > Andriy to suggest a fix. Thanks for the report. Unfortunately neither me nor other guys in the team could not reproduce mentioned issue on our hardware. Nevertheless I will try to allocate time and suggest possible fix if any comes to my mind. So, your testing and analysis is appreciated.
Hi Andy, On Mon, May 20, 2013 at 08:52:34AM +0100, Andy Shevchenko wrote: > On Fri, 2013-05-17 at 15:18 +0100, Will Deacon wrote: > > On Fri, May 17, 2013 at 01:34:23PM +0100, Vinod Koul wrote: > > > On Thu, May 16, 2013 at 04:35:53PM +0100, Will Deacon wrote: > > > > Right, so I think I understand what's causing this, but I'll leave it to > > > > Andriy to suggest a fix. > > Thanks for the report. > > Unfortunately neither me nor other guys in the team could not reproduce > mentioned issue on our hardware. > Nevertheless I will try to allocate time and suggest possible fix if any > comes to my mind. So, your testing and analysis is appreciated. Sure. They way I trigger it is to unload the module whilst the DMA is ongoing (I'm using a software model, so I can make the DMA nice and slow -- I guess you could try using some large buffers). My script (I just run from busybox) is: #!/bin/sh mount proc -t proc /proc mount sysfs -t sysfs /sys mount debugfs -t debugfs /sys/kernel/debug echo 9 > /proc/sysrq-trigger modprobe dmatest echo 8 > /sys/kernel/debug/dmatest/iterations echo 8 > /sys/kernel/debug/dmatest/threads_per_chan echo 131072 > /sys/kernel/debug/dmatest/test_buf_size echo 1 > /sys/kernel/debug/dmatest/run modprobe -r dmatest Will
On Mon, 2013-05-20 at 10:58 +0100, Will Deacon wrote: > Sure. They way I trigger it is to unload the module whilst the DMA is > ongoing (I'm using a software model, so I can make the DMA nice and slow -- > I guess you could try using some large buffers). Thank you for the script. It looks like I managed to reproduce it with help of your script. I'm about to send the patch. I will appreciate to collect a Tested-by tags in case it solves the issue.
diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c index d8ce4ec..4e8d581 100644 --- a/drivers/dma/dmatest.c +++ b/drivers/dma/dmatest.c @@ -822,6 +822,9 @@ static void dmatest_cleanup_channel(struct dmatest_chan *dtc) struct dmatest_thread *_thread; int ret; + /* terminate all transfers on specified channels */ + dmaengine_terminate_all(dtc->chan); + list_for_each_entry_safe(thread, _thread, &dtc->threads, node) { ret = kthread_stop(thread->task); pr_debug("dmatest: thread %s exited with status %d\n", @@ -830,9 +833,6 @@ static void dmatest_cleanup_channel(struct dmatest_chan *dtc) kfree(thread); } - /* terminate all transfers on specified channels */ - dmaengine_terminate_all(dtc->chan); - kfree(dtc); }