[v4] ARM: omap: edma: add suspend suspend/resume hooks

Message ID	1383164468-4610-1-git-send-email-zonque@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Daniel Mack <zonque@gmail.com> To: linux-omap@vger.kernel.org, joelf@ti.com, gururaja.hebbar@ti.com, balajitk@ti.com Subject: [PATCH v4] ARM: omap: edma: add suspend suspend/resume hooks Date: Wed, 30 Oct 2013 21:21:08 +0100 Message-Id: <1383164468-4610-1-git-send-email-zonque@gmail.com> Cc: mporter@ti.com, nsekhar@ti.com, s.neumann@raumfeld.com, Daniel Mack <zonque@gmail.com>, Russ.Dill@ti.com, linux-arm-kernel@lists.infradead.org Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Daniel Mack Oct. 30, 2013, 8:21 p.m. UTC

This patch makes the edma driver resume correctly after suspend. Tested
on an AM33xx platform with cyclic audio streams and omap_hsmmc.

All information can be reconstructed by already known runtime
information.

As we now use some functions that were previously only used from __init
context, annotations had to be dropped.

Signed-off-by: Daniel Mack <zonque@gmail.com>
---
There was actually only a v3 ever, I made a mistake when formating the
first version of this patch. To prevent confusion though, I named this
one v4.

v3 -> v4:
	* dropped extra allocations, and reconstruct register values
	  from already known driver states.



Hi Joel, Gururaja, Balaji,

thanks a lot for your feedback. I successfully tested this version with
davinci mcasp as well as omap_hsmmc. I'd appreciate another round of
reviews :)


Thanks,
Daniel

 arch/arm/common/edma.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 79 insertions(+), 3 deletions(-)

Vaibhav Bedia Oct. 31, 2013, 10:25 p.m. UTC | #1

Hi Daniel,

On Wed, Oct 30, 2013 at 4:21 PM, Daniel Mack <zonque@gmail.com> wrote:
[...]
> +
> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
> +
>  static struct platform_driver edma_driver = {
>         .driver = {
>                 .name   = "edma",
> +               .pm     = &edma_pm_ops,
>                 .of_match_table = edma_of_ids,
>         },

A while back we discovered a nasty race condition here that had us move the EDMA
PM callbacks to the noirq phase. IIRC the MMC driver was resuming
before the EDMA
driver had a chance to run and that was leading to a deadlock. I am
not sure how to force
this scenario but i do remember spending time debugging this on a
random codebase.
Maybe some else has some better ideas on how to force this race condition...

I think logically it makes sense to have the DMA controllers in any
system resume before
any of the client drivers. Maybe a set of syscore_ops similar to the
timekeeping stuff needs
to be added (in the dmaengine framework?) to get this right.

Regards,
Vaibhav

Joel Fernandes Nov. 6, 2013, 5:36 p.m. UTC | #2

Hi Vaibhav,

On 10/31/2013 05:25 PM, Vaibhav Bedia wrote:
> Hi Daniel,
> 
> On Wed, Oct 30, 2013 at 4:21 PM, Daniel Mack <zonque@gmail.com> wrote:
> [...]
>> +
>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
>> +
>>  static struct platform_driver edma_driver = {
>>         .driver = {
>>                 .name   = "edma",
>> +               .pm     = &edma_pm_ops,
>>                 .of_match_table = edma_of_ids,
>>         },
> 
> A while back we discovered a nasty race condition here that had us move the EDMA
> PM callbacks to the noirq phase. IIRC the MMC driver was resuming
> before the EDMA
> driver had a chance to run and that was leading to a deadlock. I am
> not sure how to force
> this scenario but i do remember spending time debugging this on a
> random codebase.
> Maybe some else has some better ideas on how to force this race condition...

I think you're talking about the patch at [1] which is not upstream. A quick
question with my limited knowledge of suspend/resume- How can there be pending
I/O operations between suspend/resume cycles? The sync is done before suspend,
so I don't understand how one is receiving a response from the card after resume
before EDMA can resume? I'd imagine that until all devices are resumed, there
will be no I/O operation issued. Let me know your thoughts.

thanks,

-Joel

[1]
https://www.gitorious.org/x0148406-public/linux-kernel/commit/b81bf04091986fa3893f31955564594567be3b61

Hebbar, Gururaja Nov. 7, 2013, 1:30 p.m. UTC | #3

On Wednesday 06 November 2013 11:06 PM, Joel Fernandes wrote:
> Hi Vaibhav,
> 
> On 10/31/2013 05:25 PM, Vaibhav Bedia wrote:
>> Hi Daniel,
>>
>> On Wed, Oct 30, 2013 at 4:21 PM, Daniel Mack <zonque@gmail.com> wrote:
>> [...]
>>> +
>>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
>>> +
>>>  static struct platform_driver edma_driver = {
>>>         .driver = {
>>>                 .name   = "edma",
>>> +               .pm     = &edma_pm_ops,
>>>                 .of_match_table = edma_of_ids,
>>>         },
>>
>> A while back we discovered a nasty race condition here that had us move the EDMA
>> PM callbacks to the noirq phase. IIRC the MMC driver was resuming
>> before the EDMA
>> driver had a chance to run and that was leading to a deadlock. I am
>> not sure how to force
>> this scenario but i do remember spending time debugging this on a
>> random codebase.
>> Maybe some else has some better ideas on how to force this race condition...
> 
> I think you're talking about the patch at [1] which is not upstream. A quick
> question with my limited knowledge of suspend/resume- How can there be pending
> I/O operations between suspend/resume cycles? 

AFAIK, MMC framework started talking to cards immediately after resume.
Due to race condition, EDMA resume callback had not yet completed and
HSMMC driver had initiated a DMA operation. This resulted in Deadlock.

regards
Gururaja

> The sync is done before suspend,
> so I don't understand how one is receiving a response from the card after resume
> before EDMA can resume? I'd imagine that until all devices are resumed, there
> will be no I/O operation issued. Let me know your thoughts.
> 
> thanks,
> 
> -Joel
> 
> [1]
> https://www.gitorious.org/x0148406-public/linux-kernel/commit/b81bf04091986fa3893f31955564594567be3b61
>

Daniel Mack Nov. 7, 2013, 1:32 p.m. UTC | #4

On 11/07/2013 02:30 PM, Gururaja Hebbar wrote:
> On Wednesday 06 November 2013 11:06 PM, Joel Fernandes wrote:
>> Hi Vaibhav,
>>
>> On 10/31/2013 05:25 PM, Vaibhav Bedia wrote:
>>> Hi Daniel,
>>>
>>> On Wed, Oct 30, 2013 at 4:21 PM, Daniel Mack <zonque@gmail.com> wrote:
>>> [...]
>>>> +
>>>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
>>>> +
>>>>  static struct platform_driver edma_driver = {
>>>>         .driver = {
>>>>                 .name   = "edma",
>>>> +               .pm     = &edma_pm_ops,
>>>>                 .of_match_table = edma_of_ids,
>>>>         },
>>>
>>> A while back we discovered a nasty race condition here that had us move the EDMA
>>> PM callbacks to the noirq phase. IIRC the MMC driver was resuming
>>> before the EDMA
>>> driver had a chance to run and that was leading to a deadlock. I am
>>> not sure how to force
>>> this scenario but i do remember spending time debugging this on a
>>> random codebase.
>>> Maybe some else has some better ideas on how to force this race condition...
>>
>> I think you're talking about the patch at [1] which is not upstream. A quick
>> question with my limited knowledge of suspend/resume- How can there be pending
>> I/O operations between suspend/resume cycles? 
> 
> AFAIK, MMC framework started talking to cards immediately after resume.
> Due to race condition, EDMA resume callback had not yet completed and
> HSMMC driver had initiated a DMA operation. This resulted in Deadlock.

Hmm. At least in my case, that doesn't happen. And my debug logs also
show that the calls are in expected order. Which tree were you on when
you saw this?


Thanks,
Daniel

Nishanth Menon Nov. 7, 2013, 3:18 p.m. UTC | #5

On 11/07/2013 07:32 AM, Daniel Mack wrote:
> On 11/07/2013 02:30 PM, Gururaja Hebbar wrote:
>> On Wednesday 06 November 2013 11:06 PM, Joel Fernandes wrote:
>>> Hi Vaibhav,
>>>
>>> On 10/31/2013 05:25 PM, Vaibhav Bedia wrote:
>>>> Hi Daniel,
>>>>
>>>> On Wed, Oct 30, 2013 at 4:21 PM, Daniel Mack <zonque@gmail.com> wrote:
>>>> [...]
>>>>> +
>>>>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
>>>>> +
>>>>>  static struct platform_driver edma_driver = {
>>>>>         .driver = {
>>>>>                 .name   = "edma",
>>>>> +               .pm     = &edma_pm_ops,
>>>>>                 .of_match_table = edma_of_ids,
>>>>>         },
>>>>
>>>> A while back we discovered a nasty race condition here that had us move the EDMA
>>>> PM callbacks to the noirq phase. IIRC the MMC driver was resuming
>>>> before the EDMA
>>>> driver had a chance to run and that was leading to a deadlock. I am
>>>> not sure how to force
>>>> this scenario but i do remember spending time debugging this on a
>>>> random codebase.
>>>> Maybe some else has some better ideas on how to force this race condition...
>>>
>>> I think you're talking about the patch at [1] which is not upstream. A quick
>>> question with my limited knowledge of suspend/resume- How can there be pending
>>> I/O operations between suspend/resume cycles? 
>>
>> AFAIK, MMC framework started talking to cards immediately after resume.
>> Due to race condition, EDMA resume callback had not yet completed and
>> HSMMC driver had initiated a DMA operation. This resulted in Deadlock.
> 
> Hmm. At least in my case, that doesn't happen. And my debug logs also
> show that the calls are in expected order. Which tree were you on when
> you saw this?

Tested this on a vendor V3.12 tag based kernel:

Test patch: http://pastebin.com/AmnktQ7B
test: echo -n "1">/sys/power/pm_print_times; rtcwake -d /dev/rtc0 -m
mem -s 5


with the current patch: http://pastebin.com/RujarRLV
suspend_late and resume_early: http://pastebin.com/RujarRLV
suspend_noirq and resume_noirq: http://pastebin.com/nKfbm7Mj

one needs to be careful of the sequence - donot forget that
omap_device also does stuff in the background to every SoC device in
noirq - sequence is paramount. you would want to ensure edma is saving
after every single dependent device is done with it's stuff and
guarenteed to never request any further transaction, and resume is
done before any of the dependent devices need edma. but edma is also a
peripheral that omap_device and generic runtime pm framework deals
with - so ensure sequences consider that as well.

Nishanth Menon Nov. 7, 2013, 3:34 p.m. UTC | #6

On 10/30/2013 03:21 PM, Daniel Mack wrote:
[...]
> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
> index 8e1a024..f15cdb9 100644
> --- a/arch/arm/common/edma.c
> +++ b/arch/arm/common/edma.c

[...]

> +static int edma_pm_suspend(struct device *dev)
> +{
> +	int j;
> +
> +	pm_runtime_get_sync(dev);
> +
error checks?
> +	for (j = 0; j < arch_num_cc; j++) {
> +		struct edma *ecc = edma_cc[j];
> +
> +		disable_irq(ecc->irq_res_start);
> +		disable_irq(ecc->irq_res_end);
> +	}
> +
> +	pm_runtime_put_sync(dev);
> +
> +	return 0;
> +}
> +
> +static int edma_pm_resume(struct device *dev)
> +{
> +	int i, j;
> +
> +	pm_runtime_get_sync(dev);
error checks?

[...]

Daniel Mack Nov. 7, 2013, 3:36 p.m. UTC | #7

On 11/07/2013 04:18 PM, Nishanth Menon wrote:
> Tested this on a vendor V3.12 tag based kernel:
> 
> Test patch: http://pastebin.com/AmnktQ7B
> test: echo -n "1">/sys/power/pm_print_times; rtcwake -d /dev/rtc0 -m
> mem -s 5
> 
> 
> with the current patch: http://pastebin.com/RujarRLV
> suspend_late and resume_early: http://pastebin.com/RujarRLV

These two are identical.

> suspend_noirq and resume_noirq: http://pastebin.com/nKfbm7Mj

And I can't see any difference between this one and the first one,
except for slightly different timings. Am I missing anything?

> one needs to be careful of the sequence - donot forget that
> omap_device also does stuff in the background to every SoC device in
> noirq - sequence is paramount. you would want to ensure edma is saving
> after every single dependent device is done with it's stuff and
> guarenteed to never request any further transaction, and resume is
> done before any of the dependent devices need edma. but edma is also a
> peripheral that omap_device and generic runtime pm framework deals
> with - so ensure sequences consider that as well.

So, what would you say which sequence is correct then? :)


Thanks,
Daniel

Nishanth Menon Nov. 7, 2013, 3:48 p.m. UTC | #8

On 11/07/2013 09:36 AM, Daniel Mack wrote:
> On 11/07/2013 04:18 PM, Nishanth Menon wrote:
>> Tested this on a vendor V3.12 tag based kernel:
>>
>> Test patch: http://pastebin.com/AmnktQ7B
>> test: echo -n "1">/sys/power/pm_print_times; rtcwake -d /dev/rtc0 -m
>> mem -s 5
>>
>>
>> with the current patch: http://pastebin.com/RujarRLV
>> suspend_late and resume_early: http://pastebin.com/RujarRLV
> 
> These two are identical.
> 
>> suspend_noirq and resume_noirq: http://pastebin.com/nKfbm7Mj
> 
> And I can't see any difference between this one and the first one,
> except for slightly different timings. Am I missing anything?

aah, that happens to be a little key :)
if you see the current patch, it happens around line 417,
with suspend_late, it moves deeper(or later) into suspend around 738
with noirq - it is as late as it can get - around line 823 just before
the last set of arch suspend calls are done.

>> one needs to be careful of the sequence - donot forget that
>> omap_device also does stuff in the background to every SoC device in
>> noirq - sequence is paramount. you would want to ensure edma is saving
>> after every single dependent device is done with it's stuff and
>> guarenteed to never request any further transaction, and resume is
>> done before any of the dependent devices need edma. but edma is also a
>> peripheral that omap_device and generic runtime pm framework deals
>> with - so ensure sequences consider that as well.
> 
> So, what would you say which sequence is correct then? :)

Disclaimer: I have not dug deeper, other than a cursory look. With
proper error handling, proper split between suspend and suspend_late
seems appropriate to me(at noirq, runtime_get could fail as pmruntime
is disabled as part of suspend sequence and could race with
omap_device handling in a future cleanup there) - with the assumption
that all drivers that are using things have cleaned up prior to that.
edma is a generic engine that many drivers may choose to use - example
of MMC used in this discussion is just one of other potential users -
for a generic driver like dma, you'd want to stay as deep in the
suspend as possible. you may also want to ensure that further calls
will not succeed until resume is invoked.

Grygorii Strashko Nov. 7, 2013, 4:27 p.m. UTC | #9

On 11/07/2013 05:34 PM, Nishanth Menon wrote:
> On 10/30/2013 03:21 PM, Daniel Mack wrote:
> [...]
>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
>> index 8e1a024..f15cdb9 100644
>> --- a/arch/arm/common/edma.c
>> +++ b/arch/arm/common/edma.c
>
> [...]
>
>> +static int edma_pm_suspend(struct device *dev)
>> +{
>> +	int j;
>> +
>> +	pm_runtime_get_sync(dev);
>> +
> error checks?
>> +	for (j = 0; j < arch_num_cc; j++) {
>> +		struct edma *ecc = edma_cc[j];
>> +
>> +		disable_irq(ecc->irq_res_start);
>> +		disable_irq(ecc->irq_res_end);
>> +	}
>> +
>> +	pm_runtime_put_sync(dev);

Also, it make no sense to call pm_runtime_putXXX() in suspend handlers
- it will not work because PM core calls pm_runtime_get_noresume()
from device_prepare().

Now, for OMAP the edma Device will be finally disabled by OMAP device 
core at suspend_noirq stage (if it's connected to HW_mode).
But, in other cases it need to be handled somehow :)

>> +
>> +	return 0;
>> +}
>> +
>> +static int edma_pm_resume(struct device *dev)
>> +{
>> +	int i, j;
>> +
>> +	pm_runtime_get_sync(dev);
> error checks?
>
> [...]
>
>

Joel Fernandes Nov. 7, 2013, 4:34 p.m. UTC | #10

Hi Daniel,

Thanks for your followup patch on this. It looks much better now using existing
functions to save/restore the state.

On 10/30/2013 03:21 PM, Daniel Mack wrote:
> This patch makes the edma driver resume correctly after suspend. Tested
> on an AM33xx platform with cyclic audio streams and omap_hsmmc.
> 
> All information can be reconstructed by already known runtime
> information.
> 
> As we now use some functions that were previously only used from __init
> context, annotations had to be dropped.
> 
> Signed-off-by: Daniel Mack <zonque@gmail.com>
> ---
> There was actually only a v3 ever, I made a mistake when formating the
> first version of this patch. To prevent confusion though, I named this
> one v4.
> 
> v3 -> v4:
> 	* dropped extra allocations, and reconstruct register values
> 	  from already known driver states.
> 
> 
> 
> Hi Joel, Gururaja, Balaji,
> 
> thanks a lot for your feedback. I successfully tested this version with
> davinci mcasp as well as omap_hsmmc. I'd appreciate another round of
> reviews :)
> 
> 
> Thanks,
> Daniel
> 
>  arch/arm/common/edma.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 79 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
[..]
> +static int edma_pm_resume(struct device *dev)
> +{
> +	int i, j;
> +
> +	pm_runtime_get_sync(dev);
> +
> +	for (j = 0; j < arch_num_cc; j++) {
> +		struct edma *cc = edma_cc[j];
> +
> +		s8 (*queue_priority_mapping)[2];
> +		s8 (*queue_tc_mapping)[2];
> +
> +		queue_tc_mapping = cc->info->queue_tc_mapping;
> +		queue_priority_mapping = cc->info->queue_priority_mapping;
> +
> +		/* Event queue to TC mapping */
> +		for (i = 0; queue_tc_mapping[i][0] != -1; i++)
> +			map_queue_tc(j, queue_tc_mapping[i][0],
> +					queue_tc_mapping[i][1]);
> +
> +		/* Event queue priority mapping */
> +		for (i = 0; queue_priority_mapping[i][0] != -1; i++)
> +			assign_priority_to_queue(j,
> +						queue_priority_mapping[i][0],
> +						queue_priority_mapping[i][1]);

I know ti,edma-regions property is not currently being used, but we should
future proof this by setting up DRAE for like done in probe:

                for (i = 0; i < info[j]->n_region; i++) {
                        edma_write_array2(j, EDMA_DRAE, i, 0, 0x0);
                        edma_write_array2(j, EDMA_DRAE, i, 1, 0x0);
                        edma_write_array(j, EDMA_QRAE, i, 0x0);
                }


> +
> +		/* Map the channel to param entry if channel mapping logic
> +		 * exist
> +		 */
> +		if (edma_read(j, EDMA_CCCFG) & CHMAP_EXIST)
> +			map_dmach_param(j);
> +
> +		for (i = 0; i < cc->num_channels; i++)
> +			if (test_bit(i, cc->edma_inuse)) {
> +				/* ensure access through shadow region 0 */
> +				edma_or_array2(j, EDMA_DRAE, 0, i >> 5,
> +						BIT(i & 0x1f));
> +
> +				setup_dma_interrupt(i,
> +						    cc->intr_data[i].callback,
> +						    cc->intr_data[i].data);
> +			}
> +
> +		enable_irq(cc->irq_res_start);
> +		enable_irq(cc->irq_res_end);
> +	}
> +
> +	pm_runtime_put_sync(dev);
> +
> +	return 0;
> +}
> +
> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);

I agree with Nishanth here, it is better to do this in .suspend/resume _noirq
stage to rule out any ordering bugs that may show up in the future, since such
an issue already showed up in earlier testing.

I would appreciate it if you can make these 2 changes and post a v5. Thanks for
a lot for all the hardwork.

Acked-by: Joel Fernandes <joelf@ti.com>

regards,

-Joel

Joel Fernandes Nov. 7, 2013, 4:49 p.m. UTC | #11

On 11/07/2013 10:34 AM, Joel Fernandes wrote:
> Hi Daniel,
> 
> Thanks for your followup patch on this. It looks much better now using existing
> functions to save/restore the state.
> 
> On 10/30/2013 03:21 PM, Daniel Mack wrote:
>> This patch makes the edma driver resume correctly after suspend. Tested
>> on an AM33xx platform with cyclic audio streams and omap_hsmmc.
>>
>> All information can be reconstructed by already known runtime
>> information.
>>
>> As we now use some functions that were previously only used from __init
>> context, annotations had to be dropped.
>>
>> Signed-off-by: Daniel Mack <zonque@gmail.com>
>> ---
>> There was actually only a v3 ever, I made a mistake when formating the
>> first version of this patch. To prevent confusion though, I named this
>> one v4.
>>
>> v3 -> v4:
>> 	* dropped extra allocations, and reconstruct register values
>> 	  from already known driver states.
>>
>>
>>
>> Hi Joel, Gururaja, Balaji,
>>
>> thanks a lot for your feedback. I successfully tested this version with
>> davinci mcasp as well as omap_hsmmc. I'd appreciate another round of
>> reviews :)
>>
>>
>> Thanks,
>> Daniel
>>
>>  arch/arm/common/edma.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 79 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
> [..]
>> +static int edma_pm_resume(struct device *dev)
>> +{
>> +	int i, j;
>> +
>> +	pm_runtime_get_sync(dev);
>> +
>> +	for (j = 0; j < arch_num_cc; j++) {
>> +		struct edma *cc = edma_cc[j];
>> +
>> +		s8 (*queue_priority_mapping)[2];
>> +		s8 (*queue_tc_mapping)[2];
>> +
>> +		queue_tc_mapping = cc->info->queue_tc_mapping;
>> +		queue_priority_mapping = cc->info->queue_priority_mapping;
>> +
>> +		/* Event queue to TC mapping */
>> +		for (i = 0; queue_tc_mapping[i][0] != -1; i++)
>> +			map_queue_tc(j, queue_tc_mapping[i][0],
>> +					queue_tc_mapping[i][1]);
>> +
>> +		/* Event queue priority mapping */
>> +		for (i = 0; queue_priority_mapping[i][0] != -1; i++)
>> +			assign_priority_to_queue(j,
>> +						queue_priority_mapping[i][0],
>> +						queue_priority_mapping[i][1]);
> 
> I know ti,edma-regions property is not currently being used, but we should
> future proof this by setting up DRAE for like done in probe:
> 
>                 for (i = 0; i < info[j]->n_region; i++) {
>                         edma_write_array2(j, EDMA_DRAE, i, 0, 0x0);
>                         edma_write_array2(j, EDMA_DRAE, i, 1, 0x0);
>                         edma_write_array(j, EDMA_QRAE, i, 0x0);
>                 }

Please ignore this comment I posted earlier. That is not all required to do.
Should've looked closer. Sorry about it.

cheers,

-Joel

Daniel Mack Nov. 7, 2013, 5:37 p.m. UTC | #12

Hi Joel,

On 11/07/2013 05:34 PM, Joel Fernandes wrote:
> Thanks for your followup patch on this. It looks much better now using existing
> functions to save/restore the state.

Yes, thanks for the suggesting it in the first place.

> On 10/30/2013 03:21 PM, Daniel Mack wrote:
>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
> [..]
>> +static int edma_pm_resume(struct device *dev)
>> +{
>> +	int i, j;
>> +
>> +	pm_runtime_get_sync(dev);
>> +
>> +	for (j = 0; j < arch_num_cc; j++) {
>> +		struct edma *cc = edma_cc[j];
>> +
>> +		s8 (*queue_priority_mapping)[2];
>> +		s8 (*queue_tc_mapping)[2];
>> +
>> +		queue_tc_mapping = cc->info->queue_tc_mapping;
>> +		queue_priority_mapping = cc->info->queue_priority_mapping;
>> +
>> +		/* Event queue to TC mapping */
>> +		for (i = 0; queue_tc_mapping[i][0] != -1; i++)
>> +			map_queue_tc(j, queue_tc_mapping[i][0],
>> +					queue_tc_mapping[i][1]);
>> +
>> +		/* Event queue priority mapping */
>> +		for (i = 0; queue_priority_mapping[i][0] != -1; i++)
>> +			assign_priority_to_queue(j,
>> +						queue_priority_mapping[i][0],
>> +						queue_priority_mapping[i][1]);
> 
> I know ti,edma-regions property is not currently being used, but we should
> future proof this by setting up DRAE for like done in probe:
> 
>                 for (i = 0; i < info[j]->n_region; i++) {
>                         edma_write_array2(j, EDMA_DRAE, i, 0, 0x0);
>                         edma_write_array2(j, EDMA_DRAE, i, 1, 0x0);
>                         edma_write_array(j, EDMA_QRAE, i, 0x0);
>                 }

That doesn't work for me. I'm running long-time tests here on a device
which has a mwifiex connected to omap_hsmmc. The test procedure includes:

a) a script on the device that puts the device to sleep some seconds
after it has been woken up

b) a script on a host that wakes up the device with wake-on-lan every 10
seconds

c) a flood ping that checks whether the device is responding


That precedure is running since a couple of hourse here, and it works
well with both by v3 and v4 patches. Moving the functions to
.suspend/resume _noirq doesn't seem to break anything.

Setting QRAE to 0 as you mentioned above, however, makes the device fail
at resume.

>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
> 
> I agree with Nishanth here, it is better to do this in .suspend/resume _noirq
> stage to rule out any ordering bugs that may show up in the future, since such
> an issue already showed up in earlier testing.

Alright, I already did that.

> I would appreciate it if you can make these 2 changes and post a v5. Thanks for
> a lot for all the hardwork.

No problem at all :)

> Acked-by: Joel Fernandes <joelf@ti.com>

Still sure about that? What about your follow-up to your own reply?


Many thanks for all the feedback!

Daniel

Vaibhav Bedia Nov. 7, 2013, 8:34 p.m. UTC | #13

Hi Joel,

On Wed, Nov 6, 2013 at 12:36 PM, Joel Fernandes <joelf@ti.com> wrote:
> Hi Vaibhav,
>
> On 10/31/2013 05:25 PM, Vaibhav Bedia wrote:
>> Hi Daniel,
>>
>> On Wed, Oct 30, 2013 at 4:21 PM, Daniel Mack <zonque@gmail.com> wrote:
>> [...]
>>> +
>>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
>>> +
>>>  static struct platform_driver edma_driver = {
>>>         .driver = {
>>>                 .name   = "edma",
>>> +               .pm     = &edma_pm_ops,
>>>                 .of_match_table = edma_of_ids,
>>>         },
>>
>> A while back we discovered a nasty race condition here that had us move the EDMA
>> PM callbacks to the noirq phase. IIRC the MMC driver was resuming
>> before the EDMA
>> driver had a chance to run and that was leading to a deadlock. I am
>> not sure how to force
>> this scenario but i do remember spending time debugging this on a
>> random codebase.
>> Maybe some else has some better ideas on how to force this race condition...
>
> I think you're talking about the patch at [1] which is not upstream. A quick

Yeah that's the one.

> question with my limited knowledge of suspend/resume- How can there be pending
> I/O operations between suspend/resume cycles? The sync is done before suspend,
> so I don't understand how one is receiving a response from the card after resume
> before EDMA can resume? I'd imagine that until all devices are resumed, there
> will be no I/O operation issued. Let me know your thoughts.
>

Sadly the commit message does not capture the details to the level it
should have.

From what i remember, during the resume operation the driver was waiting for the
response of the first cmd that it sends and that never happened since the driver
was setup to use DMA mode and the EDMA driver hadn't resumed. Ideally such
issues should have been noticed a long time back. One more thing to note is that
the MMC controller on AMx series does not have the internal DMA like OMAPx.
However i am not sure how much that, if at all, plays a role here.

Maybe someone internally still has a copy of the analysis done and you could
try to hunt that down for more details. In case you do it would be
good to have it
mentioned here.

Regards,
Vaibhav

Vaibhav Bedia Nov. 7, 2013, 8:42 p.m. UTC | #14

Hi Nishanth :)
On Thu, Nov 7, 2013 at 10:48 AM, Nishanth Menon <nm@ti.com> wrote:
> On 11/07/2013 09:36 AM, Daniel Mack wrote:
>> On 11/07/2013 04:18 PM, Nishanth Menon wrote:
>>> Tested this on a vendor V3.12 tag based kernel:
>>>
>>> Test patch: http://pastebin.com/AmnktQ7B
>>> test: echo -n "1">/sys/power/pm_print_times; rtcwake -d /dev/rtc0 -m
>>> mem -s 5
>>>
>>>
>>> with the current patch: http://pastebin.com/RujarRLV
>>> suspend_late and resume_early: http://pastebin.com/RujarRLV
>>
>> These two are identical.
>>
>>> suspend_noirq and resume_noirq: http://pastebin.com/nKfbm7Mj
>>
>> And I can't see any difference between this one and the first one,
>> except for slightly different timings. Am I missing anything?
>
> aah, that happens to be a little key :)
> if you see the current patch, it happens around line 417,
> with suspend_late, it moves deeper(or later) into suspend around 738
> with noirq - it is as late as it can get - around line 823 just before
> the last set of arch suspend calls are done.
>

That's some nifty analysis overnight ;)

Yeah, the intention was to move the EDMA ops as late as possible.
I am not sure if noirq thing takes care of the late i2c stuff to talk to the
PMICs that happens on some OMAPs. Maybe the generic PM code
handles that already but i am a bit rusty on the details right now so
that would just mention that scenario to be considered.

>>> one needs to be careful of the sequence - donot forget that
>>> omap_device also does stuff in the background to every SoC device in
>>> noirq - sequence is paramount. you would want to ensure edma is saving
>>> after every single dependent device is done with it's stuff and
>>> guarenteed to never request any further transaction, and resume is
>>> done before any of the dependent devices need edma. but edma is also a
>>> peripheral that omap_device and generic runtime pm framework deals
>>> with - so ensure sequences consider that as well.
>>
>> So, what would you say which sequence is correct then? :)
>
> Disclaimer: I have not dug deeper, other than a cursory look. With
> proper error handling, proper split between suspend and suspend_late
> seems appropriate to me(at noirq, runtime_get could fail as pmruntime
> is disabled as part of suspend sequence and could race with
> omap_device handling in a future cleanup there) - with the assumption
> that all drivers that are using things have cleaned up prior to that.
> edma is a generic engine that many drivers may choose to use - example
> of MMC used in this discussion is just one of other potential users -
> for a generic driver like dma, you'd want to stay as deep in the
> suspend as possible. you may also want to ensure that further calls
> will not succeed until resume is invoked.
>

Hmm doesn't this highlight the problem that issues like this could happen on
any platform. If so, shouldn't we have a generic callback (dmaengine layer?) to
do this right? For timers we have syscore_ops, is something like that
for a DMA controller an overkill?

Regards,
Vaibhav

Vaibhav Bedia Nov. 7, 2013, 8:46 p.m. UTC | #15

Hi Grygorii :)

On Thu, Nov 7, 2013 at 11:27 AM, Grygorii Strashko
<grygorii.strashko@ti.com> wrote:
> On 11/07/2013 05:34 PM, Nishanth Menon wrote:
>>
>> On 10/30/2013 03:21 PM, Daniel Mack wrote:
>> [...]
>>>
>>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
>>> index 8e1a024..f15cdb9 100644
>>> --- a/arch/arm/common/edma.c
>>> +++ b/arch/arm/common/edma.c
>>
>>
>> [...]
>>
>>> +static int edma_pm_suspend(struct device *dev)
>>> +{
>>> +       int j;
>>> +
>>> +       pm_runtime_get_sync(dev);
>>> +
>>
>> error checks?
>>>
>>> +       for (j = 0; j < arch_num_cc; j++) {
>>> +               struct edma *ecc = edma_cc[j];
>>> +
>>> +               disable_irq(ecc->irq_res_start);
>>> +               disable_irq(ecc->irq_res_end);
>>> +       }
>>> +
>>> +       pm_runtime_put_sync(dev);
>
>
> Also, it make no sense to call pm_runtime_putXXX() in suspend handlers
> - it will not work because PM core calls pm_runtime_get_noresume()
> from device_prepare().
>
> Now, for OMAP the edma Device will be finally disabled by OMAP device core
> at suspend_noirq stage (if it's connected to HW_mode).
> But, in other cases it need to be handled somehow :)
>

This PM stuff just gets more interesting ;)

Joel Fernandes Nov. 7, 2013, 9:39 p.m. UTC | #16

On 11/07/2013 11:37 AM, Daniel Mack wrote:
[..]
>> I know ti,edma-regions property is not currently being used, but we should>>
future proof this by setting up DRAE for like done in probe:
>>
>>                 for (i = 0; i < info[j]->n_region; i++) {
>>                         edma_write_array2(j, EDMA_DRAE, i, 0, 0x0);
>>                         edma_write_array2(j, EDMA_DRAE, i, 1, 0x0);
>>                         edma_write_array(j, EDMA_QRAE, i, 0x0);
>>                 }
>
> That doesn't work for me. I'm running long-time tests here on a device
> which has a mwifiex connected to omap_hsmmc. The test procedure includes:
>
> a) a script on the device that puts the device to sleep some seconds
> after it has been woken up
>
> b) a script on a host that wakes up the device with wake-on-lan every 10
> seconds
>
> c) a flood ping that checks whether the device is responding
>
>
> That precedure is running since a couple of hourse here, and it works
> well with both by v3 and v4 patches. Moving the functions to
> .suspend/resume _noirq doesn't seem to break anything.
>
> Setting QRAE to 0 as you mentioned above, however, makes the device fail
> at resume.

Yes, I recall that. It definitely shouldn't be done.

>>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
>>
>> I agree with Nishanth here, it is better to do this in .suspend/resume _noirq
>> stage to rule out any ordering bugs that may show up in the future, since such
>> an issue already showed up in earlier testing.
>
> Alright, I already did that.

Cool, can you post the updated patch? I'll run some more tests too..

>> I would appreciate it if you can make these 2 changes and post a v5. Thanks for
>> a lot for all the hardwork.
>
> No problem at all :)
>
>> Acked-by: Joel Fernandes <joelf@ti.com>
>
> Still sure about that? What about your follow-up to your own reply?
>

Yes, the Ack definitely stands for the changes you made since the last revision.
Only thing left is to fix the ordering, Let's do that (you seem to have already
done it) and test this a bit more.

Glad you're working on this, thanks!

-Joel

Hebbar, Gururaja Nov. 8, 2013, 4:07 a.m. UTC | #17

On Thursday 07 November 2013 11:07 PM, Daniel Mack wrote:
> Hi Joel,
> 
> On 11/07/2013 05:34 PM, Joel Fernandes wrote:
>> Thanks for your followup patch on this. It looks much better now using existing
>> functions to save/restore the state.
> 
> Yes, thanks for the suggesting it in the first place.
> 
>> On 10/30/2013 03:21 PM, Daniel Mack wrote:
>>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
>> [..]
>>> +static int edma_pm_resume(struct device *dev)
>>> +{
>>> +	int i, j;
>>> +
>>> +	pm_runtime_get_sync(dev);
>>> +
>>> +	for (j = 0; j < arch_num_cc; j++) {
>>> +		struct edma *cc = edma_cc[j];
>>> +
>>> +		s8 (*queue_priority_mapping)[2];
>>> +		s8 (*queue_tc_mapping)[2];
>>> +
>>> +		queue_tc_mapping = cc->info->queue_tc_mapping;
>>> +		queue_priority_mapping = cc->info->queue_priority_mapping;
>>> +
>>> +		/* Event queue to TC mapping */
>>> +		for (i = 0; queue_tc_mapping[i][0] != -1; i++)
>>> +			map_queue_tc(j, queue_tc_mapping[i][0],
>>> +					queue_tc_mapping[i][1]);
>>> +
>>> +		/* Event queue priority mapping */
>>> +		for (i = 0; queue_priority_mapping[i][0] != -1; i++)
>>> +			assign_priority_to_queue(j,
>>> +						queue_priority_mapping[i][0],
>>> +						queue_priority_mapping[i][1]);
>>
>> I know ti,edma-regions property is not currently being used, but we should
>> future proof this by setting up DRAE for like done in probe:
>>
>>                 for (i = 0; i < info[j]->n_region; i++) {
>>                         edma_write_array2(j, EDMA_DRAE, i, 0, 0x0);
>>                         edma_write_array2(j, EDMA_DRAE, i, 1, 0x0);
>>                         edma_write_array(j, EDMA_QRAE, i, 0x0);
>>                 }
> 
> That doesn't work for me. I'm running long-time tests here on a device
> which has a mwifiex connected to omap_hsmmc. The test procedure includes:
> 
> a) a script on the device that puts the device to sleep some seconds
> after it has been woken up



> 
> b) a script on a host that wakes up the device with wake-on-lan every 10
> seconds
> 
> c) a flood ping that checks whether the device is responding

can you share above 2 (b & C) test scripts? (pastebin or inline if small)

thanks in advance

regards
Gururaja


> 
> 
> That precedure is running since a couple of hourse here, and it works
> well with both by v3 and v4 patches. Moving the functions to
> .suspend/resume _noirq doesn't seem to break anything.
> 
> Setting QRAE to 0 as you mentioned above, however, makes the device fail
> at resume.
> 
>>> +static SIMPLE_DEV_PM_OPS(edma_pm_ops, edma_pm_suspend, edma_pm_resume);
>>
>> I agree with Nishanth here, it is better to do this in .suspend/resume _noirq
>> stage to rule out any ordering bugs that may show up in the future, since such
>> an issue already showed up in earlier testing.
> 
> Alright, I already did that.
> 
>> I would appreciate it if you can make these 2 changes and post a v5. Thanks for
>> a lot for all the hardwork.
> 
> No problem at all :)
> 
>> Acked-by: Joel Fernandes <joelf@ti.com>
> 
> Still sure about that? What about your follow-up to your own reply?
> 
> 
> Many thanks for all the feedback!
> 
> Daniel
> 
> 
> 
>

Daniel Mack Nov. 8, 2013, 7:51 a.m. UTC | #18

On 11/08/2013 05:07 AM, Gururaja Hebbar wrote:
> On Thursday 07 November 2013 11:07 PM, Daniel Mack wrote:

>> I'm running long-time tests here on a device
>> which has a mwifiex connected to omap_hsmmc. The test procedure includes:
>>
>> a) a script on the device that puts the device to sleep some seconds
>> after it has been woken up
>> 
>> b) a script on a host that wakes up the device with wake-on-lan every 10
>> seconds
>>
>> c) a flood ping that checks whether the device is responding
> 
> can you share above 2 (b & C) test scripts? (pastebin or inline if small)

Of course. It's very hand-baked and simple:

a)

# wpa_supplicant is started and and the ip address ist set

while [ true ]; do
        ethtool -s eth0 wol g
        echo mem >/sys/power/state
        ethtool -s eth0 wol d
        sleep 7
done

b)

while [ true ]; do
	sudo etherwake -i em1 BC:6A:29:84:0A:62
	sleep 5
done

c)

ping -s 32768 -i 0.1 192.168.178.222



Daniel

Nishanth Menon Nov. 15, 2013, 2:39 p.m. UTC | #19

On 11/07/2013 02:42 PM, Vaibhav Bedia wrote:
> Hi Nishanth :)
> On Thu, Nov 7, 2013 at 10:48 AM, Nishanth Menon <nm@ti.com> wrote:
>> On 11/07/2013 09:36 AM, Daniel Mack wrote:
>>> On 11/07/2013 04:18 PM, Nishanth Menon wrote:
>>>> Tested this on a vendor V3.12 tag based kernel:
>>>>
>>>> Test patch: http://pastebin.com/AmnktQ7B
>>>> test: echo -n "1">/sys/power/pm_print_times; rtcwake -d /dev/rtc0 -m
>>>> mem -s 5
>>>>
>>>>
>>>> with the current patch: http://pastebin.com/RujarRLV
>>>> suspend_late and resume_early: http://pastebin.com/RujarRLV
>>>
>>> These two are identical.
>>>
>>>> suspend_noirq and resume_noirq: http://pastebin.com/nKfbm7Mj
>>>
>>> And I can't see any difference between this one and the first one,
>>> except for slightly different timings. Am I missing anything?
>>
>> aah, that happens to be a little key :)
>> if you see the current patch, it happens around line 417,
>> with suspend_late, it moves deeper(or later) into suspend around 738
>> with noirq - it is as late as it can get - around line 823 just before
>> the last set of arch suspend calls are done.
>>
> 
> That's some nifty analysis overnight ;)
> 
> Yeah, the intention was to move the EDMA ops as late as possible.
> I am not sure if noirq thing takes care of the late i2c stuff to talk to the
> PMICs that happens on some OMAPs. Maybe the generic PM code
> handles that already but i am a bit rusty on the details right now so
> that would just mention that scenario to be considered.
> 

To trigger the fail, I created a custom Test case on TI vendor kernel
based on v3.12 tag:
On beagleBone-Black
test scenario (BBB):

    Boot from SD card
    ensure firmware is loaded (rev 0x182)
    run LTP fsstress [1] in background on EMMC
        mkdir -p /tmp/testing
        mke2fs /dev/mmcblk1p1
        mount /dev/mmcblk1p1 /tmp/testing
        fsstress -d /tmp/testing p 4 -n 1000 -l0 -v >/dev/null &
    run ping in the background (to add yet another interface)
    run memtester[2] (70% of free memory)
        memtester 343M >/dev/null &
        sleep 10 (to allow memtester to start up properly)
    start=`date`;i=0; while [ 1 ]; do rtcwake -d /dev/rtc0 -m mem -s
2; i=$((i + 1)); echo "$i: start =$start, now="`date`; sleep 2; done

Eventual hang log (using the regular suspend-resume): [3] - took close
to two days of testing to trigger this.

Moving to a suspend_late and resume_early such as in [4], it passed
the test for over 4 days now.

Daniel,
will be good if you could post [4] for comments if you think that is
the right thing to do and helps solve the issue you saw as  well.

[1]
https://github.com/linux-test-project/ltp/tree/master/testcases/kernel/fs/fsstress

[2] http://pyropus.ca/software/memtester
[3] http://pastebin.pandaboard.org/index.php/view/18307529
[4]
https://git.ti.com/ti-linux-kernel/ti-linux-kernel/commit/f8f9a8c38644d27dc8671009209922531b072110

Daniel Mack Nov. 17, 2013, 10:09 p.m. UTC | #20

Hi Nishanth,

On 11/15/2013 03:39 PM, Nishanth Menon wrote:
> To trigger the fail, I created a custom Test case on TI vendor kernel
> based on v3.12 tag:
> On beagleBone-Black
> test scenario (BBB):
> 
>     Boot from SD card
>     ensure firmware is loaded (rev 0x182)
>     run LTP fsstress [1] in background on EMMC
>         mkdir -p /tmp/testing
>         mke2fs /dev/mmcblk1p1
>         mount /dev/mmcblk1p1 /tmp/testing
>         fsstress -d /tmp/testing p 4 -n 1000 -l0 -v >/dev/null &
>     run ping in the background (to add yet another interface)
>     run memtester[2] (70% of free memory)
>         memtester 343M >/dev/null &
>         sleep 10 (to allow memtester to start up properly)
>     start=`date`;i=0; while [ 1 ]; do rtcwake -d /dev/rtc0 -m mem -s
> 2; i=$((i + 1)); echo "$i: start =$start, now="`date`; sleep 2; done
> 
> Eventual hang log (using the regular suspend-resume): [3] - took close
> to two days of testing to trigger this.
> 
> Moving to a suspend_late and resume_early such as in [4], it passed
> the test for over 4 days now.

Wow, that's an intense test that you have there :)

> Daniel,
> will be good if you could post [4] for comments if you think that is
> the right thing to do and helps solve the issue you saw as  well.

Alright, will do!


Thanks a lot,
Daniel

[v4] ARM: omap: edma: add suspend suspend/resume hooks

Commit Message

Comments

Patch