Message ID | 1383863549-7438-1-git-send-email-zonque@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Daniel Mack <zonque@gmail.com> writes: > This patch makes the edma driver resume correctly after suspend. Tested > on an AM33xx platform with cyclic audio streams and omap_hsmmc. > > All information can be reconstructed by already known runtime > information. > > As we now use some functions that were previously only used from __init > context, annotations had to be dropped. > > Signed-off-by: Daniel Mack <zonque@gmail.com> > --- > Ok, here is v5. > > v4 -> v5: > > * dropped pm_runtime_* function calls entirely > * moved the function pointers to .suspend/resume _noirq [...] > +static const struct dev_pm_ops edma_pm_ops = { > + .suspend = edma_pm_suspend, I suspect you intended to use the _noirq version like the changelog says? > + .resume_noirq = edma_pm_resume, > +}; Also, I believe it was already suggested by Nishanth, but the late/early callbacks are probably more appropriate here than the noirq callbacks. Unless there's a *really* good reason to use the noirq callbacks, they should be avoided. That being said, I wonder if the whole approach here is the right one. I know you're basing your stuff on some TI tree, but that doesn't make it the right way (usually, it's the opposite, but I digress...) ;) IMO, EDMA should be done like we currently do I2C and not implement suspend/resume at all. Instead, the driver should do runtime PM done on a per xfer basis. Then when suspend comes along, all that needs to be done is ensure all in-flight xfers are done, then runtime PM will kick in. Kevin
Hi Kevin, On 11/07/2013 06:02 PM, Kevin Hilman wrote: > Daniel Mack <zonque@gmail.com> writes: [..] >> + .resume_noirq = edma_pm_resume, >> +}; > > Also, I believe it was already suggested by Nishanth, but the late/early > callbacks are probably more appropriate here than the noirq callbacks. > Unless there's a *really* good reason to use the noirq callbacks, they > should be avoided. > > That being said, I wonder if the whole approach here is the right one. > I know you're basing your stuff on some TI tree, but that doesn't make > it the right way (usually, it's the opposite, but I digress...) ;) > > IMO, EDMA should be done like we currently do I2C and not implement > suspend/resume at all. Instead, the driver should do runtime PM done on But a potential problem with this is powering edma on or off between xfers may slow things down quite a bit because xfers happen so much often and there is some overhead in the pm_runtime calls which adds up overtime when dealing with something as frequent as EDMA. Also we would lose the global EDMA context right so we'd have to restore the context every time during runtime PM (?). I had a patch that also made AES driver not pm_runtime_get/put specially when the next crypto request was imminent. But instead adopted a softer approach where the pm_runtime_put call would happen at a time when we are sure we're at the end of all requests. This resulted in quite a speed up. thanks, -Joel
Hi Joel, On Thu, Nov 7, 2013 at 8:58 PM, Joel Fernandes <joelf@ti.com> wrote: > On 11/07/2013 06:02 PM, Kevin Hilman wrote: >> Daniel Mack <zonque@gmail.com> writes: > [..] >>> + .resume_noirq = edma_pm_resume, >>> +}; >> >> Also, I believe it was already suggested by Nishanth, but the late/early >> callbacks are probably more appropriate here than the noirq callbacks. >> Unless there's a *really* good reason to use the noirq callbacks, they >> should be avoided. >> >> That being said, I wonder if the whole approach here is the right one. >> I know you're basing your stuff on some TI tree, but that doesn't make >> it the right way (usually, it's the opposite, but I digress...) ;) >> >> IMO, EDMA should be done like we currently do I2C and not implement >> suspend/resume at all. Instead, the driver should do runtime PM done on > > But a potential problem with this is powering edma on or off between xfers may > slow things down quite a bit because xfers happen so much often and there is > some overhead in the pm_runtime calls which adds up overtime when dealing with > something as frequent as EDMA. Also we would lose the global EDMA context right > so we'd have to restore the context every time during runtime PM (?). Have a look at the autosuspend feature of runtime PM. (c.f. Documentation/power/pm_runtime.txt) > I had a patch that also made AES driver not pm_runtime_get/put specially when > the next crypto request was imminent. But instead adopted a softer approach > where the pm_runtime_put call would happen at a time when we are sure we're at > the end of all requests. This resulted in quite a speed up. Sounds like you partially re-invented autosuspend. Kevin
On 11/08/2013 01:02 AM, Kevin Hilman wrote: > Daniel Mack <zonque@gmail.com> writes: >> + .resume_noirq = edma_pm_resume, >> +}; > > Also, I believe it was already suggested by Nishanth, but the late/early > callbacks are probably more appropriate here than the noirq callbacks. > Unless there's a *really* good reason to use the noirq callbacks, they > should be avoided. Ok, no problem to change that. Regarding the function ordering, late/early didn't show any lockup in my tests yet. > That being said, I wonder if the whole approach here is the right one. > I know you're basing your stuff on some TI tree, but that doesn't make > it the right way (usually, it's the opposite, but I digress...) ;) :) Well, I was primarily looking for a solution to bring the edma back to life after suspend. As I wrote in the first version of this patch that I sent, I didn't even intent that patch to go mainline at all, given that arch/arm/common/edma.c woill go away soon. > IMO, EDMA should be done like we currently do I2C and not implement > suspend/resume at all. Instead, the driver should do runtime PM done on > a per xfer basis. Then when suspend comes along, all that needs to be > done is ensure all in-flight xfers are done, then runtime PM will kick > in. I see your point in general, but isn't runtime PM there to actually save power at runtime? Does disabling unused channels in the EDMA really reduce the consumption of that IP block? Daniel
On 11/08/2013 12:28 AM, Kevin Hilman wrote: [..] >>> Also, I believe it was already suggested by Nishanth, but the late/early >>> callbacks are probably more appropriate here than the noirq callbacks. >>> Unless there's a *really* good reason to use the noirq callbacks, they >>> should be avoided. >>> >>> That being said, I wonder if the whole approach here is the right one. >>> I know you're basing your stuff on some TI tree, but that doesn't make >>> it the right way (usually, it's the opposite, but I digress...) ;) >>> >>> IMO, EDMA should be done like we currently do I2C and not implement >>> suspend/resume at all. Instead, the driver should do runtime PM done on >> >> But a potential problem with this is powering edma on or off between xfers may >> slow things down quite a bit because xfers happen so much often and there is >> some overhead in the pm_runtime calls which adds up overtime when dealing with >> something as frequent as EDMA. Also we would lose the global EDMA context right >> so we'd have to restore the context every time during runtime PM (?). > > Have a look at the autosuspend feature of runtime PM. (c.f. > Documentation/power/pm_runtime.txt) Sure, will do that. Thanks :) Just one more silly question, in very frequent operations is there no over-head even when using autosuspend? Because when I traced last time (without autosuspend), the depth of pm runtime calls were quite a lot but this could also be because of the OMAP implementation. >> I had a patch that also made AES driver not pm_runtime_get/put specially when >> the next crypto request was imminent. But instead adopted a softer approach >> where the pm_runtime_put call would happen at a time when we are sure we're at >> the end of all requests. This resulted in quite a speed up. > > Sounds like you partially re-invented autosuspend. Hehe. :) regards, -Joel
On 11/08/2013 10:14 AM, Joel Fernandes wrote: > On 11/08/2013 12:28 AM, Kevin Hilman wrote: > [..] >>>> Also, I believe it was already suggested by Nishanth, but the late/early >>>> callbacks are probably more appropriate here than the noirq callbacks. >>>> Unless there's a *really* good reason to use the noirq callbacks, they >>>> should be avoided. >>>> >>>> That being said, I wonder if the whole approach here is the right one. >>>> I know you're basing your stuff on some TI tree, but that doesn't make >>>> it the right way (usually, it's the opposite, but I digress...) ;) >>>> >>>> IMO, EDMA should be done like we currently do I2C and not implement >>>> suspend/resume at all. Instead, the driver should do runtime PM done on >>> >>> But a potential problem with this is powering edma on or off between xfers may >>> slow things down quite a bit because xfers happen so much often and there is >>> some overhead in the pm_runtime calls which adds up overtime when dealing with >>> something as frequent as EDMA. Also we would lose the global EDMA context right >>> so we'd have to restore the context every time during runtime PM (?). >> >> Have a look at the autosuspend feature of runtime PM. (c.f. >> Documentation/power/pm_runtime.txt) > > Sure, will do that. Thanks :) > > Just one more silly question, in very frequent operations is there no over-head > even when using autosuspend? Because when I traced last time (without > autosuspend), the depth of pm runtime calls were quite a lot but this could also > be because of the OMAP implementation. that is the entire purpose of autosuspend - the back2back calls have almost no overhead (other than to see that device suspend was not invoked) - this is pretty fast. you can tweak autosuspend timeout for the right configuration to meet up with what you need.
Joel Fernandes <joelf@ti.com> writes: > On 11/08/2013 12:28 AM, Kevin Hilman wrote: > [..] >>>> Also, I believe it was already suggested by Nishanth, but the late/early >>>> callbacks are probably more appropriate here than the noirq callbacks. >>>> Unless there's a *really* good reason to use the noirq callbacks, they >>>> should be avoided. >>>> >>>> That being said, I wonder if the whole approach here is the right one. >>>> I know you're basing your stuff on some TI tree, but that doesn't make >>>> it the right way (usually, it's the opposite, but I digress...) ;) >>>> >>>> IMO, EDMA should be done like we currently do I2C and not implement >>>> suspend/resume at all. Instead, the driver should do runtime PM done on >>> >>> But a potential problem with this is powering edma on or off between xfers may >>> slow things down quite a bit because xfers happen so much often and there is >>> some overhead in the pm_runtime calls which adds up overtime when dealing with >>> something as frequent as EDMA. Also we would lose the global EDMA context right >>> so we'd have to restore the context every time during runtime PM (?). >> >> Have a look at the autosuspend feature of runtime PM. (c.f. >> Documentation/power/pm_runtime.txt) > > Sure, will do that. Thanks :) > > Just one more silly question, in very frequent operations is there no over-head > even when using autosuspend? Because when I traced last time (without > autosuspend), the depth of pm runtime calls were quite a lot but this could also > be because of the OMAP implementation. The depth of calls is shallow until it's actually time to really runtime suspend (after the autosuspend timeout.) IOW, you don't even get into OMAP specific code until the autosuspend timeout expires, so the overhead is only coming from the runtime PM core, and is very minimal. We're already using this technique in the I2C driver which typically has less data than EDMA, but does have frequent, bursty xfers. Kevin
diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c index 8e1a024..7deacde 100644 --- a/arch/arm/common/edma.c +++ b/arch/arm/common/edma.c @@ -239,6 +239,8 @@ struct edma { /* list of channels with no even trigger; terminated by "-1" */ const s8 *noevent; + struct edma_soc_info *info; + /* The edma_inuse bit for each PaRAM slot is clear unless the * channel is in use ... by ARM or DSP, for QDMA, or whatever. */ @@ -290,13 +292,13 @@ static void map_dmach_queue(unsigned ctlr, unsigned ch_no, ~(0x7 << bit), queue_no << bit); } -static void __init map_queue_tc(unsigned ctlr, int queue_no, int tc_no) +static void map_queue_tc(unsigned ctlr, int queue_no, int tc_no) { int bit = queue_no * 4; edma_modify(ctlr, EDMA_QUETCMAP, ~(0x7 << bit), ((tc_no & 0x7) << bit)); } -static void __init assign_priority_to_queue(unsigned ctlr, int queue_no, +static void assign_priority_to_queue(unsigned ctlr, int queue_no, int priority) { int bit = queue_no * 4; @@ -315,7 +317,7 @@ static void __init assign_priority_to_queue(unsigned ctlr, int queue_no, * included in that particular EDMA variant (Eg : dm646x) * */ -static void __init map_dmach_param(unsigned ctlr) +static void map_dmach_param(unsigned ctlr) { int i; for (i = 0; i < EDMA_MAX_DMACH; i++) @@ -1785,15 +1787,85 @@ static int edma_probe(struct platform_device *pdev) edma_write_array2(j, EDMA_DRAE, i, 1, 0x0); edma_write_array(j, EDMA_QRAE, i, 0x0); } + edma_cc[j]->info = info[j]; arch_num_cc++; } return 0; } +static int edma_pm_suspend(struct device *dev) +{ + int j; + + for (j = 0; j < arch_num_cc; j++) { + struct edma *ecc = edma_cc[j]; + + disable_irq(ecc->irq_res_start); + disable_irq(ecc->irq_res_end); + } + + return 0; +} + +static int edma_pm_resume(struct device *dev) +{ + int i, j; + + for (j = 0; j < arch_num_cc; j++) { + struct edma *cc = edma_cc[j]; + + s8 (*queue_priority_mapping)[2]; + s8 (*queue_tc_mapping)[2]; + + queue_tc_mapping = cc->info->queue_tc_mapping; + queue_priority_mapping = cc->info->queue_priority_mapping; + + /* Event queue to TC mapping */ + for (i = 0; queue_tc_mapping[i][0] != -1; i++) + map_queue_tc(j, queue_tc_mapping[i][0], + queue_tc_mapping[i][1]); + + /* Event queue priority mapping */ + for (i = 0; queue_priority_mapping[i][0] != -1; i++) + assign_priority_to_queue(j, + queue_priority_mapping[i][0], + queue_priority_mapping[i][1]); + + /* Map the channel to param entry if channel mapping logic + * exist + */ + if (edma_read(j, EDMA_CCCFG) & CHMAP_EXIST) + map_dmach_param(j); + + for (i = 0; i < cc->num_channels; i++) { + if (test_bit(i, cc->edma_inuse)) { + /* ensure access through shadow region 0 */ + edma_or_array2(j, EDMA_DRAE, 0, i >> 5, + BIT(i & 0x1f)); + + setup_dma_interrupt(i, + cc->intr_data[i].callback, + cc->intr_data[i].data); + } + } + + enable_irq(cc->irq_res_start); + enable_irq(cc->irq_res_end); + } + + return 0; +} + +static const struct dev_pm_ops edma_pm_ops = { + .suspend = edma_pm_suspend, + .resume_noirq = edma_pm_resume, +}; + static struct platform_driver edma_driver = { .driver = { .name = "edma", + .pm = &edma_pm_ops, .of_match_table = edma_of_ids, }, .probe = edma_probe,
This patch makes the edma driver resume correctly after suspend. Tested on an AM33xx platform with cyclic audio streams and omap_hsmmc. All information can be reconstructed by already known runtime information. As we now use some functions that were previously only used from __init context, annotations had to be dropped. Signed-off-by: Daniel Mack <zonque@gmail.com> --- Ok, here is v5. v4 -> v5: * dropped pm_runtime_* function calls entirely * moved the function pointers to .suspend/resume _noirq Again, thanks for the reviews. I'm still uncertain which function order is most appropriate though. Thanks, Daniel arch/arm/common/edma.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 75 insertions(+), 3 deletions(-)