diff mbox

[RFC] mmc:change mmc_init workqueue into a freezable workqueue

Message ID 35FD53F367049845BC99AC72306C23D103EDAF89E192@CNBJMBX05.corpusers.net (mailing list archive)
State New, archived
Headers show

Commit Message

Wang, Yalin Jan. 8, 2015, 10:06 a.m. UTC
This patch fix the mmc driver suspend/resume conflict problems,
mmc workqueue will queue mmc_rescan(), and it will call some
pm_runtime_* functions, this will conflict with suspend path sometimes,
and will result in some strange behavior:

Suspend path:
	-000 |context_switch(inline)
	-000 |__schedule()
	-001 |schedule_preempt_disabled()
	-002 |spin_lock(inline)
	-002 |__mutex_lock_common(inline)
	-002 |__mutex_lock_slowpath(lock_count = 0xEDCD0F48)
	-003 |__mutex_fastpath_lock(inline)
	-003 |mutex_lock(lock = 0xEDCD0F48)
	-004 |sdhci_do_set_ios(host = 0xEDCD0CC0, ios = 0xEDCD0A70)
	-005 |sdhci_set_ios(?, ios = 0xEDCD0A70)
	-006 |mmc_set_ios(host = 0xEDCD0800)
	-007 |mmc_delay(inline)
	-007 |mmc_power_off(host = 0xEDCD0800)
	-008 |mmc_suspend_host(inline)
	-008 |mmc_suspend_host(host = 0xEDCD0800)
	-009 |mmc_host_suspend(dev = 0xEDCD0808)
	-010 |dpm_run_callback(cb = 0xC0627A88, dev = 0xEDCD0808, state = (event = 2), info = 0xC0B6EF9B)
	-011 |__device_suspend(dev = 0xEDCD0808, state = (event = 2), ?)
	-012 |device_suspend(inline)
	-012 |dpm_suspend(state = (event = 2))
	-013 |suspend_devices_and_enter(state = 3)
	-014 |enter_state(inline)
	-014 |pm_suspend(state = 3)
	-015 |try_to_suspend(?)
	-016 |static_key_false(inline)
	-016 |trace_workqueue_execute_end(inline)
	-016 |process_one_work(worker = 0xD5E87040, work = 0xC0E3FF54)
	-017 |worker_thread(__worker = 0xD5E87040)
	-018 |kthread(_create = 0xE9FFBF10)
	-019 |kernel_thread_exit(asm)
	 --- |end of frame

mmc_rescan() resume path:
	 -000 |context_switch(inline)
	 -000 |__schedule()
	 -001 |do_undefinstr(regs = 0xD12242F0)
	 -002 |__und_svc(asm)
	  --> |exception
	  -003 |sdhci_set_power(host = 0xEDCD0CC0, power = 0)
	  -004 |sdhci_do_set_ios(host = 0xEDCD0CC0, ios = 0xEDCD0A70)
	  -005 |sdhci_set_ios(?, ios = 0xEDCD0A70)
	  -006 |mmc_set_ios(host = 0xEDCD0800)
	  -007 |mmc_delay(inline)
	  -007 |mmc_power_up(host = 0xEDCD0800)
	  -008 |mmc_resume_bus(inline)
	  -008 |mmc_resume_bus(host = 0xEDCD0800)
	  -009 |mmc_rpm_hold(host = 0xEDCD0800, ?)
	  -010 |mmc_rescan(work = 0xEDCD0AB0)
	  -011 |static_key_false(inline)
	  -011 |trace_workqueue_execute_end(inline)
	  -011 |process_one_work(worker = 0xE5F0BBC0, work = 0xEDCD0AB0)
	  -012 |worker_thread(__worker = 0xE5F0BBC0)
	  -013 |kthread(_create = 0xD03A5F10)
	  -014 |kernel_thread_exit(asm)
	   --- |end of frame

most mmc power callback function don't check this special case, and
will cause problems, make sure the workqueue is stopped during suspend
is more safe.
---
 drivers/mmc/core/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Russell King - ARM Linux Jan. 8, 2015, 11:42 a.m. UTC | #1
On Thu, Jan 08, 2015 at 06:06:32PM +0800, Wang, Yalin wrote:
> This patch fix the mmc driver suspend/resume conflict problems,
> mmc workqueue will queue mmc_rescan(), and it will call some
> pm_runtime_* functions, this will conflict with suspend path sometimes,
> and will result in some strange behavior:
> 
> Suspend path:
> 	-000 |context_switch(inline)
> 	-000 |__schedule()
> 	-001 |schedule_preempt_disabled()
> 	-002 |spin_lock(inline)
> 	-002 |__mutex_lock_common(inline)
> 	-002 |__mutex_lock_slowpath(lock_count = 0xEDCD0F48)
> 	-003 |__mutex_fastpath_lock(inline)
> 	-003 |mutex_lock(lock = 0xEDCD0F48)
> 	-004 |sdhci_do_set_ios(host = 0xEDCD0CC0, ios = 0xEDCD0A70)
> 	-005 |sdhci_set_ios(?, ios = 0xEDCD0A70)
> 	-006 |mmc_set_ios(host = 0xEDCD0800)
> 	-007 |mmc_delay(inline)
> 	-007 |mmc_power_off(host = 0xEDCD0800)
> 	-008 |mmc_suspend_host(inline)
> 	-008 |mmc_suspend_host(host = 0xEDCD0800)
> 	-009 |mmc_host_suspend(dev = 0xEDCD0808)
> 	-010 |dpm_run_callback(cb = 0xC0627A88, dev = 0xEDCD0808, state = (event = 2), info = 0xC0B6EF9B)
> 	-011 |__device_suspend(dev = 0xEDCD0808, state = (event = 2), ?)
> 	-012 |device_suspend(inline)

Suspend can't complete until device_suspend() returns.  Hence, mmc_power_off()
must finish.

> mmc_rescan() resume path:
> 	 -000 |context_switch(inline)
> 	 -000 |__schedule()
> 	 -001 |do_undefinstr(regs = 0xD12242F0)
> 	 -002 |__und_svc(asm)
> 	  --> |exception

I assume this is what you mean by "strange behaviour" ?  If so, please
give the full oops.  Have you fully diagnosed the failure?

> most mmc power callback function don't check this special case, and
> will cause problems, make sure the workqueue is stopped during suspend
> is more safe.

I think there's a bad side effect of this: if you have swap on a SD card,
and you ask the system to hibernate, you don't want this thread to freeze.

What you do need to do is to ensure that new requests can't be processed
while the host is suspended.

A hibernate works (approximately) as:

- freeze all tasks
- push as much out to swap as possible
- suspend devices
- create system snapshot
- resume devices
- write system snapshot to swap
- power down

If the MMC thread is frozen, and you have swap on SD, then it can't write
the image to swap.
Wang, Yalin Jan. 9, 2015, 5:38 a.m. UTC | #2
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk]
> Sent: Thursday, January 08, 2015 7:42 PM
> To: Wang, Yalin
> Cc: 'chris@printf.net'; 'ulf.hansson@linaro.org'; 'tim.kryger@gmail.com';
> 'tgih.jun@samsung.com'; 'johan.rudholm@axis.com'; 'linux-
> mmc@vger.kernel.org'; 'linux-kernel@vger.kernel.org'; 'linux-arm-
> kernel@lists.infradead.org'
> Subject: Re: [RFC] mmc:change mmc_init workqueue into a freezable workqueue
> 
> On Thu, Jan 08, 2015 at 06:06:32PM +0800, Wang, Yalin wrote:
> > This patch fix the mmc driver suspend/resume conflict problems, mmc
> > workqueue will queue mmc_rescan(), and it will call some
> > pm_runtime_* functions, this will conflict with suspend path
> > sometimes, and will result in some strange behavior:
> >
> > Suspend path:
> > 	-000 |context_switch(inline)
> > 	-000 |__schedule()
> > 	-001 |schedule_preempt_disabled()
> > 	-002 |spin_lock(inline)
> > 	-002 |__mutex_lock_common(inline)
> > 	-002 |__mutex_lock_slowpath(lock_count = 0xEDCD0F48)
> > 	-003 |__mutex_fastpath_lock(inline)
> > 	-003 |mutex_lock(lock = 0xEDCD0F48)
> > 	-004 |sdhci_do_set_ios(host = 0xEDCD0CC0, ios = 0xEDCD0A70)
> > 	-005 |sdhci_set_ios(?, ios = 0xEDCD0A70)
> > 	-006 |mmc_set_ios(host = 0xEDCD0800)
> > 	-007 |mmc_delay(inline)
> > 	-007 |mmc_power_off(host = 0xEDCD0800)
> > 	-008 |mmc_suspend_host(inline)
> > 	-008 |mmc_suspend_host(host = 0xEDCD0800)
> > 	-009 |mmc_host_suspend(dev = 0xEDCD0808)
> > 	-010 |dpm_run_callback(cb = 0xC0627A88, dev = 0xEDCD0808, state =
> (event = 2), info = 0xC0B6EF9B)
> > 	-011 |__device_suspend(dev = 0xEDCD0808, state = (event = 2), ?)
> > 	-012 |device_suspend(inline)
> 
> Suspend can't complete until device_suspend() returns.  Hence,
> mmc_power_off() must finish.
> 
> > mmc_rescan() resume path:
> > 	 -000 |context_switch(inline)
> > 	 -000 |__schedule()
> > 	 -001 |do_undefinstr(regs = 0xD12242F0)
> > 	 -002 |__und_svc(asm)
> > 	  --> |exception
> 
> I assume this is what you mean by "strange behaviour" ?  If so, please give
> the full oops.  Have you fully diagnosed the failure?
> 
> > most mmc power callback function don't check this special case, and
> > will cause problems, make sure the workqueue is stopped during suspend
> > is more safe.
> 
> I think there's a bad side effect of this: if you have swap on a SD card,
> and you ask the system to hibernate, you don't want this thread to freeze.
> 
> What you do need to do is to ensure that new requests can't be processed
> while the host is suspended.
> 
> A hibernate works (approximately) as:
> 
> - freeze all tasks
> - push as much out to swap as possible
> - suspend devices
> - create system snapshot
> - resume devices
> - write system snapshot to swap
> - power down
> 
> If the MMC thread is frozen, and you have swap on SD, then it can't write
> the image to swap.
> 
I see your concerns, thanks for your comment.
I will think about some other solutions for this problems.


Thank you
diff mbox

Patch

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index a6c139d..ae8757f 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -3881,7 +3881,7 @@  static int __init mmc_init(void)
 {
 	int ret;
 
-	workqueue = alloc_ordered_workqueue("kmmcd", 0);
+	workqueue = alloc_ordered_workqueue("kmmcd", WQ_FREEZABLE);
 	if (!workqueue)
 		return -ENOMEM;