diff mbox

i2c-davinci.c: CPU FREQ causes lock up due to xfr_complete

Message ID CADL8D3YchqfT+ikKTCkLNLmwm2HYKDzZy-g4T+mJLYBJ1rC72Q@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jon Cormier July 29, 2014, 2:19 p.m. UTC
Reported issue:

1. Enable I2C1, flash the new kernel and reboot
2. Immediately after reboot, attempt to change the processor clock: "echo
456000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed"
3. Process blocks

However, if we do the following:

1. Enable I2C1, flash the new kernel and reboot
2. Immediately after reboot, run: "i2cdetect -y 2 0x08 0x08" or just
"i2cdetect -y 2"
3. Then run: "echo 456000 >
/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed"
4. Command succeeds

Here's the kernel hung task stack trace:

INFO: task sh:1428 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
sh D c026dc74 0 1428 1426 0x00000000
[<c026dc74>] (schedule+0x2a8/0x334) from [<c026e2e0>]
(schedule_timeout+0x1c/0x218)
[<c026e2e0>] (schedule_timeout+0x1c/0x218) from [<c026e164>]
(wait_for_common+0xf0/0x1b8)
[<c026e164>] (wait_for_common+0xf0/0x1b8) from [<c019ca1c>]
(i2c_davinci_cpufreq_transition+0x18/0x50)
[<c019ca1c>] (i2c_davinci_cpufreq_transition+0x18/0x50) from [<c00599a4>]
(notifier_call_chain+0x38/0x68)
[<c00599a4>] (notifier_call_chain+0x38/0x68) from [<c0059a80>]
(__srcu_notifier_call_chain+0x40/0x58)
[<c0059a80>] (__srcu_notifier_call_chain+0x40/0x58) from [<c0059aac>]
(srcu_notifier_call_chain+0x14/0x18)
[<c0059aac>] (srcu_notifier_call_chain+0x14/0x18) from [<c019dd78>]
(cpufreq_notify_transition+0xc8/0xfc)
[<c019dd78>] (cpufreq_notify_transition+0xc8/0xfc) from [<c00373ac>]
(davinci_target+0x144/0x154)
[<c00373ac>] (davinci_target+0x144/0x154) from [<c019d404>]
(__cpufreq_driver_target+0x28/0x38)
[<c019d404>] (__cpufreq_driver_target+0x28/0x38) from [<c019f260>]
(cpufreq_set+0x54/0x70)
[<c019f260>] (cpufreq_set+0x54/0x70) from [<c019d698>]
(store_scaling_setspeed+0x58/0x6c)
[<c019d698>] (store_scaling_setspeed+0x58/0x6c) from [<c019e3d0>]
(store+0x58/0x74)
[<c019e3d0>] (store+0x58/0x74) from [<c00d8854>]
(sysfs_write_file+0x108/0x140)
[<c00d8854>] (sysfs_write_file+0x108/0x140) from [<c009512c>]
(vfs_write+0xb0/0x118)
[<c009512c>] (vfs_write+0xb0/0x118) from [<c0095244>] (sys_write+0x3c/0x68)
[<c0095244>] (sys_write+0x3c/0x68) from [<c002bea0>]
(ret_fast_syscall+0x0/0x28)
Kernel panic - not syncing: hung_task: blocked tasks
[<c003069c>] (unwind_backtrace+0x0/0xd0) from [<c026d810>] (panic+0x44/0xc8)
[<c026d810>] (panic+0x44/0xc8) from [<c006aa7c>] (watchdog+0x1d4/0x21c)
[<c006aa7c>] (watchdog+0x1d4/0x21c) from [<c0054670>] (kthread+0x78/0x80)
[<c0054670>] (kthread+0x78/0x80) from [<c002c8dc>]
(kernel_thread_exit+0x0/0x8)

According to the stack trace the kernel gets stuck in the
"i2c_davinci_cpufreq_transition" function when it calls
"wait_for_completion(&dev->xfr_complete);"  The two other places this
xfr_complete variable is referenced is the init_completion in the probe and
the complete at the end of the i2c_davinci_xfer function. My understanding
as to what this was intended for was to ensure that a transfer in progress
completed before changing the clock frequency.  But as its currently done
the only thing it does is make sure there has been a completed i2c transfer
on this device ever.  Is my understanding correct?

Currently the workaround is to simply disable the wait_for_completion as
seen below.  How would you fix this to ensure a transfer in progress
completes before changing clocks without hanging if no transfer was ever
attempted?
                i2c_davinci_calc_clk_dividers(dev);

Patch were this was introduced:
sha: 82c0de11b734c5acec13c0f6007466da81cd16d9  i2c:davinci:Add cpufreq
support

Comments

Brian Niebuhr July 29, 2014, 2:38 p.m. UTC | #1
> -----Original Message-----
> From: davinci-linux-open-source-bounces@linux.davincidsp.com
> [mailto:davinci-linux-open-source-bounces@linux.davincidsp.com] On
> Behalf Of Jon Cormier
> Sent: Tuesday, July 29, 2014 9:20 AM
> To: davinci-linux-open-source@linux.davincidsp.com
> Cc: Tim Iskander
> Subject: i2c-davinci.c: CPU FREQ causes lock up due to xfr_complete
>
> Reported issue:
>
> 1. Enable I2C1, flash the new kernel and reboot
> 2. Immediately after reboot, attempt to change the processor clock: "echo
> 456000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed"
> 3. Process blocks
>
> However, if we do the following:
> 1. Enable I2C1, flash the new kernel and reboot
> 2. Immediately after reboot, run: "i2cdetect -y 2 0x08 0x08" or just "i2cdetect
> -y 2"
> 3. Then run: "echo 456000 >
> /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed"
> 4. Command succeeds

I found this problem as well, and I haven't had time to put together a proper patch.  Right now we're just working around it.  There are several problems with the use of a completion for this task:

1. If no I2C transfer has occurred, a cpufreq transition will block forever (which is the bug you found)
2. Once an I2C transfer has occurred the cpufreq transition will never block since the completion is never reinitialized.
3. Even if you do reinitialize the completion for every I2C transfer, (1) is not solved and there is still a race condition where the cpufreq transition could start just before an I2C transfer starts and the I2C transfer occurs during the cpufreq transition.

Personally I think the correct solution is to use a mutex instead of a completion.  The cpufreq code would take the mutex during the prechange callback and release it during the postchange callback.  Likewise the I2C transfer function would hold the mutex while an I2C transfer is ongoing.  That way an I2C transfer cannot occur during a cpufreq transition and vice-versa.  As I mentioned, I have not had time to put together a proper patch, but I would be happy to see this issue addressed if someone else can do it.


> Here's the kernel hung task stack trace:
>
> INFO: task sh:1428 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> sh D c026dc74 0 1428 1426 0x00000000
> [<c026dc74>] (schedule+0x2a8/0x334) from [<c026e2e0>]
> (schedule_timeout+0x1c/0x218)
> [<c026e2e0>] (schedule_timeout+0x1c/0x218) from [<c026e164>]
> (wait_for_common+0xf0/0x1b8)
> [<c026e164>] (wait_for_common+0xf0/0x1b8) from [<c019ca1c>]
> (i2c_davinci_cpufreq_transition+0x18/0x50)
> [<c019ca1c>] (i2c_davinci_cpufreq_transition+0x18/0x50) from [<c00599a4>]
> (notifier_call_chain+0x38/0x68)
> [<c00599a4>] (notifier_call_chain+0x38/0x68) from [<c0059a80>]
> (__srcu_notifier_call_chain+0x40/0x58)
> [<c0059a80>] (__srcu_notifier_call_chain+0x40/0x58) from [<c0059aac>]
> (srcu_notifier_call_chain+0x14/0x18)
> [<c0059aac>] (srcu_notifier_call_chain+0x14/0x18) from [<c019dd78>]
> (cpufreq_notify_transition+0xc8/0xfc)
> [<c019dd78>] (cpufreq_notify_transition+0xc8/0xfc) from [<c00373ac>]
> (davinci_target+0x144/0x154)
> [<c00373ac>] (davinci_target+0x144/0x154) from [<c019d404>]
> (__cpufreq_driver_target+0x28/0x38)
> [<c019d404>] (__cpufreq_driver_target+0x28/0x38) from [<c019f260>]
> (cpufreq_set+0x54/0x70)
> [<c019f260>] (cpufreq_set+0x54/0x70) from [<c019d698>]
> (store_scaling_setspeed+0x58/0x6c)
> [<c019d698>] (store_scaling_setspeed+0x58/0x6c) from [<c019e3d0>]
> (store+0x58/0x74)
> [<c019e3d0>] (store+0x58/0x74) from [<c00d8854>]
> (sysfs_write_file+0x108/0x140)
> [<c00d8854>] (sysfs_write_file+0x108/0x140) from [<c009512c>]
> (vfs_write+0xb0/0x118)
> [<c009512c>] (vfs_write+0xb0/0x118) from [<c0095244>]
> (sys_write+0x3c/0x68)
> [<c0095244>] (sys_write+0x3c/0x68) from [<c002bea0>]
> (ret_fast_syscall+0x0/0x28)
> Kernel panic - not syncing: hung_task: blocked tasks
> [<c003069c>] (unwind_backtrace+0x0/0xd0) from [<c026d810>]
> (panic+0x44/0xc8)
> [<c026d810>] (panic+0x44/0xc8) from [<c006aa7c>] (watchdog+0x1d4/0x21c)
> [<c006aa7c>] (watchdog+0x1d4/0x21c) from [<c0054670>]
> (kthread+0x78/0x80)
> [<c0054670>] (kthread+0x78/0x80) from [<c002c8dc>]
> (kernel_thread_exit+0x0/0x8)
> According to the stack trace the kernel gets stuck in the
> "i2c_davinci_cpufreq_transition" function when it calls
> "wait_for_completion(&dev->xfr_complete);"  The two other places this
> xfr_complete variable is referenced is the init_completion in the probe and
> the complete at the end of the i2c_davinci_xfer function. My understanding
> as to what this was intended for was to ensure that a transfer in progress
> completed before changing the clock frequency.  But as its currently done
> the only thing it does is make sure there has been a completed i2c transfer
> on this device ever.  Is my understanding correct?
> Currently the workaround is to simply disable the wait_for_completion as
> seen below.  How would you fix this to ensure a transfer in progress
> completes before changing clocks without hanging if no transfer was ever
> attempted?
> diff --git a/drivers/i2c/busses/i2c-davinci.c b/drivers/i2c/busses/i2c-davinci.c
> index a76d85f..564247f 100644
> --- a/drivers/i2c/busses/i2c-davinci.c
> +++ b/drivers/i2c/busses/i2c-davinci.c
> @@ -596,7 +596,7 @@ static int i2c_davinci_cpufreq_transition(struct
> notifier_block *nb,
>
>         dev = container_of(nb, struct davinci_i2c_dev, freq_transition);
>         if (val == CPUFREQ_PRECHANGE) {
> -               wait_for_completion(&dev->xfr_complete);
> +               //wait_for_completion(&dev->xfr_complete);
>                 davinci_i2c_reset_ctrl(dev, 0);
>         } else if (val == CPUFREQ_POSTCHANGE) {
>                 i2c_davinci_calc_clk_dividers(dev);
> Patch were this was introduced:
> sha: 82c0de11b734c5acec13c0f6007466da81cd16d9  i2c:davinci:Add cpufreq
> support
>
> --
> Jonathan Cormier
> CriticalLink
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.
diff mbox

Patch

diff --git a/drivers/i2c/busses/i2c-davinci.c
b/drivers/i2c/busses/i2c-davinci.c
index a76d85f..564247f 100644
--- a/drivers/i2c/busses/i2c-davinci.c
+++ b/drivers/i2c/busses/i2c-davinci.c
@@ -596,7 +596,7 @@  static int i2c_davinci_cpufreq_transition(struct
notifier_block *nb,

        dev = container_of(nb, struct davinci_i2c_dev, freq_transition);
        if (val == CPUFREQ_PRECHANGE) {
-               wait_for_completion(&dev->xfr_complete);
+               //wait_for_completion(&dev->xfr_complete);
                davinci_i2c_reset_ctrl(dev, 0);
        } else if (val == CPUFREQ_POSTCHANGE) {