Message ID | 20130831003641.GE19754@codeaurora.org (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On Friday, August 30, 2013 05:36:41 PM Stephen Boyd wrote: > On 08/29, Viresh Kumar wrote: > > On 28 August 2013 22:22, Stephen Boyd <sboyd@codeaurora.org> wrote: > > > > > > I've applied these patches on top of v3.10 > > > > > > f51e1eb63d9c28cec188337ee656a13be6980cfd (cpufreq: Fix cpufreq regression after suspend/resume > > > aae760ed21cd690fe8a6db9f3a177ad55d7e12ab (cpufreq: Revert commit a66b2e to fix suspend/resume regression) > > > e8d05276f236ee6435e78411f62be9714e0b9377 (cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression) > > > 2a99859932281ed6c2ecdd988855f8f6838f6743 (cpufreq: Fix cpufreq driver module refcount balance after suspend/resume) > > > 419e172145cf6c51d436a8bf4afcd17511f0ff79 (cpufreq: don't leave stale policy pointer in cdbs->cur_policy) > > > 95731ebb114c5f0c028459388560fc2a72fe5049 (cpufreq: Fix governor start/stop race condition) > > > > > > That second to last one causes a NULL pointer exception after the mutex > > > warning above because the limits case does > > > > > > if (policy->max < cpu_cdbs->cur_policy->cur) > > > > > > and that dereferences a NULL cur_policy pointer. > > > > I have seen something similar and the error checking patch that > > I mentioned earlier came as solution to that only.. > > Yes that patch may reduce the chance of the race condition but I > don't believe it removes it entirely. I believe this bug still > exists in linux-next. Consider the scenario where CPU1 is going > down. > > __cpufreq_remove_dev() > ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP); > __cpufreq_governor() > policy->governor->governor(policy, CPUFREQ_GOV_STOP); > cpufreq_governor_dbs() > case CPUFREQ_GOV_STOP: > mutex_destroy(&cpu_cdbs->timer_mutex) > cpu_cdbs->cur_policy = NULL; > <PREEMPT> > store() > __cpufreq_set_policy() > ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); > __cpufreq_governor() > policy->governor->governor(policy, CPUFREQ_GOV_LIMITS); > case CPUFREQ_GOV_LIMITS: > mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex) > if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL > > Once we stop the governor I don't see how another thread can't > race in and get all the way down into the GOV_LIMITS case. Even > if we wanted to lock out that thread with some mutex or semaphore > it will have to continue running eventually and so we really need > to wait until all the sysfs files are gone before we stop the > governor (in the case of the last cpu for the policy) or we need > to stop and start the governor while holding the policy semaphore > to prevent a race. > > > > > > Are there any fixes that I'm missing? I see that some things are > > > changing in linux-next but they don't look like fixes, more like > > > optimizations. > > > > Getting patches over 3.10 would be tricky.. You are two kernel > > version back and that's not going to help much.. There are too many > > patches in between linux-next and 3.10.. > > > > > > I really can't tell you which specific ones to include, as I am lost in them :) > > That's a problem. 3.10 is the next long term stable kernel and so we need to > backport any fixes to 3.10 for the next two years. Hopefully these bugs I'm > finding in the 3.10 stable kernel's cpufreq code aren't known issues on > 3.11/next. No, they aren't. Well, that's the main reason why I've been pushing back against more churn in the cpuidle subsystem recently. I think we went too far with changes that were not entirely understood and now we're seeing the fallout. It would be great if you could identify the 3.11 changes that fix problems you're seeing in 3.10.y (doing a reverse bisect of drivers/cpufreq/ changes might help, since you have reproducers it seems). Thanks, Rafael
On Saturday, August 31, 2013 02:55:57 AM Rafael J. Wysocki wrote: > On Friday, August 30, 2013 05:36:41 PM Stephen Boyd wrote: > > On 08/29, Viresh Kumar wrote: > > > On 28 August 2013 22:22, Stephen Boyd <sboyd@codeaurora.org> wrote: > > > > > > > > I've applied these patches on top of v3.10 > > > > > > > > f51e1eb63d9c28cec188337ee656a13be6980cfd (cpufreq: Fix cpufreq regression after suspend/resume > > > > aae760ed21cd690fe8a6db9f3a177ad55d7e12ab (cpufreq: Revert commit a66b2e to fix suspend/resume regression) > > > > e8d05276f236ee6435e78411f62be9714e0b9377 (cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression) > > > > 2a99859932281ed6c2ecdd988855f8f6838f6743 (cpufreq: Fix cpufreq driver module refcount balance after suspend/resume) > > > > 419e172145cf6c51d436a8bf4afcd17511f0ff79 (cpufreq: don't leave stale policy pointer in cdbs->cur_policy) > > > > 95731ebb114c5f0c028459388560fc2a72fe5049 (cpufreq: Fix governor start/stop race condition) > > > > > > > > That second to last one causes a NULL pointer exception after the mutex > > > > warning above because the limits case does > > > > > > > > if (policy->max < cpu_cdbs->cur_policy->cur) > > > > > > > > and that dereferences a NULL cur_policy pointer. > > > > > > I have seen something similar and the error checking patch that > > > I mentioned earlier came as solution to that only.. > > > > Yes that patch may reduce the chance of the race condition but I > > don't believe it removes it entirely. I believe this bug still > > exists in linux-next. Consider the scenario where CPU1 is going > > down. > > > > __cpufreq_remove_dev() > > ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP); > > __cpufreq_governor() > > policy->governor->governor(policy, CPUFREQ_GOV_STOP); > > cpufreq_governor_dbs() > > case CPUFREQ_GOV_STOP: > > mutex_destroy(&cpu_cdbs->timer_mutex) > > cpu_cdbs->cur_policy = NULL; > > <PREEMPT> > > store() > > __cpufreq_set_policy() > > ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); > > __cpufreq_governor() > > policy->governor->governor(policy, CPUFREQ_GOV_LIMITS); > > case CPUFREQ_GOV_LIMITS: > > mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex) > > if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL > > > > Once we stop the governor I don't see how another thread can't > > race in and get all the way down into the GOV_LIMITS case. Even > > if we wanted to lock out that thread with some mutex or semaphore > > it will have to continue running eventually and so we really need > > to wait until all the sysfs files are gone before we stop the > > governor (in the case of the last cpu for the policy) or we need > > to stop and start the governor while holding the policy semaphore > > to prevent a race. > > > > > > > > > Are there any fixes that I'm missing? I see that some things are > > > > changing in linux-next but they don't look like fixes, more like > > > > optimizations. > > > > > > Getting patches over 3.10 would be tricky.. You are two kernel > > > version back and that's not going to help much.. There are too many > > > patches in between linux-next and 3.10.. > > > > > > > > > I really can't tell you which specific ones to include, as I am lost in them :) > > > > That's a problem. 3.10 is the next long term stable kernel and so we need to > > backport any fixes to 3.10 for the next two years. Hopefully these bugs I'm > > finding in the 3.10 stable kernel's cpufreq code aren't known issues on > > 3.11/next. > > No, they aren't. > > Well, that's the main reason why I've been pushing back against more churn in > the cpuidle subsystem recently. I think we went too far with changes that > were not entirely understood and now we're seeing the fallout. s/cpuidle/cpufreq/ Apparently, I'm already too tired. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 31 August 2013 06:06, Stephen Boyd <sboyd@codeaurora.org> wrote: > Yes that patch may reduce the chance of the race condition but I > don't believe it removes it entirely. I believe this bug still > exists in linux-next. Consider the scenario where CPU1 is going > down. > > __cpufreq_remove_dev() > ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP); > __cpufreq_governor() > policy->governor->governor(policy, CPUFREQ_GOV_STOP); > cpufreq_governor_dbs() > case CPUFREQ_GOV_STOP: > mutex_destroy(&cpu_cdbs->timer_mutex) > cpu_cdbs->cur_policy = NULL; > <PREEMPT> > store() > __cpufreq_set_policy() > ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); > __cpufreq_governor() > policy->governor->governor(policy, CPUFREQ_GOV_LIMITS); > case CPUFREQ_GOV_LIMITS: > mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex) > if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL Some of the crashes you reported would be fixed by the patches I sent today morning. Let me know if anything else is left for latest linux-next... Btw, I am facing another crash which I am not sure how to fix.. It came with your script: [ 190.850481] ------------[ cut here ]------------ [ 190.850489] WARNING: CPU: 3 PID: 14140 at /home/arm/work/kernel/mywork/linux.git/include/linux/kref.h:47 kobject_get+0x42/0x50() [ 190.850490] Modules linked in: nfsd nfs fscache lockd arc4 iwldvm mac80211 i915 iwlwifi drm_kms_helper nfs_acl auth_rpcgss cfg80211 sunrpc drm joyd ev thinkpad_acpi snd_hda_codec_hdmi snd_seq_midi snd_hda_codec_conexant oid_registry btusb snd_rawmidi snd_hda_intel snd_hda_codec i2c_algo_bit rfcomm snd_seq_midi_event bnep psmouse snd_seq snd_hwdep snd_pcm bluetooth snd_timer snd_seq_device parport_pc snd_page_alloc ppdev tpm_tis snd soundcore lp c_ich lp parport video serio_raw mac_hid wmi nvram binfmt_misc btrfs raid6_pq e1000e ptp pps_core xor sdhci_pci sdhci zlib_deflate libcrc32c [ 190.850563] CPU: 3 PID: 14140 Comm: sh Not tainted 3.11.0-rc7-custom #39 [ 190.850567] Hardware name: LENOVO 4236G50/4236G50, BIOS 83ET70WW (1.40 ) 06/12/2012 [ 190.850571] 000000000000002f ffff8800c8bdfc38 ffffffff816746c3 0000000000000007 [ 190.850580] 0000000000000000 ffff8800c8bdfc78 ffffffff8104cf8c ffff88011e5f9b18 [ 190.850587] ffff880118eaf000 0000000000000001 0000000000000202 0000000000000008 [ 190.850593] Call Trace: [ 190.850607] [<ffffffff816746c3>] dump_stack+0x46/0x58 [ 190.850615] [<ffffffff8104cf8c>] warn_slowpath_common+0x8c/0xc0 [ 190.850622] [<ffffffff8104cfda>] warn_slowpath_null+0x1a/0x20 [ 190.850629] [<ffffffff81324e02>] kobject_get+0x42/0x50 [ 190.850638] [<ffffffff81533ab0>] cpufreq_cpu_get+0x80/0xc0 [ 190.850647] [<ffffffff81533c11>] cpufreq_get_policy+0x21/0x120 [ 190.850655] [<ffffffff81533fdf>] store_scaling_min_freq+0x3f/0xa0 [ 190.850666] [<ffffffff816785b6>] ? down_write+0x16/0x40 [ 190.850674] [<ffffffff81533000>] store+0x70/0xb0 [ 190.850683] [<ffffffff811f2582>] sysfs_write_file+0xe2/0x170 [ 190.850693] [<ffffffff81181e8e>] vfs_write+0xce/0x200 [ 190.850700] [<ffffffff81182392>] SyS_write+0x52/0xa0 [ 190.850707] [<ffffffff81683882>] system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday, September 01, 2013 11:54:10 AM Viresh Kumar wrote: > On 31 August 2013 06:06, Stephen Boyd <sboyd@codeaurora.org> wrote: > > Yes that patch may reduce the chance of the race condition but I > > don't believe it removes it entirely. I believe this bug still > > exists in linux-next. Consider the scenario where CPU1 is going > > down. > > > > __cpufreq_remove_dev() > > ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP); > > __cpufreq_governor() > > policy->governor->governor(policy, CPUFREQ_GOV_STOP); > > cpufreq_governor_dbs() > > case CPUFREQ_GOV_STOP: > > mutex_destroy(&cpu_cdbs->timer_mutex) > > cpu_cdbs->cur_policy = NULL; > > <PREEMPT> > > store() > > __cpufreq_set_policy() > > ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); > > __cpufreq_governor() > > policy->governor->governor(policy, CPUFREQ_GOV_LIMITS); > > case CPUFREQ_GOV_LIMITS: > > mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex) > > if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL > > Some of the crashes you reported would be fixed by the patches I sent > today morning. > > Let me know if anything else is left for latest linux-next... > > Btw, I am facing another crash which I am not sure how to fix.. It > came with your script: This isn't a crash, but a WARN_ON_ONCE() triggering. The comment in kref_get() explains when that occurs, so we seem to have a race between store_scaling_min_freq() and CPU removal. > [ 190.850481] ------------[ cut here ]------------ > [ 190.850489] WARNING: CPU: 3 PID: 14140 at > /home/arm/work/kernel/mywork/linux.git/include/linux/kref.h:47 > kobject_get+0x42/0x50() > [ 190.850490] Modules linked in: nfsd nfs fscache lockd arc4 iwldvm > mac80211 i915 iwlwifi drm_kms_helper nfs_acl auth_rpcgss cfg80211 > sunrpc drm joyd > ev thinkpad_acpi snd_hda_codec_hdmi snd_seq_midi > snd_hda_codec_conexant oid_registry btusb snd_rawmidi snd_hda_intel > snd_hda_codec i2c_algo_bit rfcomm > snd_seq_midi_event bnep psmouse snd_seq snd_hwdep snd_pcm bluetooth > snd_timer snd_seq_device parport_pc snd_page_alloc ppdev tpm_tis snd > soundcore lp > c_ich lp parport video serio_raw mac_hid wmi nvram binfmt_misc btrfs > raid6_pq e1000e ptp pps_core xor sdhci_pci sdhci zlib_deflate > libcrc32c > [ 190.850563] CPU: 3 PID: 14140 Comm: sh Not tainted 3.11.0-rc7-custom #39 > [ 190.850567] Hardware name: LENOVO 4236G50/4236G50, BIOS 83ET70WW > (1.40 ) 06/12/2012 > [ 190.850571] 000000000000002f ffff8800c8bdfc38 ffffffff816746c3 > 0000000000000007 > [ 190.850580] 0000000000000000 ffff8800c8bdfc78 ffffffff8104cf8c > ffff88011e5f9b18 > [ 190.850587] ffff880118eaf000 0000000000000001 0000000000000202 > 0000000000000008 > [ 190.850593] Call Trace: > [ 190.850607] [<ffffffff816746c3>] dump_stack+0x46/0x58 > [ 190.850615] [<ffffffff8104cf8c>] warn_slowpath_common+0x8c/0xc0 > [ 190.850622] [<ffffffff8104cfda>] warn_slowpath_null+0x1a/0x20 > [ 190.850629] [<ffffffff81324e02>] kobject_get+0x42/0x50 > [ 190.850638] [<ffffffff81533ab0>] cpufreq_cpu_get+0x80/0xc0 > [ 190.850647] [<ffffffff81533c11>] cpufreq_get_policy+0x21/0x120 > [ 190.850655] [<ffffffff81533fdf>] store_scaling_min_freq+0x3f/0xa0 > [ 190.850666] [<ffffffff816785b6>] ? down_write+0x16/0x40 > [ 190.850674] [<ffffffff81533000>] store+0x70/0xb0 > [ 190.850683] [<ffffffff811f2582>] sysfs_write_file+0xe2/0x170 > [ 190.850693] [<ffffffff81181e8e>] vfs_write+0xce/0x200 > [ 190.850700] [<ffffffff81182392>] SyS_write+0x52/0xa0 > [ 190.850707] [<ffffffff81683882>] system_call_fastpath+0x16/0x1b
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index cbfe3c1..ae4b59c 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -755,6 +755,29 @@ static int cpufreq_add_dev_interface(unsigned int cpu, if (ret) return ret; + write_lock_irqsave(&cpufreq_driver_lock, flags); + for_each_cpu(j, policy->cpus) { + per_cpu(cpufreq_cpu_data, j) = policy; + per_cpu(cpufreq_policy_cpu, j) = policy->cpu; + } + write_unlock_irqrestore(&cpufreq_driver_lock, flags); + + memcpy(&new_policy, policy, sizeof(struct cpufreq_policy)); + /* assure that the starting sequence is run in __cpufreq_set_policy */ + policy->governor = NULL; + + /* set default policy */ + ret = __cpufreq_set_policy(policy, &new_policy); + policy->user_policy.policy = policy->policy; + policy->user_policy.governor = policy->governor; + + if (ret) { + pr_debug("setting policy failed\n"); + if (cpufreq_driver->exit) + cpufreq_driver->exit(policy); + return ret; + } + /* set up files for this cpu device */ drv_attr = cpufreq_driver->attr; while ((drv_attr) && (*drv_attr)) { @@ -779,31 +802,10 @@ static int cpufreq_add_dev_interface(unsigned int cpu, goto err_out_kobj_put; } - write_lock_irqsave(&cpufreq_driver_lock, flags); - for_each_cpu(j, policy->cpus) { - per_cpu(cpufreq_cpu_data, j) = policy; - per_cpu(cpufreq_policy_cpu, j) = policy->cpu; - } - write_unlock_irqrestore(&cpufreq_driver_lock, flags); - ret = cpufreq_add_dev_symlink(cpu, policy); if (ret) goto err_out_kobj_put; - memcpy(&new_policy, policy, sizeof(struct cpufreq_policy)); - /* assure that the starting sequence is run in __cpufreq_set_policy */ - policy->governor = NULL; - - /* set default policy */ - ret = __cpufreq_set_policy(policy, &new_policy); - policy->user_policy.policy = policy->policy; - policy->user_policy.governor = policy->governor; - - if (ret) { - pr_debug("setting policy failed\n"); - if (cpufreq_driver->exit) - cpufreq_driver->exit(policy); - } return ret; err_out_kobj_put: