diff mbox

[v3.10,regression] deadlock on cpu hotplug

Message ID 51DB724F.9050708@linux.vnet.ibm.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Michael Wang July 9, 2013, 2:15 a.m. UTC
Hi, Bartlomiej

On 07/08/2013 11:26 PM, Bartlomiej Zolnierkiewicz wrote:
[snip]
> 
> # echo 0 > /sys/devices/system/cpu/cpu3/online
> # echo 0 > /sys/devices/system/cpu/cpu2/online
> # echo 0 > /sys/devices/system/cpu/cpu1/online
> # while true;do echo 1 > /sys/devices/system/cpu/cpu1/online;echo 0 > /sys/devices/system/cpu/cpu1/online;done
> 
> The commit in question (2f7021a8) was merged in v3.10-rc5 as a fix for
> commit 031299b ("cpufreq: governors: Avoid unnecessary per cpu timer
> interrupts") which was causing a kernel warning to show up.
> 
> Michael/Viresh: do you have some idea how to fix the issue?

Thanks for the report :) would you like to take a try
on below patch and see whether it solve the issue?


Regards,
Michael Wang

> 
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R&D Institute Poland
> Samsung Electronics
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Bartlomiej Zolnierkiewicz July 9, 2013, 11:51 a.m. UTC | #1
Hi,

On Tuesday, July 09, 2013 10:15:43 AM Michael Wang wrote:
> Hi, Bartlomiej
> 
> On 07/08/2013 11:26 PM, Bartlomiej Zolnierkiewicz wrote:
> [snip]
> > 
> > # echo 0 > /sys/devices/system/cpu/cpu3/online
> > # echo 0 > /sys/devices/system/cpu/cpu2/online
> > # echo 0 > /sys/devices/system/cpu/cpu1/online
> > # while true;do echo 1 > /sys/devices/system/cpu/cpu1/online;echo 0 > /sys/devices/system/cpu/cpu1/online;done
> > 
> > The commit in question (2f7021a8) was merged in v3.10-rc5 as a fix for
> > commit 031299b ("cpufreq: governors: Avoid unnecessary per cpu timer
> > interrupts") which was causing a kernel warning to show up.
> > 
> > Michael/Viresh: do you have some idea how to fix the issue?
> 
> Thanks for the report :) would you like to take a try
> on below patch and see whether it solve the issue?

It doesn't help and unfortunately it just can't help as it only
addresses lockdep functionality while the issue is not a lockdep
problem but a genuine locking problem. CPU hot-unplug invokes
_cpu_down() which calls cpu_hotplug_begin() which in turn takes
&cpu_hotplug.lock. The lock is then hold during __cpu_notify()
call. Notifier chain goes up to cpufreq_governor_dbs() which for
CPUFREQ_GOV_STOP event does gov_cancel_work(). This function
flushes pending work and waits for it to finish. The all above
happens in one kernel thread. At the same time the other kernel
thread is doing the work we are waiting to complete and it also
happens to do gov_queue_work() which calls get_online_cpus().
Then the code tries to take &cpu_hotplug.lock which is already
held by the first thread and deadlocks.

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad..aa05eaa 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -229,6 +229,8 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
>         }
>  }
>  
> +static struct lock_class_key j_cdbs_key;
> +
>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>                 struct common_dbs_data *cdata, unsigned int event)
>  {
> @@ -366,6 +368,8 @@ int (struct cpufreq_policy *policy,
>                                         kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>  
>                         mutex_init(&j_cdbs->timer_mutex);
> +                       lockdep_set_class(&j_cdbs->timer_mutex, &j_cdbs_key);
> +
>                         INIT_DEFERRABLE_WORK(&j_cdbs->work,
>                                              dbs_data->cdata->gov_dbs_timer);
>                 }
> 
> Regards,
> Michael Wang

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang July 10, 2013, 2:40 a.m. UTC | #2
On 07/09/2013 07:51 PM, Bartlomiej Zolnierkiewicz wrote:
[snip]
> 
> It doesn't help and unfortunately it just can't help as it only
> addresses lockdep functionality while the issue is not a lockdep
> problem but a genuine locking problem. CPU hot-unplug invokes
> _cpu_down() which calls cpu_hotplug_begin() which in turn takes
> &cpu_hotplug.lock. The lock is then hold during __cpu_notify()
> call. Notifier chain goes up to cpufreq_governor_dbs() which for
> CPUFREQ_GOV_STOP event does gov_cancel_work(). This function
> flushes pending work and waits for it to finish. The all above
> happens in one kernel thread. At the same time the other kernel
> thread is doing the work we are waiting to complete and it also
> happens to do gov_queue_work() which calls get_online_cpus().
> Then the code tries to take &cpu_hotplug.lock which is already
> held by the first thread and deadlocks.

Hmm...I think I get your point, some thread hold the lock and
flush some work which also try to hold the same lock, correct?

Ok, that's a problem, let's figure out a good way to solve it :)

Regards,
Michael Wang




> 
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R&D Institute Poland
> Samsung Electronics
> 
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index 5af40ad..aa05eaa 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -229,6 +229,8 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
>>         }
>>  }
>>  
>> +static struct lock_class_key j_cdbs_key;
>> +
>>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>                 struct common_dbs_data *cdata, unsigned int event)
>>  {
>> @@ -366,6 +368,8 @@ int (struct cpufreq_policy *policy,
>>                                         kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>>  
>>                         mutex_init(&j_cdbs->timer_mutex);
>> +                       lockdep_set_class(&j_cdbs->timer_mutex, &j_cdbs_key);
>> +
>>                         INIT_DEFERRABLE_WORK(&j_cdbs->work,
>>                                              dbs_data->cdata->gov_dbs_timer);
>>                 }
>>
>> Regards,
>> Michael Wang
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 5af40ad..aa05eaa 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -229,6 +229,8 @@  static void set_sampling_rate(struct dbs_data *dbs_data,
        }
 }
 
+static struct lock_class_key j_cdbs_key;
+
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
                struct common_dbs_data *cdata, unsigned int event)
 {
@@ -366,6 +368,8 @@  int (struct cpufreq_policy *policy,
                                        kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
                        mutex_init(&j_cdbs->timer_mutex);
+                       lockdep_set_class(&j_cdbs->timer_mutex, &j_cdbs_key);
+
                        INIT_DEFERRABLE_WORK(&j_cdbs->work,
                                             dbs_data->cdata->gov_dbs_timer);
                }