diff mbox

[v2,5/5] cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end

Message ID 20140428185507.28755.6483.stgit@srivatsabhat.in.ibm.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Srivatsa S. Bhat April 28, 2014, 6:55 p.m. UTC
Some cpufreq drivers were redundantly invoking the _begin() and _end()
APIs around frequency transitions, and this double invocation (one from
the cpufreq core and the other from the cpufreq driver) used to result
in a self-deadlock, leading to system hangs during boot. (The _begin()
API makes contending callers wait until the previous invocation is
complete. Hence, the cpufreq driver would end up waiting on itself!).

Now all such drivers have been fixed, but debugging this issue was not
very straight-forward (even lockdep didn't catch this). So let us add a
debug infrastructure to the cpufreq core to catch such issues more easily
in the future.

We add a new field called 'transition_task' to the policy structure, to keep
track of the task which is performing the frequency transition. Using this
field, we make note of this task during _begin() and print a warning if we
find a case where the same task is calling _begin() again, before completing
the previous frequency transition using the corresponding _end().

One important aspect to consider is that it is valid for ASYNC_NOTIFICATION
drivers to invoke _begin() from one task and then invoke the corresponding
_end() in another task. We take care of this scenario by setting the value
of 'transition_task' once again explicitly in the _end() function. This
helps us handle the particular tricky scenario (shown below) that can occur
in ASYNC_NOTIFICATION drivers:

Scenario 1: (Deadlock-free)
----------

         Task A						Task B

    /* 1st freq transition */
    Invoke _begin() {
            ...
            ...
    }

    Change the frequency

						Got interrupt for successful
						change of frequency.

						/* 1st freq transition */
						Invoke _end() {
							...
							...
    /* 2nd freq transition */    			...
    Invoke _begin() {					...
	    ...	//waiting for B				...
            ... //to finish _end()		}
	    ...
	    ...
    }


This scenario is actually deadlock-free because Task A can wait inside the
second call to _begin() without self-deadlocking, because it is the
responsibility of Task B to finish the first sequence by invoking the
corresponding _end().

By setting the value of 'transition_task' again explicitly in _end(), we
ensure that the code won't print a false-positive warning in this case.

However the same code successfully catches the following deadlock-prone
scenario even in ASYNC_NOTIFICATION drivers:

Scenario 2: (Deadlock-prone)
----------

         Task A						Task B

    /* 1st freq transition */
    Invoke _begin() {
            ...
            ...
    }

    /* 2nd freq transition */
    Invoke _begin() {
	    ...
	    ...
    }

    Change the frequency


Here the bug is that Task A called the second _begin() *before* actually
performing the 1st frequency transition. In other words, it failed to set
Task B in motion for the 1st frequency transition, and hence it will
self-deadlock. This is very similar to the case of drivers which do
synchronous notification, and hence the debug infrastructure developed
in this patch can catch this scenario easily.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 drivers/cpufreq/cpufreq.c |   12 ++++++++++++
 include/linux/cpufreq.h   |    1 +
 2 files changed, 13 insertions(+)


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Viresh Kumar April 29, 2014, 4:51 a.m. UTC | #1
Nice effort.

On 29 April 2014 00:25, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> Now all such drivers have been fixed, but debugging this issue was not
> very straight-forward (even lockdep didn't catch this). So let us add a
> debug infrastructure to the cpufreq core to catch such issues more easily
> in the future.

BUT, I am not sure if we really need it :(

I think we just got into the 'barrier' stuff as we had some doubts about it
earlier and were quite sure that nothing else could go wrong. Otherwise
the only problem could have been present was the second queuing
from the same thread. And we will surely get stuck if that happens and
we can't just miss that error..

> Scenario 1: (Deadlock-free)
> ----------
>
>          Task A                                         Task B
>
>     /* 1st freq transition */
>     Invoke _begin() {
>             ...
>             ...
>     }
>
>     Change the frequency
>
>                                                 Got interrupt for successful
>                                                 change of frequency.
>
>                                                 /* 1st freq transition */
>                                                 Invoke _end() {
>                                                         ...
>                                                         ...
>     /* 2nd freq transition */                           ...
>     Invoke _begin() {                                   ...
>             ... //waiting for B                         ...
>             ... //to finish _end()              }
>             ...
>             ...
>     }
>
>
> This scenario is actually deadlock-free because Task A can wait inside the
> second call to _begin() without self-deadlocking, because it is the
> responsibility of Task B to finish the first sequence by invoking the
> corresponding _end().
>
> By setting the value of 'transition_task' again explicitly in _end(), we
> ensure that the code won't print a false-positive warning in this case.
>
> However the same code successfully catches the following deadlock-prone
> scenario even in ASYNC_NOTIFICATION drivers:
>
> Scenario 2: (Deadlock-prone)
> ----------
>
>          Task A                                         Task B
>
>     /* 1st freq transition */
>     Invoke _begin() {
>             ...
>             ...
>     }
>
>     /* 2nd freq transition */
>     Invoke _begin() {
>             ...
>             ...
>     }
>
>     Change the frequency
>
>
> Here the bug is that Task A called the second _begin() *before* actually
> performing the 1st frequency transition. In other words, it failed to set
> Task B in motion for the 1st frequency transition, and hence it will
> self-deadlock. This is very similar to the case of drivers which do
> synchronous notification, and hence the debug infrastructure developed
> in this patch can catch this scenario easily.
>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
>
>  drivers/cpufreq/cpufreq.c |   12 ++++++++++++
>  include/linux/cpufreq.h   |    1 +
>  2 files changed, 13 insertions(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index abda660..2c99a6c 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -354,6 +354,10 @@ static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
>  void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
>                 struct cpufreq_freqs *freqs)
>  {
> +
> +       /* Catch double invocations of _begin() which lead to self-deadlock */
> +       WARN_ON(current == policy->transition_task);
> +
>  wait:
>         wait_event(policy->transition_wait, !policy->transition_ongoing);
>
> @@ -365,6 +369,7 @@ wait:
>         }
>
>         policy->transition_ongoing = true;
> +       policy->transition_task = current;
>
>         spin_unlock(&policy->transition_lock);
>
> @@ -378,9 +383,16 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
>         if (unlikely(WARN_ON(!policy->transition_ongoing)))
>                 return;
>
> +       /*
> +        * The task invoking _end() could be different from the one that
> +        * invoked the _begin(). So set ->transition_task again here
> +        * explicity.
> +        */
> +       policy->transition_task = current;
>         cpufreq_notify_post_transition(policy, freqs, transition_failed);
>
>         policy->transition_ongoing = false;
> +       policy->transition_task = NULL;
>
>         wake_up(&policy->transition_wait);
>  }
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 5ae5100..8f44d79 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -110,6 +110,7 @@ struct cpufreq_policy {
>         bool                    transition_ongoing; /* Tracks transition status */
>         spinlock_t              transition_lock;
>         wait_queue_head_t       transition_wait;
> +       struct task_struct      *transition_task; /* Task which is doing the transition */
>  };
>
>  /* Only for ACPI */
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar April 29, 2014, 4:55 a.m. UTC | #2
On 29 April 2014 10:21, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Nice effort.
>
> On 29 April 2014 00:25, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> Now all such drivers have been fixed, but debugging this issue was not
>> very straight-forward (even lockdep didn't catch this). So let us add a
>> debug infrastructure to the cpufreq core to catch such issues more easily
>> in the future.
>
> BUT, I am not sure if we really need it :(
>
> I think we just got into the 'barrier' stuff as we had some doubts about it
> earlier and were quite sure that nothing else could go wrong. Otherwise
> the only problem could have been present was the second queuing
> from the same thread. And we will surely get stuck if that happens and
> we can't just miss that error..
>
>> Scenario 1: (Deadlock-free)
>> ----------
>>
>>          Task A                                         Task B
>>
>>     /* 1st freq transition */
>>     Invoke _begin() {
>>             ...
>>             ...
>>     }
>>
>>     Change the frequency
>>
>>                                                 Got interrupt for successful
>>                                                 change of frequency.
>>
>>                                                 /* 1st freq transition */
>>                                                 Invoke _end() {
>>                                                         ...
>>                                                         ...
>>     /* 2nd freq transition */                           ...
>>     Invoke _begin() {                                   ...
>>             ... //waiting for B                         ...
>>             ... //to finish _end()              }
>>             ...
>>             ...
>>     }
>>
>>
>> This scenario is actually deadlock-free because Task A can wait inside the
>> second call to _begin() without self-deadlocking, because it is the
>> responsibility of Task B to finish the first sequence by invoking the
>> corresponding _end().

WTF, I was writing my mail and it just got send due to some stupid combination
of keys :( .. Sorry.

Also, this might not work as expected. Consider this scenario:

     /* 1st freq transition */
     Invoke _begin() {
             ...
             ...
     }

     Start Change of frequency and return
     back as there is no end from same thread.

     /* 2nd freq transition as there is nobody stopping us */
     Invoke _begin() {
             ... //waiting for B
             ... //to finish _end()
             ...
             ...
     }

                                                              Got
interrupt for successful
                                                              change
of frequency.

                                                              /* 1st
freq transition */
                                                              Invoke _end() {
                                                                     ...
                                                                     ...
                                                             }

And your patch will probably break this ?
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srivatsa S. Bhat April 29, 2014, 6:08 a.m. UTC | #3
On 04/29/2014 10:21 AM, Viresh Kumar wrote:
> Nice effort.
>

Thanks! :-)
 
> On 29 April 2014 00:25, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> Now all such drivers have been fixed, but debugging this issue was not
>> very straight-forward (even lockdep didn't catch this). So let us add a
>> debug infrastructure to the cpufreq core to catch such issues more easily
>> in the future.
> 
> BUT, I am not sure if we really need it :(
> 
> I think we just got into the 'barrier' stuff as we had some doubts about it
> earlier and were quite sure that nothing else could go wrong. Otherwise
> the only problem could have been present was the second queuing
> from the same thread. And we will surely get stuck if that happens and
> we can't just miss that error..
> 

Yeah, and we _did_ hit that hang, but it was not at all intuitive at first
as to what was going wrong. Worse, even lockdep is not in a position to catch
such scenarios. So it definitely doesn't hurt to add a small infrastructure
to catch such issues in the future, IMHO.

Besides, if we can add features for users, surely we can also add some
non-intrusive debug code for ourselves too, to make our lives easier, right? :-)
I'm sure we deserve that privilege ;-)

Regards,
Srivatsa S. Bhat

>> Scenario 1: (Deadlock-free)
>> ----------
>>
>>          Task A                                         Task B
>>
>>     /* 1st freq transition */
>>     Invoke _begin() {
>>             ...
>>             ...
>>     }
>>
>>     Change the frequency
>>
>>                                                 Got interrupt for successful
>>                                                 change of frequency.
>>
>>                                                 /* 1st freq transition */
>>                                                 Invoke _end() {
>>                                                         ...
>>                                                         ...
>>     /* 2nd freq transition */                           ...
>>     Invoke _begin() {                                   ...
>>             ... //waiting for B                         ...
>>             ... //to finish _end()              }
>>             ...
>>             ...
>>     }
>>
>>
>> This scenario is actually deadlock-free because Task A can wait inside the
>> second call to _begin() without self-deadlocking, because it is the
>> responsibility of Task B to finish the first sequence by invoking the
>> corresponding _end().
>>
>> By setting the value of 'transition_task' again explicitly in _end(), we
>> ensure that the code won't print a false-positive warning in this case.
>>
>> However the same code successfully catches the following deadlock-prone
>> scenario even in ASYNC_NOTIFICATION drivers:
>>
>> Scenario 2: (Deadlock-prone)
>> ----------
>>
>>          Task A                                         Task B
>>
>>     /* 1st freq transition */
>>     Invoke _begin() {
>>             ...
>>             ...
>>     }
>>
>>     /* 2nd freq transition */
>>     Invoke _begin() {
>>             ...
>>             ...
>>     }
>>
>>     Change the frequency
>>
>>
>> Here the bug is that Task A called the second _begin() *before* actually
>> performing the 1st frequency transition. In other words, it failed to set
>> Task B in motion for the 1st frequency transition, and hence it will
>> self-deadlock. This is very similar to the case of drivers which do
>> synchronous notification, and hence the debug infrastructure developed
>> in this patch can catch this scenario easily.
>>
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>> ---
>>
>>  drivers/cpufreq/cpufreq.c |   12 ++++++++++++
>>  include/linux/cpufreq.h   |    1 +
>>  2 files changed, 13 insertions(+)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index abda660..2c99a6c 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -354,6 +354,10 @@ static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
>>  void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
>>                 struct cpufreq_freqs *freqs)
>>  {
>> +
>> +       /* Catch double invocations of _begin() which lead to self-deadlock */
>> +       WARN_ON(current == policy->transition_task);
>> +
>>  wait:
>>         wait_event(policy->transition_wait, !policy->transition_ongoing);
>>
>> @@ -365,6 +369,7 @@ wait:
>>         }
>>
>>         policy->transition_ongoing = true;
>> +       policy->transition_task = current;
>>
>>         spin_unlock(&policy->transition_lock);
>>
>> @@ -378,9 +383,16 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
>>         if (unlikely(WARN_ON(!policy->transition_ongoing)))
>>                 return;
>>
>> +       /*
>> +        * The task invoking _end() could be different from the one that
>> +        * invoked the _begin(). So set ->transition_task again here
>> +        * explicity.
>> +        */
>> +       policy->transition_task = current;
>>         cpufreq_notify_post_transition(policy, freqs, transition_failed);
>>
>>         policy->transition_ongoing = false;
>> +       policy->transition_task = NULL;
>>
>>         wake_up(&policy->transition_wait);
>>  }
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index 5ae5100..8f44d79 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -110,6 +110,7 @@ struct cpufreq_policy {
>>         bool                    transition_ongoing; /* Tracks transition status */
>>         spinlock_t              transition_lock;
>>         wait_queue_head_t       transition_wait;
>> +       struct task_struct      *transition_task; /* Task which is doing the transition */
>>  };
>>
>>  /* Only for ACPI */
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srivatsa S. Bhat April 29, 2014, 6:16 a.m. UTC | #4
On 04/29/2014 10:25 AM, Viresh Kumar wrote:
> On 29 April 2014 10:21, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> Nice effort.
>>
>> On 29 April 2014 00:25, Srivatsa S. Bhat
>> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>>> Now all such drivers have been fixed, but debugging this issue was not
>>> very straight-forward (even lockdep didn't catch this). So let us add a
>>> debug infrastructure to the cpufreq core to catch such issues more easily
>>> in the future.
>>
>> BUT, I am not sure if we really need it :(
>>
>> I think we just got into the 'barrier' stuff as we had some doubts about it
>> earlier and were quite sure that nothing else could go wrong. Otherwise
>> the only problem could have been present was the second queuing
>> from the same thread. And we will surely get stuck if that happens and
>> we can't just miss that error..
>>
>>> Scenario 1: (Deadlock-free)
>>> ----------
>>>
>>>          Task A                                         Task B
>>>
>>>     /* 1st freq transition */
>>>     Invoke _begin() {
>>>             ...
>>>             ...
>>>     }
>>>
>>>     Change the frequency
>>>
>>>                                                 Got interrupt for successful
>>>                                                 change of frequency.
>>>
>>>                                                 /* 1st freq transition */
>>>                                                 Invoke _end() {
>>>                                                         ...
>>>                                                         ...
>>>     /* 2nd freq transition */                           ...
>>>     Invoke _begin() {                                   ...
>>>             ... //waiting for B                         ...
>>>             ... //to finish _end()              }
>>>             ...
>>>             ...
>>>     }
>>>
>>>
>>> This scenario is actually deadlock-free because Task A can wait inside the
>>> second call to _begin() without self-deadlocking, because it is the
>>> responsibility of Task B to finish the first sequence by invoking the
>>> corresponding _end().
> 
> WTF, I was writing my mail and it just got send due to some stupid combination
> of keys :( .. Sorry.
>

No problem!
 
> Also, this might not work as expected. Consider this scenario:
> 
>      /* 1st freq transition */
>      Invoke _begin() {
>              ...
>              ...
>      }
> 
>      Start Change of frequency and return
>      back as there is no end from same thread.
> 
>      /* 2nd freq transition as there is nobody stopping us */
>      Invoke _begin() {
>              ... //waiting for B
>              ... //to finish _end()
>              ...
>              ...
>      }
> 
>                                                               Got
> interrupt for successful
>                                                               change
> of frequency.
> 
>                                                               /* 1st
> freq transition */
>                                                               Invoke _end() {
>                                                                      ...
>                                                                      ...
>                                                              }
> 
> And your patch will probably break this ?

Yes, I'm aware that this corner case doesn't work well with my debug
patch. I tried to avoid this but couldn't think of any solution.
(One big-hammer way to avoid this is to exclude this infrastructure
for all ASYNC_NOTIFICATION drivers, but I didn't want to go with that
approach, since it makes it look ugly). Do you have any better ideas
to deal with this scenario?

Also, do we really have cases in mind where a single thread does
multiple frequency transitions in one go? That too in such quick
successions? Echo's to sysfs, changing of governors from userspace etc
all do one frequency transition at a time per-task...


Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar April 29, 2014, 6:49 a.m. UTC | #5
On 29 April 2014 11:46, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> Yes, I'm aware that this corner case doesn't work well with my debug

Don't know if its a corner case, it may be the most obvious case for
some :)

> patch. I tried to avoid this but couldn't think of any solution.

The problem is not that it wouldn't work for these systems, but we will
get WARN_ON() when they shouldn't have come :)

> (One big-hammer way to avoid this is to exclude this infrastructure
> for all ASYNC_NOTIFICATION drivers, but I didn't want to go with that
> approach, since it makes it look ugly). Do you have any better ideas
> to deal with this scenario?

Can't think of anything better than this:

+       WARN_ON(!(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION)
                             && (current == policy->transition_task));

which you already mentioned.

> Also, do we really have cases in mind where a single thread does
> multiple frequency transitions in one go? That too in such quick
> successions? Echo's to sysfs, changing of governors from userspace etc
> all do one frequency transition at a time per-task...

Its not really about if we can think of a real use case or not. The point
is, governor is free to call transition calls one after the other (will always
happen from a single thread) and it isn't supposed to wait for drivers
to finish earlier transitions as ->target() has already returned.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srivatsa S. Bhat April 29, 2014, 7:35 a.m. UTC | #6
On 04/29/2014 12:19 PM, Viresh Kumar wrote:
> On 29 April 2014 11:46, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> Yes, I'm aware that this corner case doesn't work well with my debug
> 
> Don't know if its a corner case, it may be the most obvious case for
> some :)
>

Yeah, it could be.
 
>> patch. I tried to avoid this but couldn't think of any solution.
> 
> The problem is not that it wouldn't work for these systems, but we will
> get WARN_ON() when they shouldn't have come :)
> 

Yes, I thought about this, and I agree that this is not acceptable.

>> (One big-hammer way to avoid this is to exclude this infrastructure
>> for all ASYNC_NOTIFICATION drivers, but I didn't want to go with that
>> approach, since it makes it look ugly). Do you have any better ideas
>> to deal with this scenario?
> 
> Can't think of anything better than this:
> 
> +       WARN_ON(!(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION)
>                              && (current == policy->transition_task));
> 
> which you already mentioned.

Yeah, I think we should just go with this. I thought we needed lots of
if-conditions to do exclude these drivers (which would have made it ugly),
but as you pointed above, just this one would suffice.

Besides, the cpufreq core doesn't automatically invoke _begin() and 
_end() for ASYNC_NOTIFICATION drivers. So that means the probability
that such drivers will hit this problem is extremely low, since the
driver alone is responsible for invoking _begin/_end and hence there
shouldn't be much of a conflict. So I think we should really just
skip ASYNC_NOTIFICATION drivers in this debug infrastructure.

> 
>> Also, do we really have cases in mind where a single thread does
>> multiple frequency transitions in one go? That too in such quick
>> successions? Echo's to sysfs, changing of governors from userspace etc
>> all do one frequency transition at a time per-task...
> 
> Its not really about if we can think of a real use case or not. The point
> is, governor is free to call transition calls one after the other (will always
> happen from a single thread) and it isn't supposed to wait for drivers
> to finish earlier transitions as ->target() has already returned.
> 

Yes, I agree now. Making bold assumptions in the cpufreq core about
how many frequency transitions a single task will do etc is potentially
*very* dangerous. Let's not do it that way.

I'll send a v2 excluding the ASYNC_NOTIFICATION drivers.
Thanks a lot for your inputs!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar April 29, 2014, 8:04 a.m. UTC | #7
On 29 April 2014 13:05, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> On 04/29/2014 12:19 PM, Viresh Kumar wrote:
>> +       WARN_ON(!(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION)
>>                              && (current == policy->transition_task));
>>
>> which you already mentioned.
>
> Yeah, I think we should just go with this. I thought we needed lots of
> if-conditions to do exclude these drivers (which would have made it ugly),
> but as you pointed above, just this one would suffice.

Okay, I think we can do one more modification here:

>> +       WARN_ON(unlikely(!(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION)
>>                              && (current == policy->transition_task)));


> Besides, the cpufreq core doesn't automatically invoke _begin() and
> _end() for ASYNC_NOTIFICATION drivers. So that means the probability
> that such drivers will hit this problem is extremely low, since the
> driver alone is responsible for invoking _begin/_end and hence there
> shouldn't be much of a conflict. So I think we should really just
> skip ASYNC_NOTIFICATION drivers in this debug infrastructure.

The only way it can happen (I don't hope somebody would be so
stupid to call begin twice from target() :)), is via transition notifiers,
which in some case starting a new transition..
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srivatsa S. Bhat April 29, 2014, 8:10 a.m. UTC | #8
On 04/29/2014 01:34 PM, Viresh Kumar wrote:
> On 29 April 2014 13:05, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> On 04/29/2014 12:19 PM, Viresh Kumar wrote:
>>> +       WARN_ON(!(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION)
>>>                              && (current == policy->transition_task));
>>>
>>> which you already mentioned.
>>
>> Yeah, I think we should just go with this. I thought we needed lots of
>> if-conditions to do exclude these drivers (which would have made it ugly),
>> but as you pointed above, just this one would suffice.
> 
> Okay, I think we can do one more modification here:
> 
>>> +       WARN_ON(unlikely(!(cpufreq_driver->flags & CPUFREQ_ASYNC_NOTIFICATION)
>>>                              && (current == policy->transition_task)));
>

WARN_ON and friends already wrap their arguments within unlikely().
So we don't need to add it explicitly.
 
> 
>> Besides, the cpufreq core doesn't automatically invoke _begin() and
>> _end() for ASYNC_NOTIFICATION drivers. So that means the probability
>> that such drivers will hit this problem is extremely low, since the
>> driver alone is responsible for invoking _begin/_end and hence there
>> shouldn't be much of a conflict. So I think we should really just
>> skip ASYNC_NOTIFICATION drivers in this debug infrastructure.
> 
> The only way it can happen (I don't hope somebody would be so
> stupid to call begin twice from target() :)), is via transition notifiers,
> which in some case starting a new transition..

Hmm..

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index abda660..2c99a6c 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -354,6 +354,10 @@  static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
 void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
 		struct cpufreq_freqs *freqs)
 {
+
+	/* Catch double invocations of _begin() which lead to self-deadlock */
+	WARN_ON(current == policy->transition_task);
+
 wait:
 	wait_event(policy->transition_wait, !policy->transition_ongoing);
 
@@ -365,6 +369,7 @@  wait:
 	}
 
 	policy->transition_ongoing = true;
+	policy->transition_task = current;
 
 	spin_unlock(&policy->transition_lock);
 
@@ -378,9 +383,16 @@  void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
 	if (unlikely(WARN_ON(!policy->transition_ongoing)))
 		return;
 
+	/*
+	 * The task invoking _end() could be different from the one that
+	 * invoked the _begin(). So set ->transition_task again here
+	 * explicity.
+	 */
+	policy->transition_task = current;
 	cpufreq_notify_post_transition(policy, freqs, transition_failed);
 
 	policy->transition_ongoing = false;
+	policy->transition_task = NULL;
 
 	wake_up(&policy->transition_wait);
 }
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 5ae5100..8f44d79 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -110,6 +110,7 @@  struct cpufreq_policy {
 	bool			transition_ongoing; /* Tracks transition status */
 	spinlock_t		transition_lock;
 	wait_queue_head_t	transition_wait;
+	struct task_struct	*transition_task; /* Task which is doing the transition */
 };
 
 /* Only for ACPI */