diff mbox series

[v4,3/7] genirq: Introduce irq_suspend_one() and irq_resume_one() callbacks

Message ID 1597058460-16211-4-git-send-email-mkshah@codeaurora.org (mailing list archive)
State Superseded
Headers show
Series irqchip: qcom: pdc: Introduce irq_set_wake call | expand

Commit Message

Maulik Shah Aug. 10, 2020, 11:20 a.m. UTC
From: Douglas Anderson <dianders@chromium.org>

The "struct irq_chip" has two callbacks in it: irq_suspend() and
irq_resume().  These two callbacks are interesting because sometimes
an irq chip needs to know about suspend/resume, but they are a bit
awkward because:
1. They are called once for the whole irq_chip, not once per IRQ.
   It's passed data for one of the IRQs enabled on that chip.  That
   means it's up to the irq_chip driver to aggregate.
2. They are only called if you're using "generic-chip", which not
   everyone is.
3. The implementation uses syscore ops, which apparently have problems
   with s2idle.

Probably the old irq_suspend() and irq_resume() callbacks should be
deprecated.

Let's introcuce a nicer API that works for all irq_chip devices.  This
will be called by the core and is called once per IRQ.  The core will
call the suspend callback after doing its normal suspend operations
and the resume before its normal resume operations.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Maulik Shah <mkshah@codeaurora.org>
---
 include/linux/irq.h    | 13 +++++++++++--
 kernel/irq/chip.c      | 16 ++++++++++++++++
 kernel/irq/internals.h |  2 ++
 kernel/irq/pm.c        | 15 ++++++++++++---
 4 files changed, 41 insertions(+), 5 deletions(-)

Comments

Doug Anderson Aug. 11, 2020, 8:09 p.m. UTC | #1
Hi,

On Mon, Aug 10, 2020 at 4:21 AM Maulik Shah <mkshah@codeaurora.org> wrote:
>
> From: Douglas Anderson <dianders@chromium.org>
>
> The "struct irq_chip" has two callbacks in it: irq_suspend() and
> irq_resume().  These two callbacks are interesting because sometimes
> an irq chip needs to know about suspend/resume, but they are a bit
> awkward because:
> 1. They are called once for the whole irq_chip, not once per IRQ.
>    It's passed data for one of the IRQs enabled on that chip.  That
>    means it's up to the irq_chip driver to aggregate.
> 2. They are only called if you're using "generic-chip", which not
>    everyone is.
> 3. The implementation uses syscore ops, which apparently have problems
>    with s2idle.
>
> Probably the old irq_suspend() and irq_resume() callbacks should be
> deprecated.
>
> Let's introcuce a nicer API that works for all irq_chip devices.  This

You grabbed my patch (which is great, thanks!) but forgot to address
Stephen's early feedback from <https://crrev.com/c/2321123>.
Specifically:

s/introcuce/introduce


> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -468,10 +468,16 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
>   * @irq_bus_sync_unlock:function to sync and unlock slow bus (i2c) chips
>   * @irq_cpu_online:    configure an interrupt source for a secondary CPU
>   * @irq_cpu_offline:   un-configure an interrupt source for a secondary CPU
> + * @irq_suspend_one:   called on an every irq to suspend it; called even if
> + *                     this IRQ is configured for wakeup

s/called on an/called on

> + * @irq_resume_one:    called on an every irq to resume it; called even if
> + *                     this IRQ is configured for wakeup

s/called on an/called on


-Doug
Maulik Shah Aug. 13, 2020, 7:18 a.m. UTC | #2
Hi,

Sure, i will take care these comments in v5.

Thanks,
Maulik

On 8/12/2020 1:39 AM, Doug Anderson wrote:
> Hi,
>
> On Mon, Aug 10, 2020 at 4:21 AM Maulik Shah <mkshah@codeaurora.org> wrote:
>> From: Douglas Anderson <dianders@chromium.org>
>>
>> The "struct irq_chip" has two callbacks in it: irq_suspend() and
>> irq_resume().  These two callbacks are interesting because sometimes
>> an irq chip needs to know about suspend/resume, but they are a bit
>> awkward because:
>> 1. They are called once for the whole irq_chip, not once per IRQ.
>>     It's passed data for one of the IRQs enabled on that chip.  That
>>     means it's up to the irq_chip driver to aggregate.
>> 2. They are only called if you're using "generic-chip", which not
>>     everyone is.
>> 3. The implementation uses syscore ops, which apparently have problems
>>     with s2idle.
>>
>> Probably the old irq_suspend() and irq_resume() callbacks should be
>> deprecated.
>>
>> Let's introcuce a nicer API that works for all irq_chip devices.  This
> You grabbed my patch (which is great, thanks!) but forgot to address
> Stephen's early feedback from <https://crrev.com/c/2321123>.
> Specifically:
>
> s/introcuce/introduce
>
>
>> --- a/include/linux/irq.h
>> +++ b/include/linux/irq.h
>> @@ -468,10 +468,16 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
>>    * @irq_bus_sync_unlock:function to sync and unlock slow bus (i2c) chips
>>    * @irq_cpu_online:    configure an interrupt source for a secondary CPU
>>    * @irq_cpu_offline:   un-configure an interrupt source for a secondary CPU
>> + * @irq_suspend_one:   called on an every irq to suspend it; called even if
>> + *                     this IRQ is configured for wakeup
> s/called on an/called on
>
>> + * @irq_resume_one:    called on an every irq to resume it; called even if
>> + *                     this IRQ is configured for wakeup
> s/called on an/called on
>
>
> -Doug
Thomas Gleixner Aug. 13, 2020, 9:29 a.m. UTC | #3
Maulik Shah <mkshah@codeaurora.org> writes:
> From: Douglas Anderson <dianders@chromium.org>
>
> The "struct irq_chip" has two callbacks in it: irq_suspend() and
> irq_resume().  These two callbacks are interesting because sometimes
> an irq chip needs to know about suspend/resume, but they are a bit
> awkward because:
> 1. They are called once for the whole irq_chip, not once per IRQ.
>    It's passed data for one of the IRQs enabled on that chip.  That
>    means it's up to the irq_chip driver to aggregate.
> 2. They are only called if you're using "generic-chip", which not
>    everyone is.
> 3. The implementation uses syscore ops, which apparently have problems
>    with s2idle.

The main point is that these callbacks are specific to generic chip and
not used anywhere else.

> Probably the old irq_suspend() and irq_resume() callbacks should be
> deprecated.

You need to analyze first what these callbacks actually do. :)

> Let's introcuce a nicer API that works for all irq_chip devices.

s/Let's intro/Intro/

Let's is pretty useless in a changelog especially if you read it some
time after the patch got applied.

> This will be called by the core and is called once per IRQ.  The core
> will call the suspend callback after doing its normal suspend
> operations and the resume before its normal resume operations.

Will be? You are adding the code which calls that unconditionally even.

> +void suspend_one_irq(struct irq_desc *desc)
> +{
> +	struct irq_chip *chip = desc->irq_data.chip;
> +
> +	if (chip->irq_suspend_one)
> +		chip->irq_suspend_one(&desc->irq_data);
> +}
> +
> +void resume_one_irq(struct irq_desc *desc)
> +{
> +	struct irq_chip *chip = desc->irq_data.chip;
> +
> +	if (chip->irq_resume_one)
> +		chip->irq_resume_one(&desc->irq_data);
> +}

There not much of a point to have these in chip.c. The functionality is
clearly pm.c only.

>  static bool suspend_device_irq(struct irq_desc *desc)
>  {
> +	bool sync = false;
> +
>  	if (!desc->action || irq_desc_is_chained(desc) ||
>  	    desc->no_suspend_depth)
> -		return false;
> +		goto exit;

What?

If no_suspend_depth is > 0 why would you try to tell the irq chip
that this line needs to be suspended?

If there is no action, then the interrupt line is in shut down
state. What's the point of suspending it?

Chained interrupts are special and you really have to think hard whether
calling suspend for them unconditionally is a good idea. What if a
wakeup irq is connected to this chained thing?

>  	if (irqd_is_wakeup_set(&desc->irq_data)) {
>  		irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
> +
>  		/*
>  		 * We return true here to force the caller to issue
>  		 * synchronize_irq(). We need to make sure that the
>  		 * IRQD_WAKEUP_ARMED is visible before we return from
>  		 * suspend_device_irqs().
>  		 */
> -		return true;
> +		sync = true;
> +		goto exit;

So again. This interrupt is a wakeup source. What's the point of
suspending it unconditionally.

>  	}
>  
>  	desc->istate |= IRQS_SUSPENDED;
> @@ -95,7 +99,10 @@ static bool suspend_device_irq(struct irq_desc *desc)
>  	 */
>  	if (irq_desc_get_chip(desc)->flags & IRQCHIP_MASK_ON_SUSPEND)
>  		mask_irq(desc);
> -	return true;
> +
> +exit:
> +	suspend_one_irq(desc);
> +	return sync;

So what happens in this case:

   CPU0                         CPU1
   interrupt                    suspend_device_irq()
     handle()                     chip->suspend_one()
       action()                 ...              
       chip->fiddle();

????

What is the logic here and how is this going to work under all
circumstances without having magic hacks in the irq chip to handle all
the corner cases?

This needs way more thoughts vs. the various states and sync
requirements. Just adding callbacks, invoking them unconditionally, not
giving any rationale how the whole thing is supposed to work and then
let everyone figure out how to deal with the state and corner case
handling at the irq chip driver level does not cut it, really.

State handling is core functionality and if irq chip drivers have
special requirements then they want to be communicated with flags and/or
specialized callbacks.

Thanks,

        tglx
Doug Anderson Aug. 13, 2020, 4:09 p.m. UTC | #4
Hi,

On Thu, Aug 13, 2020 at 2:29 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Maulik Shah <mkshah@codeaurora.org> writes:
> > From: Douglas Anderson <dianders@chromium.org>
> >
> > The "struct irq_chip" has two callbacks in it: irq_suspend() and
> > irq_resume().  These two callbacks are interesting because sometimes
> > an irq chip needs to know about suspend/resume, but they are a bit
> > awkward because:
> > 1. They are called once for the whole irq_chip, not once per IRQ.
> >    It's passed data for one of the IRQs enabled on that chip.  That
> >    means it's up to the irq_chip driver to aggregate.
> > 2. They are only called if you're using "generic-chip", which not
> >    everyone is.
> > 3. The implementation uses syscore ops, which apparently have problems
> >    with s2idle.
>
> The main point is that these callbacks are specific to generic chip and
> not used anywhere else.

I'm not sure I understand.  This callback is used by drivers that use
generic-chip but I don't think there's anything specific about
generic-chip in these callbacks.  Sure many of them use the
generic-chip's "wake_active" tracking but a different IRQ chip could
track "wake_active" itself without bringing in all of generic-chip and
still might want to accomplish the same thing, right?


> > Probably the old irq_suspend() and irq_resume() callbacks should be
> > deprecated.
>
> You need to analyze first what these callbacks actually do. :)

See below.  I intended my callbacks to be for the same type of thing
as the existing ones, though perhaps either my naming or description
was confusing.


> > Let's introcuce a nicer API that works for all irq_chip devices.
>
> s/Let's intro/Intro/
>
> Let's is pretty useless in a changelog especially if you read it some
> time after the patch got applied.

Sounds fine.  Hopefully Maulik can adjust when he posts the next version.


> > This will be called by the core and is called once per IRQ.  The core
> > will call the suspend callback after doing its normal suspend
> > operations and the resume before its normal resume operations.
>
> Will be? You are adding the code which calls that unconditionally even.
>
> > +void suspend_one_irq(struct irq_desc *desc)
> > +{
> > +     struct irq_chip *chip = desc->irq_data.chip;
> > +
> > +     if (chip->irq_suspend_one)
> > +             chip->irq_suspend_one(&desc->irq_data);
> > +}
> > +
> > +void resume_one_irq(struct irq_desc *desc)
> > +{
> > +     struct irq_chip *chip = desc->irq_data.chip;
> > +
> > +     if (chip->irq_resume_one)
> > +             chip->irq_resume_one(&desc->irq_data);
> > +}
>
> There not much of a point to have these in chip.c. The functionality is
> clearly pm.c only.

No objections to it moving.  Since Maulik is posting the patches,
hopefully he can move it?


> >  static bool suspend_device_irq(struct irq_desc *desc)
> >  {
> > +     bool sync = false;
> > +
> >       if (!desc->action || irq_desc_is_chained(desc) ||
> >           desc->no_suspend_depth)
> > -             return false;
> > +             goto exit;
>
> What?
>
> If no_suspend_depth is > 0 why would you try to tell the irq chip
> that this line needs to be suspended?
>
> If there is no action, then the interrupt line is in shut down
> state. What's the point of suspending it?
>
> Chained interrupts are special and you really have to think hard whether
> calling suspend for them unconditionally is a good idea. What if a
> wakeup irq is connected to this chained thing?

I think there is a confusion about what this callback is intended to
do and that probably needs to be made clearer, either by renaming or
by comments (or both).  Let's think about these two things that we
might be telling the IRQ:

a) Please disable yourself in preparation for suspending.

b) The system is suspending, please take any action you need to.

I believe you are reading this as a).  I intended it to be b).  Can
you think of a name for these callbacks that would make it clearer?
suspend_notify() / resume_notify() maybe?


Specifically the problem we're trying to address is when an IRQ is
marked as "disabled" (driver called disable_irq()) but also marked as
"wakeup" (driver called enable_irq_wake()).  As per my understanding,
this means:

* Don't call the interrupt handler for this interrupt until I call
enable_irq() but keep tracking it (either in hardware or in software).
Specifically it's a requirement that if the interrupt fires one or
more times while masked the interrupt handler should be called as soon
as enable_irq() is called.

* If this interrupt fires while the system is suspended then please
wake the system up.


On some (many?) interrupt controllers a masked interrupt won't wake
the system up.  Thus we need some point in time where the interrupt
controller can unmask interrupts in hardware so that they can act as
wakeups.  Also: if an interrupt was masked lazily this could be a good
time to ensure that these interrupts _won't_ wake the system up.  Thus
the point of these callbacks is to provide a hook for IRQ chips to do
this.  Now that you understand the motivation perhaps you can suggest
a better way to accomplish this if the approach in this patch is not
OK.


I will note that a quick audit of existing users of the gernic-chip's
irq_suspend() show that they are doing exactly this.  So the point of
my patch is to actually allow other IRQ chips (ones that aren't using
generic-chip) to do this type of thing.  At the same time my patch
provides a way for current users of generic-chip to adapt their
routines so they work without syscore (which, I guess, isn't
compatible with s2idle).


> >       if (irqd_is_wakeup_set(&desc->irq_data)) {
> >               irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
> > +
> >               /*
> >                * We return true here to force the caller to issue
> >                * synchronize_irq(). We need to make sure that the
> >                * IRQD_WAKEUP_ARMED is visible before we return from
> >                * suspend_device_irqs().
> >                */
> > -             return true;
> > +             sync = true;
> > +             goto exit;
>
> So again. This interrupt is a wakeup source. What's the point of
> suspending it unconditionally.

Again this is a confusion about whether I'm saying "please suspend
yourself" or "the system is suspending, please take needed action".


> >       }
> >
> >       desc->istate |= IRQS_SUSPENDED;
> > @@ -95,7 +99,10 @@ static bool suspend_device_irq(struct irq_desc *desc)
> >        */
> >       if (irq_desc_get_chip(desc)->flags & IRQCHIP_MASK_ON_SUSPEND)
> >               mask_irq(desc);
> > -     return true;
> > +
> > +exit:
> > +     suspend_one_irq(desc);
> > +     return sync;
>
> So what happens in this case:
>
>    CPU0                         CPU1
>    interrupt                    suspend_device_irq()
>      handle()                     chip->suspend_one()
>        action()                 ...
>        chip->fiddle();
>
> ????

Ah, so I guess we need to move the call to suspend_one_irq() till
after the (potential) call to synchronize_irq() in in
suspend_device_irqs()?


> What is the logic here and how is this going to work under all
> circumstances without having magic hacks in the irq chip to handle all
> the corner cases?
>
> This needs way more thoughts vs. the various states and sync
> requirements. Just adding callbacks, invoking them unconditionally, not
> giving any rationale how the whole thing is supposed to work and then
> let everyone figure out how to deal with the state and corner case
> handling at the irq chip driver level does not cut it, really.
>
> State handling is core functionality and if irq chip drivers have
> special requirements then they want to be communicated with flags and/or
> specialized callbacks.

Hopefully with the above explanation this makes more sense?  If not,
hopefully you can suggest how to adapt it to accomplish what we need
(allow wakeup from masked IRQs that have wakeup enabled).

Thanks!

-Doug
Thomas Gleixner Aug. 13, 2020, 10:09 p.m. UTC | #5
Doug,

On Thu, Aug 13 2020 at 09:09, Doug Anderson wrote:
> On Thu, Aug 13, 2020 at 2:29 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> The main point is that these callbacks are specific to generic chip and
>> not used anywhere else.
>
> I'm not sure I understand.  This callback is used by drivers that use
> generic-chip but I don't think there's anything specific about
> generic-chip in these callbacks.  Sure many of them use the
> generic-chip's "wake_active" tracking but a different IRQ chip could
> track "wake_active" itself without bringing in all of generic-chip and
> still might want to accomplish the same thing, right?

They are not issued for non generic chip based irq chips and they are
not issued from the common irq suspend/resume code.

Wake active tracking is just a conveniance function and there is nothing
which prevents any other driver to do that. The real question is why
would it do so? The state is tracked in the core already. Don't tell me,
I already read your whole reply :)

>> > Probably the old irq_suspend() and irq_resume() callbacks should be
>> > deprecated.
>>
>> You need to analyze first what these callbacks actually do. :)
>
> See below.  I intended my callbacks to be for the same type of thing
> as the existing ones, though perhaps either my naming or description
> was confusing.

IIRC the suspend/resume callbacks were added to get some existing SoC
drivers converted over in a similar way to existing code, but my memory
is faint. But I'm sure it wasn't a design from scratch and the semantics
are rather obscure. But clearly because this was based on syscore ops
this was never meant for S2idle which did not really exist back then.

>> >  static bool suspend_device_irq(struct irq_desc *desc)
>> >  {
>> > +     bool sync = false;
>> > +
>> >       if (!desc->action || irq_desc_is_chained(desc) ||
>> >           desc->no_suspend_depth)
>> > -             return false;
>> > +             goto exit;
>>
>> What?
>>
>> If no_suspend_depth is > 0 why would you try to tell the irq chip
>> that this line needs to be suspended?
>>
>> If there is no action, then the interrupt line is in shut down
>> state. What's the point of suspending it?
>>
>> Chained interrupts are special and you really have to think hard whether
>> calling suspend for them unconditionally is a good idea. What if a
>> wakeup irq is connected to this chained thing?
>
> I think there is a confusion about what this callback is intended to
> do and that probably needs to be made clearer, either by renaming or
> by comments (or both).  Let's think about these two things that we
> might be telling the IRQ:
>
> a) Please disable yourself in preparation for suspending.
>
> b) The system is suspending, please take any action you need to.
>
> I believe you are reading this as a).  I intended it to be b).  Can
> you think of a name for these callbacks that would make it clearer?
> suspend_notify() / resume_notify() maybe?

I probably read is as #a, but even with #b the semantics are completely
unclear. So I started asking questions.

And these questions are important because if we really would add such a
callback then it needs to be clear what semantics and rules are there
for the driver side. If you don't specify that clearly then this is
going to be (ab)used for implementing insanities which bring state out
of sync and cause more problems than they solve. I still can remember
that I had to cleanup tons of nasty irq chip driver code which did
exactly that. I had to do that to be able to change the internals of the
core code.

Guess why the irq subsystem attempts to encapsulate as much as possible
and has nasty struct member names all over the place.

> Specifically the problem we're trying to address is when an IRQ is
> marked as "disabled" (driver called disable_irq()) but also marked as
> "wakeup" (driver called enable_irq_wake()).  As per my understanding,
> this means:
>
> * Don't call the interrupt handler for this interrupt until I call
> enable_irq() but keep tracking it (either in hardware or in software).
> Specifically it's a requirement that if the interrupt fires one or
> more times while masked the interrupt handler should be called as soon
> as enable_irq() is called.

irq_disable() has two operating modes:

    1) Immediately mask the interrupt at the irq chip level

    2) Software disable it. If an interrupt is raised while disabled
       then the flow handler observes disabled state, masks it, marks it
       pending and returns without invoking any device handler.

On a subsequent irq_enable() the interrupt is unmasked if it was masked
and if the interrupt is marked pending and the interrupt is not level
type then it's attempted to retrigger it. Either in hardware or by a
software replay mechanism.

> * If this interrupt fires while the system is suspended then please
> wake the system up.

Well, that's kinda contradicting itself. If the interrupt is masked then
what is the point? I'm surely missing something subtle here.

> On some (many?) interrupt controllers a masked interrupt won't wake
> the system up.  Thus we need some point in time where the interrupt
> controller can unmask interrupts in hardware so that they can act as
> wakeups.

So far nobody told me about this until now, but why exactly do we need
yet another unspecified callback instead of simply telling the core via
an irq chip flag that it should always unmask the interrupt if it is a
wakeup source?

> Also: if an interrupt was masked lazily this could be a good
> time to ensure that these interrupts _won't_ wake the system up.

Setting IRQCHIP_MASK_ON_SUSPEND does exactly that. No need for a chip
driver to do any magic. You just have to use it.

So the really obvious counterpart for this is to have:

       IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND

and then do:

@@ -81,6 +81,8 @@ static bool suspend_device_irq(struct ir
 		 * IRQD_WAKEUP_ARMED is visible before we return from
 		 * suspend_device_irqs().
 		 */
+		if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
+			unmask_irq(desc);
 		return true;
 	}
 
plus the counterpart in the resume path. This also ensures that state is
consistent.

The magic behind the back of the core code unmask brings core state and
hardware state out of sync. So if for whatever reason the interrupt is
raised in the CPU before the resume path can mask it again, then the
flow handler will see disabled state, invoke mask_irq() which does
nothing because core state is masked and if that's a level irq it will
come back forever.

> Thus the point of these callbacks is to provide a hook for IRQ chips
> to do this.  Now that you understand the motivation perhaps you can
> suggest a better way to accomplish this if the approach in this patch
> is not OK.

See above.

> I will note that a quick audit of existing users of the gernic-chip's
> irq_suspend() show that they are doing exactly this.  So the point of
> my patch is to actually allow other IRQ chips (ones that aren't using
> generic-chip) to do this type of thing.  At the same time my patch
> provides a way for current users of generic-chip to adapt their
> routines so they work without syscore (which, I guess, isn't
> compatible with s2idle).

If that's the main problem which is solved in these callbacks, then I
really have to ask why this has not been raised years ago. Why can't
people talk?

IIRC back then when the callbacks for GC were added the reason was that
the affected chips needed a way to save and restore the full chip state
because the hardware lost it during suspend. S2idle did not exist back
then at least not in it's current form. Oh well...

But gust replacing them by something which is yet another sinkhole for
horrible hacks behind the core code is not making it any better.

I fear another sweep through the unpleasantries of chip drivers is due
sooner than later. Aside of finding time, I need to find my eyecancer
protection glasses and check my schnaps stock.

>> So what happens in this case:
>>
>>    CPU0                         CPU1
>>    interrupt                    suspend_device_irq()
>>      handle()                     chip->suspend_one()
>>        action()                 ...
>>        chip->fiddle();
>>
>> ????
>
> Ah, so I guess we need to move the call to suspend_one_irq() till
> after the (potential) call to synchronize_irq() in in
> suspend_device_irqs()?

For what you are trying to achieve, no. IRQCHIP_MASK_ON_SUSPEND is
already safe.

If we add IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND then there is no sync
problem either.

> Hopefully with the above explanation this makes more sense?

At least the explanation helped to understand the problem, while the
changelog was pretty useless in that regard:

  "These two callbacks are interesting because sometimes an irq chip
   needs to know about suspend/resume."

Really valuable and precise technical information.

But aside of the confusion, even with your explanation of what you are
trying to solve, I really want a coherent explanation why this should be
done for any of those:

  1) an interrupt which has no action, i.e. an interrupt which has no
     active users and is in the worst case completely deactivated or was
     never activated to begin with.

     In the inactive case it might be in a state where unmask issues an
     invalid vector, causes hardware malfunction or hits undefined
     software state in the chip drivers in the hierarchy.

     If you want to be woken up by irq X, then request irq X which
     ensures that irq X is in a usable state at all levels of the
     stack. If you call disable_irq() or mark the interrupt with
     IRQ_NOAUTOEN, fine, it's still consistent state.

  2) interrupts which have no_suspend_depth > 0 which means that
     there is an action requested which explicitely says: don't touch me
     on suspend.

     If that driver invokes disable_irq() then it can keep the pieces.

  3) chained interrupts

     They are never disabled and never masked. So why would anything
     need to be done here?

     Side note: they should not exist at all, but that's a different
     story.

If you don't have coherent explanations, then please just don't touch
that condition at all.

Hint: "Sometimes a chip needs to know" does not qualify :)

Thanks,

        tglx
Doug Anderson Aug. 13, 2020, 10:58 p.m. UTC | #6
Hi,

On Thu, Aug 13, 2020 at 3:09 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> > Specifically the problem we're trying to address is when an IRQ is
> > marked as "disabled" (driver called disable_irq()) but also marked as
> > "wakeup" (driver called enable_irq_wake()).  As per my understanding,
> > this means:
> >
> > * Don't call the interrupt handler for this interrupt until I call
> > enable_irq() but keep tracking it (either in hardware or in software).
> > Specifically it's a requirement that if the interrupt fires one or
> > more times while masked the interrupt handler should be called as soon
> > as enable_irq() is called.
>
> irq_disable() has two operating modes:
>
>     1) Immediately mask the interrupt at the irq chip level
>
>     2) Software disable it. If an interrupt is raised while disabled
>        then the flow handler observes disabled state, masks it, marks it
>        pending and returns without invoking any device handler.
>
> On a subsequent irq_enable() the interrupt is unmasked if it was masked
> and if the interrupt is marked pending and the interrupt is not level
> type then it's attempted to retrigger it. Either in hardware or by a
> software replay mechanism.
>
> > * If this interrupt fires while the system is suspended then please
> > wake the system up.
>
> Well, that's kinda contradicting itself. If the interrupt is masked then
> what is the point? I'm surely missing something subtle here.

This is how I've always been told that the API works and there are at
least a handful of drivers in the kernel whose suspend routines both
enable wakeup and call disable_irq().  Isn't this also documented as
of commit f9f21cea3113 ("genirq: Clarify that irq wake state is
orthogonal to enable/disable")?


> > On some (many?) interrupt controllers a masked interrupt won't wake
> > the system up.  Thus we need some point in time where the interrupt
> > controller can unmask interrupts in hardware so that they can act as
> > wakeups.
>
> So far nobody told me about this until now, but why exactly do we need
> yet another unspecified callback instead of simply telling the core via
> an irq chip flag that it should always unmask the interrupt if it is a
> wakeup source?
>
> > Also: if an interrupt was masked lazily this could be a good
> > time to ensure that these interrupts _won't_ wake the system up.
>
> Setting IRQCHIP_MASK_ON_SUSPEND does exactly that. No need for a chip
> driver to do any magic. You just have to use it.
>
> So the really obvious counterpart for this is to have:
>
>        IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND
>
> and then do:
>
> @@ -81,6 +81,8 @@ static bool suspend_device_irq(struct ir
>                  * IRQD_WAKEUP_ARMED is visible before we return from
>                  * suspend_device_irqs().
>                  */
> +               if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
> +                       unmask_irq(desc);
>                 return true;
>         }
>
> plus the counterpart in the resume path. This also ensures that state is
> consistent.

This sounds wonderful to me.  Maulik: I think you could replace quite
a few of the patches in the series and just use that.


> The magic behind the back of the core code unmask brings core state and
> hardware state out of sync. So if for whatever reason the interrupt is
> raised in the CPU before the resume path can mask it again, then the
> flow handler will see disabled state, invoke mask_irq() which does
> nothing because core state is masked and if that's a level irq it will
> come back forever.
>
> > Thus the point of these callbacks is to provide a hook for IRQ chips
> > to do this.  Now that you understand the motivation perhaps you can
> > suggest a better way to accomplish this if the approach in this patch
> > is not OK.
>
> See above.
>
> > I will note that a quick audit of existing users of the gernic-chip's
> > irq_suspend() show that they are doing exactly this.  So the point of
> > my patch is to actually allow other IRQ chips (ones that aren't using
> > generic-chip) to do this type of thing.  At the same time my patch
> > provides a way for current users of generic-chip to adapt their
> > routines so they work without syscore (which, I guess, isn't
> > compatible with s2idle).
>
> If that's the main problem which is solved in these callbacks, then I
> really have to ask why this has not been raised years ago. Why can't
> people talk?

Not all of us have the big picture that you do to know how things
ought to work, I guess.  If nothing else someone looking at this
problem would think: "this must be a common problem, let's go see how
all the other places do it" and then they find how everyone else is
doing it and do it that way.  It requires the grander picture that a
maintainer has in order to say: whoa, everyone's copying the same
hack--let's come up with a better solution.


> IIRC back then when the callbacks for GC were added the reason was that
> the affected chips needed a way to save and restore the full chip state
> because the hardware lost it during suspend. S2idle did not exist back
> then at least not in it's current form. Oh well...
>
> But gust replacing them by something which is yet another sinkhole for
> horrible hacks behind the core code is not making it any better.
>
> I fear another sweep through the unpleasantries of chip drivers is due
> sooner than later. Aside of finding time, I need to find my eyecancer
> protection glasses and check my schnaps stock.
>
> >> So what happens in this case:
> >>
> >>    CPU0                         CPU1
> >>    interrupt                    suspend_device_irq()
> >>      handle()                     chip->suspend_one()
> >>        action()                 ...
> >>        chip->fiddle();
> >>
> >> ????
> >
> > Ah, so I guess we need to move the call to suspend_one_irq() till
> > after the (potential) call to synchronize_irq() in in
> > suspend_device_irqs()?
>
> For what you are trying to achieve, no. IRQCHIP_MASK_ON_SUSPEND is
> already safe.
>
> If we add IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND then there is no sync
> problem either.
>
> > Hopefully with the above explanation this makes more sense?
>
> At least the explanation helped to understand the problem, while the
> changelog was pretty useless in that regard:
>
>   "These two callbacks are interesting because sometimes an irq chip
>    needs to know about suspend/resume."
>
> Really valuable and precise technical information.

Funny to get yelled at for not providing a detailed enough changelog.
Usually people complain that my changelogs are too detailed.  Sigh.


> But aside of the confusion, even with your explanation of what you are
> trying to solve, I really want a coherent explanation why this should be
> done for any of those:
>
>   1) an interrupt which has no action, i.e. an interrupt which has no
>      active users and is in the worst case completely deactivated or was
>      never activated to begin with.
>
>      In the inactive case it might be in a state where unmask issues an
>      invalid vector, causes hardware malfunction or hits undefined
>      software state in the chip drivers in the hierarchy.
>
>      If you want to be woken up by irq X, then request irq X which
>      ensures that irq X is in a usable state at all levels of the
>      stack. If you call disable_irq() or mark the interrupt with
>      IRQ_NOAUTOEN, fine, it's still consistent state.
>
>   2) interrupts which have no_suspend_depth > 0 which means that
>      there is an action requested which explicitely says: don't touch me
>      on suspend.
>
>      If that driver invokes disable_irq() then it can keep the pieces.
>
>   3) chained interrupts
>
>      They are never disabled and never masked. So why would anything
>      need to be done here?
>
>      Side note: they should not exist at all, but that's a different
>      story.
>
> If you don't have coherent explanations, then please just don't touch
> that condition at all.
>
> Hint: "Sometimes a chip needs to know" does not qualify :)

Clearly I am not coherent.  ;-)  My only goal was to help enable
interrupts that were disabled / marked as wakeup (as per above,
documented to be OK) to work on Qualcomm chips.  This specifically
affects me because a driver that I need to work (cros_ec) does this.
If IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND is good to add then it sounds like
a great plan to me.


-Doug
Thomas Gleixner Aug. 14, 2020, 2:07 a.m. UTC | #7
Doug,

On Thu, Aug 13 2020 at 15:58, Doug Anderson wrote:
> On Thu, Aug 13, 2020 at 3:09 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> > * If this interrupt fires while the system is suspended then please
>> > wake the system up.
>>
>> Well, that's kinda contradicting itself. If the interrupt is masked then
>> what is the point? I'm surely missing something subtle here.
>
> This is how I've always been told that the API works and there are at
> least a handful of drivers in the kernel whose suspend routines both
> enable wakeup and call disable_irq().  Isn't this also documented as
> of commit f9f21cea3113 ("genirq: Clarify that irq wake state is
> orthogonal to enable/disable")?

Fair enough. The wording there is unfortunate and I probably should have
spent more brain cycles before applying it. It suggests that this is a
pure driver problem. I should have asked some of the questions I asked
now back then :(

>> If that's the main problem which is solved in these callbacks, then I
>> really have to ask why this has not been raised years ago. Why can't
>> people talk?
>
> Not all of us have the big picture that you do to know how things
> ought to work, I guess.  If nothing else someone looking at this
> problem would think: "this must be a common problem, let's go see how
> all the other places do it" and then they find how everyone else is
> doing it and do it that way.  It requires the grander picture that a
> maintainer has in order to say: whoa, everyone's copying the same
> hack--let's come up with a better solution.

That's not the point. I know how these things happen, but I fail to
understand why nobody ever looks at this and says: OMG, I need to do yet
another variant of copy&pasta of the same thing every other driver
does. Why is there no infrastructure for that? 

Asking that question does not require a maintainer who always encouraged
people to talk about exactly these kind of things instead of going off
and creating the gazillionst broken copy of the same thing with yet
another wart working around core code problems and thereby violating
layering and introducing bugs which wouldn't exist otherwise.

Spare me all the $corp reasons. I've heard all of them and if not then
the not yet known reason won't be any more convincing. :)

One of the most underutilized strengths of FOSS is that you can go and
ask someone who has the big picture in his head before you go off and
waste time on distangling copy&pasta, dealing with the resulting obvious
bugs and then the latent ones which only surface 3 month after the
product has shipped. Or like in this case figure out that the copy&pasta
road is a dead end and then create something new without seeing the big
picture and having analyzed completely what consequences this might have.

I don't know how much hours you and others spent on this. I surely know
that after you gave me proper context it took me less than an hour to
figure out that one problem you were trying to solve was already solved
and the other one was just a matter of doing the counterpart of it. I
definitely spent way more time on reviewing and debating.

So if you had asked upfront, I probably would have spent quite some time
on it as well depending on the quality of the question and explanation
but the total amount on both sides would have been significantly lower,
which I consider a win-win situation.

Of course I know that my $corp MBA foo is close to zero, so I just can
be sure that it would have been a win for me :)

Seriously, we need to promote a 'talk to each other' culture very
actively. The people with the big picture in their head, aka
maintainers, are happy to answer questions and they also want that
others come forth and say "this makes no sense" instead of silently
accepting that the five other drivers do something stupid. This would
help to solve some of the well known problems:

 - Maintainer scalability

   I rather discuss a problem with you at the conceptual level upfront
   instead of reviewing patches after the fact and having to tell you
   that it's all wrong. The latter takes way more time.

   Having a quick and dirty POC for illustration is fine and usually
   useful.

 - Maintainer blinders
 
   Maintainers need input from the outside as any other people because
   they become blind to shortcomings in the area they are responsible
   for as any other person. Especially if they maintain complex and
   highly active subsystems.

 - Submitter frustration

   You spent a huge amount of time to come up with a solution for
   something and then you get told by the maintainer/reviewer that the
   time spent was wasted and your solution is crap. It does not matter
   much what the politeness level of that message is. It sets you back
   and causes frustration on both ends.

 - Turn around times

   A lot of time can be spared by talking to each other early. A half
   baken POC patch is fine for opening such a discussion, but going down
   all the way and then having the talk over the final patch review is
   more than suboptimal and causes grief on both sides.

>>   "These two callbacks are interesting because sometimes an irq chip
>>    needs to know about suspend/resume."
>>
>> Really valuable and precise technical information.
>
> Funny to get yelled at for not providing a detailed enough changelog.
> Usually people complain that my changelogs are too detailed.  Sigh.

The complaint you might get from me about an overly detailed changelog
is that it has redundant or pointless information in it, e.g.

  - the 500 lines of debug dump containing about 10 lines of valuable
    information which you already decoded and condensed in order to
    figure the problem out.

  - anecdotes around the discovery which carry zero information and
    often show that that the scope of the problem was not fully
    understood.

  - pointless examples of how to trigger the fail

  - In depth explanaations of what the patch does instead of a concise
    explanation at the conceptual level.

You won't hear me complain about a concise and coherent in depth
technical explanation of a problem.

Writing changelogs is an art and I surely look at some of my own
changelogs written long ago and yell at myself from time to time.

Reading a patch goes top down obviously:

      1) Subject line
      2) Changelog
      3) Patch.

If I have to rumage for my crystal ball before #3 then I already spent
more time than necessary. If the thing is some random feature then I
might just say: try again. But if I get the sense that it is about a bug
or  has some smell of a shorrcoming in the core code then I have to bite
the bullet and decode it the hard way. Not the most efficient way. And
from experience I can tell you that if #1 and #2 are already problematic
then #3 needs some serious scrutiny in most cases.

>> Hint: "Sometimes a chip needs to know" does not qualify :)
>
> Clearly I am not coherent.  ;-)  My only goal was to help enable
> interrupts that were disabled / marked as wakeup (as per above,
> documented to be OK) to work on Qualcomm chips.  This specifically
> affects me because a driver that I need to work (cros_ec) does this.

Mission acoomplished :)

> If IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND is good to add then it sounds like
> a great plan to me.

If it solves the problem and from what you explained it should do so
then this is definitely the right way to go.

Thanks,

        tglx
Doug Anderson Aug. 14, 2020, 3:04 a.m. UTC | #8
Hi,

On Thu, Aug 13, 2020 at 7:07 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Doug,
>
> On Thu, Aug 13 2020 at 15:58, Doug Anderson wrote:
> > On Thu, Aug 13, 2020 at 3:09 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> > * If this interrupt fires while the system is suspended then please
> >> > wake the system up.
> >>
> >> Well, that's kinda contradicting itself. If the interrupt is masked then
> >> what is the point? I'm surely missing something subtle here.
> >
> > This is how I've always been told that the API works and there are at
> > least a handful of drivers in the kernel whose suspend routines both
> > enable wakeup and call disable_irq().  Isn't this also documented as
> > of commit f9f21cea3113 ("genirq: Clarify that irq wake state is
> > orthogonal to enable/disable")?
>
> Fair enough. The wording there is unfortunate and I probably should have
> spent more brain cycles before applying it. It suggests that this is a
> pure driver problem. I should have asked some of the questions I asked
> now back then :(

I mean, certainly a driver could be rewritten not to do this.  ...and,
in fact, the easier approach (for just solving my immediate concern)
would be to change cros-ec not to do this.  However, it was my
understanding that what cros-ec was doing was actually just fine and
part of the API to drivers.  This understanding was solidified when
the patch I mentioned landed.  When looking at this before I found
that certainly there are other drivers that do this and it felt better
to implement the proper thing rather than add a hack to cros-ec to
work around the Qualcomm pinctrl driver.

In general the idea here, I think, is that in the "suspend" call of a
driver it might want to disable interrupts so that it doesn't have to
deal with them after the driver has configured things (and adjusted
its internal data structures) for suspend.  However, it might still
want its interrupt to cause a wakeup.  ...so it wants the wakeup to
happen (and its resume call to be made to get everything back in the
right state) and at the end of the resume call it wants to enable its
interrupt handler again.  That seems like a sane design pattern to me,
but maybe I'm crazy.  Yes, I guess the driver could implement the
"noirq" suspend function, but sometimes it's simpler to have a single
suspend function that first leverages interrupts, then disables them
at an exact point it can control, and then finishes adjusting its
state.

I'll also note that the concept that a masked interrupt can "wake you
up" is also not unlike how ARM SoCs work, which is part of what made
me feel like this API was fine.  Specifically if you have interrupts
masked at the CPU level and then enter "WFI" (wait for interrupt) it
will wake up (or come out of idle) from one of those masked
interrupts.


> >> If that's the main problem which is solved in these callbacks, then I
> >> really have to ask why this has not been raised years ago. Why can't
> >> people talk?
> >
> > Not all of us have the big picture that you do to know how things
> > ought to work, I guess.  If nothing else someone looking at this
> > problem would think: "this must be a common problem, let's go see how
> > all the other places do it" and then they find how everyone else is
> > doing it and do it that way.  It requires the grander picture that a
> > maintainer has in order to say: whoa, everyone's copying the same
> > hack--let's come up with a better solution.
>
> That's not the point. I know how these things happen, but I fail to
> understand why nobody ever looks at this and says: OMG, I need to do yet
> another variant of copy&pasta of the same thing every other driver
> does. Why is there no infrastructure for that?
>
> Asking that question does not require a maintainer who always encouraged
> people to talk about exactly these kind of things instead of going off
> and creating the gazillionst broken copy of the same thing with yet
> another wart working around core code problems and thereby violating
> layering and introducing bugs which wouldn't exist otherwise.
>
> Spare me all the $corp reasons. I've heard all of them and if not then
> the not yet known reason won't be any more convincing. :)

As per above, if I was simply motivated to hack it to get it done I
would have suggested we just muck with cros_ec.  I certainly do have a
bias for getting things done and getting things landed, but I also try
to pride myself in not saying that we should just accept any old hack.
Perhaps many people posting patches just want any old crap landed, but
I'd like to think I'm not one of them.


> One of the most underutilized strengths of FOSS is that you can go and
> ask someone who has the big picture in his head before you go off and
> waste time on distangling copy&pasta, dealing with the resulting obvious
> bugs and then the latent ones which only surface 3 month after the
> product has shipped. Or like in this case figure out that the copy&pasta
> road is a dead end and then create something new without seeing the big
> picture and having analyzed completely what consequences this might have.

I've found that one of the best ways to get something figured out is
to post a patch, even if it's not perfect.  Perhaps in cases where
you're involved, but in general most cases where you just ask a
question you get ignored.  You've gotta post a patch.  This solution
was the best I was able to come up with and was discussed with several
people before posting.


> I don't know how much hours you and others spent on this. I surely know
> that after you gave me proper context it took me less than an hour to
> figure out that one problem you were trying to solve was already solved
> and the other one was just a matter of doing the counterpart of it. I
> definitely spent way more time on reviewing and debating.

I did spend quite a bit of time on it, though perhaps it's not
obvious.  Though I agree that the patch in isolation didn't have a
good enough description, I felt like it combined with the later
patches in the series did show what I was trying to do.


> So if you had asked upfront, I probably would have spent quite some time
> on it as well depending on the quality of the question and explanation
> but the total amount on both sides would have been significantly lower,
> which I consider a win-win situation.
>
> Of course I know that my $corp MBA foo is close to zero, so I just can
> be sure that it would have been a win for me :)
>
> Seriously, we need to promote a 'talk to each other' culture very
> actively. The people with the big picture in their head, aka
> maintainers, are happy to answer questions and they also want that
> others come forth and say "this makes no sense" instead of silently
> accepting that the five other drivers do something stupid. This would
> help to solve some of the well known problems:
>
>  - Maintainer scalability
>
>    I rather discuss a problem with you at the conceptual level upfront
>    instead of reviewing patches after the fact and having to tell you
>    that it's all wrong. The latter takes way more time.
>
>    Having a quick and dirty POC for illustration is fine and usually
>    useful.

OK, I will try to remember that, in the future, I should send
questions rather than patches to you.  I'm always learning the
workflows of the different maintainers, so sorry for killing so much
time.  :(


>  - Maintainer blinders
>
>    Maintainers need input from the outside as any other people because
>    they become blind to shortcomings in the area they are responsible
>    for as any other person. Especially if they maintain complex and
>    highly active subsystems.
>
>  - Submitter frustration
>
>    You spent a huge amount of time to come up with a solution for
>    something and then you get told by the maintainer/reviewer that the
>    time spent was wasted and your solution is crap. It does not matter
>    much what the politeness level of that message is. It sets you back
>    and causes frustration on both ends.
>
>  - Turn around times
>
>    A lot of time can be spared by talking to each other early. A half
>    baken POC patch is fine for opening such a discussion, but going down
>    all the way and then having the talk over the final patch review is
>    more than suboptimal and causes grief on both sides.

Yup, definitely understand.  Again, sorry for the misunderstandings
this time around and hopefully we can find better ways to interact in
the future.


> >>   "These two callbacks are interesting because sometimes an irq chip
> >>    needs to know about suspend/resume."
> >>
> >> Really valuable and precise technical information.
> >
> > Funny to get yelled at for not providing a detailed enough changelog.
> > Usually people complain that my changelogs are too detailed.  Sigh.
>
> The complaint you might get from me about an overly detailed changelog
> is that it has redundant or pointless information in it, e.g.
>
>   - the 500 lines of debug dump containing about 10 lines of valuable
>     information which you already decoded and condensed in order to
>     figure the problem out.
>
>   - anecdotes around the discovery which carry zero information and
>     often show that that the scope of the problem was not fully
>     understood.
>
>   - pointless examples of how to trigger the fail
>
>   - In depth explanaations of what the patch does instead of a concise
>     explanation at the conceptual level.
>
> You won't hear me complain about a concise and coherent in depth
> technical explanation of a problem.
>
> Writing changelogs is an art and I surely look at some of my own
> changelogs written long ago and yell at myself from time to time.
>
> Reading a patch goes top down obviously:
>
>       1) Subject line
>       2) Changelog
>       3) Patch.
>
> If I have to rumage for my crystal ball before #3 then I already spent
> more time than necessary. If the thing is some random feature then I
> might just say: try again. But if I get the sense that it is about a bug
> or  has some smell of a shorrcoming in the core code then I have to bite
> the bullet and decode it the hard way. Not the most efficient way. And
> from experience I can tell you that if #1 and #2 are already problematic
> then #3 needs some serious scrutiny in most cases.
>
> >> Hint: "Sometimes a chip needs to know" does not qualify :)
> >
> > Clearly I am not coherent.  ;-)  My only goal was to help enable
> > interrupts that were disabled / marked as wakeup (as per above,
> > documented to be OK) to work on Qualcomm chips.  This specifically
> > affects me because a driver that I need to work (cros_ec) does this.
>
> Mission acoomplished :)
>
> > If IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND is good to add then it sounds like
> > a great plan to me.
>
> If it solves the problem and from what you explained it should do so
> then this is definitely the right way to go.

Wonderful!  Looking forward to Maulik's post doing it this way.

-Doug
Thomas Gleixner Aug. 14, 2020, 12:43 p.m. UTC | #9
Doug,

On Thu, Aug 13 2020 at 20:04, Doug Anderson wrote:
> On Thu, Aug 13, 2020 at 7:07 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>    Having a quick and dirty POC for illustration is fine and usually
>>    useful.
>
> OK, I will try to remember that, in the future, I should send
> questions rather than patches to you.  I'm always learning the

The quick and dirty POC patch for illustration along with the questions
is always good to catch my attention.

> workflows of the different maintainers, so sorry for killing so much
> time.  :(

No problem. 

>> If it solves the problem and from what you explained it should do so
>> then this is definitely the right way to go.
>
> Wonderful!  Looking forward to Maulik's post doing it this way.

/me closes the case for now and moves on.

Thanks

        tglx
Maulik Shah Aug. 18, 2020, 4:35 a.m. UTC | #10
Hi,

On 8/14/2020 4:28 AM, Doug Anderson wrote:
> Hi,
>
> On Thu, Aug 13, 2020 at 3:09 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>> Specifically the problem we're trying to address is when an IRQ is
>>> marked as "disabled" (driver called disable_irq()) but also marked as
>>> "wakeup" (driver called enable_irq_wake()).  As per my understanding,
>>> this means:
>>>
>>> * Don't call the interrupt handler for this interrupt until I call
>>> enable_irq() but keep tracking it (either in hardware or in software).
>>> Specifically it's a requirement that if the interrupt fires one or
>>> more times while masked the interrupt handler should be called as soon
>>> as enable_irq() is called.
>> irq_disable() has two operating modes:
>>
>>      1) Immediately mask the interrupt at the irq chip level
>>
>>      2) Software disable it. If an interrupt is raised while disabled
>>         then the flow handler observes disabled state, masks it, marks it
>>         pending and returns without invoking any device handler.
>>
>> On a subsequent irq_enable() the interrupt is unmasked if it was masked
>> and if the interrupt is marked pending and the interrupt is not level
>> type then it's attempted to retrigger it. Either in hardware or by a
>> software replay mechanism.
>>
>>> * If this interrupt fires while the system is suspended then please
>>> wake the system up.
>> Well, that's kinda contradicting itself. If the interrupt is masked then
>> what is the point? I'm surely missing something subtle here.
> This is how I've always been told that the API works and there are at
> least a handful of drivers in the kernel whose suspend routines both
> enable wakeup and call disable_irq().  Isn't this also documented as
> of commit f9f21cea3113 ("genirq: Clarify that irq wake state is
> orthogonal to enable/disable")?
>
>
>>> On some (many?) interrupt controllers a masked interrupt won't wake
>>> the system up.  Thus we need some point in time where the interrupt
>>> controller can unmask interrupts in hardware so that they can act as
>>> wakeups.
>> So far nobody told me about this until now, but why exactly do we need
>> yet another unspecified callback instead of simply telling the core via
>> an irq chip flag that it should always unmask the interrupt if it is a
>> wakeup source?
>>
>>> Also: if an interrupt was masked lazily this could be a good
>>> time to ensure that these interrupts _won't_ wake the system up.
>> Setting IRQCHIP_MASK_ON_SUSPEND does exactly that. No need for a chip
>> driver to do any magic. You just have to use it.
>>
>> So the really obvious counterpart for this is to have:
>>
>>         IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND
>>
>> and then do:
>>
>> @@ -81,6 +81,8 @@ static bool suspend_device_irq(struct ir
>>                   * IRQD_WAKEUP_ARMED is visible before we return from
>>                   * suspend_device_irqs().
>>                   */
>> +               if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
>> +                       unmask_irq(desc);
>>                  return true;
>>          }
>>
>> plus the counterpart in the resume path. This also ensures that state is
>> consistent.
> This sounds wonderful to me.  Maulik: I think you could replace quite
> a few of the patches in the series and just use that.

Sure.

+               if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
+                       unmask_irq(desc);

I tried this patch and it didnot work as is.

Calling unmask_irq() only invoke's chip's .irq_unmask callback but the 
underlying irq_chip have .irq_enable also present.

Replacing the call with irq_enable() internally takes care of either 
invoking chip's .irq_enable (if its present) else it invokes unmask_irq().

+
+               if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
+                       irq_enable(desc);

probably IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND should also be renamed to 
IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND.

Thanks,
Maulik

>
>
>> The magic behind the back of the core code unmask brings core state and
>> hardware state out of sync. So if for whatever reason the interrupt is
>> raised in the CPU before the resume path can mask it again, then the
>> flow handler will see disabled state, invoke mask_irq() which does
>> nothing because core state is masked and if that's a level irq it will
>> come back forever.
>>
>>> Thus the point of these callbacks is to provide a hook for IRQ chips
>>> to do this.  Now that you understand the motivation perhaps you can
>>> suggest a better way to accomplish this if the approach in this patch
>>> is not OK.
>> See above.
>>
>>> I will note that a quick audit of existing users of the gernic-chip's
>>> irq_suspend() show that they are doing exactly this.  So the point of
>>> my patch is to actually allow other IRQ chips (ones that aren't using
>>> generic-chip) to do this type of thing.  At the same time my patch
>>> provides a way for current users of generic-chip to adapt their
>>> routines so they work without syscore (which, I guess, isn't
>>> compatible with s2idle).
>> If that's the main problem which is solved in these callbacks, then I
>> really have to ask why this has not been raised years ago. Why can't
>> people talk?
> Not all of us have the big picture that you do to know how things
> ought to work, I guess.  If nothing else someone looking at this
> problem would think: "this must be a common problem, let's go see how
> all the other places do it" and then they find how everyone else is
> doing it and do it that way.  It requires the grander picture that a
> maintainer has in order to say: whoa, everyone's copying the same
> hack--let's come up with a better solution.
>
>
>> IIRC back then when the callbacks for GC were added the reason was that
>> the affected chips needed a way to save and restore the full chip state
>> because the hardware lost it during suspend. S2idle did not exist back
>> then at least not in it's current form. Oh well...
>>
>> But gust replacing them by something which is yet another sinkhole for
>> horrible hacks behind the core code is not making it any better.
>>
>> I fear another sweep through the unpleasantries of chip drivers is due
>> sooner than later. Aside of finding time, I need to find my eyecancer
>> protection glasses and check my schnaps stock.
>>
>>>> So what happens in this case:
>>>>
>>>>     CPU0                         CPU1
>>>>     interrupt                    suspend_device_irq()
>>>>       handle()                     chip->suspend_one()
>>>>         action()                 ...
>>>>         chip->fiddle();
>>>>
>>>> ????
>>> Ah, so I guess we need to move the call to suspend_one_irq() till
>>> after the (potential) call to synchronize_irq() in in
>>> suspend_device_irqs()?
>> For what you are trying to achieve, no. IRQCHIP_MASK_ON_SUSPEND is
>> already safe.
>>
>> If we add IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND then there is no sync
>> problem either.
>>
>>> Hopefully with the above explanation this makes more sense?
>> At least the explanation helped to understand the problem, while the
>> changelog was pretty useless in that regard:
>>
>>    "These two callbacks are interesting because sometimes an irq chip
>>     needs to know about suspend/resume."
>>
>> Really valuable and precise technical information.
> Funny to get yelled at for not providing a detailed enough changelog.
> Usually people complain that my changelogs are too detailed.  Sigh.
>
>
>> But aside of the confusion, even with your explanation of what you are
>> trying to solve, I really want a coherent explanation why this should be
>> done for any of those:
>>
>>    1) an interrupt which has no action, i.e. an interrupt which has no
>>       active users and is in the worst case completely deactivated or was
>>       never activated to begin with.
>>
>>       In the inactive case it might be in a state where unmask issues an
>>       invalid vector, causes hardware malfunction or hits undefined
>>       software state in the chip drivers in the hierarchy.
>>
>>       If you want to be woken up by irq X, then request irq X which
>>       ensures that irq X is in a usable state at all levels of the
>>       stack. If you call disable_irq() or mark the interrupt with
>>       IRQ_NOAUTOEN, fine, it's still consistent state.
>>
>>    2) interrupts which have no_suspend_depth > 0 which means that
>>       there is an action requested which explicitely says: don't touch me
>>       on suspend.
>>
>>       If that driver invokes disable_irq() then it can keep the pieces.
>>
>>    3) chained interrupts
>>
>>       They are never disabled and never masked. So why would anything
>>       need to be done here?
>>
>>       Side note: they should not exist at all, but that's a different
>>       story.
>>
>> If you don't have coherent explanations, then please just don't touch
>> that condition at all.
>>
>> Hint: "Sometimes a chip needs to know" does not qualify :)
> Clearly I am not coherent.  ;-)  My only goal was to help enable
> interrupts that were disabled / marked as wakeup (as per above,
> documented to be OK) to work on Qualcomm chips.  This specifically
> affects me because a driver that I need to work (cros_ec) does this.
> If IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND is good to add then it sounds like
> a great plan to me.
>
>
> -Doug
Thomas Gleixner Aug. 18, 2020, 2:40 p.m. UTC | #11
Maulik,

On Tue, Aug 18 2020 at 10:05, Maulik Shah wrote:
> On 8/14/2020 4:28 AM, Doug Anderson wrote:
>> On Thu, Aug 13, 2020 at 3:09 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> +               if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
> +                       unmask_irq(desc);
>
> I tried this patch and it didnot work as is.
>
> Calling unmask_irq() only invoke's chip's .irq_unmask callback but the 
> underlying irq_chip have .irq_enable also present.
>
> Replacing the call with irq_enable() internally takes care of either 
> invoking chip's .irq_enable (if its present) else it invokes unmask_irq().
>
> +
> +               if (chip->flags & IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND)
> +                       irq_enable(desc);
>
> probably IRQCHIP_UNMASK_WAKEUP_ON_SUSPEND should also be renamed to 
> IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND.

Makes sense and also works when the interrupt is already enabled.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 1b7f4df..8d37b32 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -468,10 +468,16 @@  static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
  * @irq_bus_sync_unlock:function to sync and unlock slow bus (i2c) chips
  * @irq_cpu_online:	configure an interrupt source for a secondary CPU
  * @irq_cpu_offline:	un-configure an interrupt source for a secondary CPU
+ * @irq_suspend_one:	called on an every irq to suspend it; called even if
+ *			this IRQ is configured for wakeup
+ * @irq_resume_one:	called on an every irq to resume it; called even if
+ *			this IRQ is configured for wakeup
  * @irq_suspend:	function called from core code on suspend once per
- *			chip, when one or more interrupts are installed
+ *			chip, when one or more interrupts are installed;
+ *			only works if using irq/generic-chip
  * @irq_resume:		function called from core code on resume once per chip,
- *			when one ore more interrupts are installed
+ *			when one ore more interrupts are installed;
+ *			only works if using irq/generic-chip
  * @irq_pm_shutdown:	function called from core code on shutdown once per chip
  * @irq_calc_mask:	Optional function to set irq_data.mask for special cases
  * @irq_print_chip:	optional to print special chip info in show_interrupts
@@ -515,6 +521,9 @@  struct irq_chip {
 	void		(*irq_cpu_online)(struct irq_data *data);
 	void		(*irq_cpu_offline)(struct irq_data *data);
 
+	void		(*irq_suspend_one)(struct irq_data *data);
+	void		(*irq_resume_one)(struct irq_data *data);
+
 	void		(*irq_suspend)(struct irq_data *data);
 	void		(*irq_resume)(struct irq_data *data);
 	void		(*irq_pm_shutdown)(struct irq_data *data);
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 857f5f4..caf80c1 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -447,6 +447,22 @@  void unmask_threaded_irq(struct irq_desc *desc)
 	unmask_irq(desc);
 }
 
+void suspend_one_irq(struct irq_desc *desc)
+{
+	struct irq_chip *chip = desc->irq_data.chip;
+
+	if (chip->irq_suspend_one)
+		chip->irq_suspend_one(&desc->irq_data);
+}
+
+void resume_one_irq(struct irq_desc *desc)
+{
+	struct irq_chip *chip = desc->irq_data.chip;
+
+	if (chip->irq_resume_one)
+		chip->irq_resume_one(&desc->irq_data);
+}
+
 /*
  *	handle_nested_irq - Handle a nested irq from a irq thread
  *	@irq:	the interrupt number
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 7db284b..11c2dac 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -90,6 +90,8 @@  extern void irq_percpu_disable(struct irq_desc *desc, unsigned int cpu);
 extern void mask_irq(struct irq_desc *desc);
 extern void unmask_irq(struct irq_desc *desc);
 extern void unmask_threaded_irq(struct irq_desc *desc);
+extern void suspend_one_irq(struct irq_desc *desc);
+extern void resume_one_irq(struct irq_desc *desc);
 
 #ifdef CONFIG_SPARSE_IRQ
 static inline void irq_mark_irq(unsigned int irq) { }
diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
index 8f557fa..b9e5338 100644
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -69,19 +69,23 @@  void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action)
 
 static bool suspend_device_irq(struct irq_desc *desc)
 {
+	bool sync = false;
+
 	if (!desc->action || irq_desc_is_chained(desc) ||
 	    desc->no_suspend_depth)
-		return false;
+		goto exit;
 
 	if (irqd_is_wakeup_set(&desc->irq_data)) {
 		irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
+
 		/*
 		 * We return true here to force the caller to issue
 		 * synchronize_irq(). We need to make sure that the
 		 * IRQD_WAKEUP_ARMED is visible before we return from
 		 * suspend_device_irqs().
 		 */
-		return true;
+		sync = true;
+		goto exit;
 	}
 
 	desc->istate |= IRQS_SUSPENDED;
@@ -95,7 +99,10 @@  static bool suspend_device_irq(struct irq_desc *desc)
 	 */
 	if (irq_desc_get_chip(desc)->flags & IRQCHIP_MASK_ON_SUSPEND)
 		mask_irq(desc);
-	return true;
+
+exit:
+	suspend_one_irq(desc);
+	return sync;
 }
 
 /**
@@ -137,6 +144,8 @@  EXPORT_SYMBOL_GPL(suspend_device_irqs);
 
 static void resume_irq(struct irq_desc *desc)
 {
+	resume_one_irq(desc);
+
 	irqd_clear(&desc->irq_data, IRQD_WAKEUP_ARMED);
 
 	if (desc->istate & IRQS_SUSPENDED)