diff mbox

irq: Consider a negative return value of irq_startup() as an error

Message ID 1393345719-3707-1-git-send-email-jjhiblot@traphandler.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jean-Jacques Hiblot Feb. 25, 2014, 4:28 p.m. UTC
The irq_startup() function returns the return value of the irq_startup callback
of the underlying irq_chip. Currently this value only tells if the interrupt is
pending, but we can make it also return an error code when it fails.

Signed-off-by: Jean-Jacques Hiblot <jjhiblot@traphandler.com>
---

Hi Thomas,

This patch updates the semantic of the return value of the irq_startup()
callback of an irq_chip : a negative value indicates an error. The purpose is
to make __setup_irq() fail if the irq_chip can't startup the interrupt.
I checked in the kernel for drivers impacted by this. It turns out that most
of the implementation of irq_startup() return 0. The only drivers that return
non-zero values are:
 * arch/x86/kernel/apic/io_apic.c: returns 1 when an interrupt is pending
 * drivers/pinctrl/pinctrl-adi2.c: may return -ENODEV.

GPIO-based irq_chip could use this feature because an output gpio can't be used
as an interrupt, and thus a call to irq_startup() is likely to fail. You can
check out http://www.spinics.net/lists/arm-kernel/msg302713.html for a
reference.

Regards,

Jean-Jacques

 include/linux/irq.h    | 2 ++
 kernel/irq/autoprobe.c | 2 +-
 kernel/irq/manage.c    | 4 +++-
 3 files changed, 6 insertions(+), 2 deletions(-)

Comments

Thomas Gleixner Feb. 26, 2014, 10:15 p.m. UTC | #1
On Tue, 25 Feb 2014, Jean-Jacques Hiblot wrote:

> The irq_startup() function returns the return value of the
> irq_startup callback of the underlying irq_chip. Currently this
> value only tells if the interrupt is pending, but we can make it
> also return an error code when it fails.
> 
> Signed-off-by: Jean-Jacques Hiblot <jjhiblot@traphandler.com>
> ---
> 
> Hi Thomas,
> 
> This patch updates the semantic of the return value of the irq_startup()
> callback of an irq_chip : a negative value indicates an error. The purpose is
> to make __setup_irq() fail if the irq_chip can't startup the interrupt.

How does that work?

    unsigned int    (*irq_startup)(struct irq_data *data);

> I checked in the kernel for drivers impacted by this. It turns out that most
> of the implementation of irq_startup() return 0. The only drivers that return
> non-zero values are:
>  * arch/x86/kernel/apic/io_apic.c: returns 1 when an interrupt is pending
>  * drivers/pinctrl/pinctrl-adi2.c: may return -ENODEV.

   which is wrong to begin with.
 
> GPIO-based irq_chip could use this feature because an output gpio
> can't be used as an interrupt, and thus a call to irq_startup() is
> likely to fail. You can check out
> http://www.spinics.net/lists/arm-kernel/msg302713.html for a
> reference.

And that tells me that the design of the gpio stuff is just wrong.

The whole drviers/gpio directory is full of this gpio_lock_as_irq()
called from the guts of irq_chip callbacks braindamage.

The comment above gpiod_lock_as_irq() is so wrong it's not even funny
anymore.

"....or in the .irq_enable() from its irq_chip implementation ..."

	void            (*irq_enable)(struct irq_data *data);

So how does that help, if the gpio is not available as irq?

And no, you are not going to cure that design brainfart by adding a
negative return value to a function returning unsigned int. And I'm
not going to accept anything which is just a duct tape based
workaround for a complete braindamaged design.

Lets look at the usage sites:

The following irqchip callback implementations printk some blurb and
proceed:

 sirfsoc_gpio_irq_startup
 nmk_gpio_irq_startup
 msm_gpio_irq_startup
 u300_gpio_irq_enable
 byt_irq_startup
 mcp23s08_irq_startup
 lp_irq_startup
 intel_mid_irq_startup
 em_gio_irq_startup
 bcm_kona_gpio_irq_startup
 adnp_irq_startup

The following functions return an error code from the set_type
callback:

 tegra_gpio_irq_set_type
 gpio_irq_type

The whole idiocy starts with commit d468bf9e, which tells people to
call this from irq_startup/enable. But it fails to tell, that this is
pointless because it wont prevent the interrupt from starting up.

So 11 SoC lemmings followed and added this completely pointless
nonsense to their drivers.

Two lemmings added it to a function which can actually fail. Failure
as well, because there is no guarantee that this function gets invoked
before request_irq(). What's worse is that if request_irq() succeeds
then the following code which tries to set the type fails for no
obvious reason.

Now at least Jean-Jacques tried to make it actually fail, but that
turns out to be a failure on its own.

Dammit, I told you folks often enough that workarounds in some
subsystem do not actually cure a shortcoming in the core code. When
the core code was written, the GPIO case which might actually fail was
definitly not thought of. But that's not a reason for adding
tasteless and useless workarounds to the GPIO subsystem.

Of course you encouraged people to copy that nonsense all over the
place. All startup functions do the same thing:

    if (gpio_lock_as_irq() < 0)
       dev_err("BLA");
    unmask_irq();

Plus the shutdown functions having the gpio_unlock_as_irq() +
mask_irq() sequence. Sigh.

So the proper solution to the problem if we want to use irq_startup()
is:

  Change the return value of the irq_startup() callback to int and
  fixup all existing implementations. This wants to be done with a
  coccinelle script. If you hurry up, I might squeeze it into 3.14 as
  there is no risk at all. Otherwise this is going to happen after the
  3.15 merge window closes.

  After that's done, remove all the copies of startup/shutdown
  nonsense in drivers/gpio and force all gpio chips to call
  gpio_set_irq_chip*() instead of irq_set_irq_chip*() and add generic
  irq_startup/shutdown callbacks which are implemented in the gpiolib
  core code.

There is a less intrusive alternative solution:

  Mark the interrupt as GPIO based in the core and let the core call
  conditonally the gpio_[un]lock_as_irq functions.

And no, I'm not going to provide you the proper solution on the silver
tablet this time. Go and figure it out yourself and if you're
confident enough that it might be an acceptable solution, post it and
we'll see.

Thanks,

	tglx
Jean-Jacques Hiblot Feb. 27, 2014, 4:03 p.m. UTC | #2
2014-02-26 23:15 GMT+01:00 Thomas Gleixner <tglx@linutronix.de>:
> On Tue, 25 Feb 2014, Jean-Jacques Hiblot wrote:
>
>> The irq_startup() function returns the return value of the
>> irq_startup callback of the underlying irq_chip. Currently this
>> value only tells if the interrupt is pending, but we can make it
>> also return an error code when it fails.
>>
>> Signed-off-by: Jean-Jacques Hiblot <jjhiblot@traphandler.com>
>> ---
>>
>> Hi Thomas,
>>
>> This patch updates the semantic of the return value of the irq_startup()
>> callback of an irq_chip : a negative value indicates an error. The purpose is
>> to make __setup_irq() fail if the irq_chip can't startup the interrupt.
>
> How does that work?
>
>     unsigned int    (*irq_startup)(struct irq_data *data);
well .. no excuse sir
>
>> I checked in the kernel for drivers impacted by this. It turns out that most
>> of the implementation of irq_startup() return 0. The only drivers that return
>> non-zero values are:
>>  * arch/x86/kernel/apic/io_apic.c: returns 1 when an interrupt is pending
>>  * drivers/pinctrl/pinctrl-adi2.c: may return -ENODEV.
>
>    which is wrong to begin with.
>
>> GPIO-based irq_chip could use this feature because an output gpio
>> can't be used as an interrupt, and thus a call to irq_startup() is
>> likely to fail. You can check out
>> http://www.spinics.net/lists/arm-kernel/msg302713.html for a
>> reference.
>
> And that tells me that the design of the gpio stuff is just wrong.
>
> The whole drviers/gpio directory is full of this gpio_lock_as_irq()
> called from the guts of irq_chip callbacks braindamage.
>
> The comment above gpiod_lock_as_irq() is so wrong it's not even funny
> anymore.
>
> "....or in the .irq_enable() from its irq_chip implementation ..."
>
>         void            (*irq_enable)(struct irq_data *data);
>
> So how does that help, if the gpio is not available as irq?
>
> And no, you are not going to cure that design brainfart by adding a
> negative return value to a function returning unsigned int. And I'm
> not going to accept anything which is just a duct tape based
> workaround for a complete braindamaged design.
>
> Lets look at the usage sites:
>
> The following irqchip callback implementations printk some blurb and
> proceed:
>
>  sirfsoc_gpio_irq_startup
>  nmk_gpio_irq_startup
>  msm_gpio_irq_startup
>  u300_gpio_irq_enable
>  byt_irq_startup
>  mcp23s08_irq_startup
>  lp_irq_startup
>  intel_mid_irq_startup
>  em_gio_irq_startup
>  bcm_kona_gpio_irq_startup
>  adnp_irq_startup
>
> The following functions return an error code from the set_type
> callback:
>
>  tegra_gpio_irq_set_type
>  gpio_irq_type
>
> The whole idiocy starts with commit d468bf9e, which tells people to
> call this from irq_startup/enable. But it fails to tell, that this is
> pointless because it wont prevent the interrupt from starting up.
>
From a GPIO subsystem standpoint, the IRQ lock is sensible because
using a gpio as an interrupt source, adds some restrictions to its
usage. Ouput configuration and IRQ configuration are exclusive for
example.
That  request_irq() succeeds if a GPIO can't be used as interrupt, is
however a problem. It can be fixed.


> So 11 SoC lemmings followed and added this completely pointless
> nonsense to their drivers.
>
> Two lemmings added it to a function which can actually fail. Failure
> as well, because there is no guarantee that this function gets invoked
> before request_irq(). What's worse is that if request_irq() succeeds
> then the following code which tries to set the type fails for no
> obvious reason.
>
> Now at least Jean-Jacques tried to make it actually fail, but that
> turns out to be a failure on its own.
just to clarify: is the failure the problem with the unsigned or the
principle of making request_irq fail ?

>
> Dammit, I told you folks often enough that workarounds in some
> subsystem do not actually cure a shortcoming in the core code. When
> the core code was written, the GPIO case which might actually fail was
> definitly not thought of. But that's not a reason for adding
> tasteless and useless workarounds to the GPIO subsystem.
>
> Of course you encouraged people to copy that nonsense all over the
> place. All startup functions do the same thing:
>
>     if (gpio_lock_as_irq() < 0)
>        dev_err("BLA");
>     unmask_irq();
>
> Plus the shutdown functions having the gpio_unlock_as_irq() +
> mask_irq() sequence. Sigh.
>
> So the proper solution to the problem if we want to use irq_startup()
> is:
>
>   Change the return value of the irq_startup() callback to int and
>   fixup all existing implementations. This wants to be done with a
>   coccinelle script. If you hurry up, I might squeeze it into 3.14 as
>   there is no risk at all. Otherwise this is going to happen after the
>   3.15 merge window closes.
I'll post a patch doing just this in the next email.

Jean-Jacques
>
>   After that's done, remove all the copies of startup/shutdown
>   nonsense in drivers/gpio and force all gpio chips to call
>   gpio_set_irq_chip*() instead of irq_set_irq_chip*() and add generic
>   irq_startup/shutdown callbacks which are implemented in the gpiolib
>   core code.
>
> There is a less intrusive alternative solution:
>
>   Mark the interrupt as GPIO based in the core and let the core call
>   conditonally the gpio_[un]lock_as_irq functions.
>
> And no, I'm not going to provide you the proper solution on the silver
> tablet this time. Go and figure it out yourself and if you're
> confident enough that it might be an acceptable solution, post it and
> we'll see.
>
> Thanks,
>
>         tglx
Linus Walleij March 7, 2014, 2:49 a.m. UTC | #3
On Thu, Feb 27, 2014 at 6:15 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, 25 Feb 2014, Jean-Jacques Hiblot wrote:
>
> The whole drviers/gpio directory is full of this gpio_lock_as_irq()
> called from the guts of irq_chip callbacks braindamage.

OK then we need to figure out where to place this call to lock
the lines as used by IRQs properly.

The whole thing came out as a result of the resolution of a
discussion about semantic dependencies in drivers/gpio, as drivers
were not handling gpio_* calls to gpiolib and irqchip stuff
orthogonally instead relying on implicit handling of call
dependencies.

Anyway, using a GPIO line as an IRQ makes some other usecases
(like setting it to output and bitbanging it as an I2C) impossible,
and the gpiolib definately needs to be aware that it is used as
an IRQ. And some usecases (like the line being an output)
are impossible so the use as IRQ need to be denied.

> The comment above gpiod_lock_as_irq() is so wrong it's not even funny
> anymore.

Sent a patch to fix this.

> The whole idiocy starts with commit d468bf9e, which tells people to
> call this from irq_startup/enable. But it fails to tell, that this is
> pointless because it wont prevent the interrupt from starting up.

The call is still useful for the gpiolib intrinsics, we would be better off
even if the error to lock the line as IRQ was just printed and then
ignored, as then atleast the error message would tell users that they
are trying to do something that is impossible. Before they had no clue
and were debugging their GPIO irqs for days...

> So 11 SoC lemmings followed and added this completely pointless
> nonsense to their drivers.

Maybe for the irq subsystem, but the gpio subsystem is very
happy about now knowing if a GPIO is used as IRQ or not.
It has implications for the (horrid) sysfs API to GPIO for example.

> Dammit, I told you folks often enough that workarounds in some
> subsystem do not actually cure a shortcoming in the core code. When
> the core code was written, the GPIO case which might actually fail was
> definitly not thought of. But that's not a reason for adding
> tasteless and useless workarounds to the GPIO subsystem.

So while it is true that the situation where the locking to IRQ
fails was not handled properly, this was not added to workaround
shortcomings in other subsystems, but rather to make gpiolib
aware that a certain line is used for IRQs so as to stop people
from shooting themselves in the foot using the gpiolib.

> Of course you encouraged people to copy that nonsense all over the
> place.

Not Jean-Jacques! I take the sole honor for that :-)

> All startup functions do the same thing:
>
>     if (gpio_lock_as_irq() < 0)
>        dev_err("BLA");
>     unmask_irq();
>
> Plus the shutdown functions having the gpio_unlock_as_irq() +
> mask_irq() sequence. Sigh.
>
> So the proper solution to the problem if we want to use irq_startup()
> is:
>
>   Change the return value of the irq_startup() callback to int and
>   fixup all existing implementations. This wants to be done with a
>   coccinelle script. If you hurry up, I might squeeze it into 3.14 as
>   there is no risk at all. Otherwise this is going to happen after the
>   3.15 merge window closes.

I think Jean-Jacques says he's onto this so great.

>   After that's done, remove all the copies of startup/shutdown
>   nonsense in drivers/gpio and force all gpio chips to call
>   gpio_set_irq_chip*() instead of irq_set_irq_chip*() and add generic
>   irq_startup/shutdown callbacks which are implemented in the gpiolib
>   core code.

We probably need to mirror irq_set_chip_and_handler() too since
most drivers use this, but I like this idea.

> There is a less intrusive alternative solution:
>
>   Mark the interrupt as GPIO based in the core and let the core call
>   conditonally the gpio_[un]lock_as_irq functions.

I think the first one is better actually.

I'll see what Jean-Jacques comes up with and take it from there unless
he's interested in taking it all the way.

Yours,
Linus Walleij
diff mbox

Patch

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 7dc1003..d566855 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -282,6 +282,8 @@  static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
  *
  * @name:		name for /proc/interrupts
  * @irq_startup:	start up the interrupt (defaults to ->enable if NULL)
+ *			negative return values indicate an error, otherwise the
+ *			return values indicates if an interrupt is pending
  * @irq_shutdown:	shut down the interrupt (defaults to ->disable if NULL)
  * @irq_enable:		enable the interrupt (defaults to chip->unmask if NULL)
  * @irq_disable:	disable the interrupt
diff --git a/kernel/irq/autoprobe.c b/kernel/irq/autoprobe.c
index 0119b9d..64794fa 100644
--- a/kernel/irq/autoprobe.c
+++ b/kernel/irq/autoprobe.c
@@ -70,7 +70,7 @@  unsigned long probe_irq_on(void)
 		raw_spin_lock_irq(&desc->lock);
 		if (!desc->action && irq_settings_can_probe(desc)) {
 			desc->istate |= IRQS_AUTODETECT | IRQS_WAITING;
-			if (irq_startup(desc, false))
+			if (irq_startup(desc, false) > 0)
 				desc->istate |= IRQS_PENDING;
 		}
 		raw_spin_unlock_irq(&desc->lock);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 481a13c..58c6be0 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1116,7 +1116,9 @@  __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
 			desc->istate |= IRQS_ONESHOT;
 
 		if (irq_settings_can_autoenable(desc))
-			irq_startup(desc, true);
+			ret = irq_startup(desc, true);
+			if (ret < 0)
+				goto out_mask;
 		else
 			/* Undo nested disables: */
 			desc->depth = 1;