Message ID | 20180503065553.7762-1-jeffy.chen@rock-chips.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
+ Doug Hi Jeffy, On Thu, May 03, 2018 at 02:55:53PM +0800, Jeffy Chen wrote: > We saw spurious irq when changing irq's trigger type, for example > setting gpio-keys's wakeup irq trigger type. > > And according to the TRM: > "Programming the GPIO registers for interrupt capability, edge-sensitive > or level-sensitive interrupts, and interrupt polarity should be > completed prior to enabling the interrupts on Port A in order to prevent > spurious glitches on the interrupt lines to the interrupt controller." > > Reported-by: Brian Norris <briannorris@google.com> > Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> > --- > > drivers/pinctrl/pinctrl-rockchip.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/pinctrl/pinctrl-rockchip.c b/drivers/pinctrl/pinctrl-rockchip.c > index 3924779f55785..7ff45ec8330d1 100644 > --- a/drivers/pinctrl/pinctrl-rockchip.c > +++ b/drivers/pinctrl/pinctrl-rockchip.c > @@ -2727,9 +2727,19 @@ static int rockchip_irq_set_type(struct irq_data *d, unsigned int type) > return -EINVAL; > } > > + /** The typical multiline comment style is to use only 1 asterisk -- 2 asterisks usually denote something more formal, like kerneldoc. > + * According to the TRM, we should keep irq disabled during programming > + * interrupt capability to prevent spurious glitches on the interrupt > + * lines to the interrupt controller. > + */ It's also worth noting that this case may come up in rockchip_irq_demux() for EDGE_BOTH triggers: /* * Triggering IRQ on both rising and falling edge * needs manual intervention. */ if (bank->toggle_edge_mode & BIT(irq)) { ... polarity = readl_relaxed(bank->reg_base + GPIO_INT_POLARITY); if (data & BIT(irq)) polarity &= ~BIT(irq); else polarity |= BIT(irq); writel(polarity, bank->reg_base + GPIO_INT_POLARITY); AIUI, this case is not actually a problem, because this polarity reconfiguration happens before we call generic_handle_irq(), so the extra spurious interrupt is handled and cleared after this point. But it seems like this should be noted somewhere in either the commit message, the code comments, or both. On the other hand...this also implies there may be a race condition there, where we might lose an interrupt if there is an edge between the re-configuration of the polarity in rockchip_irq_demux() and the clearing/handling of the interrupt (handle_edge_irq() -> chip->irq_ack()). If we have an edge between there, then we might ack it, but leave the polarity such that we aren't ready for the next (inverted) edge. I'm not 100% sure about the above, so it might be good to verify it. But I also expect this means there's really no way to 100% reliably implement both-edge trigger types on hardware like this that doesn't directly implement it. So I'm not sure what we'd do with that knowledge. > + data = readl(bank->reg_base + GPIO_INTEN); > + writel_relaxed(data & ~mask, gc->reg_base + GPIO_INTEN); Not sure if this is a needless optimization: but couldn't you only disable the interrupt (and make the level and polarity changes) only if the level or polarity are actually changing? For example, it's possible to end up here when changing from EDGE_BOTH to EDGE_RISING, but we might not actually program a new value. Brian > + > writel_relaxed(level, gc->reg_base + GPIO_INTTYPE_LEVEL); > writel_relaxed(polarity, gc->reg_base + GPIO_INT_POLARITY); > > + writel_relaxed(data, gc->reg_base + GPIO_INTEN); > + > irq_gc_unlock(gc); > raw_spin_unlock_irqrestore(&bank->slock, flags); > clk_disable(bank->clk); > -- > 2.11.0 > >
Hi Brian, On 05/08/2018 06:15 AM, Brian Norris wrote: > + Doug > > Hi Jeffy, > > On Thu, May 03, 2018 at 02:55:53PM +0800, Jeffy Chen wrote: >> We saw spurious irq when changing irq's trigger type, for example >> setting gpio-keys's wakeup irq trigger type. >> >> And according to the TRM: >> "Programming the GPIO registers for interrupt capability, edge-sensitive >> or level-sensitive interrupts, and interrupt polarity should be >> completed prior to enabling the interrupts on Port A in order to prevent >> spurious glitches on the interrupt lines to the interrupt controller." >> >> Reported-by: Brian Norris <briannorris@google.com> >> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> >> --- >> >> drivers/pinctrl/pinctrl-rockchip.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/drivers/pinctrl/pinctrl-rockchip.c b/drivers/pinctrl/pinctrl-rockchip.c >> index 3924779f55785..7ff45ec8330d1 100644 >> --- a/drivers/pinctrl/pinctrl-rockchip.c >> +++ b/drivers/pinctrl/pinctrl-rockchip.c >> @@ -2727,9 +2727,19 @@ static int rockchip_irq_set_type(struct irq_data *d, unsigned int type) >> return -EINVAL; >> } >> >> + /** > > The typical multiline comment style is to use only 1 asterisk -- 2 > asterisks usually denote something more formal, like kerneldoc. ok, will fix it. > >> + * According to the TRM, we should keep irq disabled during programming >> + * interrupt capability to prevent spurious glitches on the interrupt >> + * lines to the interrupt controller. >> + */ > > It's also worth noting that this case may come up in > rockchip_irq_demux() for EDGE_BOTH triggers: > > /* > * Triggering IRQ on both rising and falling edge > * needs manual intervention. > */ > if (bank->toggle_edge_mode & BIT(irq)) { > ... > polarity = readl_relaxed(bank->reg_base + > GPIO_INT_POLARITY); > if (data & BIT(irq)) > polarity &= ~BIT(irq); > else > polarity |= BIT(irq); > writel(polarity, > bank->reg_base + GPIO_INT_POLARITY); > > AIUI, this case is not actually a problem, because this polarity > reconfiguration happens before we call generic_handle_irq(), so the > extra spurious interrupt is handled and cleared after this point. But it > seems like this should be noted somewhere in either the commit message, > the code comments, or both. just from my testing, the spurious irq only happen when set EDGE_RISING to a gpio which is already high level, or set EDGE_FALLING to a gpio which is already low level.so the demux / EDGE_BOTH seem ok. but adding comments as your suggested is a good idea, will do that. > > On the other hand...this also implies there may be a race condition > there, where we might lose an interrupt if there is an edge between the > re-configuration of the polarity in rockchip_irq_demux() and the > clearing/handling of the interrupt (handle_edge_irq() -> > chip->irq_ack()). If we have an edge between there, then we might ack > it, but leave the polarity such that we aren't ready for the next > (inverted) edge. if let me guess, the unexpected irq we saw is the hardware trying to avoid losing irq? for example, we set a EDGE_RISING, and the hardware saw the gpio is already high, then though it might lost an irq, so fake one for safe? i'll try to confirm it with IC guys. > > I'm not 100% sure about the above, so it might be good to verify it. But > I also expect this means there's really no way to 100% reliably > implement both-edge trigger types on hardware like this that doesn't > directly implement it. So I'm not sure what we'd do with that knowledge. > >> + data = readl(bank->reg_base + GPIO_INTEN); >> + writel_relaxed(data & ~mask, gc->reg_base + GPIO_INTEN); > > Not sure if this is a needless optimization: but couldn't you only > disable the interrupt (and make the level and polarity changes) only if > the level or polarity are actually changing? For example, it's possible > to end up here when changing from EDGE_BOTH to EDGE_RISING, but we might > not actually program a new value. right, i noticed that too, will add a patch for that in v2. > > Brian > >> + >> writel_relaxed(level, gc->reg_base + GPIO_INTTYPE_LEVEL); >> writel_relaxed(polarity, gc->reg_base + GPIO_INT_POLARITY); >> >> + writel_relaxed(data, gc->reg_base + GPIO_INTEN); >> + >> irq_gc_unlock(gc); >> raw_spin_unlock_irqrestore(&bank->slock, flags); >> clk_disable(bank->clk); >> -- >> 2.11.0 >> >> > > >
On Tue, May 08, 2018 at 09:36:24AM +0800, Jeffy Chen wrote: > On 05/08/2018 06:15 AM, Brian Norris wrote: > > On the other hand...this also implies there may be a race condition > > there, where we might lose an interrupt if there is an edge between the > > re-configuration of the polarity in rockchip_irq_demux() and the > > clearing/handling of the interrupt (handle_edge_irq() -> > > chip->irq_ack()). If we have an edge between there, then we might ack > > it, but leave the polarity such that we aren't ready for the next > > (inverted) edge. > > if let me guess, the unexpected irq we saw is the hardware trying to avoid > losing irq? for example, we set a EDGE_RISING, and the hardware saw the gpio > is already high, then though it might lost an irq, so fake one for safe? I won't pretend to know what the IC designers were doing, but I don't think that would resolve the problem I'm talking about. The sequence is something like: 1. EDGE_BOTH IRQ occurs (e.g., low to high) 2. reconfigure polarity in rockchip_irq_demux() (polarity=low) 3. continue to handle_edge_irq() 4. another HW edge occurs (e.g., high to low) 5. handle_edge_irq() (from 3) acks (clears) IRQ (before a subsequent rockchip_irq_demux() gets a chance to run and flip the polarity) ... Now the polarity is still low, but the next trigger should be a low-to-high edge. > i'll try to confirm it with IC guys. Brian
Hi Brian, On 05/08/2018 09:56 AM, Brian Norris wrote: > On Tue, May 08, 2018 at 09:36:24AM +0800, Jeffy Chen wrote: >> On 05/08/2018 06:15 AM, Brian Norris wrote: >>> On the other hand...this also implies there may be a race condition >>> there, where we might lose an interrupt if there is an edge between the >>> re-configuration of the polarity in rockchip_irq_demux() and the >>> clearing/handling of the interrupt (handle_edge_irq() -> >>> chip->irq_ack()). If we have an edge between there, then we might ack >>> it, but leave the polarity such that we aren't ready for the next >>> (inverted) edge. >> >> if let me guess, the unexpected irq we saw is the hardware trying to avoid >> losing irq? for example, we set a EDGE_RISING, and the hardware saw the gpio >> is already high, then though it might lost an irq, so fake one for safe? > > I won't pretend to know what the IC designers were doing, but I don't > think that would resolve the problem I'm talking about. The sequence is > something like: > 1. EDGE_BOTH IRQ occurs (e.g., low to high) > 2. reconfigure polarity in rockchip_irq_demux() (polarity=low) > 3. continue to handle_edge_irq() > 4. another HW edge occurs (e.g., high to low) > 5. handle_edge_irq() (from 3) acks (clears) IRQ (before a subsequent > rockchip_irq_demux() gets a chance to run and flip the polarity) > ... > > Now the polarity is still low, but the next trigger should be a > low-to-high edge. oops, i see the problem. so what if we do these: 1/ edge irq triggered 2/ read gpio level 3/ handle irq(ack irq) 4/ toggle edge mode(with a while gpio level check) if the gpio changed in 2/ -> 3/, the 4/ will trigger an irq when writing GPIO_INT_POLARITY(which is what we are trying to avoid in the set_type case) but this would not work if i'm wrong about how the HW fake an irq when changing POLARITY... or maybe we could just check the gpio status again after handle_edge_irq, and correct the polarity in this case > >> i'll try to confirm it with IC guys. > > Brian > > >
Hi Brian, On 05/08/2018 10:31 AM, JeffyChen wrote: > Hi Brian, > > On 05/08/2018 09:56 AM, Brian Norris wrote: >> On Tue, May 08, 2018 at 09:36:24AM +0800, Jeffy Chen wrote: >>> On 05/08/2018 06:15 AM, Brian Norris wrote: >>>> On the other hand...this also implies there may be a race condition >>>> there, where we might lose an interrupt if there is an edge between the >>>> re-configuration of the polarity in rockchip_irq_demux() and the >>>> clearing/handling of the interrupt (handle_edge_irq() -> >>>> chip->irq_ack()). If we have an edge between there, then we might ack >>>> it, but leave the polarity such that we aren't ready for the next >>>> (inverted) edge. >>> >>> if let me guess, the unexpected irq we saw is the hardware trying to >>> avoid >>> losing irq? for example, we set a EDGE_RISING, and the hardware saw >>> the gpio >>> is already high, then though it might lost an irq, so fake one for safe? >> >> I won't pretend to know what the IC designers were doing, but I don't >> think that would resolve the problem I'm talking about. The sequence is >> something like: >> 1. EDGE_BOTH IRQ occurs (e.g., low to high) >> 2. reconfigure polarity in rockchip_irq_demux() (polarity=low) >> 3. continue to handle_edge_irq() >> 4. another HW edge occurs (e.g., high to low) >> 5. handle_edge_irq() (from 3) acks (clears) IRQ (before a subsequent >> rockchip_irq_demux() gets a chance to run and flip the polarity) >> ... >> >> Now the polarity is still low, but the next trigger should be a >> low-to-high edge. > > oops, i see the problem. > > so what if we do these: > 1/ edge irq triggered > 2/ read gpio level > 3/ handle irq(ack irq) > 4/ toggle edge mode(with a while gpio level check) > > if the gpio changed in 2/ -> 3/, the 4/ will trigger an irq when writing > GPIO_INT_POLARITY(which is what we are trying to avoid in the set_type > case) > > but this would not work if i'm wrong about how the HW fake an irq when > changing POLARITY... > > > or maybe we could just check the gpio status again after > handle_edge_irq, and correct the polarity in this case i saw the pinctrl-msm.c do this in the ack(), and also at the end of set_type(), which might avoid another race in the set_type() > >> >>> i'll try to confirm it with IC guys. >> >> Brian >> >> >> >
Hi, On Mon, May 7, 2018 at 6:56 PM, Brian Norris <briannorris@chromium.org> wrote: > On Tue, May 08, 2018 at 09:36:24AM +0800, Jeffy Chen wrote: >> On 05/08/2018 06:15 AM, Brian Norris wrote: >> > On the other hand...this also implies there may be a race condition >> > there, where we might lose an interrupt if there is an edge between the >> > re-configuration of the polarity in rockchip_irq_demux() and the >> > clearing/handling of the interrupt (handle_edge_irq() -> >> > chip->irq_ack()). If we have an edge between there, then we might ack >> > it, but leave the polarity such that we aren't ready for the next >> > (inverted) edge. >> >> if let me guess, the unexpected irq we saw is the hardware trying to avoid >> losing irq? for example, we set a EDGE_RISING, and the hardware saw the gpio >> is already high, then though it might lost an irq, so fake one for safe? > > I won't pretend to know what the IC designers were doing, but I don't > think that would resolve the problem I'm talking about. The sequence is > something like: > 1. EDGE_BOTH IRQ occurs (e.g., low to high) > 2. reconfigure polarity in rockchip_irq_demux() (polarity=low) > 3. continue to handle_edge_irq() > 4. another HW edge occurs (e.g., high to low) > 5. handle_edge_irq() (from 3) acks (clears) IRQ (before a subsequent > rockchip_irq_demux() gets a chance to run and flip the polarity) > ... > > Now the polarity is still low, but the next trigger should be a > low-to-high edge. > >> i'll try to confirm it with IC guys. One note is that in the case Brian points at (where we need to simulate EDGE_BOTH by swapping edges) we purposely ignored the TRM and we needed to do that to avoid losing interrupts. For details, see commit 53b1bfc76df2 ("pinctrl: rockchip: Avoid losing interrupts when supporting both edges"). We did this because: 1. We believed that the IP block in Rockchip SoCs has nearly the same logic as "gpio-dwapb.c" and that's what "gpio-dwapb.c" did. 2. We were actually losing real interrupts and this was the only way we could figure out how to fix it. When I tested that back in the day I was fairly convinced that we weren't losing any interrupts in the EDGE_BOTH case after my fix, but I certainly could have messed up. For the EDGE_BOTH case it was important not to lose an interrupt because, as you guys are talking about, we could end up configured the wrong way. I think in your case where you're just picking one polarity losing an interrupt shouldn't matter since it's undefined exactly if an edge happens while you're in the middle of executing rockchip_irq_set_type(). Is that right? -Doug
Hi Doug, On 05/09/2018 03:46 AM, Doug Anderson wrote: > One note is that in the case Brian points at (where we need to > simulate EDGE_BOTH by swapping edges) we purposely ignored the TRM and > we needed to do that to avoid losing interrupts. For details, see > commit 53b1bfc76df2 ("pinctrl: rockchip: Avoid losing interrupts when > supporting both edges"). We did this because: > > 1. We believed that the IP block in Rockchip SoCs has nearly the same > logic as "gpio-dwapb.c" and that's what "gpio-dwapb.c" did. > hmm, but i saw the gpio-dwapb.c actually toggle trigger after handle irq, which might avoid the race Brian mentioned: + generic_handle_irq(gpio_irq); + irq_status &= ~BIT(hwirq); + + if ((irq_get_trigger_type(gpio_irq) & IRQ_TYPE_SENSE_MASK) + == IRQ_TYPE_EDGE_BOTH) + dwapb_toggle_trigger(gpio, hwirq); and i also saw ./qcom/pinctrl-msm.c do the toggle(msm_gpio_update_dual_edge_pos) at the end of msm_gpio_irq_set_type and msm_gpio_irq_ack, that seems can avoid the polarity races too. > 2. We were actually losing real interrupts and this was the only way > we could figure out how to fix it. > > When I tested that back in the day I was fairly convinced that we > weren't losing any interrupts in the EDGE_BOTH case after my fix, but > I certainly could have messed up. > > > For the EDGE_BOTH case it was important not to lose an interrupt > because, as you guys are talking about, we could end up configured the > wrong way. I think in your case where you're just picking one > polarity losing an interrupt shouldn't matter since it's undefined > exactly if an edge happens while you're in the middle of executing > rockchip_irq_set_type(). Is that right? right, so we now have 2 cases: rockchip_irq_demux/ rockchip_irq_set_type if i'm right about the spurious irq(only happen when set rising for a high gpio, or set falling for a low gpio), then: 1/ rockchip_irq_demux it's important to not losing irqs in this case, maybe we can a) ack irq b) update polarity for edge both irq we don't need to disable irq in b), since we would not hit the spurious irq cases here(always check gpio level to toggle it) 2/ rockchip_irq_set_type it's important to not having spurious irqs so we can disable irq during changing polarity only in these case: ((rising && gpio is heigh) || (falling && gpio is low)) i'm still confirming the spurious irq with IC guys. > > > -Doug > > >
Hi, On Tue, May 8, 2018 at 7:21 PM, JeffyChen <jeffy.chen@rock-chips.com> wrote: > Hi Doug, > > On 05/09/2018 03:46 AM, Doug Anderson wrote: >> >> One note is that in the case Brian points at (where we need to >> simulate EDGE_BOTH by swapping edges) we purposely ignored the TRM and >> we needed to do that to avoid losing interrupts. For details, see >> commit 53b1bfc76df2 ("pinctrl: rockchip: Avoid losing interrupts when >> supporting both edges"). We did this because: >> >> 1. We believed that the IP block in Rockchip SoCs has nearly the same >> logic as "gpio-dwapb.c" and that's what "gpio-dwapb.c" did. >> > > hmm, but i saw the gpio-dwapb.c actually toggle trigger after handle irq, > which might avoid the race Brian mentioned: > + generic_handle_irq(gpio_irq); > + irq_status &= ~BIT(hwirq); > + > + if ((irq_get_trigger_type(gpio_irq) & IRQ_TYPE_SENSE_MASK) > + == IRQ_TYPE_EDGE_BOTH) > + dwapb_toggle_trigger(gpio, hwirq); The code you point at in dwapb_toggle_trigger() specifically is an example of toggling the polarity _without_ disabling the interrupt in between. We took this as "working" code and as proof that it was OK to change polarity without disabling the interrupt in-between. > and i also saw ./qcom/pinctrl-msm.c do the > toggle(msm_gpio_update_dual_edge_pos) at the end of msm_gpio_irq_set_type > and msm_gpio_irq_ack, that seems can avoid the polarity races too. > >> 2. We were actually losing real interrupts and this was the only way >> we could figure out how to fix it. >> >> When I tested that back in the day I was fairly convinced that we >> weren't losing any interrupts in the EDGE_BOTH case after my fix, but >> I certainly could have messed up. >> >> >> For the EDGE_BOTH case it was important not to lose an interrupt >> because, as you guys are talking about, we could end up configured the >> wrong way. I think in your case where you're just picking one >> polarity losing an interrupt shouldn't matter since it's undefined >> exactly if an edge happens while you're in the middle of executing >> rockchip_irq_set_type(). Is that right? > > > right, so we now have 2 cases: rockchip_irq_demux/ rockchip_irq_set_type > > if i'm right about the spurious irq(only happen when set rising for a high > gpio, or set falling for a low gpio), then: > > 1/ rockchip_irq_demux > it's important to not losing irqs in this case, maybe we can > > a) ack irq > b) update polarity for edge both irq > > we don't need to disable irq in b), since we would not hit the spurious irq > cases here(always check gpio level to toggle it) Unless you have some sort of proof that rockchip_irq_demux(), I would take it as an example of something that works. I remember stress testing the heck out of it. Do you have some evidence that it's not working? I think Brian was simply speculating that there might be a race here, but I don't think anyone has shown it have they? Looking back at my notes, the thing I really made sure to stress was that we never got into a situation where we were losing an edge (AKA we were never configured to look for a falling edge when the line was already low). I'm not sure I confirmed that we never got an extra interrupt. I'm at home right now and I can't add prints and poke at things, but as I understand it for edge interrupts the usual flow to make sure interrupts aren't ever lost is: 1. See that the interrupt went off 2. Ack it (clear it) 3. Call the interrupt handler ...presumably in this case rockchip_irq_demux() is called after step #2 (but I don't know if it's called before or after step #3). If the line is toggling like crazy while the 3 steps are going on, it's OK if the interrupt handler is called more than once. In general this could be considered expected. That's why you Ack before handling--any extra edges that come in any time after the interrupt handler starts (even after the very first instruction) need to cause the interrupt handler to get called again. This is different than Brian's understanding since he seemed to think the Ack was happening later. If you're in front of something where you can add printouts, maybe you can tell us. I tried to look through the code and it was too twisted for me to be sure. > 2/ rockchip_irq_set_type > it's important to not having spurious irqs > > so we can disable irq during changing polarity only in these case: > ((rising && gpio is heigh) || (falling && gpio is low)) > > i'm still confirming the spurious irq with IC guys. Hmmm, thinking about all this more, I'm curious how callers expect this to work. Certainly things are undefined if you have the following situation: Start: rising edge trigger, line is actually high Request: change to falling edge trigger Line falls during the request In that case it's OK to throw the interrupt away because it can be argued that the line could have fallen before the request actually took place. ...but it seems like there could be situations where the user wouldn't expect interrupts to be thrown away by a call to irq_set_type(). In other words: Start: rising edge trigger, line is actually high Request: change to falling edge trigger Line falls, rises, and falls during the request ...in that case you'd expect that some sort of interrupt would have gone off and not be thrown away. No matter what instant in time the request actually took place it should have caught an edge, right? Said another way: As a client of irq_set_type() I'd expect it to not throw away interrupts, but I'd expect that the change in type would be atomic. That is: if the interrupt came in before the type change in type applied then it should trigger with the old rules. If the interrupt came in after the type change then it should trigger with the new rules. I would be tempted to confirm your testing and just clear the spurious interrupts that you're aware of. AKA: if there's no interrupt pending and you change the type to "IRQ_TYPE_EDGE_RISING" or "IRQ_TYPE_EDGE_FALLING" then you should clear the interrupt. It's still racy, but I guess it's the best you can do unless IC guys come up with something better. Anyway, it's past my bedtime. Hopefully some of the above made sense. I'm sure you'll tell me if it didn't or if I said something stupid/wrong. :-P -Doug
Hi Doug, Thanks for your reply :) On 05/09/2018 01:18 PM, Doug Anderson wrote: >> > >> > >> >right, so we now have 2 cases: rockchip_irq_demux/ rockchip_irq_set_type >> > >> >if i'm right about the spurious irq(only happen when set rising for a high >> >gpio, or set falling for a low gpio), then: >> > >> >1/ rockchip_irq_demux >> >it's important to not losing irqs in this case, maybe we can >> > >> >a) ack irq >> >b) update polarity for edge both irq >> > >> >we don't need to disable irq in b), since we would not hit the spurious irq >> >cases here(always check gpio level to toggle it) > Unless you have some sort of proof that rockchip_irq_demux(), I would > take it as an example of something that works. I remember stress > testing the heck out of it. Do you have some evidence that it's not > working? I think Brian was simply speculating that there might be a > race here, but I don't think anyone has shown it have they? Looking > back at my notes, the thing I really made sure to stress was that we > never got into a situation where we were losing an edge (AKA we were > never configured to look for a falling edge when the line was already > low). I'm not sure I confirmed that we never got an extra interrupt. > > I'm at home right now and I can't add prints and poke at things, but > as I understand it for edge interrupts the usual flow to make sure > interrupts aren't ever lost is: > > 1. See that the interrupt went off > 2. Ack it (clear it) > 3. Call the interrupt handler > > ...presumably in this case rockchip_irq_demux() is called after step > #2 (but I don't know if it's called before or after step #3). If the > line is toggling like crazy while the 3 steps are going on, it's OK if > the interrupt handler is called more than once. In general this could > be considered expected. That's why you Ack before handling--any extra > edges that come in any time after the interrupt handler starts (even > after the very first instruction) need to cause the interrupt handler > to get called again. > > This is different than Brian's understanding since he seemed to think > the Ack was happening later. If you're in front of something where > you can add printouts, maybe you can tell us. I tried to look through > the code and it was too twisted for me to be sure. i think the current edge both irq flow for rk3399 would be: gic_handle_irq //irq-gic-v3.c handle_domain_irq rockchip_irq_demux //pinctrl-rockchip.c toggle polarity generic_handle_irq handle_edge_irq //kernel/irq irq_ack handle_irq_event action->handler so i think the race might actually exist (maybe we can add some delay after toggle polarity to confirm) BTW, checking other drivers, there're quite a few using this kind of toggle edge for edge both, and they go different ways: 1/ toggle it in ack(): mediatek/pinctrl-mtk-common.c gpio-ingenic.c gpio-ep93xx.c 2/ toggle it before handle_irq: pinctrl-rockchip.c pinctrl-armada-37xx.c gpio-ath79.c gpio-mxs.c gpio-omap.c gpio-mvebu.c 3/ toggle it after handle_irq: gpio-dwapb.c gpio-pmic-eic-sprd.c would it make sense to support this kind of emulate edge both irq in some core codes? > > >> >2/ rockchip_irq_set_type >> >it's important to not having spurious irqs >> > >> >so we can disable irq during changing polarity only in these case: >> >((rising && gpio is heigh) || (falling && gpio is low)) >> > >> >i'm still confirming the spurious irq with IC guys. > Hmmm, thinking about all this more, I'm curious how callers expect > this to work. Certainly things are undefined if you have the > following situation: > > Start: rising edge trigger, line is actually high > Request: change to falling edge trigger > Line falls during the request > > In that case it's OK to throw the interrupt away because it can be > argued that the line could have fallen before the request actually > took place. ...but it seems like there could be situations where the > user wouldn't expect interrupts to be thrown away by a call to > irq_set_type(). In other words: > > Start: rising edge trigger, line is actually high > Request: change to falling edge trigger > Line falls, rises, and falls during the request > > ...in that case you'd expect that some sort of interrupt would have > gone off and not be thrown away. No matter what instant in time the > request actually took place it should have caught an edge, right? > > > Said another way: As a client of irq_set_type() I'd expect it to not > throw away interrupts, but I'd expect that the change in type would be > atomic. That is: if the interrupt came in before the type change in > type applied then it should trigger with the old rules. If the > interrupt came in after the type change then it should trigger with > the new rules. > > > I would be tempted to confirm your testing and just clear the spurious > interrupts that you're aware of. AKA: if there's no interrupt pending > and you change the type to "IRQ_TYPE_EDGE_RISING" or > "IRQ_TYPE_EDGE_FALLING" then you should clear the interrupt. It's > still racy, but I guess it's the best you can do unless IC guys come > up with something better. > hmmm, right, clear the spurious irq seems to be a better way, will try to do it in the next version. > > > Anyway, it's past my bedtime. Hopefully some of the above made sense. > I'm sure you'll tell me if it didn't or if I said something > stupid/wrong.:-P > > -Doug > > >
+ irqchip maintainers [ irqchip is weird -- it's all over drivers/{pinctrl,gpio,irqchip}/ :D ] Hi Doug, On Tue, May 08, 2018 at 10:18:18PM -0700, Doug Anderson wrote: > On Tue, May 8, 2018 at 7:21 PM, JeffyChen <jeffy.chen@rock-chips.com> wrote: > > On 05/09/2018 03:46 AM, Doug Anderson wrote: > >> One note is that in the case Brian points at (where we need to > >> simulate EDGE_BOTH by swapping edges) we purposely ignored the TRM and > >> we needed to do that to avoid losing interrupts. For details, see > >> commit 53b1bfc76df2 ("pinctrl: rockchip: Avoid losing interrupts when > >> supporting both edges"). We did this because: > >> > >> 1. We believed that the IP block in Rockchip SoCs has nearly the same > >> logic as "gpio-dwapb.c" and that's what "gpio-dwapb.c" did. > >> > > > > hmm, but i saw the gpio-dwapb.c actually toggle trigger after handle irq, > > which might avoid the race Brian mentioned: > > + generic_handle_irq(gpio_irq); > > + irq_status &= ~BIT(hwirq); > > + > > + if ((irq_get_trigger_type(gpio_irq) & IRQ_TYPE_SENSE_MASK) > > + == IRQ_TYPE_EDGE_BOTH) > > + dwapb_toggle_trigger(gpio, hwirq); > > The code you point at in dwapb_toggle_trigger() specifically is an > example of toggling the polarity _without_ disabling the interrupt in > between. We took this as "working" code and as proof that it was OK > to change polarity without disabling the interrupt in-between. There's a crucial ordering difference though: gpio-dwapb performs its polarity adjustments *after* calling generic_handle_irq(), which among other things calls ->irq_ack(). This means that it's re-ensuring that the polarity is correct *after* the point at which it last ack'ed the interrupt. So there's no chance of it clearing a second interrupt without appropriately reconfiguring the polarity. > > and i also saw ./qcom/pinctrl-msm.c do the > > toggle(msm_gpio_update_dual_edge_pos) at the end of msm_gpio_irq_set_type > > and msm_gpio_irq_ack, that seems can avoid the polarity races too. > > > >> 2. We were actually losing real interrupts and this was the only way > >> we could figure out how to fix it. > >> > >> When I tested that back in the day I was fairly convinced that we > >> weren't losing any interrupts in the EDGE_BOTH case after my fix, but > >> I certainly could have messed up. > >> > >> > >> For the EDGE_BOTH case it was important not to lose an interrupt > >> because, as you guys are talking about, we could end up configured the > >> wrong way. I think in your case where you're just picking one > >> polarity losing an interrupt shouldn't matter since it's undefined > >> exactly if an edge happens while you're in the middle of executing > >> rockchip_irq_set_type(). Is that right? > > > > > > right, so we now have 2 cases: rockchip_irq_demux/ rockchip_irq_set_type > > > > if i'm right about the spurious irq(only happen when set rising for a high > > gpio, or set falling for a low gpio), then: > > > > 1/ rockchip_irq_demux > > it's important to not losing irqs in this case, maybe we can > > > > a) ack irq > > b) update polarity for edge both irq > > > > we don't need to disable irq in b), since we would not hit the spurious irq > > cases here(always check gpio level to toggle it) > > Unless you have some sort of proof that rockchip_irq_demux(), I would > take it as an example of something that works. I remember stress > testing the heck out of it. Do you have some evidence that it's not > working? I think Brian was simply speculating that there might be a > race here, but I don't think anyone has shown it have they? Looking > back at my notes, the thing I really made sure to stress was that we > never got into a situation where we were losing an edge (AKA we were > never configured to look for a falling edge when the line was already > low). I'm not sure I confirmed that we never got an extra interrupt. I'll agree wholeheartedly that I'm only at the speculation stage right now :) I can try to poke at it sometime if I get some cycles. I'd definitely want to get better test results to prove this before changing this part. This is really just a side tangent anyway, because apparently the existing code is working well enough for rockchip_irq_demux(), and for $subject, we're really only working on improving set_type(). > I'm at home right now and I can't add prints and poke at things, but > as I understand it for edge interrupts the usual flow to make sure > interrupts aren't ever lost is: > > 1. See that the interrupt went off > 2. Ack it (clear it) > 3. Call the interrupt handler > > ...presumably in this case rockchip_irq_demux() is called after step > #2 (but I don't know if it's called before or after step #3). If the One thing to note here is that we're talking about a 2-level (chained) interrupt system. We've got the GIC, which does all of 1, 2, 3 at its level, and as part of #3 for the GIC, it runs the 2nd-level handler -- rockchip_irq_demux() -- which has to perform all of 1, 2, 3 at its level. 1 -> this is looping over GPIO_INT_STATUS in rockchip_irq_demux() 2 -> this happens when writing to GPIO_PORTS_EOI, which only is called by ->irq_ack() (irq_gc_ack_set_bit()) ... which happens from: rockchip_irq_demux() -> generic_handle_irq() -> handle_edge_irq() -> desc->irq_data.chip->irq_ack() = irq_gc_ack_set_bit() 3 -> same callpath as 2, except it's -> handle_edge_irq() -> handle_irq_event() Those all happen in the correct order. But the problem is that you left out the part about "change polarity for emulating EDGE_BOTH"; in your example (gpio-dwapb.c), polarity adjustment happens *after* 2; in Rockchip's driver, we have it before. I'm pretty sure dwapb is correct, and we are not. > line is toggling like crazy while the 3 steps are going on, it's OK if > the interrupt handler is called more than once. In general this could > be considered expected. That's why you Ack before handling--any extra > edges that come in any time after the interrupt handler starts (even > after the very first instruction) need to cause the interrupt handler > to get called again. > > This is different than Brian's understanding since he seemed to think > the Ack was happening later. If you're in front of something where > you can add printouts, maybe you can tell us. I tried to look through > the code and it was too twisted for me to be sure. I'm not sure your understanding of my understanding is accurate :) Hopefully the above clarifies what I'm thinking? > > 2/ rockchip_irq_set_type > > it's important to not having spurious irqs > > > > so we can disable irq during changing polarity only in these case: > > ((rising && gpio is heigh) || (falling && gpio is low)) > > > > i'm still confirming the spurious irq with IC guys. > > Hmmm, thinking about all this more, I'm curious how callers expect > this to work. Certainly things are undefined if you have the > following situation: > > Start: rising edge trigger, line is actually high > Request: change to falling edge trigger > Line falls during the request > > In that case it's OK to throw the interrupt away because it can be > argued that the line could have fallen before the request actually > took place. ...but it seems like there could be situations where the > user wouldn't expect interrupts to be thrown away by a call to > irq_set_type(). In other words: > > Start: rising edge trigger, line is actually high > Request: change to falling edge trigger > Line falls, rises, and falls during the request > > ...in that case you'd expect that some sort of interrupt would have > gone off and not be thrown away. No matter what instant in time the > request actually took place it should have caught an edge, right? > > > Said another way: As a client of irq_set_type() I'd expect it to not > throw away interrupts, but I'd expect that the change in type would be > atomic. That is: if the interrupt came in before the type change in > type applied then it should trigger with the old rules. If the > interrupt came in after the type change then it should trigger with > the new rules. I'm not sure it's totally possible to differentiate these, but that seems about right I think. > I would be tempted to confirm your testing and just clear the spurious > interrupts that you're aware of. AKA: if there's no interrupt pending > and you change the type to "IRQ_TYPE_EDGE_RISING" or > "IRQ_TYPE_EDGE_FALLING" then you should clear the interrupt. It's > still racy, but I guess it's the best you can do unless IC guys come > up with something better. Thanks! Yeah, clearing (rather than temporarily disabling) seems to make more sense. > Anyway, it's past my bedtime. Hopefully some of the above made sense. > I'm sure you'll tell me if it didn't or if I said something > stupid/wrong. :-P Brian
diff --git a/drivers/pinctrl/pinctrl-rockchip.c b/drivers/pinctrl/pinctrl-rockchip.c index 3924779f55785..7ff45ec8330d1 100644 --- a/drivers/pinctrl/pinctrl-rockchip.c +++ b/drivers/pinctrl/pinctrl-rockchip.c @@ -2727,9 +2727,19 @@ static int rockchip_irq_set_type(struct irq_data *d, unsigned int type) return -EINVAL; } + /** + * According to the TRM, we should keep irq disabled during programming + * interrupt capability to prevent spurious glitches on the interrupt + * lines to the interrupt controller. + */ + data = readl(bank->reg_base + GPIO_INTEN); + writel_relaxed(data & ~mask, gc->reg_base + GPIO_INTEN); + writel_relaxed(level, gc->reg_base + GPIO_INTTYPE_LEVEL); writel_relaxed(polarity, gc->reg_base + GPIO_INT_POLARITY); + writel_relaxed(data, gc->reg_base + GPIO_INTEN); + irq_gc_unlock(gc); raw_spin_unlock_irqrestore(&bank->slock, flags); clk_disable(bank->clk);
We saw spurious irq when changing irq's trigger type, for example setting gpio-keys's wakeup irq trigger type. And according to the TRM: "Programming the GPIO registers for interrupt capability, edge-sensitive or level-sensitive interrupts, and interrupt polarity should be completed prior to enabling the interrupts on Port A in order to prevent spurious glitches on the interrupt lines to the interrupt controller." Reported-by: Brian Norris <briannorris@google.com> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> --- drivers/pinctrl/pinctrl-rockchip.c | 10 ++++++++++ 1 file changed, 10 insertions(+)