Message ID | 20220223185353.51370-1-andriy.shevchenko@linux.intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [v1,1/1] IB/hfi1: Don't cast parameter in bit operations | expand |
From: Andy Shevchenko > Sent: 23 February 2022 18:54 > > While in this particular case it would not be a (critical) issue, > the pattern itself is bad and error prone in case somebody blindly > copies to their code. It is horribly wrong on BE systems. > Don't cast parameter to unsigned long pointer in the bit operations. > Instead copy to a local variable on stack of a proper type and use. > > Fixes: 7724105686e7 ("IB/hfi1: add driver files") > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > --- > drivers/infiniband/hw/hfi1/chip.c | 29 ++++++++++++++--------------- > 1 file changed, 14 insertions(+), 15 deletions(-) > > diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c > index f1245c94ae26..100274b926d3 100644 > --- a/drivers/infiniband/hw/hfi1/chip.c > +++ b/drivers/infiniband/hw/hfi1/chip.c > @@ -8286,34 +8286,33 @@ static void is_interrupt(struct hfi1_devdata *dd, unsigned int source) > irqreturn_t general_interrupt(int irq, void *data) > { > struct hfi1_devdata *dd = data; > - u64 regs[CCE_NUM_INT_CSRS]; > + DECLARE_BITMAP(pending, CCE_NUM_INT_CSRS * 64); > + u64 value; > u32 bit; > int i; > - irqreturn_t handled = IRQ_NONE; > > this_cpu_inc(*dd->int_counter); > > /* phase 1: scan and clear all handled interrupts */ > for (i = 0; i < CCE_NUM_INT_CSRS; i++) { > - if (dd->gi_mask[i] == 0) { > - regs[i] = 0; /* used later */ > - continue; > - } > - regs[i] = read_csr(dd, CCE_INT_STATUS + (8 * i)) & > - dd->gi_mask[i]; > + if (dd->gi_mask[i] == 0) > + value = 0; /* used later */ > + else > + value = read_csr(dd, CCE_INT_STATUS + (8 * i)) & dd->gi_mask[i]; > + > + /* save for further use */ > + bitmap_from_u64(&pending[BITS_TO_LONGS(i * 64)], value); > + > /* only clear if anything is set */ > - if (regs[i]) > - write_csr(dd, CCE_INT_CLEAR + (8 * i), regs[i]); > + if (value) > + write_csr(dd, CCE_INT_CLEAR + (8 * i), value); > } I think I'd leave all that alone. > /* phase 2: call the appropriate handler */ > - for_each_set_bit(bit, (unsigned long *)®s[0], > - CCE_NUM_INT_CSRS * 64) { > + for_each_set_bit(bit, pending, CCE_NUM_INT_CSRS * 64) And do something else for that loop instead. > is_interrupt(dd, bit); > - handled = IRQ_HANDLED; > - } > > - return handled; > + return IRQ_RETVAL(!bitmap_empty(pending, CCE_NUM_INT_CSRS * 64)); You really don't want to scan the bitmap again. Actually, of the face of it, you could merge the two loops. Provided you clear the status bit before calling the relevant handler I expect it will all work. David > } > > irqreturn_t sdma_interrupt(int irq, void *data) > -- > 2.34.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, Feb 23, 2022 at 09:44:32PM +0000, David Laight wrote: > From: Andy Shevchenko > > Sent: 23 February 2022 18:54 > > > > While in this particular case it would not be a (critical) issue, > > the pattern itself is bad and error prone in case somebody blindly > > copies to their code. > > It is horribly wrong on BE systems. You mean the pattern? Yes, it has three issues regarding to endianess and potential out of boundary access. ... > > - return handled; > > + return IRQ_RETVAL(!bitmap_empty(pending, CCE_NUM_INT_CSRS * 64)); > You really don't want to scan the bitmap again. Either way it wastes cycles, the outcome depends on the actual distribution of the interrupts across the bitmap. If it gathered closer to the beginning of the bitmap, my code wins, otherwise the original ones. > Actually, of the face of it, you could merge the two loops. > Provided you clear the status bit before calling the relevant > handler I expect it will all work. True. I will consider that for v2.
From: 'Andy Shevchenko' > Sent: 23 February 2022 22:30 > > On Wed, Feb 23, 2022 at 09:44:32PM +0000, David Laight wrote: > > From: Andy Shevchenko > > > Sent: 23 February 2022 18:54 > > > > > > While in this particular case it would not be a (critical) issue, > > > the pattern itself is bad and error prone in case somebody blindly > > > copies to their code. > > > > It is horribly wrong on BE systems. > > You mean the pattern? Yes, it has three issues regarding to endianess and > potential out of boundary access. Never mind the misaligned page-boundary-crossing locked access. > ... > > > > - return handled; > > > + return IRQ_RETVAL(!bitmap_empty(pending, CCE_NUM_INT_CSRS * 64)); > > > You really don't want to scan the bitmap again. > > Either way it wastes cycles, the outcome depends on the actual distribution of > the interrupts across the bitmap. If it gathered closer to the beginning of the > bitmap, my code wins, otherwise the original ones. The loop in bitmap_empty() will kill you - even if the first word in non-zero. Or just 'or' together the 'value' written to clear the pending interrupts in the first loop. Or just return IRQ_HANDLED ;-) Depending on exactly how the interrupt system works on you hardware it is perfectly possible to get another ISR entry for an IRQ bit you just cleared. Which can generate a 'spurious interrupt' message when IRQ_HANDLED isn't returned (maybe not in Linux...) It is easiest to see how that can happen with a level sensitive interrupt request. The write to clear the pending register can get delayed (posted bus write) long enough for the cpu to have actually exited the ISR. So the IRQ line is still set and the ISR re-entered. But no pending bits are now set. Put enough PCIe bridges in a system and overload PCIe links and you might get the same to happen for MSI-X. Especially since there will be additional delays on the device itself converting the internal IRQ into the required PCIe write. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, Feb 23, 2022 at 10:50:19PM +0000, David Laight wrote: > From: 'Andy Shevchenko' > > Sent: 23 February 2022 22:30 > > On Wed, Feb 23, 2022 at 09:44:32PM +0000, David Laight wrote: > > > From: Andy Shevchenko > > > > Sent: 23 February 2022 18:54 ... > > Either way it wastes cycles, the outcome depends on the actual distribution of > > the interrupts across the bitmap. If it gathered closer to the beginning of the > > bitmap, my code wins, otherwise the original ones. > > The loop in bitmap_empty() will kill you - even if the first word in non-zero. What loop? Did you really look into implementation of bitmap_empty()?
On 2/23/22 4:44 PM, David Laight wrote: > From: Andy Shevchenko >> Sent: 23 February 2022 18:54 >> >> While in this particular case it would not be a (critical) issue, >> the pattern itself is bad and error prone in case somebody blindly >> copies to their code. > > It is horribly wrong on BE systems. Note that this driver has in Kconfig: depends on X86_64. However it's a good point about not wanting anyone to blindly copy. -Denny
On 2/23/22 5:30 PM, 'Andy Shevchenko' wrote: > On Wed, Feb 23, 2022 at 09:44:32PM +0000, David Laight wrote: >> From: Andy Shevchenko >>> Sent: 23 February 2022 18:54 >>> >>> While in this particular case it would not be a (critical) issue, >>> the pattern itself is bad and error prone in case somebody blindly >>> copies to their code. >> >> It is horribly wrong on BE systems. > > You mean the pattern? Yes, it has three issues regarding to endianess and > potential out of boundary access. > > ... > >>> - return handled; >>> + return IRQ_RETVAL(!bitmap_empty(pending, CCE_NUM_INT_CSRS * 64)); > >> You really don't want to scan the bitmap again. > > Either way it wastes cycles, the outcome depends on the actual distribution of > the interrupts across the bitmap. If it gathered closer to the beginning of the > bitmap, my code wins, otherwise the original ones. > >> Actually, of the face of it, you could merge the two loops. >> Provided you clear the status bit before calling the relevant >> handler I expect it will all work. > > True. I will consider that for v2. Will wait for a v2 patch and I'll test it. We are very sensitive to performance changes. -Denny
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c index f1245c94ae26..100274b926d3 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -8286,34 +8286,33 @@ static void is_interrupt(struct hfi1_devdata *dd, unsigned int source) irqreturn_t general_interrupt(int irq, void *data) { struct hfi1_devdata *dd = data; - u64 regs[CCE_NUM_INT_CSRS]; + DECLARE_BITMAP(pending, CCE_NUM_INT_CSRS * 64); + u64 value; u32 bit; int i; - irqreturn_t handled = IRQ_NONE; this_cpu_inc(*dd->int_counter); /* phase 1: scan and clear all handled interrupts */ for (i = 0; i < CCE_NUM_INT_CSRS; i++) { - if (dd->gi_mask[i] == 0) { - regs[i] = 0; /* used later */ - continue; - } - regs[i] = read_csr(dd, CCE_INT_STATUS + (8 * i)) & - dd->gi_mask[i]; + if (dd->gi_mask[i] == 0) + value = 0; /* used later */ + else + value = read_csr(dd, CCE_INT_STATUS + (8 * i)) & dd->gi_mask[i]; + + /* save for further use */ + bitmap_from_u64(&pending[BITS_TO_LONGS(i * 64)], value); + /* only clear if anything is set */ - if (regs[i]) - write_csr(dd, CCE_INT_CLEAR + (8 * i), regs[i]); + if (value) + write_csr(dd, CCE_INT_CLEAR + (8 * i), value); } /* phase 2: call the appropriate handler */ - for_each_set_bit(bit, (unsigned long *)®s[0], - CCE_NUM_INT_CSRS * 64) { + for_each_set_bit(bit, pending, CCE_NUM_INT_CSRS * 64) is_interrupt(dd, bit); - handled = IRQ_HANDLED; - } - return handled; + return IRQ_RETVAL(!bitmap_empty(pending, CCE_NUM_INT_CSRS * 64)); } irqreturn_t sdma_interrupt(int irq, void *data)
While in this particular case it would not be a (critical) issue, the pattern itself is bad and error prone in case somebody blindly copies to their code. Don't cast parameter to unsigned long pointer in the bit operations. Instead copy to a local variable on stack of a proper type and use. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> --- drivers/infiniband/hw/hfi1/chip.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-)