Message ID | 1517958324-13536-1-git-send-email-greearb@candelatech.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Kalle Valo |
Headers | show |
On 2018-02-07 00:05, greearb@candelatech.com wrote: > From: Ben Greear <greearb@candelatech.com> > > In case where the system is sluggish, we should probably break out > early. Maybe this will fix issues where the OS thinks the IRQ handler > is not responding and disables the IRQ because 'nobody cared' > > Signed-off-by: Ben Greear <greearb@candelatech.com> 5 jiffies as a hardcoded value is a bad idea, since it produces different behavior based on CONFIG_HZ. - Felix
On Wed, 2018-02-07 at 10:16 +0100, Felix Fietkau wrote: > On 2018-02-07 00:05, greearb@candelatech.com wrote: > > From: Ben Greear <greearb@candelatech.com> > > > > In case where the system is sluggish, we should probably break out > > early. Maybe this will fix issues where the OS thinks the IRQ handler > > is not responding and disables the IRQ because 'nobody cared' > > > > Signed-off-by: Ben Greear <greearb@candelatech.com> > > 5 jiffies as a hardcoded value is a bad idea, since it produces > different behavior based on CONFIG_HZ. Also, err, NAPI? Or is something else is going on here? johannes
On 02/07/2018 02:55 AM, Johannes Berg wrote: > On Wed, 2018-02-07 at 10:16 +0100, Felix Fietkau wrote: >> On 2018-02-07 00:05, greearb@candelatech.com wrote: >>> From: Ben Greear <greearb@candelatech.com> >>> >>> In case where the system is sluggish, we should probably break out >>> early. Maybe this will fix issues where the OS thinks the IRQ handler >>> is not responding and disables the IRQ because 'nobody cared' >>> >>> Signed-off-by: Ben Greear <greearb@candelatech.com> >> >> 5 jiffies as a hardcoded value is a bad idea, since it produces >> different behavior based on CONFIG_HZ. I figured that was a benefit since it would run shorter duration on systems with a faster HZ clock. > > Also, err, NAPI? Or is something else is going on here? I don't really know, but part of my test was running traffic while creating 1200 stations, so likely there were lots of higher-level lock contention that slowed down sending pkts up the stack. I got a bunch of errors about IRQs being ignored because nobody cared. I noticed that the ath9k loop could handle up to 500 or so frames, and that seemed like too many for my particular test case. Once I put in this patch, I did not see the 'nobody cared' error again. There could easily be a better fix. If you all want me to use a fixed time instead of HZ, then please suggest a value. I was testing with HZ of 1000, btw. Thanks, Ben > > johannes >
On 02/07/2018 07:39 AM, Ben Greear wrote: > > > On 02/07/2018 02:55 AM, Johannes Berg wrote: >> On Wed, 2018-02-07 at 10:16 +0100, Felix Fietkau wrote: >>> On 2018-02-07 00:05, greearb@candelatech.com wrote: >>>> From: Ben Greear <greearb@candelatech.com> >>>> >>>> In case where the system is sluggish, we should probably break out >>>> early. Maybe this will fix issues where the OS thinks the IRQ handler >>>> is not responding and disables the IRQ because 'nobody cared' >>>> >>>> Signed-off-by: Ben Greear <greearb@candelatech.com> >>> >>> 5 jiffies as a hardcoded value is a bad idea, since it produces >>> different behavior based on CONFIG_HZ. > > I figured that was a benefit since it would run shorter duration on systems with > a faster HZ clock. > >> >> Also, err, NAPI? Or is something else is going on here? > > I don't really know, but part of my test was running traffic while creating > 1200 stations, so likely there were lots of higher-level lock contention that > slowed down sending pkts up the stack. > > I got a bunch of errors about IRQs being ignored because nobody cared. I noticed > that the ath9k loop could handle up to 500 or so frames, and that seemed like too > many for my particular test case. > > Once I put in this patch, I did not see the 'nobody cared' error again. > > There could easily be a better fix. If you all want me to use a fixed time instead > of HZ, then please suggest a value. I was testing with HZ of 1000, btw. Hello, I don't mind changing this patch, but I could use some guidance as to what values you all want me to use. Should I use a millisecond based clock instead of jiffies? What time duration do you want if 5 Jiffies (or 5ms) is not desired? Thanks, Ben
On 2/26/2018 10:39 PM, Ben Greear wrote: > On 02/07/2018 07:39 AM, Ben Greear wrote: >> >> >> On 02/07/2018 02:55 AM, Johannes Berg wrote: >>> On Wed, 2018-02-07 at 10:16 +0100, Felix Fietkau wrote: >>>> On 2018-02-07 00:05, greearb@candelatech.com wrote: >>>>> From: Ben Greear <greearb@candelatech.com> >>>>> >>>>> In case where the system is sluggish, we should probably break out >>>>> early. Maybe this will fix issues where the OS thinks the IRQ handler >>>>> is not responding and disables the IRQ because 'nobody cared' >>>>> >>>>> Signed-off-by: Ben Greear <greearb@candelatech.com> >>>> >>>> 5 jiffies as a hardcoded value is a bad idea, since it produces >>>> different behavior based on CONFIG_HZ. >> >> I figured that was a benefit since it would run shorter duration on >> systems with >> a faster HZ clock. >> >>> >>> Also, err, NAPI? Or is something else is going on here? >> >> I don't really know, but part of my test was running traffic while >> creating >> 1200 stations, so likely there were lots of higher-level lock >> contention that >> slowed down sending pkts up the stack. >> >> I got a bunch of errors about IRQs being ignored because nobody >> cared. I noticed >> that the ath9k loop could handle up to 500 or so frames, and that >> seemed like too >> many for my particular test case. >> >> Once I put in this patch, I did not see the 'nobody cared' error again. >> >> There could easily be a better fix. If you all want me to use a fixed >> time instead >> of HZ, then please suggest a value. I was testing with HZ of 1000, btw. > > Hello, > > I don't mind changing this patch, but I could use some guidance as to what > values you all want me to use. > > Should I use a millisecond based clock instead of jiffies? > > What time duration do you want if 5 Jiffies (or 5ms) is not desired? Hi Ben, Instead of using some time unit you could consider breaking out after handing 'x' number of frames and make 'x' configurable through debugfs. Regards, Arend
On 02/26/2018 02:08 PM, Arend van Spriel wrote: > On 2/26/2018 10:39 PM, Ben Greear wrote: >> On 02/07/2018 07:39 AM, Ben Greear wrote: >>> >>> >>> On 02/07/2018 02:55 AM, Johannes Berg wrote: >>>> On Wed, 2018-02-07 at 10:16 +0100, Felix Fietkau wrote: >>>>> On 2018-02-07 00:05, greearb@candelatech.com wrote: >>>>>> From: Ben Greear <greearb@candelatech.com> >>>>>> >>>>>> In case where the system is sluggish, we should probably break out >>>>>> early. Maybe this will fix issues where the OS thinks the IRQ handler >>>>>> is not responding and disables the IRQ because 'nobody cared' >>>>>> >>>>>> Signed-off-by: Ben Greear <greearb@candelatech.com> >>>>> >>>>> 5 jiffies as a hardcoded value is a bad idea, since it produces >>>>> different behavior based on CONFIG_HZ. >>> >>> I figured that was a benefit since it would run shorter duration on >>> systems with >>> a faster HZ clock. >>> >>>> >>>> Also, err, NAPI? Or is something else is going on here? >>> >>> I don't really know, but part of my test was running traffic while >>> creating >>> 1200 stations, so likely there were lots of higher-level lock >>> contention that >>> slowed down sending pkts up the stack. >>> >>> I got a bunch of errors about IRQs being ignored because nobody >>> cared. I noticed >>> that the ath9k loop could handle up to 500 or so frames, and that >>> seemed like too >>> many for my particular test case. >>> >>> Once I put in this patch, I did not see the 'nobody cared' error again. >>> >>> There could easily be a better fix. If you all want me to use a fixed >>> time instead >>> of HZ, then please suggest a value. I was testing with HZ of 1000, btw. >> >> Hello, >> >> I don't mind changing this patch, but I could use some guidance as to what >> values you all want me to use. >> >> Should I use a millisecond based clock instead of jiffies? >> >> What time duration do you want if 5 Jiffies (or 5ms) is not desired? > > Hi Ben, > > Instead of using some time unit you could consider breaking out after handing 'x' number of frames and make 'x' configurable through debugfs. I don't see why you would care about number of pkts...it is just a proxy for time, right? So, in that case, then using jiffies (or some other fast timer) seems the most useful. Thanks, Ben > > Regards, > Arend >
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c index b90ea2b..274814c 100644 --- a/drivers/net/wireless/ath/ath9k/recv.c +++ b/drivers/net/wireless/ath/ath9k/recv.c @@ -1084,6 +1084,7 @@ int ath_rx_tasklet(struct ath_softc *sc, int flush, bool hp) dma_addr_t new_buf_addr; unsigned int budget = 512; struct ieee80211_hdr *hdr; + unsigned long expires_jiffies = jiffies + 5; if (edma) dma_type = DMA_BIDIRECTIONAL; @@ -1241,6 +1242,9 @@ int ath_rx_tasklet(struct ath_softc *sc, int flush, bool hp) if (!budget--) break; + + if (time_is_before_jiffies(expires_jiffies)) + break; } while (1); if (!(ah->imask & ATH9K_INT_RXEOL)) {