Message ID | 1504122674-3379-3-git-send-email-gbhat@marvell.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Kalle Valo |
Headers | show |
On Thu, 2017-08-31 at 01:21 +0530, Ganapathi Bhat wrote: > Current driver prints dev_alloc_skb failures everytime while > submitting RX URBs. This failure might be frequent in some > low resource platforms. So, wait for a threshold failure > count before start priting the error. This change is a follow > up for the 'commit 7b368e3d15c3 > ("mwifiex: resubmit failed to submit RX URBs in main thread")' [] > diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c [] > @@ -300,9 +300,16 @@ static int mwifiex_usb_submit_rx_urb(struct urb_context *ctx, int size) > if (card->rx_cmd_ep != ctx->ep) { > ctx->skb = dev_alloc_skb(size); > if (!ctx->skb) { > - mwifiex_dbg(adapter, ERROR, > - "%s: dev_alloc_skb failed\n", __func__); > + if (++card->rx_urb_failure_count > > + MWIFIEX_RX_URB_FAILURE_THRESHOLD) { > + mwifiex_dbg(adapter, ERROR, > + "%s: dev_alloc_skb failed, failure count = %u\n", > + __func__, > + card->rx_urb_failure_count); > + } > return -ENOMEM; Why not use a ratelimit?
Hi Joe, > On Thu, 2017-08-31 at 01:21 +0530, Ganapathi Bhat wrote: > > Current driver prints dev_alloc_skb failures everytime while > > submitting RX URBs. This failure might be frequent in some low > > resource platforms. So, wait for a threshold failure count before > > start priting the error. This change is a follow up for the 'commit > > 7b368e3d15c3 > > ("mwifiex: resubmit failed to submit RX URBs in main thread")' > > [] > > > diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c > > b/drivers/net/wireless/marvell/mwifiex/usb.c > [] > > @@ -300,9 +300,16 @@ static int mwifiex_usb_submit_rx_urb(struct > urb_context *ctx, int size) > > if (card->rx_cmd_ep != ctx->ep) { > > ctx->skb = dev_alloc_skb(size); > > if (!ctx->skb) { > > - mwifiex_dbg(adapter, ERROR, > > - "%s: dev_alloc_skb failed\n", __func__); > > + if (++card->rx_urb_failure_count > > > + MWIFIEX_RX_URB_FAILURE_THRESHOLD) { > > + mwifiex_dbg(adapter, ERROR, > > + "%s: dev_alloc_skb failed, failure > count = %u\n", > > + __func__, > > + card->rx_urb_failure_count); > > + } > > return -ENOMEM; > > Why not use a ratelimit? Since this is for receive, the packets are from AP side and we cannot lower the rate from AP. On some low performance systems this change will be helpful. Regards, Ganapathi
Brian Norris <briannorris@chromium.org> writes: > Hi Ganapathi, > > On Thu, Sep 14, 2017 at 02:14:24PM +0000, Ganapathi Bhat wrote: >> > On Thu, 2017-08-31 at 01:21 +0530, Ganapathi Bhat wrote: >> > > @@ -300,9 +300,16 @@ static int mwifiex_usb_submit_rx_urb(struct >> > urb_context *ctx, int size) >> > > if (card->rx_cmd_ep != ctx->ep) { >> > > ctx->skb = dev_alloc_skb(size); >> > > if (!ctx->skb) { >> > > - mwifiex_dbg(adapter, ERROR, >> > > - "%s: dev_alloc_skb failed\n", __func__); >> > > + if (++card->rx_urb_failure_count > >> > > + MWIFIEX_RX_URB_FAILURE_THRESHOLD) { >> > > + mwifiex_dbg(adapter, ERROR, >> > > + "%s: dev_alloc_skb failed, failure >> > count = %u\n", >> > > + __func__, >> > > + card->rx_urb_failure_count); >> > > + } >> > > return -ENOMEM; >> > >> > Why not use a ratelimit? >> Since this is for receive, the packets are from AP side and we cannot >> lower the rate from AP. On some low performance systems this change >> will be helpful. > > I think Joe was referring to things like printk_ratelimited() or > dev_err_ratelimited(). Those automatically ratelimit prints for you, > using a static counter. You'd just need to make a small warpper for > mwifiex_dbg() using __ratelimit(). > > Those sort of rate limits are significantly different than yours though. > You were looking to avoid printing errors when there are only a few > failures in a row, whereas the existing rate-limiting infrastructure > looks to avoid printing errors if too many happen in a row. Those are > different goals. Are you saying that this patch is good to take? Or should Ganapathi submit v2?
Hi Kalle, On Wed, Sep 20, 2017 at 07:30:29AM +0300, Kalle Valo wrote: > Brian Norris <briannorris@chromium.org> writes: > > > Hi Ganapathi, > > > > On Thu, Sep 14, 2017 at 02:14:24PM +0000, Ganapathi Bhat wrote: > >> > On Thu, 2017-08-31 at 01:21 +0530, Ganapathi Bhat wrote: > >> > Why not use a ratelimit? > >> Since this is for receive, the packets are from AP side and we cannot > >> lower the rate from AP. On some low performance systems this change > >> will be helpful. > > > > I think Joe was referring to things like printk_ratelimited() or > > dev_err_ratelimited(). Those automatically ratelimit prints for you, > > using a static counter. You'd just need to make a small warpper for > > mwifiex_dbg() using __ratelimit(). > > > > Those sort of rate limits are significantly different than yours though. > > You were looking to avoid printing errors when there are only a few > > failures in a row, whereas the existing rate-limiting infrastructure > > looks to avoid printing errors if too many happen in a row. Those are > > different goals. > > Are you saying that this patch is good to take? Or should Ganapathi > submit v2? If you're asking me... All I was saying was that I don't think Joe's suggestion will help Ganapathi. I'd expect Ganapathi could confirm/deny that part. (Or Joe could correct me if my interpretation is wrong.) I'm also not familiar with how we expect dev_alloc_skb() failures to be handled. If that's a common expected failure mode in low-memory situations (seems reasonable?) and the driver handles these failure just fine (completely unreviewed by me, so far; I suspect it doesn't do this completely correctly), then sure, being less noisy about it as done in this patch should be fine. IOW, I don't have concrete feedback for Ganapathi to address, but I'm not exactly "ack"ing it myself. Brian
Brian Norris <briannorris@chromium.org> writes: > On Wed, Sep 20, 2017 at 07:30:29AM +0300, Kalle Valo wrote: >> Brian Norris <briannorris@chromium.org> writes: >> >> > Hi Ganapathi, >> > >> > On Thu, Sep 14, 2017 at 02:14:24PM +0000, Ganapathi Bhat wrote: >> >> > On Thu, 2017-08-31 at 01:21 +0530, Ganapathi Bhat wrote: > >> >> > Why not use a ratelimit? >> >> Since this is for receive, the packets are from AP side and we cannot >> >> lower the rate from AP. On some low performance systems this change >> >> will be helpful. >> > >> > I think Joe was referring to things like printk_ratelimited() or >> > dev_err_ratelimited(). Those automatically ratelimit prints for you, >> > using a static counter. You'd just need to make a small warpper for >> > mwifiex_dbg() using __ratelimit(). >> > >> > Those sort of rate limits are significantly different than yours though. >> > You were looking to avoid printing errors when there are only a few >> > failures in a row, whereas the existing rate-limiting infrastructure >> > looks to avoid printing errors if too many happen in a row. Those are >> > different goals. >> >> Are you saying that this patch is good to take? Or should Ganapathi >> submit v2? > > If you're asking me... Yeah, I was asking you because to me this patch looks like an ugly workaround to a bug. And now that looked patch 1 more closely it feels the same. > All I was saying was that I don't think Joe's suggestion will help > Ganapathi. I'd expect Ganapathi could confirm/deny that part. (Or Joe > could correct me if my interpretation is wrong.) Ok. > I'm also not familiar with how we expect dev_alloc_skb() failures to be > handled. If that's a common expected failure mode in low-memory > situations (seems reasonable?) and the driver handles these failure just > fine (completely unreviewed by me, so far; I suspect it doesn't do this > completely correctly), then sure, being less noisy about it as done in > this patch should be fine. But this is a debug message so it should not bother normal users, right? I think that having a threshold like this is just hiding problems and not solving them. The real issue here is that dev_alloc_skb() is failing and that's what should be solved, not to paper it over by limiting debug messages. It just means that the real issue will be even more difficult to detect in the future. > IOW, I don't have concrete feedback for Ganapathi to address, but I'm > not exactly "ack"ing it myself. I'm not very confident about this patch either, it's not just making any sense.
diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c index f4f2b9b..98f6973 100644 --- a/drivers/net/wireless/marvell/mwifiex/usb.c +++ b/drivers/net/wireless/marvell/mwifiex/usb.c @@ -300,9 +300,16 @@ static int mwifiex_usb_submit_rx_urb(struct urb_context *ctx, int size) if (card->rx_cmd_ep != ctx->ep) { ctx->skb = dev_alloc_skb(size); if (!ctx->skb) { - mwifiex_dbg(adapter, ERROR, - "%s: dev_alloc_skb failed\n", __func__); + if (++card->rx_urb_failure_count > + MWIFIEX_RX_URB_FAILURE_THRESHOLD) { + mwifiex_dbg(adapter, ERROR, + "%s: dev_alloc_skb failed, failure count = %u\n", + __func__, + card->rx_urb_failure_count); + } return -ENOMEM; + } else { + card->rx_urb_failure_count = 0; } } diff --git a/drivers/net/wireless/marvell/mwifiex/usb.h b/drivers/net/wireless/marvell/mwifiex/usb.h index 37abd22..dc4750b 100644 --- a/drivers/net/wireless/marvell/mwifiex/usb.h +++ b/drivers/net/wireless/marvell/mwifiex/usb.h @@ -43,6 +43,7 @@ #define MWIFIEX_TX_DATA_URB 6 #define MWIFIEX_RX_DATA_URB 6 #define MWIFIEX_USB_TIMEOUT 100 +#define MWIFIEX_RX_URB_FAILURE_THRESHOLD 20 #define USB8766_DEFAULT_FW_NAME "mrvl/usb8766_uapsta.bin" #define USB8797_DEFAULT_FW_NAME "mrvl/usb8797_uapsta.bin" @@ -117,6 +118,7 @@ struct usb_card_rec { u8 rx_cmd_interval; int tx_cmd_ep_type; u8 tx_cmd_interval; + u32 rx_urb_failure_count; }; struct fw_header {