Message ID | 20211214223901.1.I777939e0ef1e89872d4ab65340f3fd756615a047@changeid (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ath10k: enable threaded napi on ath10k driver | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
Hi All, This patch is to trigger a discussion on the best approach to enable threaded NAPI on ath10k. Threaded NAPI feature was added in (net: extract napi poll functionality to __napi_poll() commit 898f8015ffe74118e7b461827451f2cc6e51035b) and showed good results on ath10k snoc based solution. If we come to a consensus with this as the best approach to enable threaded NAPI on ath10k, then we can moved ahead with the implementation and enable across sdio and pci, or if there is any objection then we can discuss it here. Thanks Abhishek On Tue, Dec 14, 2021 at 2:41 PM Abhishek Kumar <kuabhs@chromium.org> wrote: > > NAPI poll can be done in threaded context along with soft irq > context. Threaded context can be scheduled efficiently, thus > creating less of bottleneck during Rx processing. This patch is > to enable threaded NAPI on ath10k driver. > > Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2-00696-QCAHLSWMTPL-1 > Signed-off-by: Abhishek Kumar <kuabhs@chromium.org> > --- > > drivers/net/wireless/ath/ath10k/pci.c | 1 + > drivers/net/wireless/ath/ath10k/sdio.c | 1 + > drivers/net/wireless/ath/ath10k/snoc.c | 2 +- > 3 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c > index 4d4e2f91e15c..584307574d99 100644 > --- a/drivers/net/wireless/ath/ath10k/pci.c > +++ b/drivers/net/wireless/ath/ath10k/pci.c > @@ -1958,6 +1958,7 @@ static int ath10k_pci_hif_start(struct ath10k *ar) > > ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot hif start\n"); > > + dev_set_threaded(&ar->napi_dev, true); > ath10k_core_napi_enable(ar); > > ath10k_pci_irq_enable(ar); > diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c > index 63e1c2d783c5..52ef74d9811a 100644 > --- a/drivers/net/wireless/ath/ath10k/sdio.c > +++ b/drivers/net/wireless/ath/ath10k/sdio.c > @@ -1862,6 +1862,7 @@ static int ath10k_sdio_hif_start(struct ath10k *ar) > struct ath10k_sdio *ar_sdio = ath10k_sdio_priv(ar); > int ret; > > + dev_set_threaded(&ar->napi_dev, true); > ath10k_core_napi_enable(ar); > > /* Sleep 20 ms before HIF interrupts are disabled. > diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c > index 9513ab696fff..e7d12dbb3fa5 100644 > --- a/drivers/net/wireless/ath/ath10k/snoc.c > +++ b/drivers/net/wireless/ath/ath10k/snoc.c > @@ -926,7 +926,7 @@ static int ath10k_snoc_hif_start(struct ath10k *ar) > struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); > > bitmap_clear(ar_snoc->pending_ce_irqs, 0, CE_COUNT_MAX); > - > + dev_set_threaded(&ar->napi_dev, true); > ath10k_core_napi_enable(ar); > ath10k_snoc_irq_enable(ar); > ath10k_snoc_rx_post(ar); > -- > 2.34.1.173.g76aa8bc2d0-goog >
On Tue, 14 Dec 2021 22:39:36 +0000 Abhishek Kumar wrote: > NAPI poll can be done in threaded context along with soft irq > context. Threaded context can be scheduled efficiently, thus > creating less of bottleneck during Rx processing. This patch is > to enable threaded NAPI on ath10k driver. You need to explain in more detail what you mean by "can be scheduled efficiently". mt76 had an issue where Rx and Tx would use the same IRQ and threaded NAPI allowed them to be run on separate cores. What's the challenge for ath10k HW?
Abhishek Kumar <kuabhs@chromium.org> writes: > This patch is to trigger a discussion on the best approach to enable > threaded NAPI on ath10k. Threaded NAPI feature was added in (net: > extract napi poll functionality to __napi_poll() commit > 898f8015ffe74118e7b461827451f2cc6e51035b) and showed good results on > ath10k snoc based solution. > > If we come to a consensus with this as the best approach to enable > threaded NAPI on ath10k, then we can moved ahead with the > implementation and enable across sdio and pci, or if there is any > objection then we can discuss it here. Few tips: It's a good idea to mark patches like this with "[PATCH RFC]": https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches#use_rfc_or_rft_for_patches_not_ready And you can add a text like the above after the "---" line in the patch itself.
Abhishek Kumar <kuabhs@chromium.org> writes: > NAPI poll can be done in threaded context along with soft irq > context. Threaded context can be scheduled efficiently, thus > creating less of bottleneck during Rx processing. This patch is > to enable threaded NAPI on ath10k driver. > > Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2-00696-QCAHLSWMTPL-1 > Signed-off-by: Abhishek Kumar <kuabhs@chromium.org> > --- > > drivers/net/wireless/ath/ath10k/pci.c | 1 + > drivers/net/wireless/ath/ath10k/sdio.c | 1 + > drivers/net/wireless/ath/ath10k/snoc.c | 2 +- I think also pci and sdio should be tested.
On 2021-12-14 23:39, Abhishek Kumar wrote: > NAPI poll can be done in threaded context along with soft irq > context. Threaded context can be scheduled efficiently, thus > creating less of bottleneck during Rx processing. This patch is > to enable threaded NAPI on ath10k driver. > > Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2-00696-QCAHLSWMTPL-1 > Signed-off-by: Abhishek Kumar <kuabhs@chromium.org> The name of the dummy device should be filled in as well. How about something like this: --- a/drivers/net/wireless/ath/ath10k/core.c +++ b/drivers/net/wireless/ath/ath10k/core.c @@ -3576,6 +3576,9 @@ struct ath10k *ath10k_core_create(size_t priv_size, struct device *dev, ath10k_core_set_coverage_class_work); init_dummy_netdev(&ar->napi_dev); + snprintf(ar->napi_dev.name, sizeof(ar->napi_dev.name), "%s", + wiphy_name(ar->hw->wiphy)); + ar->napi_dev.threaded = 1; ret = ath10k_coredump_create(ar); if (ret)
what is your definition "Good results"? I would really love it if I could get back more flent benchmarks like this one: https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002 As the ath10k has cost me more hair and time than I care to think about.
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index 4d4e2f91e15c..584307574d99 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -1958,6 +1958,7 @@ static int ath10k_pci_hif_start(struct ath10k *ar) ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot hif start\n"); + dev_set_threaded(&ar->napi_dev, true); ath10k_core_napi_enable(ar); ath10k_pci_irq_enable(ar); diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index 63e1c2d783c5..52ef74d9811a 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -1862,6 +1862,7 @@ static int ath10k_sdio_hif_start(struct ath10k *ar) struct ath10k_sdio *ar_sdio = ath10k_sdio_priv(ar); int ret; + dev_set_threaded(&ar->napi_dev, true); ath10k_core_napi_enable(ar); /* Sleep 20 ms before HIF interrupts are disabled. diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c index 9513ab696fff..e7d12dbb3fa5 100644 --- a/drivers/net/wireless/ath/ath10k/snoc.c +++ b/drivers/net/wireless/ath/ath10k/snoc.c @@ -926,7 +926,7 @@ static int ath10k_snoc_hif_start(struct ath10k *ar) struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); bitmap_clear(ar_snoc->pending_ce_irqs, 0, CE_COUNT_MAX); - + dev_set_threaded(&ar->napi_dev, true); ath10k_core_napi_enable(ar); ath10k_snoc_irq_enable(ar); ath10k_snoc_rx_post(ar);
NAPI poll can be done in threaded context along with soft irq context. Threaded context can be scheduled efficiently, thus creating less of bottleneck during Rx processing. This patch is to enable threaded NAPI on ath10k driver. Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2-00696-QCAHLSWMTPL-1 Signed-off-by: Abhishek Kumar <kuabhs@chromium.org> --- drivers/net/wireless/ath/ath10k/pci.c | 1 + drivers/net/wireless/ath/ath10k/sdio.c | 1 + drivers/net/wireless/ath/ath10k/snoc.c | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-)