From patchwork Wed Nov 18 19:10:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 11915769 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0599EC63798 for ; Wed, 18 Nov 2020 20:07:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AA41E2225E for ; Wed, 18 Nov 2020 20:07:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Q8D/uSGI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726412AbgKRUHR (ORCPT ); Wed, 18 Nov 2020 15:07:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725710AbgKRUHQ (ORCPT ); Wed, 18 Nov 2020 15:07:16 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D09E0C0613D4 for ; Wed, 18 Nov 2020 12:07:16 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id j12so1850933plj.20 for ; Wed, 18 Nov 2020 12:07:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=jIQIpkd1u2dhmvMkde7f4Crf0+RGMzwf23gd/v5mBFA=; b=Q8D/uSGI/yz/Zx6wPl0CT4F5vE7H+e4cqBz+Zg8lKhY0Hmn3+DCf5cETP/Ub6orSMJ Uts3NXD8msRUKyOjuWA2Gmck5Tg/Xe3aKUZJTdgOLuWmh69qm0qRZfqhXCdiaxWw3fdi wHVSxEsB3QSWCV9VeNsjl7lGnHHeSJY9lj/dH5CHOGDGMsxXlGA8oZHC7CJmgrW1wcwm xoE2OG4dlAqWNvc7sD82XDdRn0g2aG0FaRRbnzq2pJ9wqHvmumNwH3lEqfVfncsfEpwB vhkjfQFDzTGO0M6Xr2JEh9eIeITjSDCmZAcout2L/vpLxZH2Bmg+fxVLZiWV6SA14bJS rD6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=jIQIpkd1u2dhmvMkde7f4Crf0+RGMzwf23gd/v5mBFA=; b=G3Iu8fbZjfZ2hr2pKfEIJxxxCQtXHJc+mY5qnvKQCr0yrAgWr4HFsG2cS8v1xmtymi uTYGV/9oSERXocXJHE9jpARTdmg6kQ6dwonNWnS3u2c1+9dcxnWulVB/yXI/gMWM4cNj Y/f8lpfzj/4QnYi+ZXQrQDtTK4OJZOLL5+u+lXg47Oab5FtZiOcgRfx+JGpb1KEU5omA qa5UENTFqkqxdJF45kMJM9ccocWH5rkHXAWqaJrb7ggqonEXVJejtre3IzihMqJsK/2D RLMuudW+Jmt3NeKbY5S1t1GKTgdxJKhMYiEOipTl/7kuk3Nq8H1gPjo53PCsYiMDr6CW C4tg== X-Gm-Message-State: AOAM531szAGMn4fs1P4cGNvVn5BPqnp54L/9BkmqMcSNN2of+Jg+LbIR ShSnppBJd/gR4k2BlYMWeGDm8ssXRg8= X-Google-Smtp-Source: ABdhPJyVw0xqrq0vSmAc4B5DTUNbhlxN3NCJ3Xc+u7vW6IkFXeZJwoFLe20zg26PNsA7uS0KTAGQ8MQkzSI= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a62:8cd6:0:b029:18b:ad92:503b with SMTP id m205-20020a628cd60000b029018bad92503bmr5962239pfd.77.1605730036412; Wed, 18 Nov 2020 12:07:16 -0800 (PST) Date: Wed, 18 Nov 2020 11:10:05 -0800 In-Reply-To: <20201118191009.3406652-1-weiwan@google.com> Message-Id: <20201118191009.3406652-2-weiwan@google.com> Mime-Version: 1.0 References: <20201118191009.3406652-1-weiwan@google.com> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog Subject: [PATCH net-next v3 1/5] net: implement threaded-able napi poll loop support From: Wei Wang To: David Miller , Jakub Kicinski , netdev@vger.kernel.org Cc: Eric Dumazet , Felix Fietkau , Paolo Abeni , Hannes Frederic Sowa , Hillf Danton , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Paolo Abeni This patch allows running each napi poll loop inside its own kernel thread. The rx mode can be enabled per napi instance via the newly addded napi_set_threaded() api; the requested kthread will be created on demand and shut down on device stop. Once that threaded mode is enabled and the kthread is started, napi_schedule() will wake-up such thread instead of scheduling the softirq. The threaded poll loop behaves quite likely the net_rx_action, but it does not have to manipulate local irqs and uses an explicit scheduling point based on netdev_budget. Signed-off-by: Paolo Abeni Signed-off-by: Hannes Frederic Sowa Signed-off-by: Wei Wang Reviewed-by: Eric Dumazet --- include/linux/netdevice.h | 5 ++ net/core/dev.c | 113 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 118 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 03433a4c929e..5ba430f56085 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -347,6 +347,7 @@ struct napi_struct { struct list_head dev_list; struct hlist_node napi_hash_node; unsigned int napi_id; + struct task_struct *thread; }; enum { @@ -357,6 +358,7 @@ enum { NAPI_STATE_LISTED, /* NAPI added to system lists */ NAPI_STATE_NO_BUSY_POLL,/* Do not add in napi_hash, no busy polling */ NAPI_STATE_IN_BUSY_POLL,/* sk_busy_loop() owns this NAPI */ + NAPI_STATE_THREADED, /* The poll is performed inside its own thread*/ }; enum { @@ -367,6 +369,7 @@ enum { NAPIF_STATE_LISTED = BIT(NAPI_STATE_LISTED), NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL), NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL), + NAPIF_STATE_THREADED = BIT(NAPI_STATE_THREADED), }; enum gro_result { @@ -488,6 +491,8 @@ static inline bool napi_complete(struct napi_struct *n) return napi_complete_done(n, 0); } +int napi_set_threaded(struct napi_struct *n, bool threaded); + /** * napi_disable - prevent NAPI from scheduling * @n: NAPI context diff --git a/net/core/dev.c b/net/core/dev.c index 4bfdcd6b20e8..a5d2ead8be78 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -91,6 +91,7 @@ #include #include #include +#include #include #include #include @@ -1488,9 +1489,19 @@ void netdev_notify_peers(struct net_device *dev) } EXPORT_SYMBOL(netdev_notify_peers); +static int napi_threaded_poll(void *data); + +static void napi_thread_start(struct napi_struct *n) +{ + if (test_bit(NAPI_STATE_THREADED, &n->state) && !n->thread) + n->thread = kthread_create(napi_threaded_poll, n, "%s-%d", + n->dev->name, n->napi_id); +} + static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) { const struct net_device_ops *ops = dev->netdev_ops; + struct napi_struct *n; int ret; ASSERT_RTNL(); @@ -1522,6 +1533,9 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) if (!ret && ops->ndo_open) ret = ops->ndo_open(dev); + list_for_each_entry(n, &dev->napi_list, dev_list) + napi_thread_start(n); + netpoll_poll_enable(dev); if (ret) @@ -1567,6 +1581,14 @@ int dev_open(struct net_device *dev, struct netlink_ext_ack *extack) } EXPORT_SYMBOL(dev_open); +static void napi_thread_stop(struct napi_struct *n) +{ + if (!n->thread) + return; + kthread_stop(n->thread); + n->thread = NULL; +} + static void __dev_close_many(struct list_head *head) { struct net_device *dev; @@ -1595,6 +1617,7 @@ static void __dev_close_many(struct list_head *head) list_for_each_entry(dev, head, close_list) { const struct net_device_ops *ops = dev->netdev_ops; + struct napi_struct *n; /* * Call the device specific close. This cannot fail. @@ -1606,6 +1629,9 @@ static void __dev_close_many(struct list_head *head) if (ops->ndo_stop) ops->ndo_stop(dev); + list_for_each_entry(n, &dev->napi_list, dev_list) + napi_thread_stop(n); + dev->flags &= ~IFF_UP; netpoll_poll_enable(dev); } @@ -4245,6 +4271,11 @@ int gro_normal_batch __read_mostly = 8; static inline void ____napi_schedule(struct softnet_data *sd, struct napi_struct *napi) { + if (napi->thread) { + wake_up_process(napi->thread); + return; + } + list_add_tail(&napi->poll_list, &sd->poll_list); __raise_softirq_irqoff(NET_RX_SOFTIRQ); } @@ -6667,6 +6698,30 @@ static void init_gro_hash(struct napi_struct *napi) napi->gro_bitmask = 0; } +int napi_set_threaded(struct napi_struct *n, bool threaded) +{ + ASSERT_RTNL(); + + if (n->dev->flags & IFF_UP) + return -EBUSY; + + if (threaded == !!test_bit(NAPI_STATE_THREADED, &n->state)) + return 0; + if (threaded) + set_bit(NAPI_STATE_THREADED, &n->state); + else + clear_bit(NAPI_STATE_THREADED, &n->state); + + /* if the device is initializing, nothing todo */ + if (test_bit(__LINK_STATE_START, &n->dev->state)) + return 0; + + napi_thread_stop(n); + napi_thread_start(n); + return 0; +} +EXPORT_SYMBOL(napi_set_threaded); + void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) { @@ -6807,6 +6862,64 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) return work; } +static int napi_thread_wait(struct napi_struct *napi) +{ + set_current_state(TASK_INTERRUPTIBLE); + + while (!kthread_should_stop() && !napi_disable_pending(napi)) { + if (test_bit(NAPI_STATE_SCHED, &napi->state)) { + __set_current_state(TASK_RUNNING); + return 0; + } + + schedule(); + set_current_state(TASK_INTERRUPTIBLE); + } + __set_current_state(TASK_RUNNING); + return -1; +} + +static int napi_threaded_poll(void *data) +{ + struct napi_struct *napi = data; + + while (!napi_thread_wait(napi)) { + struct list_head dummy_repoll; + int budget = netdev_budget; + unsigned long time_limit; + bool again = true; + + INIT_LIST_HEAD(&dummy_repoll); + local_bh_disable(); + time_limit = jiffies + 2; + do { + /* ensure that the poll list is not empty */ + if (list_empty(&dummy_repoll)) + list_add(&napi->poll_list, &dummy_repoll); + + budget -= napi_poll(napi, &dummy_repoll); + if (unlikely(budget <= 0 || + time_after_eq(jiffies, time_limit))) { + cond_resched(); + + /* refresh the budget */ + budget = netdev_budget; + __kfree_skb_flush(); + time_limit = jiffies + 2; + } + + if (napi_disable_pending(napi)) + again = false; + else if (!test_bit(NAPI_STATE_SCHED, &napi->state)) + again = false; + } while (again); + + __kfree_skb_flush(); + local_bh_enable(); + } + return 0; +} + static __latent_entropy void net_rx_action(struct softirq_action *h) { struct softnet_data *sd = this_cpu_ptr(&softnet_data); From patchwork Wed Nov 18 19:10:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 11915773 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3727C5519F for ; Wed, 18 Nov 2020 20:07:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 83976205F4 for ; Wed, 18 Nov 2020 20:07:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="dApvoR09" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726588AbgKRUHU (ORCPT ); Wed, 18 Nov 2020 15:07:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725710AbgKRUHU (ORCPT ); Wed, 18 Nov 2020 15:07:20 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 181A3C0613D4 for ; Wed, 18 Nov 2020 12:07:20 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id w8so4089621ybq.4 for ; Wed, 18 Nov 2020 12:07:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=AujCKgYJhnDkM058WtqUGOklnts1B0HNyUZaNDRuuQQ=; b=dApvoR09gTj3QUFLWQMo+ls1/rAizTC+a/7Fj5MjX3j9MHL5oJbcSbP0F5//XnXqYr hPlauMSV3LxXZ5IX6TKcDKfFC0rc6hmkcgk0Seq+ikD1hcy5kSkcrbOGExWtgQ7wVPNZ xNsMVEZvBF6sBsU4/eqioVarDS7gCc98rmv9TvXOl1Yae3puFveWW+vUI2dvA99g7A6X E3yd48tcFaD4/gx8iETI8NFPlqEIbIagRjiViDb/t/BA8JXHSunt/5v+SICaPS7F/OCK 5QquxZH7+/c0v0v4PgFtr924r0+jVJPWKCv/quOZAcYJ9rMTx62P6XAwo/wv6C8lpHBO b3IQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=AujCKgYJhnDkM058WtqUGOklnts1B0HNyUZaNDRuuQQ=; b=pPa6n2PNP5Tn2AvZ9x2Q53VLZtL4Jdo67rYWbimOUYFgFekE9NBl77e1GnCO9kzNYu /D+8EZeXCZELq0dQHNH7yfwOzj01IxarTtrpC62ySDSV+IDC8nS0kDHkZvtGODRRI8yJ pv7NxzZu2+PbYUbU8feruQWzZfCF7peiobOx8FQK/+G134Pp6ALtC7PjnD3Y3EMy1Tyn qgXqCRcQ5iI+7DGRIQ4yq+7veGIBeVXU8NAOq5K6yFgSQkJeQLnpi5tTtzL5arnSABeZ TpBbQA2w9KTfhMM+noCRdqykFCilOYd333mgVb7EHzx+VMNRsb0d0M/8Lg76UZMvUTSw /rKw== X-Gm-Message-State: AOAM533S5Tn4axeNEFG6HRwauNwgECL7jWM8t7o1SP2B/6pPiP1Nq3zm feEkdlBua86ZhUPdA9ZqP/eoVsqLoMo= X-Google-Smtp-Source: ABdhPJyYnX9dCYMVa7OXG3SSS364NblkClXfMTiQ6AkoF2DZwQiczxH/WVHhu5i50t+Z5u+4QfqP7kmqAIk= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a25:ba4c:: with SMTP id z12mr9344035ybj.366.1605730039293; Wed, 18 Nov 2020 12:07:19 -0800 (PST) Date: Wed, 18 Nov 2020 11:10:06 -0800 In-Reply-To: <20201118191009.3406652-1-weiwan@google.com> Message-Id: <20201118191009.3406652-3-weiwan@google.com> Mime-Version: 1.0 References: <20201118191009.3406652-1-weiwan@google.com> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog Subject: [PATCH net-next v3 2/5] net: add sysfs attribute to control napi threaded mode From: Wei Wang To: David Miller , Jakub Kicinski , netdev@vger.kernel.org Cc: Eric Dumazet , Felix Fietkau , Paolo Abeni , Hannes Frederic Sowa , Hillf Danton , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Paolo Abeni this patch adds a new sysfs attribute to the network device class. Said attribute is a bitmask that allows controlling the threaded mode for all the napi instances of the given network device. The threaded mode can be switched only if related network device is down. Signed-off-by: Paolo Abeni Signed-off-by: Hannes Frederic Sowa Signed-off-by: Wei Wang Reviewed-by: Eric Dumazet --- net/core/net-sysfs.c | 103 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 103 insertions(+) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 94fff0700bdd..df8dd25e5e4b 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -538,6 +538,108 @@ static ssize_t phys_switch_id_show(struct device *dev, } static DEVICE_ATTR_RO(phys_switch_id); +static unsigned long *__alloc_thread_bitmap(struct net_device *netdev, + int *bits) +{ + struct napi_struct *n; + + *bits = 0; + list_for_each_entry(n, &netdev->napi_list, dev_list) + (*bits)++; + + return kmalloc_array(BITS_TO_LONGS(*bits), sizeof(unsigned long), + GFP_ATOMIC | __GFP_ZERO); +} + +static ssize_t threaded_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct net_device *netdev = to_net_dev(dev); + struct napi_struct *n; + unsigned long *bmap; + size_t count = 0; + int i, bits; + + if (!rtnl_trylock()) + return restart_syscall(); + + if (!dev_isalive(netdev)) + goto unlock; + + bmap = __alloc_thread_bitmap(netdev, &bits); + if (!bmap) { + count = -ENOMEM; + goto unlock; + } + + i = 0; + list_for_each_entry(n, &netdev->napi_list, dev_list) { + if (test_bit(NAPI_STATE_THREADED, &n->state)) + set_bit(i, bmap); + i++; + } + + count = bitmap_print_to_pagebuf(true, buf, bmap, bits); + kfree(bmap); + +unlock: + rtnl_unlock(); + + return count; +} + +static ssize_t threaded_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct net_device *netdev = to_net_dev(dev); + struct napi_struct *n; + unsigned long *bmap; + int i, bits; + size_t ret; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + if (!rtnl_trylock()) + return restart_syscall(); + + if (!dev_isalive(netdev)) { + ret = len; + goto unlock; + } + + if (netdev->flags & IFF_UP) { + ret = -EBUSY; + goto unlock; + } + + bmap = __alloc_thread_bitmap(netdev, &bits); + if (!bmap) { + ret = -ENOMEM; + goto unlock; + } + + ret = bitmap_parselist(buf, bmap, bits); + if (ret) + goto free_unlock; + + i = 0; + list_for_each_entry(n, &netdev->napi_list, dev_list) { + napi_set_threaded(n, test_bit(i, bmap)); + i++; + } + ret = len; + +free_unlock: + kfree(bmap); + +unlock: + rtnl_unlock(); + return ret; +} +static DEVICE_ATTR_RW(threaded); + static struct attribute *net_class_attrs[] __ro_after_init = { &dev_attr_netdev_group.attr, &dev_attr_type.attr, @@ -570,6 +672,7 @@ static struct attribute *net_class_attrs[] __ro_after_init = { &dev_attr_proto_down.attr, &dev_attr_carrier_up_count.attr, &dev_attr_carrier_down_count.attr, + &dev_attr_threaded.attr, NULL, }; ATTRIBUTE_GROUPS(net_class); From patchwork Wed Nov 18 19:10:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 11915775 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34D67C56202 for ; Wed, 18 Nov 2020 20:07:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C4E1720782 for ; Wed, 18 Nov 2020 20:07:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="A8+WD7+B" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726756AbgKRUHW (ORCPT ); Wed, 18 Nov 2020 15:07:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725710AbgKRUHW (ORCPT ); Wed, 18 Nov 2020 15:07:22 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74DEBC0613D4 for ; Wed, 18 Nov 2020 12:07:22 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id b23so778294pfp.15 for ; Wed, 18 Nov 2020 12:07:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=G8hWwIw1O/XzD0iRvl1/p1SRPUAAJsJjTcuCciOramk=; b=A8+WD7+BtZ8SbsLMQx7Jh+1fj5CcExfRR7Cnsgzf1qVs22/JdwDcTjy3a+qtNyiYoY IxK4WEa25XLoIAXeBirZ6q8zZpFiMOBPTLcKSDLUcJ6EyBJIz8NSxorqvzZOV+556ItT Pds29Lvww9JpznXCFGwiyKncDra/kr+fWUUgHexNnfwdZ2gYzULiMGeSQA2UTuyI+t4c 46eQUcKvVp8dLMYXudrFCjY+J6Vv5Ch0xRN688srGxExHnxUOuUSNXkwO+zEwV5DavF0 FVs33KAPXFpy6KZOQfMfEOqgNP5xtA+P7b4yIKJqUw/64g0+2269xfeiviIrtMWO0sCI vUmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=G8hWwIw1O/XzD0iRvl1/p1SRPUAAJsJjTcuCciOramk=; b=O8AziAlLAkiY8fXrarafkZq67uFkMDVLW2UgkyQSxsUM6iLq7Toh1aycMpsyeZHWGd HaBeYe3gnwa4jWxEWpwXntJObjIeyMMBFiUf9mLbseZM+dxb4miEBsVlKb1ZiGah82ox Fjg7xWQPyRmfzJ61y3jBRUKubYUc6L/G75X3jO1h17Ew9+4k2m6oOYuEixTz1yncQvnA VAPwFCFsnq80Y6hp/TwL6QW2q955a0Hy+CToKamm8tyzqk9Jw5Vz2ibcpHi4SUW703c9 byZ/wHJ8qL2Vl0W+kCkeXZEgBWyO1YJLpykgWSv/sOlJElpsr+QTxbbTrMRJq6kjRnsx rq/A== X-Gm-Message-State: AOAM5300h3nzqalz4VY7RcjC8/RoZ0ambR9nZLOVt3inBxY/HcNNJRgd 71hkS5dbraqQsA33BZISVg9rywuxvqU= X-Google-Smtp-Source: ABdhPJwFOsWTpz6B34ulUjf/ehkr2SMyf7j6gm/sG+o056BIux0otfP4p2XCBnM7g1G2SR6DXJLCmM2Cyvc= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a62:2ec3:0:b029:197:6ca1:2498 with SMTP id u186-20020a622ec30000b02901976ca12498mr6027294pfu.32.1605730042013; Wed, 18 Nov 2020 12:07:22 -0800 (PST) Date: Wed, 18 Nov 2020 11:10:07 -0800 In-Reply-To: <20201118191009.3406652-1-weiwan@google.com> Message-Id: <20201118191009.3406652-4-weiwan@google.com> Mime-Version: 1.0 References: <20201118191009.3406652-1-weiwan@google.com> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog Subject: [PATCH net-next v3 3/5] net: extract napi poll functionality to __napi_poll() From: Wei Wang To: David Miller , Jakub Kicinski , netdev@vger.kernel.org Cc: Eric Dumazet , Felix Fietkau , Paolo Abeni , Hannes Frederic Sowa , Hillf Danton , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Felix Fietkau This commit introduces a new function __napi_poll() which does the main logic of the existing napi_poll() function, and will be called by other functions in later commits. This idea and implementation is done by Felix Fietkau and is proposed as part of the patch to move napi work to work_queue context. This commit by itself is a code restructure. Signed-off-by: Felix Fietkau Signed-off-by: Wei Wang Reviewed-by: Eric Dumazet --- net/core/dev.c | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index a5d2ead8be78..a739dbbe4d89 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6796,15 +6796,10 @@ void __netif_napi_del(struct napi_struct *napi) } EXPORT_SYMBOL(__netif_napi_del); -static int napi_poll(struct napi_struct *n, struct list_head *repoll) +static int __napi_poll(struct napi_struct *n, bool *repoll) { - void *have; int work, weight; - list_del_init(&n->poll_list); - - have = netpoll_poll_lock(n); - weight = n->weight; /* This NAPI_STATE_SCHED test is for avoiding a race @@ -6824,7 +6819,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) n->poll, work, weight); if (likely(work < weight)) - goto out_unlock; + return work; /* Drivers must not modify the NAPI state if they * consume the entire weight. In such cases this code @@ -6833,7 +6828,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) */ if (unlikely(napi_disable_pending(n))) { napi_complete(n); - goto out_unlock; + return work; } if (n->gro_bitmask) { @@ -6845,6 +6840,26 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) gro_normal_list(n); + *repoll = true; + + return work; +} + +static int napi_poll(struct napi_struct *n, struct list_head *repoll) +{ + bool do_repoll = false; + void *have; + int work; + + list_del_init(&n->poll_list); + + have = netpoll_poll_lock(n); + + work = __napi_poll(n, &do_repoll); + + if (!do_repoll) + goto out_unlock; + /* Some drivers may have called napi_schedule * prior to exhausting their budget. */ From patchwork Wed Nov 18 19:10:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 11915777 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70F80C56201 for ; Wed, 18 Nov 2020 20:07:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0F0C121527 for ; Wed, 18 Nov 2020 20:07:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ZDpEvGfK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726788AbgKRUH2 (ORCPT ); Wed, 18 Nov 2020 15:07:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35132 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725710AbgKRUH1 (ORCPT ); Wed, 18 Nov 2020 15:07:27 -0500 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 450EBC0613D4 for ; Wed, 18 Nov 2020 12:07:26 -0800 (PST) Received: by mail-qk1-x74a.google.com with SMTP id 143so2441562qkg.20 for ; Wed, 18 Nov 2020 12:07:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=InYd1q+YUiZ+OtuTZaswYvG/hZu51QspPnytjgDA4O4=; b=ZDpEvGfKzWw6cwm0YM1T7Md0Nf7/8RVu+RbmZeNx8yAIziDj30ur11JmiWvUxSRzOR QvT0LOz1Gp2EdlLkdO47dAgo6MXp/LHQp45ucnHda6sc0KTm678tkgaZlIFhEyVUg37v 9wUEyvE9EJxESqBa/ntUsNWE7Cx62LYTh9lI0N/oATuRLE+t4gDaWqF40ADdtbh5yZH3 UdGW+ePfgBEjUlhEluH/WyN26gLEOR7cHURomiPPJEMKODJzAhwgbfBUWcrEJKDBXpFA MIcE9KRCDU526HjpiitnGrE1H866ecHZ7lREUR29tkrpJ/8iNmNkF9tvveVe/I9TUEi+ mmAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=InYd1q+YUiZ+OtuTZaswYvG/hZu51QspPnytjgDA4O4=; b=pxHY7kd+QMK34gAavV/OMwqbODmlQxB/N9sVPvSnfKs7z+5YkMkq0noVRiM66rdhRB TbKgvatcmbQb/3fLwiW3eEnUXOxy2+bNdX5KvolNO0NooVX9vQGPIrCxXlfYKGJNXqVp Mx01MYbT8J33scb+iTpVrpRpQ8s+jusjE9VHhUBi+WCt7btkkmf+efHYBr68K3tFFrC3 CvF2pm5/K9LEzvceldS0siWG1iZy/ZQ9nSLiFIk49fJeBQ3nf8r49U44ZWh/HS6/1fVq s1sStmT4MJUvR40spNrH9m4mw/PscgIMSEdbLfpI6fxosbtya4tuYOHBrRhDHYdTZAUw q0tA== X-Gm-Message-State: AOAM5311uxaIPFLQnTarv8fK7ZFkDOJq+tyf9stIIjL4CjfhlMFSR+oH 8KIhgpdDJYxT10NlGuUieejKDpxSgQM= X-Google-Smtp-Source: ABdhPJyio2ogzQK8gGPGmDcj3pQ6jmgc0Hq0vF+VI7pdNmNB9/cnulxbfGP1Uw4OH3ySdJaXomDTXFf7wec= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:ad4:56f2:: with SMTP id cr18mr6496727qvb.62.1605730045387; Wed, 18 Nov 2020 12:07:25 -0800 (PST) Date: Wed, 18 Nov 2020 11:10:08 -0800 In-Reply-To: <20201118191009.3406652-1-weiwan@google.com> Message-Id: <20201118191009.3406652-5-weiwan@google.com> Mime-Version: 1.0 References: <20201118191009.3406652-1-weiwan@google.com> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog Subject: [PATCH net-next v3 4/5] net: modify kthread handler to use __napi_poll() From: Wei Wang To: David Miller , Jakub Kicinski , netdev@vger.kernel.org Cc: Eric Dumazet , Felix Fietkau , Paolo Abeni , Hannes Frederic Sowa , Hillf Danton , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Jakub Kicinski The current kthread handler calls napi_poll() and has to pass a dummy repoll list to the function, which seems redundent. The new proposed kthread handler calls the newly proposed __napi_poll(), and respects napi->weight as before. If repoll is needed, cond_resched() is called first to give other tasks a chance to run before repolling. This change is proposed by Jakub Kicinski on top of the previous patch. Signed-off-by: Jakub Kicinski Signed-off-by: Wei Wang Reviewed-by: Eric Dumazet --- net/core/dev.c | 62 +++++++++++++++++++------------------------------- 1 file changed, 24 insertions(+), 38 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index a739dbbe4d89..88437cdf29f1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6840,6 +6840,15 @@ static int __napi_poll(struct napi_struct *n, bool *repoll) gro_normal_list(n); + /* Some drivers may have called napi_schedule + * prior to exhausting their budget. + */ + if (unlikely(!list_empty(&n->poll_list))) { + pr_warn_once("%s: Budget exhausted after napi rescheduled\n", + n->dev ? n->dev->name : "backlog"); + return work; + } + *repoll = true; return work; @@ -6860,15 +6869,6 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) if (!do_repoll) goto out_unlock; - /* Some drivers may have called napi_schedule - * prior to exhausting their budget. - */ - if (unlikely(!list_empty(&n->poll_list))) { - pr_warn_once("%s: Budget exhausted after napi rescheduled\n", - n->dev ? n->dev->name : "backlog"); - goto out_unlock; - } - list_add_tail(&n->poll_list, repoll); out_unlock: @@ -6897,40 +6897,26 @@ static int napi_thread_wait(struct napi_struct *napi) static int napi_threaded_poll(void *data) { struct napi_struct *napi = data; + void *have; while (!napi_thread_wait(napi)) { - struct list_head dummy_repoll; - int budget = netdev_budget; - unsigned long time_limit; - bool again = true; + for (;;) { + bool repoll = false; - INIT_LIST_HEAD(&dummy_repoll); - local_bh_disable(); - time_limit = jiffies + 2; - do { - /* ensure that the poll list is not empty */ - if (list_empty(&dummy_repoll)) - list_add(&napi->poll_list, &dummy_repoll); - - budget -= napi_poll(napi, &dummy_repoll); - if (unlikely(budget <= 0 || - time_after_eq(jiffies, time_limit))) { - cond_resched(); - - /* refresh the budget */ - budget = netdev_budget; - __kfree_skb_flush(); - time_limit = jiffies + 2; - } + local_bh_disable(); - if (napi_disable_pending(napi)) - again = false; - else if (!test_bit(NAPI_STATE_SCHED, &napi->state)) - again = false; - } while (again); + have = netpoll_poll_lock(napi); + __napi_poll(napi, &repoll); + netpoll_poll_unlock(have); - __kfree_skb_flush(); - local_bh_enable(); + __kfree_skb_flush(); + local_bh_enable(); + + if (!repoll) + break; + + cond_resched(); + } } return 0; } From patchwork Wed Nov 18 19:10:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 11915779 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CBFBC6379D for ; Wed, 18 Nov 2020 20:07:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B166205F4 for ; Wed, 18 Nov 2020 20:07:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="OwsxSjN1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726875AbgKRUHa (ORCPT ); Wed, 18 Nov 2020 15:07:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725710AbgKRUH2 (ORCPT ); Wed, 18 Nov 2020 15:07:28 -0500 Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACAB0C0613D4 for ; Wed, 18 Nov 2020 12:07:28 -0800 (PST) Received: by mail-qt1-x849.google.com with SMTP id i20so2343227qtr.0 for ; Wed, 18 Nov 2020 12:07:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=xlHZPWU+bbDS9e6CXR8uTghGetIA9/oXojkEATLlN2Q=; b=OwsxSjN1pAb1/UqydZ1pl1jhQP4tYy32uUuSI/kiYsQknyd1m6PFQ0YkMZE4NgC0GE 6DafeJE9SCkKLpoh3363IXue+lHI23zax7IcGKzgnIvdJsRQZvq2kCxspYOF9sAHlMBI zQi9a7ESb0JLp/WyJq1NiL+0ZSmYV3LJnQ/IL5DjFZvTfnETwwqHWTj22p4llIYm7Y0K 3sxnGoDTK2bDJFesvE30Vt9ZqoNwlUo4gxI9G9gH7qmTUNJgaGaXaahCvN/d2lCbQckc 32ADVNunsD1CLQG4v+vdW44M1MlMVrKcG8mM1Hr+rQy5F1o1vYc8BlSco5OL3GxqR210 82Pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=xlHZPWU+bbDS9e6CXR8uTghGetIA9/oXojkEATLlN2Q=; b=MKLA53me4nvLHTl4nYd6vQv9soWDWOiHfoIs4+wIpqAzKQ8FBN4XYA7ZK4J4I2B1me njIAvY8nom2FBXWKzZ9MYiVwLtzDeSQBcgQaEBVcABr22SXBXiuccaiCqL2v3O8sx0sc JL27Of7mqybrcR9ZgxHA2ezb5jHmNQku7YhEpgq2cOX9Rn4PnagYSrjVPIkXactToJg1 pxe3ryFEKUpaoBbaKPtBy8HBIUMUPdQ8pcgEj6Hd0lHZ0TJybn2fLi1cygytCaymeg2I PEhHXWiPyhF1yIZKrkN4BUVLpEdjEeo98zF3tgn56EM2ULM8x64QX9v71uwkash4+iDW yKWA== X-Gm-Message-State: AOAM533kAsMVdL0NEQv1MciM1AsFr+YnT5MQTM2yLUiCK8+xcNmUlHdw rnjw/nN5OQMf7L9x1J4X5+tMXJ/LPdo= X-Google-Smtp-Source: ABdhPJzfkty8I1w4D5hP3Z95e2Fz7aEiqf2db3Wvt/6TducWu7yd8ACV4j7Vo54og0areNe0qXvV7pRas74= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a0c:fb4a:: with SMTP id b10mr6753589qvq.1.1605730047898; Wed, 18 Nov 2020 12:07:27 -0800 (PST) Date: Wed, 18 Nov 2020 11:10:09 -0800 In-Reply-To: <20201118191009.3406652-1-weiwan@google.com> Message-Id: <20201118191009.3406652-6-weiwan@google.com> Mime-Version: 1.0 References: <20201118191009.3406652-1-weiwan@google.com> X-Mailer: git-send-email 2.29.2.454.gaff20da3a2-goog Subject: [PATCH net-next v3 5/5] net: improve napi threaded config From: Wei Wang To: David Miller , Jakub Kicinski , netdev@vger.kernel.org Cc: Eric Dumazet , Felix Fietkau , Paolo Abeni , Hannes Frederic Sowa , Hillf Danton , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This commit mainly addresses the threaded config to make the switch between softirq based and kthread based NAPI processing not require a device down/up. It also moves the kthread_run() call to the sysfs handler when user tries to enable "threaded" on napi, and properly handles the kthread_run() failure. This is because certain drivers do not have the napi created and linked to the dev when dev_open() is called. So the previous implementation does not work properly there. Signed-off-by: Wei Wang Reviewed-by: Eric Dumazet --- net/core/dev.c | 53 ++++++++++++++++++++++++++------------------ net/core/net-sysfs.c | 9 +++----- 2 files changed, 35 insertions(+), 27 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 88437cdf29f1..7788899b100f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1491,17 +1491,28 @@ EXPORT_SYMBOL(netdev_notify_peers); static int napi_threaded_poll(void *data); -static void napi_thread_start(struct napi_struct *n) +static int napi_kthread_create(struct napi_struct *n) { - if (test_bit(NAPI_STATE_THREADED, &n->state) && !n->thread) - n->thread = kthread_create(napi_threaded_poll, n, "%s-%d", - n->dev->name, n->napi_id); + int err = 0; + + /* Create and wake up the kthread once to put it in + * TASK_INTERRUPTIBLE mode to avoid the blocked task + * warning and work with loadavg. + */ + n->thread = kthread_run(napi_threaded_poll, n, "napi/%s-%d", + n->dev->name, n->napi_id); + if (IS_ERR(n->thread)) { + err = PTR_ERR(n->thread); + pr_err("kthread_run failed with err %d\n", err); + n->thread = NULL; + } + + return err; } static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) { const struct net_device_ops *ops = dev->netdev_ops; - struct napi_struct *n; int ret; ASSERT_RTNL(); @@ -1533,9 +1544,6 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) if (!ret && ops->ndo_open) ret = ops->ndo_open(dev); - list_for_each_entry(n, &dev->napi_list, dev_list) - napi_thread_start(n); - netpoll_poll_enable(dev); if (ret) @@ -1586,6 +1594,7 @@ static void napi_thread_stop(struct napi_struct *n) if (!n->thread) return; kthread_stop(n->thread); + clear_bit(NAPI_STATE_THREADED, &n->state); n->thread = NULL; } @@ -4271,7 +4280,7 @@ int gro_normal_batch __read_mostly = 8; static inline void ____napi_schedule(struct softnet_data *sd, struct napi_struct *napi) { - if (napi->thread) { + if (test_bit(NAPI_STATE_THREADED, &napi->state)) { wake_up_process(napi->thread); return; } @@ -6700,25 +6709,25 @@ static void init_gro_hash(struct napi_struct *napi) int napi_set_threaded(struct napi_struct *n, bool threaded) { - ASSERT_RTNL(); + int err = 0; - if (n->dev->flags & IFF_UP) - return -EBUSY; + ASSERT_RTNL(); if (threaded == !!test_bit(NAPI_STATE_THREADED, &n->state)) return 0; - if (threaded) + if (threaded) { + if (!n->thread) { + err = napi_kthread_create(n); + if (err) + goto out; + } set_bit(NAPI_STATE_THREADED, &n->state); - else + } else { clear_bit(NAPI_STATE_THREADED, &n->state); + } - /* if the device is initializing, nothing todo */ - if (test_bit(__LINK_STATE_START, &n->dev->state)) - return 0; - - napi_thread_stop(n); - napi_thread_start(n); - return 0; +out: + return err; } EXPORT_SYMBOL(napi_set_threaded); @@ -6763,6 +6772,7 @@ void napi_disable(struct napi_struct *n) msleep(1); hrtimer_cancel(&n->timer); + napi_thread_stop(n); clear_bit(NAPI_STATE_DISABLE, &n->state); } @@ -6883,6 +6893,7 @@ static int napi_thread_wait(struct napi_struct *napi) while (!kthread_should_stop() && !napi_disable_pending(napi)) { if (test_bit(NAPI_STATE_SCHED, &napi->state)) { + WARN_ON(!list_empty(&napi->poll_list)); __set_current_state(TASK_RUNNING); return 0; } diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index df8dd25e5e4b..1e24c1e81ad8 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -609,11 +609,6 @@ static ssize_t threaded_store(struct device *dev, goto unlock; } - if (netdev->flags & IFF_UP) { - ret = -EBUSY; - goto unlock; - } - bmap = __alloc_thread_bitmap(netdev, &bits); if (!bmap) { ret = -ENOMEM; @@ -626,7 +621,9 @@ static ssize_t threaded_store(struct device *dev, i = 0; list_for_each_entry(n, &netdev->napi_list, dev_list) { - napi_set_threaded(n, test_bit(i, bmap)); + ret = napi_set_threaded(n, test_bit(i, bmap)); + if (ret) + goto free_unlock; i++; } ret = len;