From patchwork Tue Jun 20 11:18:49 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Liu X-Patchwork-Id: 9799247 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B58CF601BC for ; Tue, 20 Jun 2017 11:21:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A78222000A for ; Tue, 20 Jun 2017 11:21:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9C72F25E13; Tue, 20 Jun 2017 11:21:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 31E8E2000A for ; Tue, 20 Jun 2017 11:21:08 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dNHBP-0002MQ-Po; Tue, 20 Jun 2017 11:18:55 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dNHBO-0002MF-3h for xen-devel@lists.xen.org; Tue, 20 Jun 2017 11:18:54 +0000 Received: from [85.158.137.68] by server-14.bemta-3.messagelabs.com id 75/5F-10689-D9409495; Tue, 20 Jun 2017 11:18:53 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprJIsWRWlGSWpSXmKPExsXitHSDve4cFs9 IgyvHVS2WfFzM4sDocXT3b6YAxijWzLyk/IoE1ozO/ycYC5ZJV1yZ3MTawLhGvIuRk0NCwF9i 37ETzCA2i4CqxPk5X1lAbDYBZYmfnb1sILaIgI7E/a8bgGo4OJgFAiQ2vLICMYUFHCRWHCkEq eAVsJCYM+8vO4gtBGS/+LyaCSIuKHFy5hOwicxAUxbs/sQGMUVaYvk/DpAwp4ClxKy/28AOEB VQkehcOocFYoyCRMf0Y0wTGPlmIZk0C8mkWQiTFjAyr2LUKE4tKkst0jU20EsqykzPKMlNzMz RNTQw1stNLS5OTE/NSUwq1kvOz93ECAyyegYGxh2MnSf8DjFKcjApifKqM3hGCvEl5adUZiQW Z8QXleakFh9ilOHgUJLgzWcGygkWpaanVqRl5gDDHSYtwcGjJMIbD9LKW1yQmFucmQ6ROsWoK CXO2wDSJwCSyCjNg2uDxdglRlkpYV5GBgYGIZ6C1KLczBJU+VeM4hyMSsK8JSBTeDLzSuCmvw JazAS0+MURD5DFJYkIKakGRpUH/o+P8+0xcy3b+45puVGDeZPD13MC768vDD974+DmVUc6Kyf O/lk/8fkKRh+Oi9uWqIvdOmJp7TZt/R5/H+ljbjPn1X169lkyY/kZpweZBxg3rLt4yHbzlsbE po2NDIcvO66edq74Y6Lrls3rWPXvrVLtvex/12Nh+plrQlfKf/fHSIjEvTBSYinOSDTUYi4qT gQA/07WhKwCAAA= X-Env-Sender: prvs=337010580=wei.liu2@citrix.com X-Msg-Ref: server-12.tower-31.messagelabs.com!1497957530!89706415!1 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 9.4.19; banners=-,-,- X-VirusChecked: Checked Received: (qmail 46792 invoked from network); 20 Jun 2017 11:18:52 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-12.tower-31.messagelabs.com with RC4-SHA encrypted SMTP; 20 Jun 2017 11:18:52 -0000 X-IronPort-AV: E=Sophos;i="5.39,364,1493683200"; d="scan'208";a="437115916" Date: Tue, 20 Jun 2017 12:18:49 +0100 From: Wei Liu To: Jean-Louis Dupond Message-ID: <20170620111849.aiouc66mps3jbjvo@citrix.com> References: <9718d7ecc813e1ee50bd17b21d1ec049@dupond.be> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <9718d7ecc813e1ee50bd17b21d1ec049@dupond.be> User-Agent: NeoMutt/20170113 (1.7.2) Cc: paul.durrant@citrix.com, wei.liu2@citrix.com, xen-devel@lists.xen.org Subject: Re: [Xen-devel] Lockup/High ksoftirqd when rate-limiting is enabled X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Jun 20, 2017 at 11:31:02AM +0200, Jean-Louis Dupond wrote: > Hi, > > As requested via IRC i'm sending this to xen-devel & netback maintainers. > > We are using Xen 4.4.4-23.el6 with kernel 3.18.44-20.el6.x86_64. > Now recently we're having issues with rate-limiting enabled. > > When we enable rate limiting in Xen, and then do alot of outbound traffic on > the domU, we notice a high ksoftirqd load. > But in some cases the system locks up completely. > Can you give this patch a try? ---8<-- From a242d4a74cc4ec46c5e3d43dd07eb146be4ca233 Mon Sep 17 00:00:00 2001 From: Wei Liu Date: Tue, 20 Jun 2017 11:49:28 +0100 Subject: [PATCH] xen-netback: correctly schedule rate-limited queues Add a flag to indicate if a queue is rate-limited. Test the flag in NAPI poll handler and avoid rescheduling the queue if true, otherwise we risk locking up the host. The rescheduling shall be done when replenishing credit. Reported-by: Jean-Louis Dupond Signed-off-by: Wei Liu Tested-by: Jean-Louis Dupond --- drivers/net/xen-netback/common.h | 1 + drivers/net/xen-netback/interface.c | 6 +++++- drivers/net/xen-netback/netback.c | 6 +++++- 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 530586be05b4..5b1d2e8402d9 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -199,6 +199,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */ unsigned long remaining_credit; struct timer_list credit_timeout; u64 credit_window_start; + bool rate_limited; /* Statistics */ struct xenvif_stats stats; diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 8397f6c92451..e322a862ddfe 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -106,7 +106,11 @@ static int xenvif_poll(struct napi_struct *napi, int budget) if (work_done < budget) { napi_complete_done(napi, work_done); - xenvif_napi_schedule_or_enable_events(queue); + /* If the queue is rate-limited, it shall be + * rescheduled in the timer callback. + */ + if (likely(!queue->rate_limited)) + xenvif_napi_schedule_or_enable_events(queue); } return work_done; diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 602d408fa25e..5042ff8d449a 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -180,6 +180,7 @@ static void tx_add_credit(struct xenvif_queue *queue) max_credit = ULONG_MAX; /* wrapped: clamp to ULONG_MAX */ queue->remaining_credit = min(max_credit, max_burst); + queue->rate_limited = false; } void xenvif_tx_credit_callback(unsigned long data) @@ -686,8 +687,10 @@ static bool tx_credit_exceeded(struct xenvif_queue *queue, unsigned size) msecs_to_jiffies(queue->credit_usec / 1000); /* Timer could already be pending in rare cases. */ - if (timer_pending(&queue->credit_timeout)) + if (timer_pending(&queue->credit_timeout)) { + queue->rate_limited = true; return true; + } /* Passed the point where we can replenish credit? */ if (time_after_eq64(now, next_credit)) { @@ -702,6 +705,7 @@ static bool tx_credit_exceeded(struct xenvif_queue *queue, unsigned size) mod_timer(&queue->credit_timeout, next_credit); queue->credit_window_start = next_credit; + queue->rate_limited = true; return true; }