From patchwork Sun Nov 17 17:41:38 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 3194551 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 59E48C045B for ; Sun, 17 Nov 2013 17:42:32 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9572620378 for ; Sun, 17 Nov 2013 17:42:31 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9BC472035E for ; Sun, 17 Nov 2013 17:42:30 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Vi6MJ-0000CA-PL; Sun, 17 Nov 2013 17:42:08 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1Vi6MH-0003n6-Bl; Sun, 17 Nov 2013 17:42:05 +0000 Received: from mail-pd0-x233.google.com ([2607:f8b0:400e:c02::233]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Vi6MD-0003lL-W9 for linux-arm-kernel@lists.infradead.org; Sun, 17 Nov 2013 17:42:02 +0000 Received: by mail-pd0-f179.google.com with SMTP id r10so2581421pdi.24 for ; Sun, 17 Nov 2013 09:41:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:subject:from:to:cc:date:in-reply-to:references :content-type:content-transfer-encoding:mime-version; bh=JnF6FB9uZa3gOcXOGwXI9VNzRKVUdaQkCul1wniSXbU=; b=HPU4SNpeA0EZhCiKo0NRfljSzv7p7goxcdTHZxZy/dwrnxJh74No1O0OIZqUwruHg+ hGGW3MCqZrIbUrunPM4d6DVsFxh2tWM6KE5Tph8BuUEn/BZxPD0YZZwSLxfnSz++N/qE GjSSgkQs6id2vPfb9MlULSm+7P63r3wQy13u00WF92AO9m4++bzkYKaQ1g1UBjUm9khs dsDPqQNBK2w6SDYiqaVwZiexbUKUGHInquvKe3RE/+4kYtygDB/cUhif6nbYMMoXfYz9 VufDn0vbKJaM26/+EvlfSnoDcKvaktHSFzb3KqgMZckkqtAK6NfFuiIXVwh9tycs4Uof Wn7Q== X-Received: by 10.68.4.232 with SMTP id n8mr16983090pbn.9.1384710099605; Sun, 17 Nov 2013 09:41:39 -0800 (PST) Received: from [172.29.165.201] ([172.29.165.201]) by mx.google.com with ESMTPSA id ha10sm18057130pbd.17.2013.11.17.09.41.38 for (version=SSLv3 cipher=RC4-SHA bits=128/128); Sun, 17 Nov 2013 09:41:39 -0800 (PST) Message-ID: <1384710098.8604.58.camel@edumazet-glaptop2.roam.corp.google.com> Subject: Re: [BUG, REGRESSION?] 3.11.6+, 3.12: GbE iface rate drops to few KB/s From: Eric Dumazet To: Willy Tarreau Date: Sun, 17 Nov 2013 09:41:38 -0800 In-Reply-To: <20131117141940.GA18569@1wt.eu> References: <8761s0cqhh.fsf@natisbad.org> <87y54u59zq.fsf@natisbad.org> <20131112083633.GB10318@1wt.eu> <87a9hagex1.fsf@natisbad.org> <20131112100126.GB23981@1wt.eu> <87vbzxd473.fsf@natisbad.org> <20131113072257.GB10591@1wt.eu> <20131117141940.GA18569@1wt.eu> X-Mailer: Evolution 3.2.3-0ubuntu6 Mime-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20131117_124202_117693_B9AB89F5 X-CRM114-Status: GOOD ( 13.40 ) X-Spam-Score: -2.0 (--) Cc: Thomas Petazzoni , netdev@vger.kernel.org, Arnaud Ebalard , edumazet@google.com, Cong Wang , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Sun, 2013-11-17 at 15:19 +0100, Willy Tarreau wrote: > > So it is fairly possible that in your case you can't fill the link if you > consume too many descriptors. For example, if your server uses TCP_NODELAY > and sends incomplete segments (which is quite common), it's very easy to > run out of descriptors before the link is full. BTW I have a very simple patch for TCP stack that could help this exact situation... Idea is to use TCP Small Queue so that we dont fill qdisc/TX ring with very small frames, and let tcp_sendmsg() have more chance to fill complete packets. Again, for this to work very well, you need that NIC performs TX completion in reasonable amount of time... diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 3dc0c6c..10456cf 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -624,13 +624,19 @@ static inline void tcp_push(struct sock *sk, int flags, int mss_now, { if (tcp_send_head(sk)) { struct tcp_sock *tp = tcp_sk(sk); + struct sk_buff *skb = tcp_write_queue_tail(sk); if (!(flags & MSG_MORE) || forced_push(tp)) - tcp_mark_push(tp, tcp_write_queue_tail(sk)); + tcp_mark_push(tp, skb); tcp_mark_urg(tp, flags); - __tcp_push_pending_frames(sk, mss_now, - (flags & MSG_MORE) ? TCP_NAGLE_CORK : nonagle); + if (flags & MSG_MORE) + nonagle = TCP_NAGLE_CORK; + if (atomic_read(&sk->sk_wmem_alloc) > 2048) { + set_bit(TSQ_THROTTLED, &tp->tsq_flags); + nonagle = TCP_NAGLE_CORK; + } + __tcp_push_pending_frames(sk, mss_now, nonagle); } }