From patchwork Wed May 11 00:04:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joanne Koong X-Patchwork-Id: 12845625 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B3EAC433EF for ; Wed, 11 May 2022 00:05:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239103AbiEKAFO (ORCPT ); Tue, 10 May 2022 20:05:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231839AbiEKAFM (ORCPT ); Tue, 10 May 2022 20:05:12 -0400 Received: from 66-220-155-178.mail-mxout.facebook.com (66-220-155-178.mail-mxout.facebook.com [66.220.155.178]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72ABC248E35 for ; Tue, 10 May 2022 17:05:11 -0700 (PDT) Received: by devbig010.atn6.facebook.com (Postfix, from userid 115148) id E528AC314A33; Tue, 10 May 2022 17:04:52 -0700 (PDT) From: Joanne Koong To: netdev@vger.kernel.org Cc: edumazet@google.com, kafai@fb.com, kuba@kernel.org, davem@davemloft.net, Joanne Koong Subject: [PATCH net-next v3 0/2] Add a bhash2 table hashed by port + address Date: Tue, 10 May 2022 17:04:22 -0700 Message-Id: <20220511000424.2223932-1-joannelkoong@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This patchset proposes adding a bhash2 table that hashes by port and address. The motivation behind bhash2 is to expedite bind requests in situations where the port has many sockets in its bhash table entry, which makes checking bind conflicts costly especially given that we acquire the table entry spinlock while doing so, which can cause softirq cpu lockups and can prevent new tcp connections. We ran into this problem at Meta where the traffic team binds a large number of IPs to port 443 and the bind() call took a significant amount of time which led to cpu softirq lockups, which caused packet drops and other failures on the machine The patches are as follows: 1/2 - Adds a second bhash table (bhash2) hashed by port and address 2/2 - Adds a test for timing how long an additional bind request takes when the bhash entry is populated When experimentally testing this on a local server for ~24k sockets bound to the port, the results seen were: ipv4: before - 0.002317 seconds with bhash2 - 0.000018 seconds ipv6: before - 0.002431 seconds with bhash2 - 0.000021 seconds v2 -> v3: v2: https://lore.kernel.org/netdev/20220510005316.3967597-1-joannelkoong@gmail.com/ * Fix bhash2 allocation error handling for dccp * Rebase onto net-next/master v1 -> v2: v1: https://lore.kernel.org/netdev/20220421221449.1817041-1-joannelkoong@gmail.com/ * Attached test for timing bind request Joanne Koong (2): net: Add a second bind table hashed by port and address selftests: Add test for timing a bind request to a port with a populated bhash entry include/net/inet_connection_sock.h | 3 + include/net/inet_hashtables.h | 56 ++++- include/net/sock.h | 14 ++ net/dccp/proto.c | 34 ++- net/ipv4/inet_connection_sock.c | 227 +++++++++++++----- net/ipv4/inet_hashtables.c | 188 ++++++++++++++- net/ipv4/tcp.c | 14 +- tools/testing/selftests/net/.gitignore | 1 + tools/testing/selftests/net/Makefile | 2 + tools/testing/selftests/net/bind_bhash_test.c | 119 +++++++++ 10 files changed, 576 insertions(+), 82 deletions(-) create mode 100644 tools/testing/selftests/net/bind_bhash_test.c