From patchwork Mon Mar 25 20:24:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602871 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB52D43AB6 for ; Mon, 25 Mar 2024 20:25:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.49.90 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398307; cv=none; b=U4JDPlVxerBpy+YRwrhYY0DnhVy0pBT1f4YkNL0r1giOiKikxJPr/X9/0nfRs/5Qr2we9dD+B8stmY2BosMX4nTieWeCrDXgNd1sNap3ltaDULX5Gg5U/8mFLn9WsaNN1tbaaeIJJQp7yIx8RjiEetyCf3fTdNUJ0W3UrrrClms= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398307; c=relaxed/simple; bh=7p09WJiSGsXQo7PM/MWCnQsbBhKl3oZQNHb6ZPosCB0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=a81JH5PO/OdwiWvHN063ECWARF/AF+MPHgHh/0BLBRV8omhztdLDa7/nHnZhuRvIUklZwPTQUUelshp95vN39uD6pFPwnlVUzTbbDfcNjGfUGiSkIhpQbwtBbeUWLJX2Ck/Q+iCxnNcoXWyPD8fkbItL+5IAc816tu8Tg3OtpyA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=A40rn+OK; arc=none smtp.client-ip=52.95.49.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="A40rn+OK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398307; x=1742934307; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Asgho1UWYenYA2GErEKKF8xNZYxLWHCdH/1HGwxHS7Y=; b=A40rn+OKjTdjaq3B8kcyc2ZUFEP5wjB+5+rejPUtk8J66hf+kRl9AkwD iVcpjPg52VoHFHKw4yYkfB93Tj4CefwdcB84cAl1BfQeOvdxg13kKJ8AP 6Mw8xTo86Uhva1zWiY5ygFToNICW6pqd0VuRA+rcAZGgBYQCDB4AZq1V+ I=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="395948054" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:25:04 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.7.35:43025] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.3.112:2525] with esmtp (Farcaster) id 96be7fbe-66ed-4f8a-ac86-6719e46955b8; Mon, 25 Mar 2024 20:25:02 +0000 (UTC) X-Farcaster-Flow-ID: 96be7fbe-66ed-4f8a-ac86-6719e46955b8 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:25:02 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:25:00 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 01/15] af_unix: Allocate struct unix_vertex for each inflight AF_UNIX fd. Date: Mon, 25 Mar 2024 13:24:11 -0700 Message-ID: <20240325202425.60930-2-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D039UWA001.ant.amazon.com (10.13.139.110) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We will replace the garbage collection algorithm for AF_UNIX, where we will consider each inflight AF_UNIX socket as a vertex and its file descriptor as an edge in a directed graph. This patch introduces a new struct unix_vertex representing a vertex in the graph and adds its pointer to struct unix_sock. When we send a fd using the SCM_RIGHTS message, we allocate struct scm_fp_list to struct scm_cookie in scm_fp_copy(). Then, we bump each refcount of the inflight fds' struct file and save them in scm_fp_list.fp. After that, unix_attach_fds() inexplicably clones scm_fp_list of scm_cookie and sets it to skb. (We will remove this part after replacing GC.) Here, we add a new function call in unix_attach_fds() to preallocate struct unix_vertex per inflight AF_UNIX fd and link each vertex to skb's scm_fp_list.vertices. When sendmsg() succeeds later, if the socket of the inflight fd is still not inflight yet, we will set the preallocated vertex to struct unix_sock.vertex and link it to a global list unix_unvisited_vertices under spin_lock(&unix_gc_lock). If the socket is already inflight, we free the preallocated vertex. This is to avoid taking the lock unnecessarily when sendmsg() could fail later. In the following patch, we will similarly allocate another struct per edge, which will finally be linked to the inflight socket's unix_vertex.edges. And then, we will count the number of edges as unix_vertex.out_degree. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 9 +++++++++ include/net/scm.h | 3 +++ net/core/scm.c | 7 +++++++ net/unix/af_unix.c | 6 ++++++ net/unix/garbage.c | 38 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 63 insertions(+) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 627ea8e2d915..c270877a5256 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -22,9 +22,17 @@ extern unsigned int unix_tot_inflight; void unix_inflight(struct user_struct *user, struct file *fp); void unix_notinflight(struct user_struct *user, struct file *fp); +int unix_prepare_fpl(struct scm_fp_list *fpl); +void unix_destroy_fpl(struct scm_fp_list *fpl); void unix_gc(void); void wait_for_unix_gc(struct scm_fp_list *fpl); +struct unix_vertex { + struct list_head edges; + struct list_head entry; + unsigned long out_degree; +}; + struct sock *unix_peer_get(struct sock *sk); #define UNIX_HASH_MOD (256 - 1) @@ -62,6 +70,7 @@ struct unix_sock { struct path path; struct mutex iolock, bindlock; struct sock *peer; + struct unix_vertex *vertex; struct list_head link; unsigned long inflight; spinlock_t lock; diff --git a/include/net/scm.h b/include/net/scm.h index 92276a2c5543..e34321b6e204 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -27,6 +27,9 @@ struct scm_fp_list { short count; short count_unix; short max; +#ifdef CONFIG_UNIX + struct list_head vertices; +#endif struct user_struct *user; struct file *fp[SCM_MAX_FD]; }; diff --git a/net/core/scm.c b/net/core/scm.c index 9cd4b0a01cd6..87dfec1c3378 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -89,6 +89,9 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp) fpl->count_unix = 0; fpl->max = SCM_MAX_FD; fpl->user = NULL; +#if IS_ENABLED(CONFIG_UNIX) + INIT_LIST_HEAD(&fpl->vertices); +#endif } fpp = &fpl->fp[fpl->count]; @@ -376,8 +379,12 @@ struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl) if (new_fpl) { for (i = 0; i < fpl->count; i++) get_file(fpl->fp[i]); + new_fpl->max = new_fpl->count; new_fpl->user = get_uid(fpl->user); +#if IS_ENABLED(CONFIG_UNIX) + INIT_LIST_HEAD(&new_fpl->vertices); +#endif } return new_fpl; } diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 5b41e2321209..a3b25d311560 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -980,6 +980,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); u->inflight = 0; + u->vertex = NULL; u->path.dentry = NULL; u->path.mnt = NULL; spin_lock_init(&u->lock); @@ -1805,6 +1806,9 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) for (i = scm->fp->count - 1; i >= 0; i--) unix_inflight(scm->fp->user, scm->fp->fp[i]); + if (unix_prepare_fpl(UNIXCB(skb).fp)) + return -ENOMEM; + return 0; } @@ -1815,6 +1819,8 @@ static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb) scm->fp = UNIXCB(skb).fp; UNIXCB(skb).fp = NULL; + unix_destroy_fpl(scm->fp); + for (i = scm->fp->count - 1; i >= 0; i--) unix_notinflight(scm->fp->user, scm->fp->fp[i]); } diff --git a/net/unix/garbage.c b/net/unix/garbage.c index fa39b6265238..75bdf66b81df 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -101,6 +101,44 @@ struct unix_sock *unix_get_socket(struct file *filp) return NULL; } +static void unix_free_vertices(struct scm_fp_list *fpl) +{ + struct unix_vertex *vertex, *next_vertex; + + list_for_each_entry_safe(vertex, next_vertex, &fpl->vertices, entry) { + list_del(&vertex->entry); + kfree(vertex); + } +} + +int unix_prepare_fpl(struct scm_fp_list *fpl) +{ + struct unix_vertex *vertex; + int i; + + if (!fpl->count_unix) + return 0; + + for (i = 0; i < fpl->count_unix; i++) { + vertex = kmalloc(sizeof(*vertex), GFP_KERNEL); + if (!vertex) + goto err; + + list_add(&vertex->entry, &fpl->vertices); + } + + return 0; + +err: + unix_free_vertices(fpl); + return -ENOMEM; +} + +void unix_destroy_fpl(struct scm_fp_list *fpl) +{ + unix_free_vertices(fpl); +} + DEFINE_SPINLOCK(unix_gc_lock); unsigned int unix_tot_inflight; static LIST_HEAD(gc_candidates); From patchwork Mon Mar 25 20:24:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602872 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-2101.amazon.com (smtp-fw-2101.amazon.com [72.21.196.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8870945033 for ; Mon, 25 Mar 2024 20:25:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=72.21.196.25 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398333; cv=none; b=gTKGI43FnA2vyZdTJIgqPBVZ9sxIvTa5CGek/CTmGydOYGVesWA/BLYjksQoeWPh/50vKcoQwdvF0+a6v+G5wzi8qXoREN20WlJPpeFs9IOpifuBlotwcXGd7OFP6ij7cn20gWMsCgr+THvdCfR0FRG2no7okYMsi7L6PAZ10u4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398333; c=relaxed/simple; bh=cvAFTB5Fl4ABRm6q8gAQa76ykQI5qyDDvhFnu3Nlluc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=rCoziSIazAKc1fUsbLurwS7wQBDT1oeXdNaqt8rn14xCz6ovxD1YujwX08j+gC7d2R4vE/adXMdkciixpYFrklTk6hH398Pao6zEbSeLXxYsJZ1GaYbFVlembM1hXsRJw2kjcGZ9e2rtpMyyoAEPgCWgiqEYSDp2fVpY9u6kqRw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=WU3n53h8; arc=none smtp.client-ip=72.21.196.25 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="WU3n53h8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398332; x=1742934332; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=G1uch6QHxqxni8gJ+VBi5uykeMpQSVHkOLmEZAqaNWs=; b=WU3n53h8wFFfz4wkpkGmuZRGQWw7gGf6nMvdHtHJ8kf05WeRtNFm+FUx g9RgmqMfsE4nN8taaXDhg1mMI3GCEHxFot4K9NroQAgMaFtaiqtUJgyFT EgGgqpiC3UxmG2EXsDKC9Hw3GRbmyXvLEx0LuBGnssi5LdlY8UCO8QgqE I=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="390419938" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:25:28 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.21.151:28254] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.58.246:2525] with esmtp (Farcaster) id 23061839-1b53-47ca-8bc1-56d281280c32; Mon, 25 Mar 2024 20:25:26 +0000 (UTC) X-Farcaster-Flow-ID: 23061839-1b53-47ca-8bc1-56d281280c32 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:25:26 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:25:24 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 02/15] af_unix: Allocate struct unix_edge for each inflight AF_UNIX fd. Date: Mon, 25 Mar 2024 13:24:12 -0700 Message-ID: <20240325202425.60930-3-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D038UWB003.ant.amazon.com (10.13.139.157) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org As with the previous patch, we preallocate to skb's scm_fp_list an array of struct unix_edge in the number of inflight AF_UNIX fds. There we just preallocate memory and do not use immediately because sendmsg() could fail after this point. The actual use will be in the next patch. When we queue skb with inflight edges, we will set the inflight socket's unix_sock as unix_edge->predecessor and the receiver's unix_sock as successor, and then we will link the edge to the inflight socket's unix_vertex.edges. Note that we set NULL to cloned scm_fp_list.edges in scm_fp_dup() so that MSG_PEEK does not change the shape of the directed graph. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 6 ++++++ include/net/scm.h | 5 +++++ net/core/scm.c | 2 ++ net/unix/garbage.c | 6 ++++++ 4 files changed, 19 insertions(+) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index c270877a5256..55c4abc26a71 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -33,6 +33,12 @@ struct unix_vertex { unsigned long out_degree; }; +struct unix_edge { + struct unix_sock *predecessor; + struct unix_sock *successor; + struct list_head vertex_entry; +}; + struct sock *unix_peer_get(struct sock *sk); #define UNIX_HASH_MOD (256 - 1) diff --git a/include/net/scm.h b/include/net/scm.h index e34321b6e204..5f5154e5096d 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -23,12 +23,17 @@ struct scm_creds { kgid_t gid; }; +#ifdef CONFIG_UNIX +struct unix_edge; +#endif + struct scm_fp_list { short count; short count_unix; short max; #ifdef CONFIG_UNIX struct list_head vertices; + struct unix_edge *edges; #endif struct user_struct *user; struct file *fp[SCM_MAX_FD]; diff --git a/net/core/scm.c b/net/core/scm.c index 87dfec1c3378..1bcc8a2d65e3 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -90,6 +90,7 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp) fpl->max = SCM_MAX_FD; fpl->user = NULL; #if IS_ENABLED(CONFIG_UNIX) + fpl->edges = NULL; INIT_LIST_HEAD(&fpl->vertices); #endif } @@ -383,6 +384,7 @@ struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl) new_fpl->max = new_fpl->count; new_fpl->user = get_uid(fpl->user); #if IS_ENABLED(CONFIG_UNIX) + new_fpl->edges = NULL; INIT_LIST_HEAD(&new_fpl->vertices); #endif } diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 75bdf66b81df..f31917683288 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -127,6 +127,11 @@ int unix_prepare_fpl(struct scm_fp_list *fpl) list_add(&vertex->entry, &fpl->vertices); } + fpl->edges = kvmalloc_array(fpl->count_unix, sizeof(*fpl->edges), + GFP_KERNEL_ACCOUNT); + if (!fpl->edges) + goto err; + return 0; err: @@ -136,6 +141,7 @@ int unix_prepare_fpl(struct scm_fp_list *fpl) void unix_destroy_fpl(struct scm_fp_list *fpl) { + kvfree(fpl->edges); unix_free_vertices(fpl); } From patchwork Mon Mar 25 20:24:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602873 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52002.amazon.com (smtp-fw-52002.amazon.com [52.119.213.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 500A71CA9C for ; Mon, 25 Mar 2024 20:25:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.150 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398356; cv=none; b=WIHa6non54UJNq29vRRwPlFS+5E3eoLk9wS7jDNn8hMz8EXv8xfQXGkuCydrh1oju6JNPqMFZ6+d+3u9Y2O835M8x4lp1MFrLlUZ+NWalamavALGlW2mJrZ0IsTmnzsE7G354HvI4m3Yr5YdUsT1+xDkksumW3T8OwU0+HQSuFM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398356; c=relaxed/simple; bh=DZUO+FDhceUXRBkmBe5BuPc11t4zdycYZc9z/Y1adXY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dBp24sD9KjKu35TMimbOzxc+1HQAhcWVmWuXOt2XBvNmtW5dtLNBGye8T76ubHTY1GBgNeC0g84RwirwgUeMCNW2qtPBjdDC/GkUlWiQCNCn5GtG4M/TWajuGVTjRu+m0ugGsE0ohiICM+rxsdxbG87elL5VfcWwj2mLZjtX4lg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=LszJFdns; arc=none smtp.client-ip=52.119.213.150 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="LszJFdns" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398355; x=1742934355; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=G4QSG30Sdhxz7PgBOvOV909i+5vnr7o73dq6xLDBcyQ=; b=LszJFdns1VVJHUQNsJnk8xg5ud9U9kcA5FzW61468lh/naF4PuuL7tqm Ea8JToZ66aAa83uSLgNiYA396TtVMHeRwPql0551ILp9XraXuN+H6O8cO PDJcQk3qmH/PDqsggELF9u37ek3H+x7b+Rth8dK5cA8TqEM+ZKHEnCaWy 4=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="622262284" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:25:51 +0000 Received: from EX19MTAUWC002.ant.amazon.com [10.0.38.20:7196] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.11.19:2525] with esmtp (Farcaster) id 2f7e7d93-8c92-4de6-b0dd-2599e1d4a302; Mon, 25 Mar 2024 20:25:50 +0000 (UTC) X-Farcaster-Flow-ID: 2f7e7d93-8c92-4de6-b0dd-2599e1d4a302 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:25:50 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:25:48 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 03/15] af_unix: Link struct unix_edge when queuing skb. Date: Mon, 25 Mar 2024 13:24:13 -0700 Message-ID: <20240325202425.60930-4-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D046UWB001.ant.amazon.com (10.13.139.187) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org Just before queuing skb with inflight fds, we call scm_stat_add(), which is a good place to set up the preallocated struct unix_vertex and struct unix_edge in UNIXCB(skb).fp. Then, we call unix_add_edges() and construct the directed graph as follows: 1. Set the inflight socket's unix_sock to unix_edge.predecessor. 2. Set the receiver's unix_sock to unix_edge.successor. 3. Set the preallocated vertex to inflight socket's unix_sock.vertex. 4. Link inflight socket's unix_vertex.entry to unix_unvisited_vertices. 5. Link unix_edge.vertex_entry to the inflight socket's unix_vertex.edges. Let's say we pass the fd of AF_UNIX socket A to B and the fd of B to C. The graph looks like this: +-------------------------+ | unix_unvisited_vertices | <-------------------------. +-------------------------+ | + | | +--------------+ +--------------+ | +--------------+ | | unix_sock A | <---. .---> | unix_sock B | <-|-. .---> | unix_sock C | | +--------------+ | | +--------------+ | | | +--------------+ | .-+ | vertex | | | .-+ | vertex | | | | | vertex | | | +--------------+ | | | +--------------+ | | | +--------------+ | | | | | | | | | | +--------------+ | | | +--------------+ | | | | '-> | unix_vertex | | | '-> | unix_vertex | | | | | +--------------+ | | +--------------+ | | | `---> | entry | +---------> | entry | +-' | | |--------------| | | |--------------| | | | edges | <-. | | | edges | <-. | | +--------------+ | | | +--------------+ | | | | | | | | | .----------------------' | | .----------------------' | | | | | | | | | +--------------+ | | | +--------------+ | | | | unix_edge | | | | | unix_edge | | | | +--------------+ | | | +--------------+ | | `-> | vertex_entry | | | `-> | vertex_entry | | | |--------------| | | |--------------| | | | predecessor | +---' | | predecessor | +---' | |--------------| | |--------------| | | successor | +-----' | successor | +-----' +--------------+ +--------------+ Henceforth, we denote such a graph as A -> B (-> C). Now, we can express all inflight fd graphs that do not contain embryo sockets. We will support the particular case later. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 2 + include/net/scm.h | 1 + net/core/scm.c | 2 + net/unix/af_unix.c | 8 +++- net/unix/garbage.c | 90 ++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 100 insertions(+), 3 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 55c4abc26a71..f31ad1166346 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -22,6 +22,8 @@ extern unsigned int unix_tot_inflight; void unix_inflight(struct user_struct *user, struct file *fp); void unix_notinflight(struct user_struct *user, struct file *fp); +void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver); +void unix_del_edges(struct scm_fp_list *fpl); int unix_prepare_fpl(struct scm_fp_list *fpl); void unix_destroy_fpl(struct scm_fp_list *fpl); void unix_gc(void); diff --git a/include/net/scm.h b/include/net/scm.h index 5f5154e5096d..bbc5527809d1 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -32,6 +32,7 @@ struct scm_fp_list { short count_unix; short max; #ifdef CONFIG_UNIX + bool inflight; struct list_head vertices; struct unix_edge *edges; #endif diff --git a/net/core/scm.c b/net/core/scm.c index 1bcc8a2d65e3..5763f3320358 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -90,6 +90,7 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp) fpl->max = SCM_MAX_FD; fpl->user = NULL; #if IS_ENABLED(CONFIG_UNIX) + fpl->inflight = false; fpl->edges = NULL; INIT_LIST_HEAD(&fpl->vertices); #endif @@ -384,6 +385,7 @@ struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl) new_fpl->max = new_fpl->count; new_fpl->user = get_uid(fpl->user); #if IS_ENABLED(CONFIG_UNIX) + new_fpl->inflight = false; new_fpl->edges = NULL; INIT_LIST_HEAD(&new_fpl->vertices); #endif diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index a3b25d311560..24adbc4d5188 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1943,8 +1943,10 @@ static void scm_stat_add(struct sock *sk, struct sk_buff *skb) struct scm_fp_list *fp = UNIXCB(skb).fp; struct unix_sock *u = unix_sk(sk); - if (unlikely(fp && fp->count)) + if (unlikely(fp && fp->count)) { atomic_add(fp->count, &u->scm_stat.nr_fds); + unix_add_edges(fp, u); + } } static void scm_stat_del(struct sock *sk, struct sk_buff *skb) @@ -1952,8 +1954,10 @@ static void scm_stat_del(struct sock *sk, struct sk_buff *skb) struct scm_fp_list *fp = UNIXCB(skb).fp; struct unix_sock *u = unix_sk(sk); - if (unlikely(fp && fp->count)) + if (unlikely(fp && fp->count)) { atomic_sub(fp->count, &u->scm_stat.nr_fds); + unix_del_edges(fp); + } } /* diff --git a/net/unix/garbage.c b/net/unix/garbage.c index f31917683288..36d665936096 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -101,6 +101,38 @@ struct unix_sock *unix_get_socket(struct file *filp) return NULL; } +static LIST_HEAD(unix_unvisited_vertices); + +static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) +{ + struct unix_vertex *vertex = edge->predecessor->vertex; + + if (!vertex) { + vertex = list_first_entry(&fpl->vertices, typeof(*vertex), entry); + vertex->out_degree = 0; + INIT_LIST_HEAD(&vertex->edges); + + list_move_tail(&vertex->entry, &unix_unvisited_vertices); + edge->predecessor->vertex = vertex; + } + + vertex->out_degree++; + list_add_tail(&edge->vertex_entry, &vertex->edges); +} + +static void unix_del_edge(struct scm_fp_list *fpl, struct unix_edge *edge) +{ + struct unix_vertex *vertex = edge->predecessor->vertex; + + list_del(&edge->vertex_entry); + vertex->out_degree--; + + if (!vertex->out_degree) { + edge->predecessor->vertex = NULL; + list_move_tail(&vertex->entry, &fpl->vertices); + } +} + static void unix_free_vertices(struct scm_fp_list *fpl) { struct unix_vertex *vertex, *next_vertex; @@ -111,6 +143,60 @@ static void unix_free_vertices(struct scm_fp_list *fpl) } } +DEFINE_SPINLOCK(unix_gc_lock); + +void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver) +{ + int i = 0, j = 0; + + spin_lock(&unix_gc_lock); + + if (!fpl->count_unix) + goto out; + + do { + struct unix_sock *inflight = unix_get_socket(fpl->fp[j++]); + struct unix_edge *edge; + + if (!inflight) + continue; + + edge = fpl->edges + i++; + edge->predecessor = inflight; + edge->successor = receiver; + + unix_add_edge(fpl, edge); + } while (i < fpl->count_unix); + +out: + spin_unlock(&unix_gc_lock); + + fpl->inflight = true; + + unix_free_vertices(fpl); +} + +void unix_del_edges(struct scm_fp_list *fpl) +{ + int i = 0; + + spin_lock(&unix_gc_lock); + + if (!fpl->count_unix) + goto out; + + do { + struct unix_edge *edge = fpl->edges + i++; + + unix_del_edge(fpl, edge); + } while (i < fpl->count_unix); + +out: + spin_unlock(&unix_gc_lock); + + fpl->inflight = false; +} + int unix_prepare_fpl(struct scm_fp_list *fpl) { struct unix_vertex *vertex; @@ -141,11 +227,13 @@ int unix_prepare_fpl(struct scm_fp_list *fpl) void unix_destroy_fpl(struct scm_fp_list *fpl) { + if (fpl->inflight) + unix_del_edges(fpl); + kvfree(fpl->edges); unix_free_vertices(fpl); } -DEFINE_SPINLOCK(unix_gc_lock); unsigned int unix_tot_inflight; static LIST_HEAD(gc_candidates); static LIST_HEAD(gc_inflight_list); From patchwork Mon Mar 25 20:24:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602874 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E0413DABEF for ; Mon, 25 Mar 2024 20:26:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.49.90 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398379; cv=none; b=eUFG9z3CjXfetCAPN27rUC+nQipyNQkqf4cYLle7dgykVYt6kY42YxU0Divf9OAb7g/bmM1M6lfL/XL1GSWGQvkbsmFLWFt7v9/uZDuGzYeg9g5nndZFVsqSTihsWhJuTXIM7H+ihyaonbnXwN0vvgWrLaZzvkdKerzzQSnv1bU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398379; c=relaxed/simple; bh=RrrkbnJ8j9faRNlMqUvJ4cKRPZHzubkW90VXR+n9f5w=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DusweAo5WFFgU9bjFUKPayVF8S4NJ56Cr+jmLbJ20zZlVBjTawonkTs5PJilbJ6rD2BRx9AiR7+sOR4SQO/zvnD+cec7gpBGoHLfubOq6nJlJGYWBhhRu7eXDvukyLC0msj0G63zlVljY+fWD9tIb9OdQYNwaMvOT5bn7t8IMug= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=oytdXGEU; arc=none smtp.client-ip=52.95.49.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="oytdXGEU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398377; x=1742934377; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ijyoOVzvkQmDmVQzGC69VWdIjFlPPcoVQQsg3Gm2rOQ=; b=oytdXGEUfd2M2waw0gn3mh8xQRfctmCfRNTH/gk93t1m1r+CDw1t0haH UK6na0RMPvPgPZ6yzg5fZ8fqnm3nNXZYAQROTqUaRKxoQxX6zFHZGcLHd mquf5UjCwoEyThyDVyOvHXeVU6kvTMEHlYT7cYc8L2vST5Vcq0rqQo+pn o=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="395948302" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:26:15 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.7.35:13203] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.51.164:2525] with esmtp (Farcaster) id ac27db10-20b0-4625-b33c-e5f23a587a7b; Mon, 25 Mar 2024 20:26:15 +0000 (UTC) X-Farcaster-Flow-ID: ac27db10-20b0-4625-b33c-e5f23a587a7b Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:26:14 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:26:12 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 04/15] af_unix: Bulk update unix_tot_inflight/unix_inflight when queuing skb. Date: Mon, 25 Mar 2024 13:24:14 -0700 Message-ID: <20240325202425.60930-5-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D043UWA003.ant.amazon.com (10.13.139.31) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org Currently, we track the number of inflight sockets in two variables. unix_tot_inflight is the total number of inflight AF_UNIX sockets on the host, and user->unix_inflight is the number of inflight fds per user. We update them one by one in unix_inflight(), which can be done once in batch. Also, sendmsg() could fail even after unix_inflight(), then we need to acquire unix_gc_lock only to decrement the counters. Let's bulk update the counters in unix_add_edges() and unix_del_edges(), which is called only for successfully passed fds. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 36d665936096..84c8ea524a98 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -144,6 +144,7 @@ static void unix_free_vertices(struct scm_fp_list *fpl) } DEFINE_SPINLOCK(unix_gc_lock); +unsigned int unix_tot_inflight; void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver) { @@ -168,7 +169,10 @@ void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver) unix_add_edge(fpl, edge); } while (i < fpl->count_unix); + WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + fpl->count_unix); out: + WRITE_ONCE(fpl->user->unix_inflight, fpl->user->unix_inflight + fpl->count); + spin_unlock(&unix_gc_lock); fpl->inflight = true; @@ -191,7 +195,10 @@ void unix_del_edges(struct scm_fp_list *fpl) unix_del_edge(fpl, edge); } while (i < fpl->count_unix); + WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - fpl->count_unix); out: + WRITE_ONCE(fpl->user->unix_inflight, fpl->user->unix_inflight - fpl->count); + spin_unlock(&unix_gc_lock); fpl->inflight = false; @@ -234,7 +241,6 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } -unsigned int unix_tot_inflight; static LIST_HEAD(gc_candidates); static LIST_HEAD(gc_inflight_list); @@ -255,13 +261,8 @@ void unix_inflight(struct user_struct *user, struct file *filp) WARN_ON_ONCE(list_empty(&u->link)); } u->inflight++; - - /* Paired with READ_ONCE() in wait_for_unix_gc() */ - WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + 1); } - WRITE_ONCE(user->unix_inflight, user->unix_inflight + 1); - spin_unlock(&unix_gc_lock); } @@ -278,13 +279,8 @@ void unix_notinflight(struct user_struct *user, struct file *filp) u->inflight--; if (!u->inflight) list_del_init(&u->link); - - /* Paired with READ_ONCE() in wait_for_unix_gc() */ - WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - 1); } - WRITE_ONCE(user->unix_inflight, user->unix_inflight - 1); - spin_unlock(&unix_gc_lock); } From patchwork Mon Mar 25 20:24:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602884 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8CA246444 for ; Mon, 25 Mar 2024 20:26:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.217 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398413; cv=none; b=RnaHGJOw+8WI3GcQBtfXDAHTkxTyjxhrb6XgpHAugNotDjj6FAN7Y0r6/PUj4eG8F8HUHYNgiVk5hq3PQhuyc95SFx5EzimnEbG9sPrCoIt5rU26iG1QWxngdaih0EM5xBfa/Zetup7pdovoDqkbBdkrUdy1TzBfpb/lJo8e25E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398413; c=relaxed/simple; bh=ph2yf7J6LKq4YYRMk+Y5URfzdxOTm+QE2Ee2wVW/qI8=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=p9f14tecdQLl/fX2tE8miEjNY4G/C1z2W3Y0G6Op7W9XaVOVu20EZUbFHv5brEtyLp5IeSE5pkaBp4HvY2RL7WPbE1/D5lLcRN3hygQ5GPrQ9VKcEk151DkP4X8LXE0k6h9aF+1R+mBrK4Is9oND0hZW0P0OHTPrGRk+YzKFq8s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=R4tlxAGL; arc=none smtp.client-ip=99.78.197.217 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="R4tlxAGL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398412; x=1742934412; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GT6vMIEY3HLrFx24YMvpjbDCcF/1ON4MQ2a4ka0flj0=; b=R4tlxAGL9OeAl2vu6jXQVPU1DrJAs7fc6tYARNfij5McP0bOTC74LjHs Xw8ag15KgdhFnwQvZAn/Sb24eBVaO/cEI3kEJIVIBnTDootMfYgCV6luP 23QIcqsDQjFYe7jr7HWW3UyTBnJsvuEggmDUv/CCq9QqFOjRGIloihQ7p M=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="282771327" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:26:48 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.7.35:10915] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.17.10:2525] with esmtp (Farcaster) id 7c44c0f6-d099-4d0f-abba-646504841a95; Mon, 25 Mar 2024 20:26:47 +0000 (UTC) X-Farcaster-Flow-ID: 7c44c0f6-d099-4d0f-abba-646504841a95 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:26:38 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:26:36 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 05/15] af_unix: Iterate all vertices by DFS. Date: Mon, 25 Mar 2024 13:24:15 -0700 Message-ID: <20240325202425.60930-6-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D031UWA004.ant.amazon.com (10.13.139.19) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org The new GC will use a depth first search graph algorithm to find cyclic references. The algorithm visits every vertex exactly once. Here, we implement the DFS part without recursion so that no one can abuse it. unix_walk_scc() marks every vertex unvisited by initialising index as UNIX_VERTEX_INDEX_UNVISITED and iterates inflight vertices in unix_unvisited_vertices and call __unix_walk_scc() to start DFS from an arbitrary vertex. __unix_walk_scc() iterates all edges starting from the vertex and explores the neighbour vertices with DFS using edge_stack. After visiting all neighbours, __unix_walk_scc() moves the visited vertex to unix_visited_vertices so that unix_walk_scc() will not restart DFS from the visited vertex. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 2 ++ net/unix/garbage.c | 74 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index f31ad1166346..970a91da2239 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -33,12 +33,14 @@ struct unix_vertex { struct list_head edges; struct list_head entry; unsigned long out_degree; + unsigned long index; }; struct unix_edge { struct unix_sock *predecessor; struct unix_sock *successor; struct list_head vertex_entry; + struct list_head stack_entry; }; struct sock *unix_peer_get(struct sock *sk); diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 84c8ea524a98..8b16ab9e240e 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -103,6 +103,11 @@ struct unix_sock *unix_get_socket(struct file *filp) static LIST_HEAD(unix_unvisited_vertices); +enum unix_vertex_index { + UNIX_VERTEX_INDEX_UNVISITED, + UNIX_VERTEX_INDEX_START, +}; + static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) { struct unix_vertex *vertex = edge->predecessor->vertex; @@ -241,6 +246,73 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } +static LIST_HEAD(unix_visited_vertices); + +static void __unix_walk_scc(struct unix_vertex *vertex) +{ + unsigned long index = UNIX_VERTEX_INDEX_START; + struct unix_edge *edge; + LIST_HEAD(edge_stack); + +next_vertex: + vertex->index = index; + index++; + + /* Explore neighbour vertices (receivers of the current vertex's fd). */ + list_for_each_entry(edge, &vertex->edges, vertex_entry) { + struct unix_vertex *next_vertex = edge->successor->vertex; + + if (!next_vertex) + continue; + + if (next_vertex->index == UNIX_VERTEX_INDEX_UNVISITED) { + /* Iterative deepening depth first search + * + * 1. Push a forward edge to edge_stack and set + * the successor to vertex for the next iteration. + */ + list_add(&edge->stack_entry, &edge_stack); + + vertex = next_vertex; + goto next_vertex; + + /* 2. Pop the edge directed to the current vertex + * and restore the ancestor for backtracking. + */ +prev_vertex: + edge = list_first_entry(&edge_stack, typeof(*edge), stack_entry); + list_del_init(&edge->stack_entry); + + vertex = edge->predecessor->vertex; + } + } + + /* Don't restart DFS from this vertex in unix_walk_scc(). */ + list_move_tail(&vertex->entry, &unix_visited_vertices); + + /* Need backtracking ? */ + if (!list_empty(&edge_stack)) + goto prev_vertex; +} + +static void unix_walk_scc(void) +{ + struct unix_vertex *vertex; + + list_for_each_entry(vertex, &unix_unvisited_vertices, entry) + vertex->index = UNIX_VERTEX_INDEX_UNVISITED; + + /* Visit every vertex exactly once. + * __unix_walk_scc() moves visited vertices to unix_visited_vertices. + */ + while (!list_empty(&unix_unvisited_vertices)) { + vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); + __unix_walk_scc(vertex); + } + + list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); +} + static LIST_HEAD(gc_candidates); static LIST_HEAD(gc_inflight_list); @@ -388,6 +460,8 @@ static void __unix_gc(struct work_struct *work) spin_lock(&unix_gc_lock); + unix_walk_scc(); + /* First, select candidates for garbage collection. Only * in-flight sockets are considered, and from those only ones * which don't have any external reference. From patchwork Mon Mar 25 20:24:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602885 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69E6545BE1 for ; Mon, 25 Mar 2024 20:27:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.49.90 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398428; cv=none; b=TwuspH3WJ5qHTlrsKCDXBxocQOeyQjtxazMFOBytBx1SiQIVG3Ym+Hl12vMst50R3DXoUCKabtoieR6qpYAEZ/AMaIIzV4JbcGdvyAIiYz/ANh53Q6ZlGPQ677LLT1dOoGMwEgFwFAeegs0StobzgjobMH8H/2NqI2T+XGWCHCQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398428; c=relaxed/simple; bh=Y9TiENNXjMbVM4Omft3lFQ3UrG16LedJwh5qux3ba90=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cGdteLpvVsYQKS/I7K0ouhBib89Aj7VUBQPC4XK68zt/XCHX9RQE+oO6hwJBob5jeKMSMEXhszILLSUKxcHNdwjxZ+2ag1r8ItjhJmSdjKbYkFOmWFNjUTmB+iqp/QvVpnFjCGbYheutGgNC3nBaBbIOpq18Q8LipWgQUXywU5c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=JpEkVQ8+; arc=none smtp.client-ip=52.95.49.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="JpEkVQ8+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398426; x=1742934426; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hh17owvbeYZuCpueUg7lO9ANNgjljH7sMZQVKp3izgI=; b=JpEkVQ8+PF8umXt0dRA/Rp0gr1m4FAYmVpfb7oa5LIjVVGfjwyPkG9cB MsLtpj4MaEUIem35HmJIF8ZRulMPNIKrujwYupTV0qdZtr1SPXntSOKlO fCmuJVRhQ2kp7eHXYz0mk68HM62uGf3c0l67T712QHcIQAq0cW1Z515HV A=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="395948520" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:27:05 +0000 Received: from EX19MTAUWB002.ant.amazon.com [10.0.21.151:18014] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.3.112:2525] with esmtp (Farcaster) id b884be06-3e13-4f1a-9ed8-01b9f31fe68c; Mon, 25 Mar 2024 20:27:04 +0000 (UTC) X-Farcaster-Flow-ID: b884be06-3e13-4f1a-9ed8-01b9f31fe68c Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWB002.ant.amazon.com (10.250.64.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:27:02 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:27:00 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 06/15] af_unix: Detect Strongly Connected Components. Date: Mon, 25 Mar 2024 13:24:16 -0700 Message-ID: <20240325202425.60930-7-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D043UWA002.ant.amazon.com (10.13.139.53) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org In the new GC, we use a simple graph algorithm, Tarjan's Strongly Connected Components (SCC) algorithm, to find cyclic references. The algorithm visits every vertex exactly once using depth-first search (DFS). DFS starts by pushing an input vertex to a stack and assigning it a unique number. Two fields, index and lowlink, are initialised with the number, but lowlink could be updated later during DFS. If a vertex has an edge to an unvisited inflight vertex, we visit it and do the same processing. So, we will have vertices in the stack in the order they appear and number them consecutively in the same order. If a vertex has a back-edge to a visited vertex in the stack, we update the predecessor's lowlink with the successor's index. After iterating edges from the vertex, we check if its index equals its lowlink. If the lowlink is different from the index, it shows there was a back-edge. Then, we go backtracking and propagate the lowlink to its predecessor and resume the previous edge iteration from the next edge. If the lowlink is the same as the index, we pop vertices before and including the vertex from the stack. Then, the set of vertices is SCC, possibly forming a cycle. At the same time, we move the vertices to unix_visited_vertices. When we finish the algorithm, all vertices in each SCC will be linked via unix_vertex.scc_entry. Let's take an example. We have a graph including five inflight vertices (F is not inflight): A -> B -> C -> D -> E (-> F) ^ | `---------' Suppose that we start DFS from C. We will visit C, D, and B first and initialise their index and lowlink. Then, the stack looks like this: > B = (3, 3) (index, lowlink) D = (2, 2) C = (1, 1) When checking B's edge to C, we update B's lowlink with C's index and propagate it to D. B = (3, 1) (index, lowlink) > D = (2, 1) C = (1, 1) Next, we visit E, which has no edge to an inflight vertex. > E = (4, 4) (index, lowlink) B = (3, 1) D = (2, 1) C = (1, 1) When we leave from E, its index and lowlink are the same, so we pop E from the stack as single-vertex SCC. Next, we leave from B and D but do nothing because their lowlink are different from their index. B = (3, 1) (index, lowlink) D = (2, 1) > C = (1, 1) Then, we leave from C, whose index and lowlink are the same, so we pop B, D and C as SCC. Last, we do DFS for the rest of vertices, A, which is also a single-vertex SCC. Finally, each unix_vertex.scc_entry is linked as follows: A -. B -> C -> D E -. ^ | ^ | ^ | `--' `---------' `--' We use SCC later to decide whether we can garbage-collect the sockets. Note that we still cannot detect SCC properly if an edge points to an embryo socket. The following two patches will sort it out. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 3 +++ net/unix/garbage.c | 46 +++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 47 insertions(+), 2 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 970a91da2239..67736767b616 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -32,8 +32,11 @@ void wait_for_unix_gc(struct scm_fp_list *fpl); struct unix_vertex { struct list_head edges; struct list_head entry; + struct list_head scc_entry; unsigned long out_degree; unsigned long index; + unsigned long lowlink; + bool on_stack; }; struct unix_edge { diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 8b16ab9e240e..33aadaa35346 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -251,11 +251,19 @@ static LIST_HEAD(unix_visited_vertices); static void __unix_walk_scc(struct unix_vertex *vertex) { unsigned long index = UNIX_VERTEX_INDEX_START; + LIST_HEAD(vertex_stack); struct unix_edge *edge; LIST_HEAD(edge_stack); next_vertex: + /* Push vertex to vertex_stack. + * The vertex will be popped when finalising SCC later. + */ + vertex->on_stack = true; + list_add(&vertex->scc_entry, &vertex_stack); + vertex->index = index; + vertex->lowlink = index; index++; /* Explore neighbour vertices (receivers of the current vertex's fd). */ @@ -283,12 +291,46 @@ static void __unix_walk_scc(struct unix_vertex *vertex) edge = list_first_entry(&edge_stack, typeof(*edge), stack_entry); list_del_init(&edge->stack_entry); + next_vertex = vertex; vertex = edge->predecessor->vertex; + + /* If the successor has a smaller lowlink, two vertices + * are in the same SCC, so propagate the smaller lowlink + * to skip SCC finalisation. + */ + vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); + } else if (next_vertex->on_stack) { + /* Loop detected by a back/cross edge. + * + * The successor is on vertex_stack, so two vertices are + * in the same SCC. If the successor has a smaller index, + * propagate it to skip SCC finalisation. + */ + vertex->lowlink = min(vertex->lowlink, next_vertex->index); + } else { + /* The successor was already grouped as another SCC */ } } - /* Don't restart DFS from this vertex in unix_walk_scc(). */ - list_move_tail(&vertex->entry, &unix_visited_vertices); + if (vertex->index == vertex->lowlink) { + struct list_head scc; + + /* SCC finalised. + * + * If the lowlink was not updated, all the vertices above on + * vertex_stack are in the same SCC. Group them using scc_entry. + */ + __list_cut_position(&scc, &vertex_stack, &vertex->scc_entry); + + list_for_each_entry_reverse(vertex, &scc, scc_entry) { + /* Don't restart DFS from this vertex in unix_walk_scc(). */ + list_move_tail(&vertex->entry, &unix_visited_vertices); + + vertex->on_stack = false; + } + + list_del(&scc); + } /* Need backtracking ? */ if (!list_empty(&edge_stack)) From patchwork Mon Mar 25 20:24:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602887 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B41C4597D for ; Mon, 25 Mar 2024 20:27:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.217 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398449; cv=none; b=ae6PgD+SnMa1GROgnTr98XWVLmp6/sfb9mdc3LyevNc8ET9MkYU9qCTUZHUVDlpASB6TzKNZtDGq5sVulWArm4kgY9Ijja1tloRMRPEgHDQA7l/5K0inTQglowuZ2/QVSL+2bhHKazEztSB8eOh8plfBUC/JNdfRrPIqgA2T+o8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398449; c=relaxed/simple; bh=RZWS7glsf3FVPy/SJ7WMtTihIZZowfEc/e54NcYbMfY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MDMoO8JTC3Q7ZPK1K2I3ImIczqrvK3ZdMSFDROno43JEyPnokQB1oe5B7P3CJ4Mo8k7XhFtfnxVR5hQlmFqQSvnsfp+dMvL778V2KzvkKY72ILGqYaAbc7sy2aggM3aiV55jE8cnP3wy80e21DbCgW3CqF3ToGByFPp4Vnnz1QM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=TDVXJwQ2; arc=none smtp.client-ip=99.78.197.217 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="TDVXJwQ2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398448; x=1742934448; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RClNWoP4mmxHjeK91vF3Qzm5FnqmxwlH+sv+IDMvaUA=; b=TDVXJwQ2tM96hnCiINY+VXkqnqtoJjfwlPvpgEN2lE2CJkf2OK5LInA9 8HGBinS4iRP844Tlj993UD84W0DMdYjRUAsCwxQCJZB4wtyJWjOAigeKL /eR6Ah17IO+3z41dDx+o+AQHy6YdC889KgM5fb+3TcSKltp7rFugmVFgd A=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="282771433" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:27:28 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.7.35:48513] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.5.115:2525] with esmtp (Farcaster) id 2b5ce983-203e-47d6-a1b5-5f520e520786; Mon, 25 Mar 2024 20:27:27 +0000 (UTC) X-Farcaster-Flow-ID: 2b5ce983-203e-47d6-a1b5-5f520e520786 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:27:27 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:27:24 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 07/15] af_unix: Save listener for embryo socket. Date: Mon, 25 Mar 2024 13:24:17 -0700 Message-ID: <20240325202425.60930-8-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D040UWA002.ant.amazon.com (10.13.139.113) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org This is a prep patch for the following change, where we need to fetch the listening socket from the successor embryo socket during GC. We add a new field to struct unix_sock to save a pointer to a listening socket. We set it when connect() creates a new socket, and clear it when accept() is called. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 1 + net/unix/af_unix.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 67736767b616..dc7469191195 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -83,6 +83,7 @@ struct unix_sock { struct path path; struct mutex iolock, bindlock; struct sock *peer; + struct sock *listener; struct unix_vertex *vertex; struct list_head link; unsigned long inflight; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 24adbc4d5188..af74e7ebc35a 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -979,6 +979,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen; sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); + u->listener = NULL; u->inflight = 0; u->vertex = NULL; u->path.dentry = NULL; @@ -1598,6 +1599,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, newsk->sk_type = sk->sk_type; init_peercred(newsk); newu = unix_sk(newsk); + newu->listener = other; RCU_INIT_POINTER(newsk->sk_wq, &newu->peer_wq); otheru = unix_sk(other); @@ -1693,8 +1695,8 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags, bool kern) { struct sock *sk = sock->sk; - struct sock *tsk; struct sk_buff *skb; + struct sock *tsk; int err; err = -EOPNOTSUPP; @@ -1719,6 +1721,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags, } tsk = skb->sk; + unix_sk(tsk)->listener = NULL; skb_free_datagram(sk, skb); wake_up_interruptible(&unix_sk(sk)->peer_wait); From patchwork Mon Mar 25 20:24:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602888 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52002.amazon.com (smtp-fw-52002.amazon.com [52.119.213.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C5C43535D4 for ; Mon, 25 Mar 2024 20:27:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.150 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398475; cv=none; b=OsieAiSoMxKzK1+mmU48nvaQmVHOrK7mVlYOVbov91NORXKijg+AIFUuI9zoXTlpWXq8c3Y2VCQZJFSA/mzGpOkfRV8ZAbar4t3FQVnh97zuhVDgMQkoBbFoC/5dkPeh3hoOuxcwymZJ8FfYo1LNYeHzicAcbOPyMxvdcddKKaM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398475; c=relaxed/simple; bh=fMNo5fTNohMnyJLbk2mLIG3DR4lnfsRZBiVmOqueyfc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=jwM7yyZzl2ALMxGdPN0RcWKtvlc336aG1gr7Y24gLKtmk7pYIa60yNnQxEniUJp6y/P1vlBw4ElkU+2nYOTY9p1JOQ6J5kE1cdvAw7GEf4ZpJkImVb5r35OCA5DR0k6f//Y284X/rA2F9j8vIyVCIWEO2BUeP/5lSOIhIrXadRo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=l7cZbZbw; arc=none smtp.client-ip=52.119.213.150 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="l7cZbZbw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398474; x=1742934474; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iVCz26quLMy3v5LmxdtliORIVFIpNx7M1m4xqBeJoes=; b=l7cZbZbwlXsjSLTiWoA50zx89H83/Xgj/eje6jGwpOwUp+lVoSzGBEBc FshfFjGDpiWeedSJndDry05T3VHhElgIQjSLkpLvllp8HUrQ0/QCdb8ai Tj40lXoC3qOCqDbMNi++8ZxVwTq/JN/0ho1+uhyKdDQ0HaHkFHwtGEp3v g=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="622262724" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:27:52 +0000 Received: from EX19MTAUWB001.ant.amazon.com [10.0.38.20:9932] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.5.115:2525] with esmtp (Farcaster) id 2e28201e-854a-4cc1-ad71-d9ee598015b6; Mon, 25 Mar 2024 20:27:51 +0000 (UTC) X-Farcaster-Flow-ID: 2e28201e-854a-4cc1-ad71-d9ee598015b6 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:27:51 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:27:48 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 08/15] af_unix: Fix up unix_edge.successor for embryo socket. Date: Mon, 25 Mar 2024 13:24:18 -0700 Message-ID: <20240325202425.60930-9-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D043UWA004.ant.amazon.com (10.13.139.41) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org To garbage collect inflight AF_UNIX sockets, we must define the cyclic reference appropriately. This is a bit tricky if the loop consists of embryo sockets. Suppose that the fd of AF_UNIX socket A is passed to D and the fd B to C and that C and D are embryo sockets of A and B, respectively. It may appear that there are two separate graphs, A (-> D) and B (-> C), but this is not correct. A --. .-- B X C <-' `-> D Now, D holds A's refcount, and C has B's refcount, so unix_release() will never be called for A and B when we close() them. However, no one can call close() for D and C to free skbs holding refcounts of A and B because C/D is in A/B's receive queue, which should have been purged by unix_release() for A and B. So, here's another type of cyclic reference. When a fd of an AF_UNIX socket is passed to an embryo socket, the reference is indirectly held by its parent listening socket. .-> A .-> B | `- sk_receive_queue | `- sk_receive_queue | `- skb | `- skb | `- sk == C | `- sk == D | `- sk_receive_queue | `- sk_receive_queue | `- skb +---------' `- skb +-. | | `---------------------------------------------------------' Technically, the graph must be denoted as A <-> B instead of A (-> D) and B (-> C) to find such a cyclic reference without touching each socket's receive queue. .-> A --. .-- B <-. | X | == A <-> B `-- C <-' `-> D --' We apply this fixup during GC by fetching the real successor by unix_edge_successor(). When we call accept(), we clear unix_sock.listener under unix_gc_lock not to confuse GC. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 1 + net/unix/af_unix.c | 2 +- net/unix/garbage.c | 20 +++++++++++++++++++- 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index dc7469191195..414463803b7e 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -24,6 +24,7 @@ void unix_inflight(struct user_struct *user, struct file *fp); void unix_notinflight(struct user_struct *user, struct file *fp); void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver); void unix_del_edges(struct scm_fp_list *fpl); +void unix_update_edges(struct unix_sock *receiver); int unix_prepare_fpl(struct scm_fp_list *fpl); void unix_destroy_fpl(struct scm_fp_list *fpl); void unix_gc(void); diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index af74e7ebc35a..ae77e2dc0dae 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1721,7 +1721,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags, } tsk = skb->sk; - unix_sk(tsk)->listener = NULL; + unix_update_edges(unix_sk(tsk)); skb_free_datagram(sk, skb); wake_up_interruptible(&unix_sk(sk)->peer_wait); diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 33aadaa35346..8d0912c1d01a 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -101,6 +101,17 @@ struct unix_sock *unix_get_socket(struct file *filp) return NULL; } +static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) +{ + /* If an embryo socket has a fd, + * the listener indirectly holds the fd's refcnt. + */ + if (edge->successor->listener) + return unix_sk(edge->successor->listener)->vertex; + + return edge->successor->vertex; +} + static LIST_HEAD(unix_unvisited_vertices); enum unix_vertex_index { @@ -209,6 +220,13 @@ void unix_del_edges(struct scm_fp_list *fpl) fpl->inflight = false; } +void unix_update_edges(struct unix_sock *receiver) +{ + spin_lock(&unix_gc_lock); + receiver->listener = NULL; + spin_unlock(&unix_gc_lock); +} + int unix_prepare_fpl(struct scm_fp_list *fpl) { struct unix_vertex *vertex; @@ -268,7 +286,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) /* Explore neighbour vertices (receivers of the current vertex's fd). */ list_for_each_entry(edge, &vertex->edges, vertex_entry) { - struct unix_vertex *next_vertex = edge->successor->vertex; + struct unix_vertex *next_vertex = unix_edge_successor(edge); if (!next_vertex) continue; From patchwork Mon Mar 25 20:24:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602889 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-33001.amazon.com (smtp-fw-33001.amazon.com [207.171.190.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98CFA433CD for ; Mon, 25 Mar 2024 20:28:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.190.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398504; cv=none; b=mLHxuoKy8bbfaFDJc2lHOYj32o4q13PfiG76Bj7xGDRTjVwNfcYFPZv9D9Xebw2KuR23GfRLWefhUhC48EC0VTA5O4LuAY3P6RVVlfE6EjLKXUiJOjXMEKIxKECAxMtl0rURjBRmiLcXV5mjY5X7uOaIBtDh6z8LgViOPjkQABo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398504; c=relaxed/simple; bh=UhxXVQX/qIBEHS4c1juEEOSzeK4sHV3x3+Rqx8ZRy4k=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tYZ3hb3f+xj/mcwjpZ0PiRD0g//tagXeQQvh3LwwyfV2HFht5lg+pSrM8HlED1+qA8nEVchm3oLwccyPzCjkNX6vxmubFTCc/BWPf0WyItAtSkKVGyHivQKn1gYYZOjmMMd+1QYRWtGZd45kDW0rXmdxUjAgFz1C0Gi0uB6XohU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=JBjRtyt3; arc=none smtp.client-ip=207.171.190.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="JBjRtyt3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398502; x=1742934502; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+x/RkUCDR2KHmrxUf/Wd+6uUv0azM4+5StGhAaki6hw=; b=JBjRtyt3J53bwsUx2UABFC2RqWGL0Edl85iJJYBkBsaAB3uGkiHDXGXS gpw8d/PSgVRXrS0C3cCNEjZ7cen+aNjsfx06hcNBtOExZM5c906MUldUC vFgLyeRN+kLdbvbmvYYzrsu7ZFlR+3MZ0EmWNuRuMD21ib9wUp29X7eGp w=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="335112133" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-33001.sea14.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:28:16 +0000 Received: from EX19MTAUWC002.ant.amazon.com [10.0.38.20:52854] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.19.88:2525] with esmtp (Farcaster) id 95ab79b5-0f38-4a52-be65-3102bbb885a6; Mon, 25 Mar 2024 20:28:15 +0000 (UTC) X-Farcaster-Flow-ID: 95ab79b5-0f38-4a52-be65-3102bbb885a6 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:28:15 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:28:12 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 09/15] af_unix: Save O(n) setup of Tarjan's algo. Date: Mon, 25 Mar 2024 13:24:19 -0700 Message-ID: <20240325202425.60930-10-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D032UWB004.ant.amazon.com (10.13.139.136) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org Before starting Tarjan's algorithm, we need to mark all vertices as unvisited. We can save this O(n) setup by reserving two special indices (0, 1) and using two variables. The first time we link a vertex to unix_unvisited_vertices, we set unix_vertex_unvisited_index to index. During DFS, we can see that the index of unvisited vertices is the same as unix_vertex_unvisited_index. When we finalise SCC later, we set unix_vertex_grouped_index to each vertex's index. Then, we can know (i) that the vertex is on the stack if the index of a visited vertex is >= 2 and (ii) that it is not on the stack and belongs to a different SCC if the index is unix_vertex_grouped_index. After the whole algorithm, all indices of vertices are set as unix_vertex_grouped_index. Next time we start DFS, we know that all unvisited vertices have unix_vertex_grouped_index, and we can use unix_vertex_unvisited_index as the not-on-stack marker. To use the same variable in __unix_walk_scc(), we can swap unix_vertex_(grouped|unvisited)_index at the end of Tarjan's algorithm. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 1 - net/unix/garbage.c | 26 +++++++++++++++----------- 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 414463803b7e..ec040caaa4b5 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -37,7 +37,6 @@ struct unix_vertex { unsigned long out_degree; unsigned long index; unsigned long lowlink; - bool on_stack; }; struct unix_edge { diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 8d0912c1d01a..cec6189126ba 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -115,16 +115,20 @@ static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) static LIST_HEAD(unix_unvisited_vertices); enum unix_vertex_index { - UNIX_VERTEX_INDEX_UNVISITED, + UNIX_VERTEX_INDEX_MARK1, + UNIX_VERTEX_INDEX_MARK2, UNIX_VERTEX_INDEX_START, }; +static unsigned long unix_vertex_unvisited_index = UNIX_VERTEX_INDEX_MARK1; + static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) { struct unix_vertex *vertex = edge->predecessor->vertex; if (!vertex) { vertex = list_first_entry(&fpl->vertices, typeof(*vertex), entry); + vertex->index = unix_vertex_unvisited_index; vertex->out_degree = 0; INIT_LIST_HEAD(&vertex->edges); @@ -265,6 +269,7 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) } static LIST_HEAD(unix_visited_vertices); +static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; static void __unix_walk_scc(struct unix_vertex *vertex) { @@ -274,10 +279,10 @@ static void __unix_walk_scc(struct unix_vertex *vertex) LIST_HEAD(edge_stack); next_vertex: - /* Push vertex to vertex_stack. + /* Push vertex to vertex_stack and mark it as on-stack + * (index >= UNIX_VERTEX_INDEX_START). * The vertex will be popped when finalising SCC later. */ - vertex->on_stack = true; list_add(&vertex->scc_entry, &vertex_stack); vertex->index = index; @@ -291,7 +296,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) if (!next_vertex) continue; - if (next_vertex->index == UNIX_VERTEX_INDEX_UNVISITED) { + if (next_vertex->index == unix_vertex_unvisited_index) { /* Iterative deepening depth first search * * 1. Push a forward edge to edge_stack and set @@ -317,7 +322,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) * to skip SCC finalisation. */ vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); - } else if (next_vertex->on_stack) { + } else if (next_vertex->index != unix_vertex_grouped_index) { /* Loop detected by a back/cross edge. * * The successor is on vertex_stack, so two vertices are @@ -344,7 +349,8 @@ static void __unix_walk_scc(struct unix_vertex *vertex) /* Don't restart DFS from this vertex in unix_walk_scc(). */ list_move_tail(&vertex->entry, &unix_visited_vertices); - vertex->on_stack = false; + /* Mark vertex as off-stack. */ + vertex->index = unix_vertex_grouped_index; } list_del(&scc); @@ -357,20 +363,18 @@ static void __unix_walk_scc(struct unix_vertex *vertex) static void unix_walk_scc(void) { - struct unix_vertex *vertex; - - list_for_each_entry(vertex, &unix_unvisited_vertices, entry) - vertex->index = UNIX_VERTEX_INDEX_UNVISITED; - /* Visit every vertex exactly once. * __unix_walk_scc() moves visited vertices to unix_visited_vertices. */ while (!list_empty(&unix_unvisited_vertices)) { + struct unix_vertex *vertex; + vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); __unix_walk_scc(vertex); } list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); + swap(unix_vertex_unvisited_index, unix_vertex_grouped_index); } static LIST_HEAD(gc_candidates); From patchwork Mon Mar 25 20:24:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602890 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 586BE45037 for ; Mon, 25 Mar 2024 20:29:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.184.29 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398542; cv=none; b=YvOxvXWVQ8WOsTcP5GkFOWx7ivzBaJ1e8ro/Z3HhitBoqmo1yGN9cyhQIudVv/n8QCX0Mu9ZFtCjzA1w6VZ7gj58GPPUvG/u+BbqBB2CEfzIBh44fMYyZsWlcn+sS1e/z1QAxIe+EQ3H4mfCpLmmjd1EVVP/K2h3uHfnEOt3H+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398542; c=relaxed/simple; bh=oHc5lY+1upZyboSRysdLhLg6IEQJYdqe+Zf0MKp/Uo8=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hVWZn5oTj1JjXKDdyoMS9w1Vh7GwOLlndyibzowk4anL6liCghTSAE1mJxm0AuE7jC0Oe9Md7Eh1njb3LhGEeQG02tyVYiS0fAWD0idv/xtks3BNQu9aEAn4ipr3ORkwzmWVZQAeXB+ar0bMO+7Ioor54yXiH9DggxBabMWD9sM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=HrN3rUAY; arc=none smtp.client-ip=207.171.184.29 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="HrN3rUAY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398541; x=1742934541; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QnSXI56Eclyl3I2hheFsjL3bUFOtceUbzamdSwVdlqs=; b=HrN3rUAYc4VnrfYfHirZDM5gcsj+adM2C4zdW2XRAzZyLoHqwtyvo47D 4LAUps6i+mpcvXAeQdWDIxU/fzUaZT+0BMTxZ9deMfCt6WzsEvpVXKekK 0wDmml/YfS0BrITljIyIg9vZiYGzumMYycXp0brJLNGGB498RrhiIbJuw 8=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="406641177" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:28:43 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.21.151:9185] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.51.164:2525] with esmtp (Farcaster) id c1fa2a4b-452f-4547-a356-32735a71d3c6; Mon, 25 Mar 2024 20:28:42 +0000 (UTC) X-Farcaster-Flow-ID: c1fa2a4b-452f-4547-a356-32735a71d3c6 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:28:39 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:28:37 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 10/15] af_unix: Skip GC if no cycle exists. Date: Mon, 25 Mar 2024 13:24:20 -0700 Message-ID: <20240325202425.60930-11-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D043UWC004.ant.amazon.com (10.13.139.206) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org We do not need to run GC if there is no possible cyclic reference. We use unix_graph_maybe_cyclic to decide if we should run GC. If a fd of an AF_UNIX socket is passed to an already inflight AF_UNIX socket, they could form a cyclic reference. Then, we set true to unix_graph_maybe_cyclic and later run Tarjan's algorithm to group them into SCC. Once we run Tarjan's algorithm, we are 100% sure whether cyclic references exist or not. If there is no cycle, we set false to unix_graph_maybe_cyclic and can skip the entire garbage collection next time. When finalising SCC, we set true to unix_graph_maybe_cyclic if SCC consists of multiple vertices. Even if SCC is a single vertex, a cycle might exist as self-fd passing. Given the corner case is rare, we detect it by checking all edges of the vertex and set true to unix_graph_maybe_cyclic. With this change, __unix_gc() is just a spin_lock() dance in the normal usage. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 48 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index cec6189126ba..5a1fae78d6dc 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -112,6 +112,19 @@ static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) return edge->successor->vertex; } +static bool unix_graph_maybe_cyclic; + +static void unix_update_graph(struct unix_vertex *vertex) +{ + /* If the receiver socket is not inflight, no cyclic + * reference could be formed. + */ + if (!vertex) + return; + + unix_graph_maybe_cyclic = true; +} + static LIST_HEAD(unix_unvisited_vertices); enum unix_vertex_index { @@ -138,12 +151,16 @@ static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) vertex->out_degree++; list_add_tail(&edge->vertex_entry, &vertex->edges); + + unix_update_graph(unix_edge_successor(edge)); } static void unix_del_edge(struct scm_fp_list *fpl, struct unix_edge *edge) { struct unix_vertex *vertex = edge->predecessor->vertex; + unix_update_graph(unix_edge_successor(edge)); + list_del(&edge->vertex_entry); vertex->out_degree--; @@ -227,6 +244,7 @@ void unix_del_edges(struct scm_fp_list *fpl) void unix_update_edges(struct unix_sock *receiver) { spin_lock(&unix_gc_lock); + unix_update_graph(unix_sk(receiver->listener)->vertex); receiver->listener = NULL; spin_unlock(&unix_gc_lock); } @@ -268,6 +286,26 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } +static bool unix_scc_cyclic(struct list_head *scc) +{ + struct unix_vertex *vertex; + struct unix_edge *edge; + + /* SCC containing multiple vertices ? */ + if (!list_is_singular(scc)) + return true; + + vertex = list_first_entry(scc, typeof(*vertex), scc_entry); + + /* Self-reference or a embryo-listener circle ? */ + list_for_each_entry(edge, &vertex->edges, vertex_entry) { + if (unix_edge_successor(edge) == vertex) + return true; + } + + return false; +} + static LIST_HEAD(unix_visited_vertices); static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; @@ -353,6 +391,9 @@ static void __unix_walk_scc(struct unix_vertex *vertex) vertex->index = unix_vertex_grouped_index; } + if (!unix_graph_maybe_cyclic) + unix_graph_maybe_cyclic = unix_scc_cyclic(&scc); + list_del(&scc); } @@ -363,6 +404,8 @@ static void __unix_walk_scc(struct unix_vertex *vertex) static void unix_walk_scc(void) { + unix_graph_maybe_cyclic = false; + /* Visit every vertex exactly once. * __unix_walk_scc() moves visited vertices to unix_visited_vertices. */ @@ -524,6 +567,9 @@ static void __unix_gc(struct work_struct *work) spin_lock(&unix_gc_lock); + if (!unix_graph_maybe_cyclic) + goto skip_gc; + unix_walk_scc(); /* First, select candidates for garbage collection. Only @@ -617,7 +663,7 @@ static void __unix_gc(struct work_struct *work) /* All candidates should have been detached by now. */ WARN_ON_ONCE(!list_empty(&gc_candidates)); - +skip_gc: /* Paired with READ_ONCE() in wait_for_unix_gc(). */ WRITE_ONCE(gc_in_progress, false); From patchwork Mon Mar 25 20:24:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602891 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52005.amazon.com (smtp-fw-52005.amazon.com [52.119.213.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30C1745972 for ; Mon, 25 Mar 2024 20:29:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.156 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398551; cv=none; b=q8rDbd1HF/ai4Sw0UAdVgX6lGNWNmsd15RW+dPRJyrNK3RCbTSJtCrDa7IdtJG9paF0ccDSvFgp3hYyPrv/tVMyQeRY0Ofa46w8qZcl0C8KEkeb5SZ80+Iq38OtGYO2TZwVMA/6hpTLl9urgbTecoxHZPmlH1X/RPMxwvDqGZ6o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398551; c=relaxed/simple; bh=4RU+h+5donowN8bS3OtgmVLABK8LqXyfrHQiX4dXwJU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cKB8SluHkAsJ38/JnGdgPCHSHPIvYD1z1Kn/wuXvEeKXXeMk+WxOK/FKTrnH3FvHPQuH3yYIzH3msd6Pz9u0U7h5B6tLll1uIbbUD3fkp3yRAowH//8MXuwO89gp76LzkhFIXjXV2N0jqTBa4NyYDct0lylC3mZYXcvXuBkepC0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=XNTVQEcu; arc=none smtp.client-ip=52.119.213.156 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="XNTVQEcu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398548; x=1742934548; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lJOQdYNUXyb8txp+ZSSVJSROSCRjARTHKFWayP/P3Vo=; b=XNTVQEcuC0eCVO77XvlOK5QXEAQW+T74e5iqzY800+UEv4LoODMzadlw E4yhroDWmAvcl3f0Xp8CO1VBrD2LkA8p6f3SaET7wQ3TKqKyiYxmXAYCN p7hj9C/sMUpulZwPiprMfagfRAEPUqKkk9bU5DO/pDJFVEe+QylKCg7RM A=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="643489446" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52005.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:29:05 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.7.35:10737] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.19.88:2525] with esmtp (Farcaster) id c37b3c0f-67ce-42a3-9dfc-6c80d9df32f9; Mon, 25 Mar 2024 20:29:04 +0000 (UTC) X-Farcaster-Flow-ID: c37b3c0f-67ce-42a3-9dfc-6c80d9df32f9 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:29:03 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.28; Mon, 25 Mar 2024 20:29:01 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 11/15] af_unix: Avoid Tarjan's algorithm if unnecessary. Date: Mon, 25 Mar 2024 13:24:21 -0700 Message-ID: <20240325202425.60930-12-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D035UWA004.ant.amazon.com (10.13.139.109) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org Once a cyclic reference is formed, we need to run GC to check if there is dead SCC. However, we do not need to run Tarjan's algorithm if we know that the shape of the inflight graph has not been changed. If an edge is added/updated/deleted and the edge's successor is inflight, we set false to unix_graph_grouped, which means we need to re-classify SCC. Once we finalise SCC, we set true to unix_graph_grouped. While unix_graph_grouped is true, we can iterate the grouped SCC using vertex->scc_entry in unix_walk_scc_fast(). list_add() and list_for_each_entry_reverse() uses seem weird, but they are to keep the vertex order consistent and make writing test easier. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 5a1fae78d6dc..654aa8e30a8b 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -113,6 +113,7 @@ static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) } static bool unix_graph_maybe_cyclic; +static bool unix_graph_grouped; static void unix_update_graph(struct unix_vertex *vertex) { @@ -123,6 +124,7 @@ static void unix_update_graph(struct unix_vertex *vertex) return; unix_graph_maybe_cyclic = true; + unix_graph_grouped = false; } static LIST_HEAD(unix_unvisited_vertices); @@ -144,6 +146,7 @@ static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) vertex->index = unix_vertex_unvisited_index; vertex->out_degree = 0; INIT_LIST_HEAD(&vertex->edges); + INIT_LIST_HEAD(&vertex->scc_entry); list_move_tail(&vertex->entry, &unix_unvisited_vertices); edge->predecessor->vertex = vertex; @@ -418,6 +421,26 @@ static void unix_walk_scc(void) list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); swap(unix_vertex_unvisited_index, unix_vertex_grouped_index); + + unix_graph_grouped = true; +} + +static void unix_walk_scc_fast(void) +{ + while (!list_empty(&unix_unvisited_vertices)) { + struct unix_vertex *vertex; + struct list_head scc; + + vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); + list_add(&scc, &vertex->scc_entry); + + list_for_each_entry_reverse(vertex, &scc, scc_entry) + list_move_tail(&vertex->entry, &unix_visited_vertices); + + list_del(&scc); + } + + list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); } static LIST_HEAD(gc_candidates); @@ -570,7 +593,10 @@ static void __unix_gc(struct work_struct *work) if (!unix_graph_maybe_cyclic) goto skip_gc; - unix_walk_scc(); + if (unix_graph_grouped) + unix_walk_scc_fast(); + else + unix_walk_scc(); /* First, select candidates for garbage collection. Only * in-flight sockets are considered, and from those only ones From patchwork Mon Mar 25 20:24:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602892 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52003.amazon.com (smtp-fw-52003.amazon.com [52.119.213.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9114645972 for ; Mon, 25 Mar 2024 20:29:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398573; cv=none; b=azWbv+wVAHTfdtHWEdmkpkmVrmmI9i07oMRn47ewARnnJkuJxBwEcBpISReIIlaH43dLytMNAH6sndEVI3CnxtPrmGLaxhwtE7jiNzBADGcVRZrP7Fgk0cpjkG+SFZW343XZokGTMKjkOPBLtjBvaMwlGCPFUaBBjV9/GSYNuBI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398573; c=relaxed/simple; bh=w0Vrdmdimm8UuYquSrqMFzVpPdChLb0CELUeZejvTuU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=O9w751U/czcke/uYMO22IJ5evTZihZHEWYdWw5GiPRVCezRLsJdjyomPI0lK1zbqa4xoeuArJQVbfEHKSRJ1mK9nqLrvQk/PMHHrXLBeCe77JLoQYzj+3D4QHwv5oRqMgdD/Z/OdaZzb3t6EFPl0hV7LYahCSr9VxSfYODxhmGg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=mEn1toaG; arc=none smtp.client-ip=52.119.213.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="mEn1toaG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398571; x=1742934571; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Lhheo73g5CW3+CtsXsEdNPrtaQcTOOYkkOFc6juVERk=; b=mEn1toaGPNQknfvQ1dPSPlgeJ1G4lTMP8EMjvTWyzHlnNrwRr0OGqDgx ysToLpf70RNIHGxxeeMh9Tol8u03TNmCZ8D567BlqAWUS7tKRZdGGHW8m DkkJSa/ryURySkesvdC85oIIOJs87yzdDcQr85p4Ltw3TcgsKDEMgv9Lo M=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="647498465" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52003.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:29:29 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.7.35:43667] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.58.246:2525] with esmtp (Farcaster) id 28f6a9a3-20fb-4d07-b4c4-7d2cd473f1be; Mon, 25 Mar 2024 20:29:28 +0000 (UTC) X-Farcaster-Flow-ID: 28f6a9a3-20fb-4d07-b4c4-7d2cd473f1be Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:29:27 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.28; Mon, 25 Mar 2024 20:29:25 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 12/15] af_unix: Assign a unique index to SCC. Date: Mon, 25 Mar 2024 13:24:22 -0700 Message-ID: <20240325202425.60930-13-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D037UWB003.ant.amazon.com (10.13.138.115) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org The definition of the lowlink in Tarjan's algorithm is the smallest index of a vertex that is reachable with at most one back-edge in SCC. This is not useful for a cross-edge. If we start traversing from A in the following graph, the final lowlink of D is 3. The cross-edge here is one between D and C. A -> B -> D D = (4, 3) (index, lowlink) ^ | | C = (3, 1) | V | B = (2, 1) `--- C <--' A = (1, 1) This is because the lowlink of D is updated with the index of C. In the following patch, we detect a dead SCC by checking two conditions for each vertex. 1) vertex has no edge directed to another SCC (no bridge) 2) vertex's out_degree is the same as the refcount of its file If 1) is false, there is a receiver of all fds of the SCC and its ancestor SCC. To evaluate 1), we need to assign a unique index to each SCC and assign it to all vertices in the SCC. This patch changes the lowlink update logic for cross-edge so that in the example above, the lowlink of D is updated with the lowlink of C. A -> B -> D D = (4, 1) (index, lowlink) ^ | | C = (3, 1) | V | B = (2, 1) `--- C <--' A = (1, 1) Then, all vertices in the same SCC have the same lowlink, and we can quickly find the bridge connecting to different SCC if exists. However, it is no longer called lowlink, so we rename it to scc_index. (It's sometimes called lowpoint.) Also, we add a global variable to hold the last index used in DFS so that we do not reset the initial index in each DFS. This patch can be squashed to the SCC detection patch but is split deliberately for anyone wondering why lowlink is not used as used in the original Tarjan's algorithm and many reference implementations. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 2 +- net/unix/garbage.c | 29 +++++++++++++++-------------- 2 files changed, 16 insertions(+), 15 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index ec040caaa4b5..696d997a5ac9 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -36,7 +36,7 @@ struct unix_vertex { struct list_head scc_entry; unsigned long out_degree; unsigned long index; - unsigned long lowlink; + unsigned long scc_index; }; struct unix_edge { diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 654aa8e30a8b..3f59cee3ccbc 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -312,9 +312,8 @@ static bool unix_scc_cyclic(struct list_head *scc) static LIST_HEAD(unix_visited_vertices); static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; -static void __unix_walk_scc(struct unix_vertex *vertex) +static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_index) { - unsigned long index = UNIX_VERTEX_INDEX_START; LIST_HEAD(vertex_stack); struct unix_edge *edge; LIST_HEAD(edge_stack); @@ -326,9 +325,9 @@ static void __unix_walk_scc(struct unix_vertex *vertex) */ list_add(&vertex->scc_entry, &vertex_stack); - vertex->index = index; - vertex->lowlink = index; - index++; + vertex->index = *last_index; + vertex->scc_index = *last_index; + (*last_index)++; /* Explore neighbour vertices (receivers of the current vertex's fd). */ list_for_each_entry(edge, &vertex->edges, vertex_entry) { @@ -358,30 +357,30 @@ static void __unix_walk_scc(struct unix_vertex *vertex) next_vertex = vertex; vertex = edge->predecessor->vertex; - /* If the successor has a smaller lowlink, two vertices - * are in the same SCC, so propagate the smaller lowlink + /* If the successor has a smaller scc_index, two vertices + * are in the same SCC, so propagate the smaller scc_index * to skip SCC finalisation. */ - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); } else if (next_vertex->index != unix_vertex_grouped_index) { /* Loop detected by a back/cross edge. * - * The successor is on vertex_stack, so two vertices are - * in the same SCC. If the successor has a smaller index, + * The successor is on vertex_stack, so two vertices are in + * the same SCC. If the successor has a smaller *scc_index*, * propagate it to skip SCC finalisation. */ - vertex->lowlink = min(vertex->lowlink, next_vertex->index); + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); } else { /* The successor was already grouped as another SCC */ } } - if (vertex->index == vertex->lowlink) { + if (vertex->index == vertex->scc_index) { struct list_head scc; /* SCC finalised. * - * If the lowlink was not updated, all the vertices above on + * If the scc_index was not updated, all the vertices above on * vertex_stack are in the same SCC. Group them using scc_entry. */ __list_cut_position(&scc, &vertex_stack, &vertex->scc_entry); @@ -407,6 +406,8 @@ static void __unix_walk_scc(struct unix_vertex *vertex) static void unix_walk_scc(void) { + unsigned long last_index = UNIX_VERTEX_INDEX_START; + unix_graph_maybe_cyclic = false; /* Visit every vertex exactly once. @@ -416,7 +417,7 @@ static void unix_walk_scc(void) struct unix_vertex *vertex; vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); - __unix_walk_scc(vertex); + __unix_walk_scc(vertex, &last_index); } list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); From patchwork Mon Mar 25 20:24:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602893 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52005.amazon.com (smtp-fw-52005.amazon.com [52.119.213.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 618A51CA9C for ; Mon, 25 Mar 2024 20:29:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.156 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398596; cv=none; b=gEU+FfnWtQHoVtZr527KcqAYGX2y7zJKAe5KiFK1CpYo+YX4JphllxEWN3NlWrNNm0XrRsOZtxsictWuPvG6KBG8aneES6oTj0tfyUCosSw504OAk3Yp4Fhd78EtQmjnFLdhfURO21yISyq18DJslaCV5Ds+d7Yk2yVolYnsTjo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398596; c=relaxed/simple; bh=iyLbj9cLUkJQWzK6FVbQudD3bJV+J4MED1IxXdUb8I0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=e23xbqy6DhMZSd5VN0UVlkQHkyxVhf2G8/KeCRFgmTPj63oqv7tPBTcGDXp0POMw4vet4iQfIAaO0asjWoRAH0MxsVB4NBBhim9C3ofoi7QAY1t1vfbh5SVy3OD+9auAImymF9wEEG9yhX8f/SC6ycK8WtfGqKWLzQjE95Toz6Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=wG7u8ZF1; arc=none smtp.client-ip=52.119.213.156 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="wG7u8ZF1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398595; x=1742934595; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t330FiCJhqSQFxRYxWgnuJiNxej6KLHAyCk43tk+cjc=; b=wG7u8ZF1K94l2zUHVygp5sGvbUB33bIKceSJ6koe0x7gk4dyrN0Zk0YH jyec57xAuVsedGeNLX8CYELRVvTOnaNYv2ZaCraFwkIycHDgHcbK2P1eo PBmHhenYDbn1oNMNNPAO7YDSl/opd7WlIA9OzH7ov3a1z4pElePdCEjV+ E=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="643489636" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52005.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:29:53 +0000 Received: from EX19MTAUWB001.ant.amazon.com [10.0.38.20:50058] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.5.115:2525] with esmtp (Farcaster) id c747e186-927a-4f00-a663-4e0fe5be824b; Mon, 25 Mar 2024 20:29:52 +0000 (UTC) X-Farcaster-Flow-ID: c747e186-927a-4f00-a663-4e0fe5be824b Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:29:52 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:29:49 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 13/15] af_unix: Detect dead SCC. Date: Mon, 25 Mar 2024 13:24:23 -0700 Message-ID: <20240325202425.60930-14-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D032UWA003.ant.amazon.com (10.13.139.37) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org When iterating SCC, we call unix_vertex_dead() for each vertex to check if the vertex is close()d and has no bridge to another SCC. If both conditions are true for every vertex in SCC, we can execute garbage collection for all skb in the SCC. The actual garbage collection is done in the following patch, replacing the old implementation. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 44 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 3f59cee3ccbc..0a6b38da578c 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -289,6 +289,39 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } +static bool unix_vertex_dead(struct unix_vertex *vertex) +{ + struct unix_edge *edge; + struct unix_sock *u; + long total_ref; + + list_for_each_entry(edge, &vertex->edges, vertex_entry) { + struct unix_vertex *next_vertex = unix_edge_successor(edge); + + /* The vertex's fd can be received by a non-inflight socket. */ + if (!next_vertex) + return false; + + /* The vertex's fd can be received by an inflight socket in + * another SCC. + */ + if (next_vertex->scc_index != vertex->scc_index) + return false; + } + + /* No receiver exists out of the same SCC. */ + + edge = list_first_entry(&vertex->edges, typeof(*edge), vertex_entry); + u = edge->predecessor; + total_ref = file_count(u->sk.sk_socket->file); + + /* If not close()d, total_ref > out_degree. */ + if (total_ref != vertex->out_degree) + return false; + + return true; +} + static bool unix_scc_cyclic(struct list_head *scc) { struct unix_vertex *vertex; @@ -377,6 +410,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_inde if (vertex->index == vertex->scc_index) { struct list_head scc; + bool scc_dead = true; /* SCC finalised. * @@ -391,6 +425,9 @@ static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_inde /* Mark vertex as off-stack. */ vertex->index = unix_vertex_grouped_index; + + if (scc_dead) + scc_dead = unix_vertex_dead(vertex); } if (!unix_graph_maybe_cyclic) @@ -431,13 +468,18 @@ static void unix_walk_scc_fast(void) while (!list_empty(&unix_unvisited_vertices)) { struct unix_vertex *vertex; struct list_head scc; + bool scc_dead = true; vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); list_add(&scc, &vertex->scc_entry); - list_for_each_entry_reverse(vertex, &scc, scc_entry) + list_for_each_entry_reverse(vertex, &scc, scc_entry) { list_move_tail(&vertex->entry, &unix_visited_vertices); + if (scc_dead) + scc_dead = unix_vertex_dead(vertex); + } + list_del(&scc); } From patchwork Mon Mar 25 20:24:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602894 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 702BF12B87 for ; Mon, 25 Mar 2024 20:30:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398622; cv=none; b=iqNPA5IkC9nPx2ASIW2Ec8wVS3vlnSsJ3d7sNhuhP4CDkeBvbg0PFEEdvrfnclerPJPUFfDv+j5UCQkO1qx05iNddwYfJ15JsIAld0JojYJb2BeiQ5IIbRpsRlI6GTYiv5MfQ2JAD6Dktp1aq+FppQxNbPCYCt4CpluV/kFaSEs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398622; c=relaxed/simple; bh=SmVsW/8w8U8lz79iK/8D4EdYvZSRaMxqJLCsp75zjoU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=J5CS7bjLojqUV2AJvT1rYZcmEqULpgaX5U6WcvwIPtPH/HNRmYPJmR1XD4TfriO6Z/8NNaTnoha2EjVHMBFtYZO+VF3YCGMxp6xR/gWIdaIwJLxh95yOZ2GQN+dNRlkKGRkkfNimuKBtztadDNQYMMUct0JhuAJ22qj1gU0yurU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=v3CtLKWe; arc=none smtp.client-ip=52.119.213.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="v3CtLKWe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398621; x=1742934621; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5FQSQC4F5dYzrZG+44maqHNSw3qhe6NgmSvkvTbI0s0=; b=v3CtLKWe6WLvTPsVa2aQdeNQXqFNmiYg6HIUAPIrWCefsSgEnXyn9RV3 qur9femlOSf/LYDbyn9QqhEGD+pWQYObA78JvygRwPrzZsCGX2ETrPZk8 Kxt6lEruJjFXv8AmyHK5A4jhNjpyTgoI0ho1gdpO9mBWJzfs4+jWUiRKp I=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="194255746" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:30:18 +0000 Received: from EX19MTAUWC002.ant.amazon.com [10.0.21.151:6692] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.19.88:2525] with esmtp (Farcaster) id 23112c7d-77c3-40cf-94d8-47b1b6813e09; Mon, 25 Mar 2024 20:30:16 +0000 (UTC) X-Farcaster-Flow-ID: 23112c7d-77c3-40cf-94d8-47b1b6813e09 Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:30:16 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:30:13 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 14/15] af_unix: Replace garbage collection algorithm. Date: Mon, 25 Mar 2024 13:24:24 -0700 Message-ID: <20240325202425.60930-15-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D042UWA001.ant.amazon.com (10.13.139.92) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org If we find a dead SCC during iteration, we call unix_collect_skb() to splice all skb in the SCC to the global sk_buff_head, hitlist. After iterating all SCC, we unlock unix_gc_lock and purge the queue. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 8 -- net/unix/af_unix.c | 12 -- net/unix/garbage.c | 302 +++++++++--------------------------------- 3 files changed, 64 insertions(+), 258 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 696d997a5ac9..226a8da2cbe3 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -19,9 +19,6 @@ static inline struct unix_sock *unix_get_socket(struct file *filp) extern spinlock_t unix_gc_lock; extern unsigned int unix_tot_inflight; - -void unix_inflight(struct user_struct *user, struct file *fp); -void unix_notinflight(struct user_struct *user, struct file *fp); void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver); void unix_del_edges(struct scm_fp_list *fpl); void unix_update_edges(struct unix_sock *receiver); @@ -85,12 +82,7 @@ struct unix_sock { struct sock *peer; struct sock *listener; struct unix_vertex *vertex; - struct list_head link; - unsigned long inflight; spinlock_t lock; - unsigned long gc_flags; -#define UNIX_GC_CANDIDATE 0 -#define UNIX_GC_MAYBE_CYCLE 1 struct socket_wq peer_wq; wait_queue_entry_t peer_wake; struct scm_stat scm_stat; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index ae77e2dc0dae..27ca50ab1cd1 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -980,12 +980,10 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); u->listener = NULL; - u->inflight = 0; u->vertex = NULL; u->path.dentry = NULL; u->path.mnt = NULL; spin_lock_init(&u->lock); - INIT_LIST_HEAD(&u->link); mutex_init(&u->iolock); /* single task reading lock */ mutex_init(&u->bindlock); /* single task binding lock */ init_waitqueue_head(&u->peer_wait); @@ -1793,8 +1791,6 @@ static inline bool too_many_unix_fds(struct task_struct *p) static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) { - int i; - if (too_many_unix_fds(current)) return -ETOOMANYREFS; @@ -1806,9 +1802,6 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) if (!UNIXCB(skb).fp) return -ENOMEM; - for (i = scm->fp->count - 1; i >= 0; i--) - unix_inflight(scm->fp->user, scm->fp->fp[i]); - if (unix_prepare_fpl(UNIXCB(skb).fp)) return -ENOMEM; @@ -1817,15 +1810,10 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb) { - int i; - scm->fp = UNIXCB(skb).fp; UNIXCB(skb).fp = NULL; unix_destroy_fpl(scm->fp); - - for (i = scm->fp->count - 1; i >= 0; i--) - unix_notinflight(scm->fp->user, scm->fp->fp[i]); } static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 0a6b38da578c..89ea71d9297b 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -322,6 +322,52 @@ static bool unix_vertex_dead(struct unix_vertex *vertex) return true; } +enum unix_recv_queue_lock_class { + U_RECVQ_LOCK_NORMAL, + U_RECVQ_LOCK_EMBRYO, +}; + +static void unix_collect_skb(struct list_head *scc, struct sk_buff_head *hitlist) +{ + struct unix_vertex *vertex; + + list_for_each_entry_reverse(vertex, scc, scc_entry) { + struct sk_buff_head *queue; + struct unix_edge *edge; + struct unix_sock *u; + + edge = list_first_entry(&vertex->edges, typeof(*edge), vertex_entry); + u = edge->predecessor; + queue = &u->sk.sk_receive_queue; + + spin_lock(&queue->lock); + + if (u->sk.sk_state == TCP_LISTEN) { + struct sk_buff *skb; + + skb_queue_walk(queue, skb) { + struct sk_buff_head *embryo_queue = &skb->sk->sk_receive_queue; + + /* listener -> embryo order, the inversion never happens. */ + spin_lock_nested(&embryo_queue->lock, U_RECVQ_LOCK_EMBRYO); + skb_queue_splice_init(embryo_queue, hitlist); + spin_unlock(&embryo_queue->lock); + } + } else { + skb_queue_splice_init(queue, hitlist); + +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) + if (u->oob_skb) { + kfree_skb(u->oob_skb); + u->oob_skb = NULL; + } +#endif + } + + spin_unlock(&queue->lock); + } +} + static bool unix_scc_cyclic(struct list_head *scc) { struct unix_vertex *vertex; @@ -345,7 +391,8 @@ static bool unix_scc_cyclic(struct list_head *scc) static LIST_HEAD(unix_visited_vertices); static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; -static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_index) +static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_index, + struct sk_buff_head *hitlist) { LIST_HEAD(vertex_stack); struct unix_edge *edge; @@ -430,7 +477,9 @@ static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_inde scc_dead = unix_vertex_dead(vertex); } - if (!unix_graph_maybe_cyclic) + if (scc_dead) + unix_collect_skb(&scc, hitlist); + else if (!unix_graph_maybe_cyclic) unix_graph_maybe_cyclic = unix_scc_cyclic(&scc); list_del(&scc); @@ -441,7 +490,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_inde goto prev_vertex; } -static void unix_walk_scc(void) +static void unix_walk_scc(struct sk_buff_head *hitlist) { unsigned long last_index = UNIX_VERTEX_INDEX_START; @@ -454,7 +503,7 @@ static void unix_walk_scc(void) struct unix_vertex *vertex; vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); - __unix_walk_scc(vertex, &last_index); + __unix_walk_scc(vertex, &last_index, hitlist); } list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); @@ -463,7 +512,7 @@ static void unix_walk_scc(void) unix_graph_grouped = true; } -static void unix_walk_scc_fast(void) +static void unix_walk_scc_fast(struct sk_buff_head *hitlist) { while (!list_empty(&unix_unvisited_vertices)) { struct unix_vertex *vertex; @@ -480,263 +529,40 @@ static void unix_walk_scc_fast(void) scc_dead = unix_vertex_dead(vertex); } + if (scc_dead) + unix_collect_skb(&scc, hitlist); + list_del(&scc); } list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); } -static LIST_HEAD(gc_candidates); -static LIST_HEAD(gc_inflight_list); - -/* Keep the number of times in flight count for the file - * descriptor if it is for an AF_UNIX socket. - */ -void unix_inflight(struct user_struct *user, struct file *filp) -{ - struct unix_sock *u = unix_get_socket(filp); - - spin_lock(&unix_gc_lock); - - if (u) { - if (!u->inflight) { - WARN_ON_ONCE(!list_empty(&u->link)); - list_add_tail(&u->link, &gc_inflight_list); - } else { - WARN_ON_ONCE(list_empty(&u->link)); - } - u->inflight++; - } - - spin_unlock(&unix_gc_lock); -} - -void unix_notinflight(struct user_struct *user, struct file *filp) -{ - struct unix_sock *u = unix_get_socket(filp); - - spin_lock(&unix_gc_lock); - - if (u) { - WARN_ON_ONCE(!u->inflight); - WARN_ON_ONCE(list_empty(&u->link)); - - u->inflight--; - if (!u->inflight) - list_del_init(&u->link); - } - - spin_unlock(&unix_gc_lock); -} - -static void scan_inflight(struct sock *x, void (*func)(struct unix_sock *), - struct sk_buff_head *hitlist) -{ - struct sk_buff *skb; - struct sk_buff *next; - - spin_lock(&x->sk_receive_queue.lock); - skb_queue_walk_safe(&x->sk_receive_queue, skb, next) { - /* Do we have file descriptors ? */ - if (UNIXCB(skb).fp) { - bool hit = false; - /* Process the descriptors of this socket */ - int nfd = UNIXCB(skb).fp->count; - struct file **fp = UNIXCB(skb).fp->fp; - - while (nfd--) { - /* Get the socket the fd matches if it indeed does so */ - struct unix_sock *u = unix_get_socket(*fp++); - - /* Ignore non-candidates, they could have been added - * to the queues after starting the garbage collection - */ - if (u && test_bit(UNIX_GC_CANDIDATE, &u->gc_flags)) { - hit = true; - - func(u); - } - } - if (hit && hitlist != NULL) { - __skb_unlink(skb, &x->sk_receive_queue); - __skb_queue_tail(hitlist, skb); - } - } - } - spin_unlock(&x->sk_receive_queue.lock); -} - -static void scan_children(struct sock *x, void (*func)(struct unix_sock *), - struct sk_buff_head *hitlist) -{ - if (x->sk_state != TCP_LISTEN) { - scan_inflight(x, func, hitlist); - } else { - struct sk_buff *skb; - struct sk_buff *next; - struct unix_sock *u; - LIST_HEAD(embryos); - - /* For a listening socket collect the queued embryos - * and perform a scan on them as well. - */ - spin_lock(&x->sk_receive_queue.lock); - skb_queue_walk_safe(&x->sk_receive_queue, skb, next) { - u = unix_sk(skb->sk); - - /* An embryo cannot be in-flight, so it's safe - * to use the list link. - */ - WARN_ON_ONCE(!list_empty(&u->link)); - list_add_tail(&u->link, &embryos); - } - spin_unlock(&x->sk_receive_queue.lock); - - while (!list_empty(&embryos)) { - u = list_entry(embryos.next, struct unix_sock, link); - scan_inflight(&u->sk, func, hitlist); - list_del_init(&u->link); - } - } -} - -static void dec_inflight(struct unix_sock *usk) -{ - usk->inflight--; -} - -static void inc_inflight(struct unix_sock *usk) -{ - usk->inflight++; -} - -static void inc_inflight_move_tail(struct unix_sock *u) -{ - u->inflight++; - - /* If this still might be part of a cycle, move it to the end - * of the list, so that it's checked even if it was already - * passed over - */ - if (test_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags)) - list_move_tail(&u->link, &gc_candidates); -} - static bool gc_in_progress; static void __unix_gc(struct work_struct *work) { struct sk_buff_head hitlist; - struct unix_sock *u, *next; - LIST_HEAD(not_cycle_list); - struct list_head cursor; spin_lock(&unix_gc_lock); - if (!unix_graph_maybe_cyclic) + if (!unix_graph_maybe_cyclic) { + spin_unlock(&unix_gc_lock); goto skip_gc; - - if (unix_graph_grouped) - unix_walk_scc_fast(); - else - unix_walk_scc(); - - /* First, select candidates for garbage collection. Only - * in-flight sockets are considered, and from those only ones - * which don't have any external reference. - * - * Holding unix_gc_lock will protect these candidates from - * being detached, and hence from gaining an external - * reference. Since there are no possible receivers, all - * buffers currently on the candidates' queues stay there - * during the garbage collection. - * - * We also know that no new candidate can be added onto the - * receive queues. Other, non candidate sockets _can_ be - * added to queue, so we must make sure only to touch - * candidates. - */ - list_for_each_entry_safe(u, next, &gc_inflight_list, link) { - long total_refs; - - total_refs = file_count(u->sk.sk_socket->file); - - WARN_ON_ONCE(!u->inflight); - WARN_ON_ONCE(total_refs < u->inflight); - if (total_refs == u->inflight) { - list_move_tail(&u->link, &gc_candidates); - __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags); - __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags); - } - } - - /* Now remove all internal in-flight reference to children of - * the candidates. - */ - list_for_each_entry(u, &gc_candidates, link) - scan_children(&u->sk, dec_inflight, NULL); - - /* Restore the references for children of all candidates, - * which have remaining references. Do this recursively, so - * only those remain, which form cyclic references. - * - * Use a "cursor" link, to make the list traversal safe, even - * though elements might be moved about. - */ - list_add(&cursor, &gc_candidates); - while (cursor.next != &gc_candidates) { - u = list_entry(cursor.next, struct unix_sock, link); - - /* Move cursor to after the current position. */ - list_move(&cursor, &u->link); - - if (u->inflight) { - list_move_tail(&u->link, ¬_cycle_list); - __clear_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags); - scan_children(&u->sk, inc_inflight_move_tail, NULL); - } } - list_del(&cursor); - /* Now gc_candidates contains only garbage. Restore original - * inflight counters for these as well, and remove the skbuffs - * which are creating the cycle(s). - */ - skb_queue_head_init(&hitlist); - list_for_each_entry(u, &gc_candidates, link) { - scan_children(&u->sk, inc_inflight, &hitlist); + __skb_queue_head_init(&hitlist); -#if IS_ENABLED(CONFIG_AF_UNIX_OOB) - if (u->oob_skb) { - kfree_skb(u->oob_skb); - u->oob_skb = NULL; - } -#endif - } - - /* not_cycle_list contains those sockets which do not make up a - * cycle. Restore these to the inflight list. - */ - while (!list_empty(¬_cycle_list)) { - u = list_entry(not_cycle_list.next, struct unix_sock, link); - __clear_bit(UNIX_GC_CANDIDATE, &u->gc_flags); - list_move_tail(&u->link, &gc_inflight_list); - } + if (unix_graph_grouped) + unix_walk_scc_fast(&hitlist); + else + unix_walk_scc(&hitlist); spin_unlock(&unix_gc_lock); - /* Here we are. Hitlist is filled. Die. */ __skb_queue_purge(&hitlist); - - spin_lock(&unix_gc_lock); - - /* All candidates should have been detached by now. */ - WARN_ON_ONCE(!list_empty(&gc_candidates)); skip_gc: - /* Paired with READ_ONCE() in wait_for_unix_gc(). */ WRITE_ONCE(gc_in_progress, false); - - spin_unlock(&unix_gc_lock); } static DECLARE_WORK(unix_gc_work, __unix_gc); From patchwork Mon Mar 25 20:24:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13602895 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52003.amazon.com (smtp-fw-52003.amazon.com [52.119.213.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 066EC28E7 for ; Mon, 25 Mar 2024 20:30:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398653; cv=none; b=ZXPcMGdageOHKIHOzkdJqpqJa5c7JIkPqj27VxhZrLbuHX68AiCeiDh+JUcaBVAlQcTW5B2wu/cU2njWWOwq4aAQNtc6Lm+8EFiDM81XNgORsOVp3U1VSi3p0SkzLdPXQgfm6DJCMGt2G8BuUpvWOyJayIzqW7J41nygQwL9vvI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711398653; c=relaxed/simple; bh=ZuRdZHLz50uFNXbgNZaAm/C6KqlpRhYt39V0cu7ic4E=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Yg3A5XZOb4ZDIqjZYP6JvK+B9bmLDDkubGvNG3Hbz0QJi2xcrvKNw+XPCCLZ80umEJT/R1eG1r2wgyxps3JmRFMowb9nVtALL9LT+xFvGDlLnbZ9VTJjS+pT7xy2FPwPBLxBP82YOehxvk8S5V3xdNj1BcZoPywbnV4/MjwnFak= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=BT6gCAhZ; arc=none smtp.client-ip=52.119.213.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="BT6gCAhZ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1711398652; x=1742934652; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Iq/zFFFJ8V14KK9QW8av9KdokGOaG8yduusDkCXibtE=; b=BT6gCAhZukmK2OZgtr+mWfilylXZy4erl+yEIwbOu8UKUq+5/Ep2nw9Z QxfHtzt9em4/hk7uPNUlzrvawRuioNj7ZcqYUdy37JNXEIiORaqzEiEoA jDoAXW2iFjRwf6LhUCmuRIDnThVV/EM54wQ0XopS8ws0mh1YDoJyf77zn I=; X-IronPort-AV: E=Sophos;i="6.07,154,1708387200"; d="scan'208";a="647498815" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52003.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 20:30:50 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.7.35:3511] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.34.239:2525] with esmtp (Farcaster) id dd3acbfe-ff28-4911-9849-47268fb2da2d; Mon, 25 Mar 2024 20:30:49 +0000 (UTC) X-Farcaster-Flow-ID: dd3acbfe-ff28-4911-9849-47268fb2da2d Received: from EX19D004ANA001.ant.amazon.com (10.37.240.138) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:30:40 +0000 Received: from 88665a182662.ant.amazon.com (10.187.171.62) by EX19D004ANA001.ant.amazon.com (10.37.240.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Mon, 25 Mar 2024 20:30:38 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v5 net-next 15/15] selftest: af_unix: Test GC for SCM_RIGHTS. Date: Mon, 25 Mar 2024 13:24:25 -0700 Message-ID: <20240325202425.60930-16-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com> References: <20240325202425.60930-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D041UWA004.ant.amazon.com (10.13.139.9) To EX19D004ANA001.ant.amazon.com (10.37.240.138) X-Patchwork-Delegate: kuba@kernel.org This patch adds test cases to verify the new GC. We run each test for the following cases: * SOCK_DGRAM * SOCK_STREAM without embryo socket * SOCK_STREAM without embryo socket + MSG_OOB * SOCK_STREAM with embryo sockets * SOCK_STREAM with embryo sockets + MSG_OOB Before and after running each test case, we ensure that there is no AF_UNIX socket left in the netns by reading /proc/net/protocols. We cannot use /proc/net/unix and UNIX_DIAG because the embryo socket does not show up there. Each test creates multiple sockets in an array. We pass sockets in the even index using the peer sockets in the odd index. So, send_fd(0, 1) actually sends fd[0] to fd[2] via fd[0 + 1]. Test 1 : A <-> A Test 2 : A <-> B Test 3 : A -> B -> C <- D ^.___|___.' ^ `---------' Signed-off-by: Kuniyuki Iwashima --- tools/testing/selftests/net/.gitignore | 1 + tools/testing/selftests/net/af_unix/Makefile | 2 +- .../selftests/net/af_unix/scm_rights.c | 286 ++++++++++++++++++ 3 files changed, 288 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/net/af_unix/scm_rights.c diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore index 2f9d378edec3..d996a0ab0765 100644 --- a/tools/testing/selftests/net/.gitignore +++ b/tools/testing/selftests/net/.gitignore @@ -31,6 +31,7 @@ reuseport_dualstack rxtimestamp sctp_hello scm_pidfd +scm_rights sk_bind_sendto_listen sk_connect_zero_addr socket diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index 221c387a7d7f..3b83c797650d 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -1,4 +1,4 @@ CFLAGS += $(KHDR_INCLUDES) -TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect scm_pidfd +TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect scm_pidfd scm_rights include ../../lib.mk diff --git a/tools/testing/selftests/net/af_unix/scm_rights.c b/tools/testing/selftests/net/af_unix/scm_rights.c new file mode 100644 index 000000000000..bab606c9f1eb --- /dev/null +++ b/tools/testing/selftests/net/af_unix/scm_rights.c @@ -0,0 +1,286 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright Amazon.com Inc. or its affiliates. */ +#define _GNU_SOURCE +#include + +#include +#include +#include +#include +#include +#include + +#include "../../kselftest_harness.h" + +FIXTURE(scm_rights) +{ + int fd[16]; +}; + +FIXTURE_VARIANT(scm_rights) +{ + char name[16]; + int type; + int flags; + bool test_listener; +}; + +FIXTURE_VARIANT_ADD(scm_rights, dgram) +{ + .name = "UNIX ", + .type = SOCK_DGRAM, + .flags = 0, + .test_listener = false, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = 0, + .test_listener = false, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream_oob) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = MSG_OOB, + .test_listener = false, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream_listener) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = 0, + .test_listener = true, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream_listener_oob) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = MSG_OOB, + .test_listener = true, +}; + +static int count_sockets(struct __test_metadata *_metadata, + const FIXTURE_VARIANT(scm_rights) *variant) +{ + int sockets = -1, len, ret; + char *line = NULL; + size_t unused; + FILE *f; + + f = fopen("/proc/net/protocols", "r"); + ASSERT_NE(NULL, f); + + len = strlen(variant->name); + + while (getline(&line, &unused, f) != -1) { + int unused2; + + if (strncmp(line, variant->name, len)) + continue; + + ret = sscanf(line + len, "%d %d", &unused2, &sockets); + ASSERT_EQ(2, ret); + + break; + } + + free(line); + + ret = fclose(f); + ASSERT_EQ(0, ret); + + return sockets; +} + +FIXTURE_SETUP(scm_rights) +{ + int ret; + + ret = unshare(CLONE_NEWNET); + ASSERT_EQ(0, ret); + + ret = count_sockets(_metadata, variant); + ASSERT_EQ(0, ret); +} + +FIXTURE_TEARDOWN(scm_rights) +{ + int ret; + + sleep(1); + + ret = count_sockets(_metadata, variant); + ASSERT_EQ(0, ret); +} + +static void create_listeners(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + int n) +{ + struct sockaddr_un addr = { + .sun_family = AF_UNIX, + }; + socklen_t addrlen; + int i, ret; + + for (i = 0; i < n * 2; i += 2) { + self->fd[i] = socket(AF_UNIX, SOCK_STREAM, 0); + ASSERT_LE(0, self->fd[i]); + + addrlen = sizeof(addr.sun_family); + ret = bind(self->fd[i], (struct sockaddr *)&addr, addrlen); + ASSERT_EQ(0, ret); + + ret = listen(self->fd[i], -1); + ASSERT_EQ(0, ret); + + addrlen = sizeof(addr); + ret = getsockname(self->fd[i], (struct sockaddr *)&addr, &addrlen); + ASSERT_EQ(0, ret); + + self->fd[i + 1] = socket(AF_UNIX, SOCK_STREAM, 0); + ASSERT_LE(0, self->fd[i + 1]); + + ret = connect(self->fd[i + 1], (struct sockaddr *)&addr, addrlen); + ASSERT_EQ(0, ret); + } +} + +static void create_socketpairs(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + const FIXTURE_VARIANT(scm_rights) *variant, + int n) +{ + int i, ret; + + ASSERT_GE(sizeof(self->fd) / sizeof(int), n); + + for (i = 0; i < n * 2; i += 2) { + ret = socketpair(AF_UNIX, variant->type, 0, self->fd + i); + ASSERT_EQ(0, ret); + } +} + +static void __create_sockets(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + const FIXTURE_VARIANT(scm_rights) *variant, + int n) +{ + if (variant->test_listener) + create_listeners(_metadata, self, n); + else + create_socketpairs(_metadata, self, variant, n); +} + +static void __close_sockets(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + int n) +{ + int i, ret; + + ASSERT_GE(sizeof(self->fd) / sizeof(int), n); + + for (i = 0; i < n * 2; i++) { + ret = close(self->fd[i]); + ASSERT_EQ(0, ret); + } +} + +void __send_fd(struct __test_metadata *_metadata, + const FIXTURE_DATA(scm_rights) *self, + const FIXTURE_VARIANT(scm_rights) *variant, + int inflight, int receiver) +{ +#define MSG "nop" +#define MSGLEN 3 + struct { + struct cmsghdr cmsghdr; + int fd[2]; + } cmsg = { + .cmsghdr = { + .cmsg_len = CMSG_LEN(sizeof(cmsg.fd)), + .cmsg_level = SOL_SOCKET, + .cmsg_type = SCM_RIGHTS, + }, + .fd = { + self->fd[inflight * 2], + self->fd[inflight * 2], + }, + }; + struct iovec iov = { + .iov_base = MSG, + .iov_len = MSGLEN, + }; + struct msghdr msg = { + .msg_name = NULL, + .msg_namelen = 0, + .msg_iov = &iov, + .msg_iovlen = 1, + .msg_control = &cmsg, + .msg_controllen = CMSG_SPACE(sizeof(cmsg.fd)), + }; + int ret; + + ret = sendmsg(self->fd[receiver * 2 + 1], &msg, variant->flags); + ASSERT_EQ(MSGLEN, ret); +} + +#define create_sockets(n) \ + __create_sockets(_metadata, self, variant, n) +#define close_sockets(n) \ + __close_sockets(_metadata, self, n) +#define send_fd(inflight, receiver) \ + __send_fd(_metadata, self, variant, inflight, receiver) + +TEST_F(scm_rights, self_ref) +{ + create_sockets(2); + + send_fd(0, 0); + + send_fd(1, 1); + + close_sockets(2); +} + +TEST_F(scm_rights, triangle) +{ + create_sockets(6); + + send_fd(0, 1); + send_fd(1, 2); + send_fd(2, 0); + + send_fd(3, 4); + send_fd(4, 5); + send_fd(5, 3); + + close_sockets(6); +} + +TEST_F(scm_rights, cross_edge) +{ + create_sockets(8); + + send_fd(0, 1); + send_fd(1, 2); + send_fd(2, 0); + send_fd(1, 3); + send_fd(3, 2); + + send_fd(4, 5); + send_fd(5, 6); + send_fd(6, 4); + send_fd(5, 7); + send_fd(7, 6); + + close_sockets(8); +} + +TEST_HARNESS_MAIN