From patchwork Sat Jun 22 15:49:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ma, Yu" X-Patchwork-Id: 13708353 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1902C16F918; Sat, 22 Jun 2024 15:23:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719069792; cv=none; b=CZ8Tb899QuFMPeff+r9MDKnvHeBNdpe/eLa2DZj0mXmCrjhiFq4Ek+tEX+HC6BwEu8HyLZQ1KFETH5bxlCGm2+uwAxXXIvX5AGNZ+ndOk4bCc6CAkjJzIzYZjLD7JlRCbESGPbWuxHXROuJvD6qcLID0AuE8LbLlMSPXJ/HCXIc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719069792; c=relaxed/simple; bh=090Oenp2xYBj31TJSwOgyDVmmqhieSXMUyvU+EdGtCE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Hoa4lpvh90aLqU8Kscgoh9j5ls4KeDAi/di1h/mfUOwNkvCbbNOtjk8G5+/48XpStwd0JbIGRv/6s95LyEdzOyAH2L/JCoCWcNzgLtU9D+P6fgGt4Cbv50XatpBLdEjS11cwvNcfD+yebUyGCszXdnUnd/ElVtIsGNajSCiwWAE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dUVUWlc2; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dUVUWlc2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719069791; x=1750605791; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=090Oenp2xYBj31TJSwOgyDVmmqhieSXMUyvU+EdGtCE=; b=dUVUWlc2RjsV3j7GLBQ0lCTZn19KGO3ECtL1BmAcOMnveCD7w/0KE6ps LpXSFh5r3cq0rlKd4RhaSUzIcwqLtO2vGGOoWUrNZk2/+M+s0iGucU1Qb /1HnjS8lKiqvS6AdbeaAZxScog6uSXQXdvek2CS/4AgHFGlieS08rwHYv KdeNfX1JX3HMn3MCyWZpMTuT28Hn3nj+3usLvgx+4+GOIpVEZuPk8fhZ4 m2KJeh5QSiVUMQgx04I6m8CZ1m2nFjIOjunhJS5u4JEjU1UIY4Lhn/wHV Htyt9DQz6yVNamEiI56ANRtw4BFKMGS/jjLI/3k7Mu4WN7C6Cba6/V3Li g==; X-CSE-ConnectionGUID: A/N53p59RzCm8CVGnv6mnQ== X-CSE-MsgGUID: 4AZ2bA7WQTm69EqwaaBQcw== X-IronPort-AV: E=McAfee;i="6700,10204,11111"; a="41495806" X-IronPort-AV: E=Sophos;i="6.08,258,1712646000"; d="scan'208";a="41495806" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jun 2024 08:23:11 -0700 X-CSE-ConnectionGUID: xDaJIqpyTqCByot2QvZNTA== X-CSE-MsgGUID: k8vnp9ckQQ+3ZRmCcXKjKA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,258,1712646000"; d="scan'208";a="42680518" Received: from linux-pnp-server-16.sh.intel.com ([10.239.177.152]) by fmviesa006.fm.intel.com with ESMTP; 22 Jun 2024 08:23:08 -0700 From: Yu Ma To: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mjguzik@gmail.com, edumazet@google.com Cc: yu.ma@intel.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, pan.deng@intel.com, tianyou.li@intel.com, tim.c.chen@intel.com, tim.c.chen@linux.intel.com Subject: [PATCH v2 1/3] fs/file.c: add fast path in alloc_fd() Date: Sat, 22 Jun 2024 11:49:02 -0400 Message-ID: <20240622154904.3774273-2-yu.ma@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240622154904.3774273-1-yu.ma@intel.com> References: <20240614163416.728752-1-yu.ma@intel.com> <20240622154904.3774273-1-yu.ma@intel.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There is available fd in the lower 64 bits of open_fds bitmap for most cases when we look for an available fd slot. Skip 2-levels searching via find_next_zero_bit() for this common fast path. Look directly for an open bit in the lower 64 bits of open_fds bitmap when a free slot is available there, as: (1) The fd allocation algorithm would always allocate fd from small to large. Lower bits in open_fds bitmap would be used much more frequently than higher bits. (2) After fdt is expanded (the bitmap size doubled for each time of expansion), it would never be shrunk. The search size increases but there are few open fds available here. (3) find_next_zero_bit() itself has a fast path inside to speed up searching when size<=64. Besides, "!start" is added to fast path condition to ensure the allocated fd is greater than start (i.e. >=0), given alloc_fd() is only called in two scenarios: (1) Allocating a new fd (the most common usage scenario) via get_unused_fd_flags() to find fd start from bit 0 in fdt (i.e. start==0). (2) Duplicating a fd (less common usage) via dup_fd() to find a fd start from old_fd's index in fdt, which is only called by syscall fcntl. With the fast path added in alloc_fd(), pts/blogbench-1.1.0 read is improved by 17% and write by 9% on Intel ICX 160 cores configuration with v6.10-rc4. Reviewed-by: Tim Chen Signed-off-by: Yu Ma --- fs/file.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/fs/file.c b/fs/file.c index a3b72aa64f11..50e900a47107 100644 --- a/fs/file.c +++ b/fs/file.c @@ -515,28 +515,35 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) if (fd < files->next_fd) fd = files->next_fd; - if (fd < fdt->max_fds) + error = -EMFILE; + if (likely(fd < fdt->max_fds)) { + if (~fdt->open_fds[0] && !start) { + fd = find_next_zero_bit(fdt->open_fds, BITS_PER_LONG, fd); + goto fastreturn; + } fd = find_next_fd(fdt, fd); + } + + if (unlikely(fd >= fdt->max_fds)) { + error = expand_files(files, fd); + if (error < 0) + goto out; + /* + * If we needed to expand the fs array we + * might have blocked - try again. + */ + if (error) + goto repeat; + } +fastreturn: /* * N.B. For clone tasks sharing a files structure, this test * will limit the total number of files that can be opened. */ - error = -EMFILE; - if (fd >= end) + if (unlikely(fd >= end)) goto out; - error = expand_files(files, fd); - if (error < 0) - goto out; - - /* - * If we needed to expand the fs array we - * might have blocked - try again. - */ - if (error) - goto repeat; - if (start <= files->next_fd) files->next_fd = fd + 1; From patchwork Sat Jun 22 15:49:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ma, Yu" X-Patchwork-Id: 13708354 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C2F3170831; Sat, 22 Jun 2024 15:23:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719069796; cv=none; b=JsSDLPwV8PwM/6LrktvjY55bBdCNSO6UJG94guqzyVTcQIjAj0KuevxucEbbulw9+CD0XYwdKk2oUAUas9Gmd83k594ndZGy0lezJ+DINgPzEGCRYBsjL2Cr+UTpWGSuiZAgEvScaS2DsogZD+eKmopaIfEYq+4RL8nXpMxjaYo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719069796; c=relaxed/simple; bh=g9kJGRRRbLpi96eXV/nMIG4vmmZQtau79XbMSAyQ/oM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KaPS11TodLmBcLKCJfyroHiJ1E3R+eUSKNBUuNuw6YCVlL4zAPoX8IVKpLwvu9E8FB0mre/79Fba37qfwMc/CT0hOaGt08eSalddDKfktfH49MVxpbDXhZKFO+z2MXluHHrfHJNzumLaPj7fghxNbW8LaLcVr66jMZaj9v7PtLs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=S0VCrHOz; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="S0VCrHOz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719069794; x=1750605794; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g9kJGRRRbLpi96eXV/nMIG4vmmZQtau79XbMSAyQ/oM=; b=S0VCrHOzbciFLCEkpS2gFgnxtbBKaS3nRCKWUvIOOXVmBukg+zvrLHzF O6SiEZAoiPlkak4gS6wD1tDfzSWveXhKAhZx4IPfe59K0yawhZRgEyfN9 Sdk01QplOEnfahJTa6JayBooI5KjWCap6XlF0AHSdEwh7WdH07xHYTMs9 RWRfPagiAxgl9pHRh8k1WIwuBVogdsLByLCmmu613JEC0+e1T7LZrc8pu J95CLzpXH1R/g36X7EBgdZmMX7lYkZcWC6T0qBO+35sVUmTR7ATSZvgK1 MxoOFQ+TkMYNnp6RvCXO1FSpQC/Z9LnwRh7rbWQAXCUfG2cWpArolfNQs w==; X-CSE-ConnectionGUID: bW3lbH/QTpKTApcFr2HTbQ== X-CSE-MsgGUID: pKuo/2foTE+IvcDqdS0wTA== X-IronPort-AV: E=McAfee;i="6700,10204,11111"; a="41495813" X-IronPort-AV: E=Sophos;i="6.08,258,1712646000"; d="scan'208";a="41495813" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jun 2024 08:23:14 -0700 X-CSE-ConnectionGUID: +3omq9eXQair3IC0x4AoTg== X-CSE-MsgGUID: w2ieawAzQOaGiuF/JBgQow== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,258,1712646000"; d="scan'208";a="42680522" Received: from linux-pnp-server-16.sh.intel.com ([10.239.177.152]) by fmviesa006.fm.intel.com with ESMTP; 22 Jun 2024 08:23:12 -0700 From: Yu Ma To: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mjguzik@gmail.com, edumazet@google.com Cc: yu.ma@intel.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, pan.deng@intel.com, tianyou.li@intel.com, tim.c.chen@intel.com, tim.c.chen@linux.intel.com Subject: [PATCH v2 2/3] fs/file.c: conditionally clear full_fds Date: Sat, 22 Jun 2024 11:49:03 -0400 Message-ID: <20240622154904.3774273-3-yu.ma@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240622154904.3774273-1-yu.ma@intel.com> References: <20240614163416.728752-1-yu.ma@intel.com> <20240622154904.3774273-1-yu.ma@intel.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 64 bits in open_fds are mapped to a common bit in full_fds_bits. It is very likely that a bit in full_fds_bits has been cleared before in __clear_open_fds()'s operation. Check the clear bit in full_fds_bits before clearing to avoid unnecessary write and cache bouncing. See commit fc90888d07b8 ("vfs: conditionally clear close-on-exec flag") for a similar optimization. Together with patch 1, they improves pts/blogbench-1.1.0 read for 27%, and write for 14% on Intel ICX 160 cores configuration with v6.10-rc4. Reviewed-by: Tim Chen Signed-off-by: Yu Ma Reviewed-by: Jan Kara --- fs/file.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/file.c b/fs/file.c index 50e900a47107..b4d25f6d4c19 100644 --- a/fs/file.c +++ b/fs/file.c @@ -268,7 +268,9 @@ static inline void __set_open_fd(unsigned int fd, struct fdtable *fdt) static inline void __clear_open_fd(unsigned int fd, struct fdtable *fdt) { __clear_bit(fd, fdt->open_fds); - __clear_bit(fd / BITS_PER_LONG, fdt->full_fds_bits); + fd /= BITS_PER_LONG; + if (test_bit(fd, fdt->full_fds_bits)) + __clear_bit(fd, fdt->full_fds_bits); } static inline bool fd_is_open(unsigned int fd, const struct fdtable *fdt) From patchwork Sat Jun 22 15:49:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ma, Yu" X-Patchwork-Id: 13708355 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3ACDF16F903; Sat, 22 Jun 2024 15:23:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719069800; cv=none; b=FNSxG1Xks6kaIY7F/B3m/auGK1e4XFhBhHFmH1ZEtk6hn5lWWFgreqKk0rA9KmNGp4C7Xf6tVH3QyEaPKb0hfk5TjwfHeJxSHlKuYLE5CVcW9CvuT5V/oncifhhxY1+A9MWSWX36rhnyfbYr46XTuG/4MVoUi1taXxtczidy0/E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719069800; c=relaxed/simple; bh=+3EtXD3anw5T5SPSXo7iEqIIBqdxGY53eBDTDUvUN6A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RKdr+fP8WDVisMiHQPTIGww7yew8HyZOt6OMb60QaixAiq5sHdClG3yHfZWPL1OUdLsxH1wz6pvOMdzN48ifJ4suOgt9kx2ERyG1qH9dJumZ57Mx32JpW73h4lfBtgNi1AnS+X0Kg4y20ti9RBOCNSbWbcS8Lz9fMi4OOCpg7tY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=EUHnRP7j; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="EUHnRP7j" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719069798; x=1750605798; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+3EtXD3anw5T5SPSXo7iEqIIBqdxGY53eBDTDUvUN6A=; b=EUHnRP7jGDm2Tx80mmXBHZye71Pzbr/AcqP1ih2PL52ljheDemfie3Rt Qdu5XM5xphoi6LyFQTdeqlczig3r9kr4J3Xl07DLJ8OzK/bOoF0v4GlfZ rBdSLwDCOhdpp05J9R7t+sPYg9wM3G6phCx9dkPKPzz1nSmT7w6o7bUsb ohzbfC297heVAW8VsR8Im9bI5ELY6HwIfS2vRVIVv3alZSoeBCuIHJYlS 2hgxvTcXVinLFwOvSO2XcvlHWS4mUAgTfJhoXAscSSIV8ip5KThe78TML iq8H3fw7+Y4CGkdM8wZqD+6spf8AtNsjmJ2xxUHpsHh85S97ZdI8B5uf4 g==; X-CSE-ConnectionGUID: ZDPF5iUMR7mpEXujn3mikw== X-CSE-MsgGUID: YDs3RqLuSLOHRSfF7u7Kxg== X-IronPort-AV: E=McAfee;i="6700,10204,11111"; a="41495819" X-IronPort-AV: E=Sophos;i="6.08,258,1712646000"; d="scan'208";a="41495819" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jun 2024 08:23:18 -0700 X-CSE-ConnectionGUID: LAVAn+nqTZC3/Mh2iI7B5A== X-CSE-MsgGUID: CofonECtRpuZaK2eOtuHsA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,258,1712646000"; d="scan'208";a="42680526" Received: from linux-pnp-server-16.sh.intel.com ([10.239.177.152]) by fmviesa006.fm.intel.com with ESMTP; 22 Jun 2024 08:23:15 -0700 From: Yu Ma To: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mjguzik@gmail.com, edumazet@google.com Cc: yu.ma@intel.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, pan.deng@intel.com, tianyou.li@intel.com, tim.c.chen@intel.com, tim.c.chen@linux.intel.com Subject: [PATCH v2 3/3] fs/file.c: remove sanity_check from alloc_fd() Date: Sat, 22 Jun 2024 11:49:04 -0400 Message-ID: <20240622154904.3774273-4-yu.ma@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240622154904.3774273-1-yu.ma@intel.com> References: <20240614163416.728752-1-yu.ma@intel.com> <20240622154904.3774273-1-yu.ma@intel.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 alloc_fd() has a sanity check inside to make sure the struct file mapping to the allocated fd is NULL. Remove this sanity check since it can be assured by exisitng zero initilization and NULL set when recycling fd. Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by 32%, write improved by 17% on Intel ICX 160 cores configuration with v6.10-rc4. Reviewed-by: Tim Chen Signed-off-by: Yu Ma --- fs/file.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/fs/file.c b/fs/file.c index b4d25f6d4c19..1153b0b7ba3d 100644 --- a/fs/file.c +++ b/fs/file.c @@ -555,13 +555,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) else __clear_close_on_exec(fd, fdt); error = fd; -#if 1 - /* Sanity check */ - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); - rcu_assign_pointer(fdt->fd[fd], NULL); - } -#endif out: spin_unlock(&files->file_lock);