From patchwork Fri Jun 14 16:34:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ma, Yu" X-Patchwork-Id: 13698915 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76299181323; Fri, 14 Jun 2024 16:08:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718381336; cv=none; b=FqCXoUSiBf9M+Q0wrEXYS2sfpG50GhDTa4W48DTurcBDPuyKcsURQMIG6Wo7czRw6//YebHkK/ycuN+J9LCzN6LGiHBFktep4B1eiVCSoju+hLgyuuvpGGI1nWUeGCf+ctLI0mANWawOA6Z75GluYu95BHEg986U0xfHLAS6www= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718381336; c=relaxed/simple; bh=PW1CwgwF4DEJPAtCw5E8u2zb9VKV5pgVkPiDLQwmwMg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kRDwojqr4km8qBGEjgG6gCgKqhQN/rgsgqGe3tnwaD/ISLh9XVZinIoQJfcosNyY9Tgj7ZVWW0bUYYMmmLwFS9/mKu/fgBvfiosl9bMorBQ0Kt2epL0qZFhD3BT4LntvVc1sVXKBuM6bvjw2kBjhwgSJasBtGIFB7QdNPW8o8Vo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KJemTTI4; arc=none smtp.client-ip=198.175.65.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KJemTTI4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718381334; x=1749917334; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PW1CwgwF4DEJPAtCw5E8u2zb9VKV5pgVkPiDLQwmwMg=; b=KJemTTI48iVgiG+z1cqTsCyZOSKwMmF+owKz8g7Lp7FxJjfr7PUwsgYY Vs6gFTXhIjtjuPqRzgSZYldKayrG4d8KbX7dSTvp8a9xonaVqU1lyy+OC R1Oez9MMFLsVnLZOLNO+dkudMILQwfAsGV1NVuX1DArzuSqAPx3LCjvaM rGBQiE/P8Y+a6YXX7p1Et4cF0Hb7b/Ou51A15ADxlUnaqXdkkkwSmRgoG cN6FW+PTVgbUtRLx+15HT3GZuQhMu7SMZ241Wy21x5Ue/pXUIQFhV4+s5 J/hhxElhvjbVAAnhODvvvNYUBVU7ef9iQ+UeJfJOrJCxKYjvbu3IEewU9 w==; X-CSE-ConnectionGUID: ipZD0eH/RSCv9qktbPlTlQ== X-CSE-MsgGUID: uf3cEddyQHyZOPtJS2A3lg== X-IronPort-AV: E=McAfee;i="6700,10204,11103"; a="15399385" X-IronPort-AV: E=Sophos;i="6.08,238,1712646000"; d="scan'208";a="15399385" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2024 09:08:54 -0700 X-CSE-ConnectionGUID: yBxsODLRSlK45G+x8PXgMA== X-CSE-MsgGUID: ygVkJ34WQKyUBeRivbolNg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,238,1712646000"; d="scan'208";a="71741074" Received: from linux-pnp-server-16.sh.intel.com ([10.239.177.152]) by fmviesa001.fm.intel.com with ESMTP; 14 Jun 2024 09:08:51 -0700 From: Yu Ma To: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tim.c.chen@linux.intel.com, tim.c.chen@intel.com, pan.deng@intel.com, tianyou.li@intel.com, yu.ma@intel.com Subject: [PATCH 1/3] fs/file.c: add fast path in alloc_fd() Date: Fri, 14 Jun 2024 12:34:14 -0400 Message-ID: <20240614163416.728752-2-yu.ma@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240614163416.728752-1-yu.ma@intel.com> References: <20240614163416.728752-1-yu.ma@intel.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There is available fd in the lower 64 bits of open_fds bitmap for most cases when we look for an available fd slot. Skip 2-levels searching via find_next_zero_bit() for this common fast path. Look directly for an open bit in the lower 64 bits of open_fds bitmap when a free slot is available there, as: (1) The fd allocation algorithm would always allocate fd from small to large. Lower bits in open_fds bitmap would be used much more frequently than higher bits. (2) After fdt is expanded (the bitmap size doubled for each time of expansion), it would never be shrunk. The search size increases but there are few open fds available here. (3) There is fast path inside of find_next_zero_bit() when size<=64 to speed up searching. With the fast path added in alloc_fd() through one-time bitmap searching, pts/blogbench-1.1.0 read is improved by 20% and write by 10% on Intel ICX 160 cores configuration with v6.8-rc6. Reviewed-by: Tim Chen Signed-off-by: Yu Ma --- fs/file.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/file.c b/fs/file.c index 3b683b9101d8..e8d2f9ef7fd1 100644 --- a/fs/file.c +++ b/fs/file.c @@ -510,8 +510,13 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) if (fd < files->next_fd) fd = files->next_fd; - if (fd < fdt->max_fds) + if (fd < fdt->max_fds) { + if (~fdt->open_fds[0]) { + fd = find_next_zero_bit(fdt->open_fds, BITS_PER_LONG, fd); + goto success; + } fd = find_next_fd(fdt, fd); + } /* * N.B. For clone tasks sharing a files structure, this test @@ -531,7 +536,7 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) */ if (error) goto repeat; - +success: if (start <= files->next_fd) files->next_fd = fd + 1; From patchwork Fri Jun 14 16:34:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ma, Yu" X-Patchwork-Id: 13698916 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4EC819AD6E; Fri, 14 Jun 2024 16:08:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718381339; cv=none; b=lEp3rb6cOKV8KrkFCaWj/OhBn4XDJ8xdRnQXsoJLyS+tQBVM3RQREW+J+nHpxH/o4SQRg87+qE8yxf6KkXNwdTebU2LaCjSDuJeuiKwaldSjogcm8hpyELBDPi+cgQQXyqD5/hWj1V8c9s3IJVJWjK1tj5zzHNywQroEXIZVt90= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718381339; c=relaxed/simple; bh=XaUtWTN7akwimeRKOM+4b5jvaenMH+iXA0eNBgjkAJY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TH7WvgUpMaFD2zowYVNGknQGageRt2i75rvjdj0UUpcV9XvLHZkj+ZeCmLuYx50O1vGeVbDZzP1B+nQ/Cjhar7peJC2D9C6LR8vPDKOafSI1mxOc3sOhsNXy442zCczrN3uznf9SQvDBLpotJUKRyvwMaCz69ej6lFS0wn2zMSY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Il5nuDff; arc=none smtp.client-ip=198.175.65.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Il5nuDff" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718381338; x=1749917338; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XaUtWTN7akwimeRKOM+4b5jvaenMH+iXA0eNBgjkAJY=; b=Il5nuDffywf1zmXUqc97i3BH98Sz2p21XhFEgtWWtifNVtiVTHU0sVJf tAg4FhK76rOsgKpecTB8qC+0DbMmWWPDMzUkZrVh5bZQ0zaSUojxRdWJ4 GZoMuj+nzNG39GYrNT3YdidoxoCmQv/nfLEC3sBJgkWwyPN0VyPB9AvMw 6DByBWWfkt2rS1dbyDjI8iU5zK09/BtGkvQ3qa4DDim8j7mA+WJnYMKbf ZTq89O/wVVqbeBholrdtib2JtOLFfzR+XT/DoKeSaNMY1l5v609fHUQhm omtItsIV8/hlE/4pH8pAI78BdG6B2M34jw5hJUFQoQ3crwdHBkyNGDmp0 A==; X-CSE-ConnectionGUID: znlBW63ET6qbgtpN1RE4mQ== X-CSE-MsgGUID: 06Lb98oZSOeM4xpccqAzlg== X-IronPort-AV: E=McAfee;i="6700,10204,11103"; a="15399411" X-IronPort-AV: E=Sophos;i="6.08,238,1712646000"; d="scan'208";a="15399411" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2024 09:08:57 -0700 X-CSE-ConnectionGUID: P7B4ZEOATw64b2ByqlQHlw== X-CSE-MsgGUID: VuUBpHm/SE2pOap1LEEH3g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,238,1712646000"; d="scan'208";a="71741079" Received: from linux-pnp-server-16.sh.intel.com ([10.239.177.152]) by fmviesa001.fm.intel.com with ESMTP; 14 Jun 2024 09:08:55 -0700 From: Yu Ma To: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tim.c.chen@linux.intel.com, tim.c.chen@intel.com, pan.deng@intel.com, tianyou.li@intel.com, yu.ma@intel.com Subject: [PATCH 2/3] fs/file.c: conditionally clear full_fds Date: Fri, 14 Jun 2024 12:34:15 -0400 Message-ID: <20240614163416.728752-3-yu.ma@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240614163416.728752-1-yu.ma@intel.com> References: <20240614163416.728752-1-yu.ma@intel.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 64 bits in open_fds are mapped to a common bit in full_fds_bits. It is very likely that a bit in full_fds_bits has been cleared before in __clear_open_fds()'s operation. Check the clear bit in full_fds_bits before clearing to avoid unnecessary write and cache bouncing. See commit fc90888d07b8 ("vfs: conditionally clear close-on-exec flag") for a similar optimization. Together with patch 1, they improves pts/blogbench-1.1.0 read for 28%, and write for 14% on Intel ICX 160 cores configuration with v6.8-rc6. Reviewed-by: Tim Chen Signed-off-by: Yu Ma --- fs/file.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/file.c b/fs/file.c index e8d2f9ef7fd1..a0e94a178c0b 100644 --- a/fs/file.c +++ b/fs/file.c @@ -268,7 +268,9 @@ static inline void __set_open_fd(unsigned int fd, struct fdtable *fdt) static inline void __clear_open_fd(unsigned int fd, struct fdtable *fdt) { __clear_bit(fd, fdt->open_fds); - __clear_bit(fd / BITS_PER_LONG, fdt->full_fds_bits); + fd /= BITS_PER_LONG; + if (test_bit(fd, fdt->full_fds_bits)) + __clear_bit(fd, fdt->full_fds_bits); } static unsigned int count_open_files(struct fdtable *fdt) From patchwork Fri Jun 14 16:34:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ma, Yu" X-Patchwork-Id: 13698917 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8FC219D068; Fri, 14 Jun 2024 16:09:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.17 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718381342; cv=none; b=Db4qomzak/+95p8h0TK9JuZjJD9cMzLLZuX/qIF/sBoATsvSLH7SWND0/XT4E3jQA9/vSmnLmnkVNzjAYHGz/tLLL1FggCNiOT4UJZrhwu4tXn6kJdG79m6WJU1GCP/tOq/Yym/Khwu04QdjilquZmVa/vjBnGPY3xcEfeHhMdk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718381342; c=relaxed/simple; bh=7j/faoRXzWh8uYOzQ7IXFB9d5L6kjkr5xK99YDrC7To=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QF/pk05xc1Y9bgz27SK+yD9gSzQfrGjxGuFQQ+8MicztL7WoWuotsD7uryKBgSi3GHH14chE/3dXWPfM+fCs+3jne5TjmTPrOHDyziZQlerYODWsXbmz1Inw+XTXVUiGfyzkFcFEOKaKmELvu9vnPNnD/UOuIA5KhWMCbSFtt0c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VB20f3hC; arc=none smtp.client-ip=198.175.65.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VB20f3hC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718381341; x=1749917341; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7j/faoRXzWh8uYOzQ7IXFB9d5L6kjkr5xK99YDrC7To=; b=VB20f3hCKJcgOMYIFt4PpDglZ4g2NbLrdeQ4Ok27fSkFjrq9Tno+dNx7 Twzvxth4nNgcUfBNu8J5z4Lp7ekB4PyNkDFKsRISSconfl40Xlm/uIreJ rx6n/h+iR5YEiNjVk0SRieRWW4FvGux98+AKqAKNJVpD7LauFaq58kN27 npclSeVZuSCFUkut7pR1vxX72eq3uJPZkiWcF8zqBmQ0Yl9Z86gojCoLc Bkc0gNFF1Fc23EWgODtAwPhXj4brmPnfNRM8mNPnTsy7fSvE4xk+71kBX uHcNAqsLFI3Cf/n+NBUJwVz6PprqvauBDrr9XPzHeXIr8+vXVrfXC4pJh w==; X-CSE-ConnectionGUID: rur5bjjARpSF1Z+kfpxQRA== X-CSE-MsgGUID: aiTmifgOSJSUm6/oEbx1Pg== X-IronPort-AV: E=McAfee;i="6700,10204,11103"; a="15399431" X-IronPort-AV: E=Sophos;i="6.08,238,1712646000"; d="scan'208";a="15399431" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2024 09:09:00 -0700 X-CSE-ConnectionGUID: Y2zZfm5BTlukoOmsTXo3aA== X-CSE-MsgGUID: koUQl62rTAqdHC4GeeJaBA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,238,1712646000"; d="scan'208";a="71741096" Received: from linux-pnp-server-16.sh.intel.com ([10.239.177.152]) by fmviesa001.fm.intel.com with ESMTP; 14 Jun 2024 09:08:58 -0700 From: Yu Ma To: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tim.c.chen@linux.intel.com, tim.c.chen@intel.com, pan.deng@intel.com, tianyou.li@intel.com, yu.ma@intel.com Subject: [PATCH 3/3] fs/file.c: move sanity_check from alloc_fd() to put_unused_fd() Date: Fri, 14 Jun 2024 12:34:16 -0400 Message-ID: <20240614163416.728752-4-yu.ma@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240614163416.728752-1-yu.ma@intel.com> References: <20240614163416.728752-1-yu.ma@intel.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 alloc_fd() has a sanity check inside to make sure the FILE object mapping to the allocated fd is NULL. Move the sanity check from performance critical alloc_fd() path to non performance critical put_unused_fd() path. As the initial NULL FILE object condition can be assured by zero initialization in init_file, we just need to make sure that it is NULL when recycling fd back. There are 3 functions call __put_unused_fd() to return fd, file_close_fd_locked(), do_close_on_exec() and put_unused_fd(). For file_close_fd_locked() and do_close_on_exec(), they have implemented NULL check already. Adds NULL check to put_unused_fd() to cover all release paths. Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by 32%, write improved by 15% on Intel ICX 160 cores configuration with v6.8-rc6. Reviewed-by: Tim Chen Signed-off-by: Yu Ma --- fs/file.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/file.c b/fs/file.c index a0e94a178c0b..59d62909e2e3 100644 --- a/fs/file.c +++ b/fs/file.c @@ -548,13 +548,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) else __clear_close_on_exec(fd, fdt); error = fd; -#if 1 - /* Sanity check */ - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); - rcu_assign_pointer(fdt->fd[fd], NULL); - } -#endif out: spin_unlock(&files->file_lock); @@ -572,7 +565,7 @@ int get_unused_fd_flags(unsigned flags) } EXPORT_SYMBOL(get_unused_fd_flags); -static void __put_unused_fd(struct files_struct *files, unsigned int fd) +static inline void __put_unused_fd(struct files_struct *files, unsigned int fd) { struct fdtable *fdt = files_fdtable(files); __clear_open_fd(fd, fdt); @@ -583,7 +576,12 @@ static void __put_unused_fd(struct files_struct *files, unsigned int fd) void put_unused_fd(unsigned int fd) { struct files_struct *files = current->files; + struct fdtable *fdt = files_fdtable(files); spin_lock(&files->file_lock); + if (unlikely(rcu_access_pointer(fdt->fd[fd]))) { + printk(KERN_WARNING "put_unused_fd: slot %d not NULL!\n", fd); + rcu_assign_pointer(fdt->fd[fd], NULL); + } __put_unused_fd(files, fd); spin_unlock(&files->file_lock); }