From patchwork Mon Jun 26 18:21:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13293332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90605EB64D7 for ; Mon, 26 Jun 2023 18:21:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30C458D0006; Mon, 26 Jun 2023 14:21:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BC328D0001; Mon, 26 Jun 2023 14:21:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15DC78D0006; Mon, 26 Jun 2023 14:21:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 083038D0001 for ; Mon, 26 Jun 2023 14:21:46 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BA99980815 for ; Mon, 26 Jun 2023 18:21:45 +0000 (UTC) X-FDA: 80945717370.25.C122626 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf26.hostedemail.com (Postfix) with ESMTP id D18AC140002 for ; Mon, 26 Jun 2023 18:21:43 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kmbTkXAg; spf=pass (imf26.hostedemail.com: domain of cel@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cel@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687803703; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PgiRF7eOMMJJhSdgmMjcm5fTADSAlBQ5PzdWP/6AXR0=; b=ku4cmrtJKgCrp8y52wxcYDFG4I3t/BYEjPe/9FnYdlNW72Rmg3/ayF9TnP/Wd9wFjxhgpH KngI3xtko+fpv1TeF+++rhIf/Ji85XIpPiSEuKf0/Q4OKZRLGan9LFGIIdsVGoXjQQvB+L pX1APHR8L+RWi5AMKNCxqFy5ctxTS5I= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687803703; a=rsa-sha256; cv=none; b=PkKRj9klM+u0iUZyDAsgXfSquOJr6vmKtKVGAdp/qY1Aze69I7U6LOykgEypfuiiQ0Myto GYLYNHos3Z21q1Yv4C6rvXIEQ5jl4hpXBszQGBr4I3hGhy/wN83zsC4Ol6hilp1qbsOqe6 bgJfx+zaDYsT3mGgtlg1A8sXxyo/rEs= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kmbTkXAg; spf=pass (imf26.hostedemail.com: domain of cel@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cel@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E074D60F59; Mon, 26 Jun 2023 18:21:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BF055C433C8; Mon, 26 Jun 2023 18:21:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687803702; bh=UKT1qjbXJ/095r7Q3G5RGYYkec8CgNh7aC+yZ74QvAA=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=kmbTkXAgmmFnlq2CWE/KuYZ1krx1nS+2xlTwHd0NOLh2Cxy2pqIlzyZ8LKTdXD6Ef ZU9IAnn/DvWiEqy484ycq0/Ao9w+/2bzgOtaV7+WTc7RygG0wwYDABenSmCxm8Sse+ eoAHZp6LEll+IcjSRWGESKr6bBK5/VmMnKdmmVo3nlNEpg3im7bymnpeyQgeDVuVW6 Ss7k47Kw5BNwQx3LRWv67kMo0AEhNstkaccwx4zk0CYQg3Sen+oMvfQSzkQlin3czV Eoq//d2TEGDkA1yP7iTcR/SMH5GmtLmU2lOS6wIgypyzu3n6NMvMa6NZjEP1y6D5dP pzBW0cZnIAQsQ== Subject: [PATCH v4 3/3] shmem: stable directory offsets From: Chuck Lever To: viro@zeniv.linux.org.uk, brauner@kernel.org, hughd@google.com, akpm@linux-foundation.org Cc: Chuck Lever , jlayton@redhat.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Date: Mon, 26 Jun 2023 14:21:40 -0400 Message-ID: <168780370085.2142.4461798321197359310.stgit@manet.1015granger.net> In-Reply-To: <168780354647.2142.537463116658872680.stgit@manet.1015granger.net> References: <168780354647.2142.537463116658872680.stgit@manet.1015granger.net> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Rspamd-Queue-Id: D18AC140002 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 8nshhcp6fn6o5n9kazemuncerq3hahr5 X-HE-Tag: 1687803703-274737 X-HE-Meta: U2FsdGVkX19NdTDaz8KeEWdwj0i6km9CpTv+lpayJqdJFMsI9MvPsz0NwSEWmJaCmWcsrnfA/BJUHfRChjzAueYpXI3IfeuC4itlSSylpeGCpIvHxnvjwvFEXhKEet6pnmgYlFItT/kdfFgNQN4vMzQTuhrJIXJqXFgIf7pLMaXLL/zqDYWCDMkRc8M6a1YiUhugkJ+lRjVWPFpHquvP924a+4sjC0+xs8sNThuzhhMW01NkvtH8NVVjTqPwkiz2NpnDTlQfQTGE7ij6B62V30DuK2wFAEyXQQjDlrjWBOtxgoI5IopKNYbhgCHIJRLBYhsPPqZUTzpe9p4XF5jgnnVJ75hlyyLjH6g99WCVLkOnvbE+SOMjrYPgz4wdl/9o4D52zqUy3aSyONdFSahF/27G0faXIk8EvJ0j+Au/Gdq7PWzDGxTgbX78N+9VI+zP45njXuu1+Lf6uYSHqAXCzywbxfhN/Edac3auJMU9Dnw9Jq3YtPINvF3FhyngMgelLsR367LX+yLcXOGezqzwuL0phXO49MJ0C4Os8NcncvyodkRQroZoThkFLwHCilriu5dd2nwhd/67/0gfDAYQqKKG9e3dA4WBpomQAaeO2jkeVq0KNCZWHn5keL9QoDuw/NgEH12zE/b+BwYUKFxEMu/W1d2jrmTMDNws0n1SCSrhGqfzuQpnX27Il/n8z9S0tKegG8/N7GeDOqt9+YgF38Cq6CXLB7+8q7eg5MKdBFaqtifTcEhDEC23kU2nGps5zIgicuLhb6ZCJrrrk7aMOMxDISDggr01k2UgtpX9x+Mqq3ueUvXVLdTGDpIOeZ0/K+pAtyzJXfybgdrNMexWom0sTAoOOyfa5R464ZsY/2N5vXMQ2dY2/6cHgg/U7DFIc4bzakvC3Jj0wHsVXRc8IuPym8Gio4GF3l5opxz1w2AI4/v48LHBrGTasa7WhjYMwQ3bEtAQfY2i/PELyV1 XN7aS6xj mAgrMeTVycw7khfxO7/nrUb/PVnt/fk5/9Emzaq/i5Zv6jxdHOwyphjA6rFiQE/FVR4DM0VQf5LX/UCPmsfTj88kD3/g7KcOrq6xE7YaNZ33B16ubhsa+SI7GUVJgSgvnZ6lvdK/qAZN4sGzRN1IAE7EOveCnMdn4vxNj8EhqcPe4X/ShecjYCpXYhPHg9DmQrxCi6+COmyCdidG1V5uC5rWMrK70BleqpecsUchBdMVibyuyqtgXAyvAfwleqiuuVTd6ylWnhMU4jsA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Chuck Lever The current cursor-based directory offset mechanism doesn't work when a tmpfs filesystem is exported via NFS. This is because NFS clients do not open directories. Each server-side READDIR operation has to open the directory, read it, then close it. The cursor state for that directory, being associated strictly with the opened struct file, is thus discarded after each NFS READDIR operation. Directory offsets are cached not only by NFS clients, but also by user space libraries on those clients. Essentially there is no way to invalidate those caches when directory offsets have changed on an NFS server after the offset-to-dentry mapping changes. Thus the whole application stack depends on unchanging directory offsets. The solution we've come up with is to make the directory offset for each file in a tmpfs filesystem stable for the life of the directory entry it represents. shmem_readdir() and shmem_dir_llseek() now use an xarray to map each directory offset (an loff_t integer) to the memory address of a struct dentry. Signed-off-by: Chuck Lever --- mm/shmem.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 47 insertions(+), 7 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 721f9fd064aa..89012f3583b1 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2410,7 +2410,8 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block /* Some things misbehave if size == 0 on a directory */ inode->i_size = 2 * BOGO_DIRENT_SIZE; inode->i_op = &shmem_dir_inode_operations; - inode->i_fop = &simple_dir_operations; + inode->i_fop = &stable_dir_operations; + stable_offset_init(inode); break; case S_IFLNK: /* @@ -2950,7 +2951,10 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir, if (error && error != -EOPNOTSUPP) goto out_iput; - error = 0; + error = stable_offset_add(dir, dentry); + if (error) + goto out_iput; + dir->i_size += BOGO_DIRENT_SIZE; dir->i_ctime = dir->i_mtime = current_time(dir); inode_inc_iversion(dir); @@ -3027,6 +3031,13 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr goto out; } + ret = stable_offset_add(dir, dentry); + if (ret) { + if (inode->i_nlink) + shmem_free_inode(inode->i_sb); + goto out; + } + dir->i_size += BOGO_DIRENT_SIZE; inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); inode_inc_iversion(dir); @@ -3045,6 +3056,8 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode)) shmem_free_inode(inode->i_sb); + stable_offset_remove(dir, dentry); + dir->i_size -= BOGO_DIRENT_SIZE; inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); inode_inc_iversion(dir); @@ -3103,24 +3116,41 @@ static int shmem_rename2(struct mnt_idmap *idmap, { struct inode *inode = d_inode(old_dentry); int they_are_dirs = S_ISDIR(inode->i_mode); + int error; if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT)) return -EINVAL; - if (flags & RENAME_EXCHANGE) + if (flags & RENAME_EXCHANGE) { + error = stable_offset_add(new_dir, old_dentry); + if (error) + return error; + error = stable_offset_add(old_dir, new_dentry); + if (error) { + stable_offset_remove(new_dir, old_dentry); + return error; + } + stable_offset_remove(old_dir, old_dentry); + stable_offset_remove(new_dir, new_dentry); + + /* Always returns zero */ return simple_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); + } if (!simple_empty(new_dentry)) return -ENOTEMPTY; if (flags & RENAME_WHITEOUT) { - int error; - error = shmem_whiteout(idmap, old_dir, old_dentry); if (error) return error; } + stable_offset_remove(old_dir, old_dentry); + error = stable_offset_add(new_dir, old_dentry); + if (error) + return error; + if (d_really_is_positive(new_dentry)) { (void) shmem_unlink(new_dir, new_dentry); if (they_are_dirs) { @@ -3164,19 +3194,23 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, if (error && error != -EOPNOTSUPP) goto out_iput; + error = stable_offset_add(dir, dentry); + if (error) + goto out_iput; + inode->i_size = len-1; if (len <= SHORT_SYMLINK_LEN) { inode->i_link = kmemdup(symname, len, GFP_KERNEL); if (!inode->i_link) { error = -ENOMEM; - goto out_iput; + goto out_remove_offset; } inode->i_op = &shmem_short_symlink_operations; } else { inode_nohighmem(inode); error = shmem_get_folio(inode, 0, &folio, SGP_WRITE); if (error) - goto out_iput; + goto out_remove_offset; inode->i_mapping->a_ops = &shmem_aops; inode->i_op = &shmem_symlink_inode_operations; memcpy(folio_address(folio), symname, len); @@ -3185,12 +3219,16 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, folio_unlock(folio); folio_put(folio); } + dir->i_size += BOGO_DIRENT_SIZE; dir->i_ctime = dir->i_mtime = current_time(dir); inode_inc_iversion(dir); d_instantiate(dentry, inode); dget(dentry); return 0; + +out_remove_offset: + stable_offset_remove(dir, dentry); out_iput: iput(inode); return error; @@ -3920,6 +3958,8 @@ static void shmem_destroy_inode(struct inode *inode) { if (S_ISREG(inode->i_mode)) mpol_free_shared_policy(&SHMEM_I(inode)->policy); + if (S_ISDIR(inode->i_mode)) + stable_offset_destroy(inode); } static void shmem_init_inode(void *foo)