From patchwork Fri Dec 20 02:54:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916083 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EAC4955887; Fri, 20 Dec 2024 03:08:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664132; cv=none; b=F0hnPBH5rvtLzm8NsanOKpHOT8VfBcoaSGuEShnDhXjJtw/syC9zu7hvNjzUjPo5iVFbABb2h13dV7y5+9Cr9tuRR+NJWVYbDNBSvedbqrJe9QZ1jLgfblsQ2bCeOnX2lIimqvHeb5gjPlCt2WnLgvn52m1diaR3DIyCLXE4IN0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664132; c=relaxed/simple; bh=ybhgwxTI4X95nHHvhL2B/S1GuzruYL67cPF3NV36pJo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gllCjxFPbbqMBbzDJa4quW0sqPVFb+znuR//2IuRj0qorA0d4fAgOF+I0l3ARJ4kViR7Qvv7+nZqfkxgNDv+LbzqHruiOfbgxpR08VE5odbfx+galawGC5j6uiMiPy5F/Gh7/lTbDdVqtZGWIVzP+SgIiqdBTxHupm8QspmCRV8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=ceBDZqlK; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=n8YJUzLh; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=ceBDZqlK; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=n8YJUzLh; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="ceBDZqlK"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="n8YJUzLh"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="ceBDZqlK"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="n8YJUzLh" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2D31521169; Fri, 20 Dec 2024 03:08:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664128; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1BJCUMvAPmGFmIg8lKEDG+ulfUTgCYmRTpYQH1F1eHc=; b=ceBDZqlKKeUkqx2v1WTURRb7sguy/h48vXgExW8wOe6vbV/w59ssGYv1oTCJ075gcvbw0P TYKfa5BLtB+K68RTRavXzOmyBLS6yCEqOP15jrLBrmpyL6qW9e2WSZ9HiQYMbyBcio96HA IdtRHPDRhlz8P6n0IIAyw5so9sk31cU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664128; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1BJCUMvAPmGFmIg8lKEDG+ulfUTgCYmRTpYQH1F1eHc=; b=n8YJUzLhqBkGC3ynNTEXHZAnqwTGwLHxUhZpDmx/4xcGgZR0QVRWTtC03PmgLbwwfCVBQA bzoBB4Q/YK9y2ZAg== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664128; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1BJCUMvAPmGFmIg8lKEDG+ulfUTgCYmRTpYQH1F1eHc=; b=ceBDZqlKKeUkqx2v1WTURRb7sguy/h48vXgExW8wOe6vbV/w59ssGYv1oTCJ075gcvbw0P TYKfa5BLtB+K68RTRavXzOmyBLS6yCEqOP15jrLBrmpyL6qW9e2WSZ9HiQYMbyBcio96HA IdtRHPDRhlz8P6n0IIAyw5so9sk31cU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664128; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1BJCUMvAPmGFmIg8lKEDG+ulfUTgCYmRTpYQH1F1eHc=; b=n8YJUzLhqBkGC3ynNTEXHZAnqwTGwLHxUhZpDmx/4xcGgZR0QVRWTtC03PmgLbwwfCVBQA bzoBB4Q/YK9y2ZAg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 132CC13A32; Fri, 20 Dec 2024 03:08:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id JX2fLr3fZGc3GAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:08:45 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 01/11] VFS: introduce vfs_mkdir_return() Date: Fri, 20 Dec 2024 13:54:19 +1100 Message-ID: <20241220030830.272429-2-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email,imap1.dmz-prg2.suse.org:helo]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: vfs_mkdir() does not guarantee to make the child dentry positive on success. It may leave it negative and then the caller needs to perform a lookup to find the target dentry. This patch introduced vfs_mkdir_return() which performs the lookup if needed so that this code is centralised. This prepares for a new inode operation which will perform mkdir and returns the correct dentry. Signed-off-by: NeilBrown --- fs/cachefiles/namei.c | 7 +---- fs/namei.c | 63 ++++++++++++++++++++++++++++++++++++++++ fs/nfsd/vfs.c | 21 ++------------ fs/overlayfs/dir.c | 33 +-------------------- fs/overlayfs/overlayfs.h | 10 +++---- fs/overlayfs/super.c | 2 +- fs/smb/server/vfs.c | 23 +++------------ include/linux/fs.h | 2 ++ 8 files changed, 80 insertions(+), 81 deletions(-) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 7cf59713f0f7..3c866c3b9534 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -95,7 +95,6 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, /* search the current directory for the element name */ inode_lock_nested(d_inode(dir), I_MUTEX_PARENT); -retry: ret = cachefiles_inject_read_error(); if (ret == 0) subdir = lookup_one_len(dirname, dir, strlen(dirname)); @@ -130,7 +129,7 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, goto mkdir_error; ret = cachefiles_inject_write_error(); if (ret == 0) - ret = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700); + ret = vfs_mkdir_return(&nop_mnt_idmap, d_inode(dir), &subdir, 0700); if (ret < 0) { trace_cachefiles_vfs_error(NULL, d_inode(dir), ret, cachefiles_trace_mkdir_error); @@ -138,10 +137,6 @@ struct dentry *cachefiles_get_directory(struct cachefiles_cache *cache, } trace_cachefiles_mkdir(dir, subdir); - if (unlikely(d_unhashed(subdir))) { - cachefiles_put_directory(subdir); - goto retry; - } ASSERT(d_backing_inode(subdir)); _debug("mkdir -> %pd{ino=%lu}", diff --git a/fs/namei.c b/fs/namei.c index 9d30c7aa9aa6..cdd1fc9d56a0 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4315,6 +4315,69 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, } EXPORT_SYMBOL(vfs_mkdir); +/** + * vfs_mkdir_return - create directory returning correct dentry + * @idmap: idmap of the mount the inode was found from + * @dir: inode of the parent directory + * @dentryp: pointer to dentry of the child directory + * @mode: mode of the child directory + * + * Create a directory. + * + * If the inode has been found through an idmapped mount the idmap of + * the vfsmount must be passed through @idmap. This function will then take + * care to map the inode according to @idmap before checking permissions. + * On non-idmapped mounts or if permission checking is to be performed on the + * raw inode simply pass @nop_mnt_idmap. + * + * The filesystem may not use the dentry that was passed in. In that case + * the passed-in dentry is put and a new one is placed in *@dentryp; + * So on successful return *@dentryp will always be positive. + */ +int vfs_mkdir_return(struct mnt_idmap *idmap, struct inode *dir, + struct dentry **dentryp, umode_t mode) +{ + struct dentry *dentry = *dentryp; + int error; + unsigned max_links = dir->i_sb->s_max_links; + + error = may_create(idmap, dir, dentry); + if (error) + return error; + + if (!dir->i_op->mkdir) + return -EPERM; + + mode = vfs_prepare_mode(idmap, dir, mode, S_IRWXUGO | S_ISVTX, 0); + error = security_inode_mkdir(dir, dentry, mode); + if (error) + return error; + + if (max_links && dir->i_nlink >= max_links) + return -EMLINK; + + error = dir->i_op->mkdir(idmap, dir, dentry, mode); + if (!error) { + fsnotify_mkdir(dir, dentry); + if (unlikely(d_unhashed(dentry))) { + struct dentry *d; + d = lookup_dcache((const struct qstr *)&dentry->d_name, + dentry->d_parent, 0); + if (IS_ERR(d)) { + error = PTR_ERR(d); + } else if (unlikely(d_is_negative(d))) { + dput(d); + error = -ENOENT; + } else { + dput(dentry); + *dentryp = d; + } + } + } + return error; +} +EXPORT_SYMBOL(vfs_mkdir_return); + int do_mkdirat(int dfd, struct filename *name, umode_t mode) { struct dentry *dentry; diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 29cb7b812d71..740332413138 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1488,26 +1488,11 @@ nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp, nfsd_check_ignore_resizing(iap); break; case S_IFDIR: - host_err = vfs_mkdir(&nop_mnt_idmap, dirp, dchild, iap->ia_mode); - if (!host_err && unlikely(d_unhashed(dchild))) { - struct dentry *d; - d = lookup_one_len(dchild->d_name.name, - dchild->d_parent, - dchild->d_name.len); - if (IS_ERR(d)) { - host_err = PTR_ERR(d); - break; - } - if (unlikely(d_is_negative(d))) { - dput(d); - err = nfserr_serverfault; - goto out; - } + host_err = vfs_mkdir_return(&nop_mnt_idmap, dirp, &dchild, iap->ia_mode); + if (!host_err && unlikely(dchild != resfhp->fh_dentry)) { dput(resfhp->fh_dentry); - resfhp->fh_dentry = dget(d); + resfhp->fh_dentry = dget(dchild); err = fh_update(resfhp); - dput(dchild); - dchild = d; if (err) goto out; } diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index 08e683917d12..92a277ccc419 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -138,37 +138,6 @@ int ovl_cleanup_and_whiteout(struct ovl_fs *ofs, struct inode *dir, goto out; } -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, - struct dentry **newdentry, umode_t mode) -{ - int err; - struct dentry *d, *dentry = *newdentry; - - err = ovl_do_mkdir(ofs, dir, dentry, mode); - if (err) - return err; - - if (likely(!d_unhashed(dentry))) - return 0; - - /* - * vfs_mkdir() may succeed and leave the dentry passed - * to it unhashed and negative. If that happens, try to - * lookup a new hashed and positive dentry. - */ - d = ovl_lookup_upper(ofs, dentry->d_name.name, dentry->d_parent, - dentry->d_name.len); - if (IS_ERR(d)) { - pr_warn("failed lookup after mkdir (%pd2, err=%i).\n", - dentry, err); - return PTR_ERR(d); - } - dput(dentry); - *newdentry = d; - - return 0; -} - struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, struct dentry *newdentry, struct ovl_cattr *attr) { @@ -191,7 +160,7 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, case S_IFDIR: /* mkdir is special... */ - err = ovl_mkdir_real(ofs, dir, &newdentry, attr->mode); + err = ovl_do_mkdir(ofs, dir, &newdentry, attr->mode); break; case S_IFCHR: diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index b361f35762be..9ff34f84fc50 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -242,11 +242,11 @@ static inline int ovl_do_create(struct ovl_fs *ofs, } static inline int ovl_do_mkdir(struct ovl_fs *ofs, - struct inode *dir, struct dentry *dentry, + struct inode *dir, struct dentry **dentry, umode_t mode) { - int err = vfs_mkdir(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); - pr_debug("mkdir(%pd2, 0%o) = %i\n", dentry, mode, err); + int err = vfs_mkdir_return(ovl_upper_mnt_idmap(ofs), dir, dentry, mode); + pr_debug("mkdir(%pd2, 0%o) = %i\n", *dentry, mode, err); return err; } @@ -838,8 +838,8 @@ struct ovl_cattr { #define OVL_CATTR(m) (&(struct ovl_cattr) { .mode = (m) }) -int ovl_mkdir_real(struct ovl_fs *ofs, struct inode *dir, - struct dentry **newdentry, umode_t mode); +int ovl_do_mkdir(struct ovl_fs *ofs, struct inode *dir, + struct dentry **newdentry, umode_t mode); struct dentry *ovl_create_real(struct ovl_fs *ofs, struct inode *dir, struct dentry *newdentry, struct ovl_cattr *attr); diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index fe511192f83c..24c44f7c1c8b 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -309,7 +309,7 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, goto retry; } - err = ovl_mkdir_real(ofs, dir, &work, attr.ia_mode); + err = ovl_do_mkdir(ofs, dir, &work, attr.ia_mode); if (err) goto out_dput; diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index 88d167a5f8b7..dfb0eee5f5f3 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -211,7 +211,7 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) { struct mnt_idmap *idmap; struct path path; - struct dentry *dentry; + struct dentry *dentry, *d; int err; dentry = ksmbd_vfs_kern_path_create(work, name, @@ -227,25 +227,10 @@ int ksmbd_vfs_mkdir(struct ksmbd_work *work, const char *name, umode_t mode) idmap = mnt_idmap(path.mnt); mode |= S_IFDIR; - err = vfs_mkdir(idmap, d_inode(path.dentry), dentry, mode); - if (!err && d_unhashed(dentry)) { - struct dentry *d; - - d = lookup_one(idmap, dentry->d_name.name, dentry->d_parent, - dentry->d_name.len); - if (IS_ERR(d)) { - err = PTR_ERR(d); - goto out_err; - } - if (unlikely(d_is_negative(d))) { - dput(d); - err = -ENOENT; - goto out_err; - } - + d = dentry; + err = vfs_mkdir_return(idmap, d_inode(path.dentry), &dentry, mode); + if (!err && dentry != d) ksmbd_vfs_inherit_owner(work, d_inode(path.dentry), d_inode(d)); - dput(d); - } out_err: done_path_create(&path, dentry); diff --git a/include/linux/fs.h b/include/linux/fs.h index 7e29433c5ecc..406887d0394e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1923,6 +1923,8 @@ int vfs_create(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, bool); int vfs_mkdir(struct mnt_idmap *, struct inode *, struct dentry *, umode_t); +int vfs_mkdir_return(struct mnt_idmap *, struct inode *, + struct dentry **, umode_t); int vfs_mknod(struct mnt_idmap *, struct inode *, struct dentry *, umode_t, dev_t); int vfs_symlink(struct mnt_idmap *, struct inode *, From patchwork Fri Dec 20 02:54:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916084 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1F6F770FE; Fri, 20 Dec 2024 03:08:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664138; cv=none; b=c7aHbIRS8ZaBMHl8jixE5aC720XqN1zrJjfQAEkx1ckPEin9IrsTaFyCVtno+0B36UHI1ORfpvZ/AUfvJA0zHgnAASEWmmBeveZ6mgyDfign6kSdJOsna2s0JzQj+JdQXATw9haD8nrN/wFPFKj/VtCk36AnvcDQzz+e3XqowZ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664138; c=relaxed/simple; bh=9kXhkW1Nz1QGe6JFCSFEY1rZocMiPCC8K+r44xnRFSQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TU10rXh0A4Fc2QyOLJ1ve0g2tWpftgcQtm+Juk8ZxSHyhLbb8jP2Z0Mkcz+LynTBg9Ts/8Z701qRe3Sr5yTDBNRGecI5IgRPR8ckCRd7rlW4T2SNeCxFT7uqnN/31FvwMwv5mw2vAw3eO64ey1KnqB94u5Zbe2NABeq85PIR9NE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=hMES+Jfe; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=ipjP52BZ; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=hMES+Jfe; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=ipjP52BZ; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="hMES+Jfe"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="ipjP52BZ"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="hMES+Jfe"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="ipjP52BZ" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 0ADA9210F2; Fri, 20 Dec 2024 03:08:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664134; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vlkY4ecDeSzlsChYigcLlSQTmsnD+T9Y53OFWkyAFS8=; b=hMES+JfeNz5wSv6EUDLVRz3zHTkH8FtW8iygriNnglbPW17K8uxUG05H9jXwKCqC2WG4rc mD17YB2IXDXSij+p/0kWayy4X+ltprfmg5geUL7G6pxbvtZapozrNDNXC1PJ3qgudrwX80 HJGXzDRZkD4ypHKmDZ1lwL/6Vf5Rc1k= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664134; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vlkY4ecDeSzlsChYigcLlSQTmsnD+T9Y53OFWkyAFS8=; b=ipjP52BZuFHerrKq/bYGRe9PJborYLKFxyA9bat0uctgQNBvHAS6BSXFY0jpSpKy8vRY2C I9I0vlDFbIpT+uCQ== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=hMES+Jfe; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=ipjP52BZ DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664134; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vlkY4ecDeSzlsChYigcLlSQTmsnD+T9Y53OFWkyAFS8=; b=hMES+JfeNz5wSv6EUDLVRz3zHTkH8FtW8iygriNnglbPW17K8uxUG05H9jXwKCqC2WG4rc mD17YB2IXDXSij+p/0kWayy4X+ltprfmg5geUL7G6pxbvtZapozrNDNXC1PJ3qgudrwX80 HJGXzDRZkD4ypHKmDZ1lwL/6Vf5Rc1k= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664134; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vlkY4ecDeSzlsChYigcLlSQTmsnD+T9Y53OFWkyAFS8=; b=ipjP52BZuFHerrKq/bYGRe9PJborYLKFxyA9bat0uctgQNBvHAS6BSXFY0jpSpKy8vRY2C I9I0vlDFbIpT+uCQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9E4BE13A32; Fri, 20 Dec 2024 03:08:51 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 7AQFFcPfZGc+GAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:08:51 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 02/11] VFS: add _shared versions of the various directory modifying inode_operations Date: Fri, 20 Dec 2024 13:54:20 +1100 Message-ID: <20241220030830.272429-3-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 0ADA9210F2 X-Spam-Level: X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; FROM_EQ_ENVFROM(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCPT_COUNT_FIVE(0.00)[6]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:dkim,suse.de:mid,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO These "_shared" versions of various inode operations are not guaranteed an exclusive lock on the directory but are guaranteed an exclusive lock on the dentry within the directory. i_rwsem *may* be held exclusively or *may* be held shared, in which case an exclusive lock will be held on the dentry - provided by a later patch. This will allow a graceful transition from exclusive to shared locking for directory updates. mkdir_shared is a bit different as it optionally returns a new dentry for cases when the filesystem is not able to use the original dentry. This allows vfs_mkdir_return() to avoid the need for an extra lookup. Signed-off-by: NeilBrown --- Documentation/filesystems/locking.rst | 28 ++++++- Documentation/filesystems/porting.rst | 10 +++ Documentation/filesystems/vfs.rst | 24 ++++++ fs/namei.c | 108 +++++++++++++++++++------- include/linux/fs.h | 16 ++++ 5 files changed, 158 insertions(+), 28 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index f5e3676db954..7cacff59356f 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -57,15 +57,24 @@ inode_operations prototypes:: int (*create) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t, bool); + int (*create_shared) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t, bool); struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*link_shared) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); + int (*unlink_shared) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); + int (*symlink_shared) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry * (*mkdir_shared) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); + int (*rmdir_shared) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); + int (*mknod_shared) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); + int (*rename_shared) (struct mnt_idmap *, struct inode *, struct dentry *, + struct inode *, struct dentry *, unsigned int); int (*readlink) (struct dentry *, char __user *,int); const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *); void (*truncate) (struct inode *); @@ -79,6 +88,9 @@ prototypes:: int (*atomic_open)(struct inode *, struct dentry *, struct file *, unsigned open_flag, umode_t create_mode); + int (*atomic_open_shared)(struct inode *, struct dentry *, + struct file *, unsigned open_flag, + umode_t create_mode); int (*tmpfile) (struct mnt_idmap *, struct inode *, struct file *, umode_t); int (*fileattr_set)(struct mnt_idmap *idmap, @@ -90,18 +102,29 @@ prototypes:: locking rules: all may block +A "mixed" lock means that either that i_rwsem on the directory is held +exclusively, or it is held as a shared lock, and an exclusive lock is held +on the dentry in that directory. ============== ================================================== ops i_rwsem(inode) ============== ================================================== lookup: shared create: exclusive +create_shared: mixed link: exclusive (both) +link_shared: exclusive on source, mixed on target mknod: exclusive +mknod_shared: mixed symlink: exclusive +symlink_shared: mixed mkdir: exclusive +mkdir_shared: mixed unlink: exclusive (both) +unlink_shared: exclusive on object, mixed on directory/name rmdir: exclusive (both)(see below) +rmdir_shared: exclusive on object, mixed on directory/name (see below) rename: exclusive (both parents, some children) (see below) +rename_shared: mixed (both parents) exclusive (some children) (see below) readlink: no get_link: no setattr: exclusive @@ -113,6 +136,7 @@ listxattr: no fiemap: no update_time: no atomic_open: shared (exclusive if O_CREAT is set in open flags) +atomic_open_shared: mixed (if O_CREAT is not set, then may not have exclusive lock on name) tmpfile: no fileattr_get: no or exclusive fileattr_set: exclusive @@ -120,8 +144,8 @@ get_offset_ctx no ============== ================================================== - Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_rwsem - exclusive on victim. + Additionally, ->rmdir(), ->unlink() and ->rename(), as well as _shared + versions, have ->i_rwsem exclusive on victim. cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem. ->unlink() and ->rename() have ->i_rwsem exclusive on all non-directories involved. diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index 9ab2a3d6f2b4..c7f3825f280c 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -1141,3 +1141,13 @@ pointer are gone. set_blocksize() takes opened struct file instead of struct block_device now and it *must* be opened exclusive. + +--- + +**recommended** + +create_shared, link_shared, unlink_shared, rmdir_shared, mknod_shared, +rename_shared, atomic_open_shared can be provided instead of the +corresponding inode_operations with the "_shared" suffix. Multiple +_shared operations can be performed in a given directory concurrently, +but never on the same name. diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 0b18af3f954e..c4860597975a 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -491,15 +491,24 @@ As of kernel 2.6.22, the following members are defined: struct inode_operations { int (*create) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t, bool); + int (*create_shared) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t, bool); struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*link_shared) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); + int (*unlink_shared) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); + int (*symlink_shared) (struct mnt_idmap *, struct inode *,struct dentry *,const char *); int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); + struct dentry * (*mkdir_shared) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t); int (*rmdir) (struct inode *,struct dentry *); + int (*rmdir_shared) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); + int (*mknod_shared) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); + int (*rename_shared) (struct mnt_idmap *, struct inode *, struct dentry *, + struct inode *, struct dentry *, unsigned int); int (*readlink) (struct dentry *, char __user *,int); const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *); @@ -511,6 +520,8 @@ As of kernel 2.6.22, the following members are defined: void (*update_time)(struct inode *, struct timespec *, int); int (*atomic_open)(struct inode *, struct dentry *, struct file *, unsigned open_flag, umode_t create_mode); + int (*atomic_open_shared)(struct inode *, struct dentry *, struct file *, + unsigned open_flag, umode_t create_mode); int (*tmpfile) (struct mnt_idmap *, struct inode *, struct file *, umode_t); struct posix_acl * (*get_acl)(struct mnt_idmap *, struct dentry *, int); int (*set_acl)(struct mnt_idmap *, struct dentry *, struct posix_acl *, int); @@ -524,6 +535,7 @@ Again, all methods are called without any locks being held, unless otherwise noted. ``create`` +``create_shared`` called by the open(2) and creat(2) system calls. Only required if you want to support regular files. The dentry you get should not have an inode (i.e. it should be a negative dentry). Here @@ -546,29 +558,39 @@ otherwise noted. directory inode semaphore held ``link`` +``link_shared`` called by the link(2) system call. Only required if you want to support hard links. You will probably need to call d_instantiate() just as you would in the create() method ``unlink`` +``unlink_shared`` called by the unlink(2) system call. Only required if you want to support deleting inodes ``symlink`` +``symlink_shared`` called by the symlink(2) system call. Only required if you want to support symlinks. You will probably need to call d_instantiate() just as you would in the create() method ``mkdir`` +``mkdir_shared`` called by the mkdir(2) system call. Only required if you want to support creating subdirectories. You will probably need to call d_instantiate() just as you would in the create() method + mkdir_shared can return an alternate dentry, much like lookup. + In this case the original dentry will still be negative and will + be unhashed. + ``rmdir`` +``rmdir_shared`` called by the rmdir(2) system call. Only required if you want to support deleting subdirectories ``mknod`` +``mknod_shared`` called by the mknod(2) system call to create a device (char, block) inode or a named pipe (FIFO) or socket. Only required if you want to support creating these types of inodes. You will @@ -576,6 +598,7 @@ otherwise noted. create() method ``rename`` +``rename_shared`` called by the rename(2) system call to rename the object to have the parent and name given by the second inode and dentry. @@ -647,6 +670,7 @@ otherwise noted. itself and call mark_inode_dirty_sync. ``atomic_open`` +``atomic_open_shared`` called on the last component of an open. Using this optional method the filesystem can look up, possibly create and open the file in one atomic operation. If it wants to leave actual diff --git a/fs/namei.c b/fs/namei.c index cdd1fc9d56a0..65082378dc60 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3338,14 +3338,17 @@ int vfs_create(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->create) + if (!dir->i_op->create && !dir->i_op->create_shared) return -EACCES; /* shouldn't it be ENOSYS? */ mode = vfs_prepare_mode(idmap, dir, mode, S_IALLUGO, S_IFREG); error = security_inode_create(dir, dentry, mode); if (error) return error; - error = dir->i_op->create(idmap, dir, dentry, mode, want_excl); + if (dir->i_op->create_shared) + error = dir->i_op->create_shared(idmap, dir, dentry, mode, want_excl); + else + error = dir->i_op->create(idmap, dir, dentry, mode, want_excl); if (!error) fsnotify_create(dir, dentry); return error; @@ -3506,8 +3509,12 @@ static struct dentry *atomic_open(struct nameidata *nd, struct dentry *dentry, file->f_path.dentry = DENTRY_NOT_SET; file->f_path.mnt = nd->path.mnt; - error = dir->i_op->atomic_open(dir, dentry, file, - open_to_namei_flags(open_flag), mode); + if (dir->i_op->atomic_open_shared) + error = dir->i_op->atomic_open_shared(dir, dentry, file, + open_to_namei_flags(open_flag), mode); + else + error = dir->i_op->atomic_open(dir, dentry, file, + open_to_namei_flags(open_flag), mode); d_lookup_done(dentry); if (!error) { if (file->f_mode & FMODE_OPENED) { @@ -3616,7 +3623,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, } if (create_error) open_flag &= ~O_CREAT; - if (dir_inode->i_op->atomic_open) { + if (dir_inode->i_op->atomic_open || dir_inode->i_op->atomic_open_shared) { dentry = atomic_open(nd, dentry, file, open_flag, mode); if (unlikely(create_error) && dentry == ERR_PTR(-ENOENT)) dentry = ERR_PTR(create_error); @@ -3641,13 +3648,17 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, if (!dentry->d_inode && (open_flag & O_CREAT)) { file->f_mode |= FMODE_CREATED; audit_inode_child(dir_inode, dentry, AUDIT_TYPE_CHILD_CREATE); - if (!dir_inode->i_op->create) { - error = -EACCES; - goto out_dput; - } - error = dir_inode->i_op->create(idmap, dir_inode, dentry, - mode, open_flag & O_EXCL); + if (dir_inode->i_op->create_shared) + error = dir_inode->i_op->create_shared(idmap, dir_inode, + dentry, mode, + open_flag & O_EXCL); + else if (dir_inode->i_op->create) + error = dir_inode->i_op->create(idmap, dir_inode, + dentry, mode, + open_flag & O_EXCL); + else + error = -EACCES; if (error) goto out_dput; } @@ -4174,7 +4185,7 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir, !capable(CAP_MKNOD)) return -EPERM; - if (!dir->i_op->mknod) + if (!dir->i_op->mknod && !dir->i_op->mknod_shared) return -EPERM; mode = vfs_prepare_mode(idmap, dir, mode, mode, mode); @@ -4186,7 +4197,10 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - error = dir->i_op->mknod(idmap, dir, dentry, mode, dev); + if (dir->i_op->mknod_shared) + error = dir->i_op->mknod_shared(idmap, dir, dentry, mode, dev); + else + error = dir->i_op->mknod(idmap, dir, dentry, mode, dev); if (!error) fsnotify_create(dir, dentry); return error; @@ -4297,7 +4311,7 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->mkdir) + if (!dir->i_op->mkdir && !dir->i_op->mkdir_shared) return -EPERM; mode = vfs_prepare_mode(idmap, dir, mode, S_IRWXUGO | S_ISVTX, 0); @@ -4308,7 +4322,16 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, if (max_links && dir->i_nlink >= max_links) return -EMLINK; - error = dir->i_op->mkdir(idmap, dir, dentry, mode); + if (dir->i_op->mkdir_shared) { + struct dentry *de; + de = dir->i_op->mkdir_shared(idmap, dir, dentry, mode); + if (IS_ERR(de)) + error = PTR_ERR(de); + else if (de) + dput(de); + } else { + error = dir->i_op->mkdir(idmap, dir, dentry, mode); + } if (!error) fsnotify_mkdir(dir, dentry); return error; @@ -4356,6 +4379,20 @@ int vfs_mkdir_return(struct mnt_idmap *idmap, struct inode *dir, if (max_links && dir->i_nlink >= max_links) return -EMLINK; + if (dir->i_op->mkdir_shared) { + struct dentry *de; + + de = dir->i_op->mkdir_shared(idmap, dir, dentry, mode); + if (IS_ERR(de)) + return PTR_ERR(de); + if (de) { + dput(dentry); + *dentryp = de; + } + fsnotify_mkdir(dir, dentry); + return 0; + } + error = dir->i_op->mkdir(idmap, dir, dentry, mode); if (!error) { fsnotify_mkdir(dir, dentry); @@ -4439,7 +4476,7 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->rmdir) + if (!dir->i_op->rmdir && !dir->i_op->rmdir_shared) return -EPERM; dget(dentry); @@ -4454,7 +4491,10 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) goto out; - error = dir->i_op->rmdir(dir, dentry); + if (dir->i_op->rmdir_shared) + error = dir->i_op->rmdir_shared(dir, dentry); + else + error = dir->i_op->rmdir(dir, dentry); if (error) goto out; @@ -4569,7 +4609,7 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->unlink) + if (!dir->i_op->unlink && !dir->i_op->unlink_shared) return -EPERM; inode_lock(target); @@ -4583,7 +4623,10 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir, error = try_break_deleg(target, delegated_inode); if (error) goto out; - error = dir->i_op->unlink(dir, dentry); + if (dir->i_op->unlink_shared) + error = dir->i_op->unlink_shared(dir, dentry); + else + error = dir->i_op->unlink(dir, dentry); if (!error) { dont_mount(dentry); detach_mounts(dentry); @@ -4722,14 +4765,17 @@ int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (!dir->i_op->symlink) + if (!dir->i_op->symlink && !dir->i_op->symlink_shared) return -EPERM; error = security_inode_symlink(dir, dentry, oldname); if (error) return error; - error = dir->i_op->symlink(idmap, dir, dentry, oldname); + if (dir->i_op->symlink_shared) + error = dir->i_op->symlink_shared(idmap, dir, dentry, oldname); + else + error = dir->i_op->symlink(idmap, dir, dentry, oldname); if (!error) fsnotify_create(dir, dentry); return error; @@ -4835,7 +4881,7 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap, */ if (HAS_UNMAPPED_ID(idmap, inode)) return -EPERM; - if (!dir->i_op->link) + if (!dir->i_op->link && !dir->i_op->link_shared) return -EPERM; if (S_ISDIR(inode->i_mode)) return -EPERM; @@ -4852,7 +4898,11 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap, error = -EMLINK; else { error = try_break_deleg(inode, delegated_inode); - if (!error) + if (error) + ; + else if (dir->i_op->link_shared) + error = dir->i_op->link_shared(old_dentry, dir, new_dentry); + else error = dir->i_op->link(old_dentry, dir, new_dentry); } @@ -5044,7 +5094,7 @@ int vfs_rename(struct renamedata *rd) if (error) return error; - if (!old_dir->i_op->rename) + if (!old_dir->i_op->rename && !old_dir->i_op->rename_shared) return -EPERM; /* @@ -5127,8 +5177,14 @@ int vfs_rename(struct renamedata *rd) if (error) goto out; } - error = old_dir->i_op->rename(rd->new_mnt_idmap, old_dir, old_dentry, - new_dir, new_dentry, flags); + if (old_dir->i_op->rename_shared) + error = old_dir->i_op->rename_shared(rd->new_mnt_idmap, + old_dir, old_dentry, + new_dir, new_dentry, flags); + else + error = old_dir->i_op->rename(rd->new_mnt_idmap, + old_dir, old_dentry, + new_dir, new_dentry, flags); if (error) goto out; diff --git a/include/linux/fs.h b/include/linux/fs.h index 406887d0394e..68eba181175b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2147,17 +2147,30 @@ struct inode_operations { int (*create) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t, bool); + int (*create_shared) (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t, bool); int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*link_shared) (struct dentry *,struct inode *,struct dentry *); int (*unlink) (struct inode *,struct dentry *); + int (*unlink_shared) (struct inode *,struct dentry *); int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *, const char *); + int (*symlink_shared) (struct mnt_idmap *, struct inode *,struct dentry *, + const char *); int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t); + struct dentry * (*mkdir_shared) (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t); int (*rmdir) (struct inode *,struct dentry *); + int (*rmdir_shared) (struct inode *,struct dentry *); int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t,dev_t); + int (*mknod_shared) (struct mnt_idmap *, struct inode *,struct dentry *, + umode_t,dev_t); int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *, struct inode *, struct dentry *, unsigned int); + int (*rename_shared) (struct mnt_idmap *, struct inode *, struct dentry *, + struct inode *, struct dentry *, unsigned int); int (*setattr) (struct mnt_idmap *, struct dentry *, struct iattr *); int (*getattr) (struct mnt_idmap *, const struct path *, struct kstat *, u32, unsigned int); @@ -2168,6 +2181,9 @@ struct inode_operations { int (*atomic_open)(struct inode *, struct dentry *, struct file *, unsigned open_flag, umode_t create_mode); + int (*atomic_open_shared)(struct inode *, struct dentry *, + struct file *, unsigned open_flag, + umode_t create_mode); int (*tmpfile) (struct mnt_idmap *, struct inode *, struct file *, umode_t); struct posix_acl *(*get_acl)(struct mnt_idmap *, struct dentry *, From patchwork Fri Dec 20 02:54:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916085 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD4CF3FB3B; Fri, 20 Dec 2024 03:09:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664145; cv=none; b=KfIB6HcIKDt6mo3hBL9wsJC+6K/IRxI/qH3KWAuM0ag1hjybbakX9Jo9Sa9OhH/LxHQfrbfAcYbhSISRzk1u4bNb4yXfCebBo+DiUxdH7ut2oJzRrnokY9HXP+Lfjy77+hiSdwsaXa4Pmk7RXb3s5zZvuLAlT2Sb+lBYXrBaYwY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664145; c=relaxed/simple; bh=w33MPL6ovjMTnedTGYecGBOG8LDHcQhiopmZSXpIHlI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q0pGaHetqlJlOMghGQk46z/lUXuaQ2j3EL+3Ynwk02qdp5JnYcZhxpB5aa14mthYf/qfxSkht1T+ASubI16YuDTnSTPrifUBmptJZwuNlmVMVVsNF5qoINqmldPbRBpX3L9KYbgwZef/EUXGnyTrnGh7Z+GWJ48rOGn2fs5yiOY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=vIPS/TLX; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=DBf6EhuH; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=OGFxWMZi; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=NVW/9eU9; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="vIPS/TLX"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="DBf6EhuH"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="OGFxWMZi"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="NVW/9eU9" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E90501F385; Fri, 20 Dec 2024 03:08:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664140; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qTO73NcFFqKlwypAnojPyMpjZxGX6UJrklaQRP2Du4k=; b=vIPS/TLX4Xk2lsT5xQUzadNEK52n9cVPpX39TjA/dnSocyrUJ0akI+r+b9VnrTjIRv1fKc 2gpd8L74uH3E+edQToKHYuvKuBKt2Jx6cyGhWSZ4sMv5keYOzhQTU/xUArcRIAPnnYelle 9LQHWHtuWO03HLf+KMWQFYj+tvXDDNQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664140; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qTO73NcFFqKlwypAnojPyMpjZxGX6UJrklaQRP2Du4k=; b=DBf6EhuHNIqKg2jhRz/VT/HumbLi7bONxVKUPCPCEyvZcmoGVvLrYMN2iRoLC4NafbTGWo x89oYeUqSt7gkvAA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664139; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qTO73NcFFqKlwypAnojPyMpjZxGX6UJrklaQRP2Du4k=; b=OGFxWMZiIySE6bx0Sv8KTPxY8vRK6NJf3P19RfotvVtH2u13SOUlF/fiXrXBXQ/MFKrhh4 7GPxKr2ULVhjN9x9msqNdPclUdqk2BrnPXCbJqs4ouElViJwh84jVbjkj3ywOI7mTmU42X NDCcvA9Bm4vrIawSrcgT++iskD3Huzs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664139; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qTO73NcFFqKlwypAnojPyMpjZxGX6UJrklaQRP2Du4k=; b=NVW/9eU9L3b0ffPUy+WKw/HwMBoKLBFJbak0VLddP92W4hmyf3c6W5uv8nug/qieIQBQtD ms2agmb84LO3M2BQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 7C78413A32; Fri, 20 Dec 2024 03:08:57 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 1QLDDMnfZGdIGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:08:57 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 03/11] VFS: use global wait-queue table for d_alloc_parallel() Date: Fri, 20 Dec 2024 13:54:21 +1100 Message-ID: <20241220030830.272429-4-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email,imap1.dmz-prg2.suse.org:helo]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: d_alloc_parallel() currently requires a wait_queue_head to be passed in. This must have a life time which extends until the lookup is completed. Future proposed patches will use d_alloc_parallel() for names being created/unlinked etc. Some filesystems combine lookup with create making a longer code path that the wq needs to live for. If it is still to be allocated on-stack this can be cumbersome. This patch replaces the on-stack wqs with a global array of wqs which are used as needed. A wq is NOT allocated when a dentry is first created but only when a second thread attempts to use the same name and so is forced to wait. At this moment a wq is chosen using the least-significant bits on the task's pid and that wq is assigned to ->d_wait. The ->d_lock is then dropped and the task waits. When the dentry is finally moved out of "in_lookup" a wake up is only sent if ->d_wait is not NULL. This avoids an (uncontended) spin lock/unlock which saves a couple of atomic operations in a common case. The wake up passes the dentry that the wake up is for as the "key" and the waiter will only wake processes waiting on the same key. This means that when these global waitqueues are shared (which is inevitable though unlikely to be frequent), a task will not be woken prematurely. Signed-off-by: NeilBrown --- fs/afs/dir_silly.c | 4 +-- fs/dcache.c | 69 +++++++++++++++++++++++++++++++++-------- fs/fuse/readdir.c | 3 +- fs/namei.c | 6 ++-- fs/nfs/dir.c | 6 ++-- fs/nfs/unlink.c | 3 +- fs/proc/base.c | 3 +- fs/proc/proc_sysctl.c | 3 +- fs/smb/client/readdir.c | 3 +- include/linux/dcache.h | 3 +- include/linux/nfs_xdr.h | 1 - 11 files changed, 67 insertions(+), 37 deletions(-) diff --git a/fs/afs/dir_silly.c b/fs/afs/dir_silly.c index a1e581946b93..aa4363a1c6fa 100644 --- a/fs/afs/dir_silly.c +++ b/fs/afs/dir_silly.c @@ -239,13 +239,11 @@ int afs_silly_iput(struct dentry *dentry, struct inode *inode) struct dentry *alias; int ret; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); - _enter("%p{%pd},%llx", dentry, dentry, vnode->fid.vnode); down_read(&dvnode->rmdir_lock); - alias = d_alloc_parallel(dentry->d_parent, &dentry->d_name, &wq); + alias = d_alloc_parallel(dentry->d_parent, &dentry->d_name); if (IS_ERR(alias)) { up_read(&dvnode->rmdir_lock); return 0; diff --git a/fs/dcache.c b/fs/dcache.c index b4d5e9e1e43d..ebe849474bd8 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2078,8 +2078,7 @@ struct dentry *d_add_ci(struct dentry *dentry, struct inode *inode, return found; } if (d_in_lookup(dentry)) { - found = d_alloc_parallel(dentry->d_parent, name, - dentry->d_wait); + found = d_alloc_parallel(dentry->d_parent, name); if (IS_ERR(found) || !d_in_lookup(found)) { iput(inode); return found; @@ -2089,7 +2088,7 @@ struct dentry *d_add_ci(struct dentry *dentry, struct inode *inode, if (!found) { iput(inode); return ERR_PTR(-ENOMEM); - } + } } res = d_splice_alias(inode, found); if (res) { @@ -2459,30 +2458,70 @@ static inline unsigned start_dir_add(struct inode *dir) } static inline void end_dir_add(struct inode *dir, unsigned int n, - wait_queue_head_t *d_wait) + wait_queue_head_t *d_wait, struct dentry *de) { smp_store_release(&dir->i_dir_seq, n + 2); preempt_enable_nested(); - wake_up_all(d_wait); + if (d_wait) + __wake_up(d_wait, TASK_NORMAL, 0, de); +} + +#define PAR_LOOKUP_WQS 256 +static wait_queue_head_t par_wait_table[PAR_LOOKUP_WQS] __cacheline_aligned; + +static int __init par_wait_init(void) +{ + int i; + + for (i = 0; i < PAR_LOOKUP_WQS; i++) + init_waitqueue_head(&par_wait_table[i]); + return 0; +} +fs_initcall(par_wait_init); + +struct par_wait_key { + struct dentry *de; + struct wait_queue_entry wqe; +}; + +static int d_wait_wake_fn(struct wait_queue_entry *wq_entry, + unsigned mode, int sync, void *key) +{ + struct par_wait_key *pwk = container_of(wq_entry, + struct par_wait_key, wqe); + if (pwk->de == key) + return default_wake_function(wq_entry, mode, sync, key); + return 0; } static void d_wait_lookup(struct dentry *dentry) { if (d_in_lookup(dentry)) { - DECLARE_WAITQUEUE(wait, current); - add_wait_queue(dentry->d_wait, &wait); + struct par_wait_key wk = { + .de = dentry, + .wqe = { + .private = current, + .func = d_wait_wake_fn, + }, + }; + struct wait_queue_head *wq; + if (!dentry->d_wait) + dentry->d_wait = &par_wait_table[current->pid % + PAR_LOOKUP_WQS]; + wq = dentry->d_wait; + add_wait_queue(wq, &wk.wqe); do { set_current_state(TASK_UNINTERRUPTIBLE); spin_unlock(&dentry->d_lock); schedule(); spin_lock(&dentry->d_lock); } while (d_in_lookup(dentry)); + remove_wait_queue(wq, &wk.wqe); } } struct dentry *d_alloc_parallel(struct dentry *parent, - const struct qstr *name, - wait_queue_head_t *wq) + const struct qstr *name) { unsigned int hash = name->hash; struct hlist_bl_head *b = in_lookup_hash(parent, hash); @@ -2579,7 +2618,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, rcu_read_unlock(); /* we can't take ->d_lock here; it's OK, though. */ new->d_flags |= DCACHE_PAR_LOOKUP; - new->d_wait = wq; + new->d_wait = NULL; hlist_bl_add_head(&new->d_u.d_in_lookup_hash, b); hlist_bl_unlock(b); return new; @@ -2616,8 +2655,12 @@ static wait_queue_head_t *__d_lookup_unhash(struct dentry *dentry) void __d_lookup_unhash_wake(struct dentry *dentry) { + wait_queue_head_t *d_wait; + spin_lock(&dentry->d_lock); - wake_up_all(__d_lookup_unhash(dentry)); + d_wait = __d_lookup_unhash(dentry); + if (d_wait) + __wake_up(d_wait, TASK_NORMAL, 0, dentry); spin_unlock(&dentry->d_lock); } EXPORT_SYMBOL(__d_lookup_unhash_wake); @@ -2645,7 +2688,7 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode) } __d_rehash(dentry); if (dir) - end_dir_add(dir, n, d_wait); + end_dir_add(dir, n, d_wait, dentry); spin_unlock(&dentry->d_lock); if (inode) spin_unlock(&inode->i_lock); @@ -2863,7 +2906,7 @@ static void __d_move(struct dentry *dentry, struct dentry *target, write_seqcount_end(&dentry->d_seq); if (dir) - end_dir_add(dir, n, d_wait); + end_dir_add(dir, n, d_wait, target); if (dentry->d_parent != old_parent) spin_unlock(&dentry->d_parent->d_lock); diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c index 17ce9636a2b1..c6b646a3f1bd 100644 --- a/fs/fuse/readdir.c +++ b/fs/fuse/readdir.c @@ -160,7 +160,6 @@ static int fuse_direntplus_link(struct file *file, struct inode *dir = d_inode(parent); struct fuse_conn *fc; struct inode *inode; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (!o->nodeid) { /* @@ -195,7 +194,7 @@ static int fuse_direntplus_link(struct file *file, dentry = d_lookup(parent, &name); if (!dentry) { retry: - dentry = d_alloc_parallel(parent, &name, &wq); + dentry = d_alloc_parallel(parent, &name); if (IS_ERR(dentry)) return PTR_ERR(dentry); } diff --git a/fs/namei.c b/fs/namei.c index 65082378dc60..174e6693304e 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1767,13 +1767,12 @@ static struct dentry *__lookup_slow(const struct qstr *name, { struct dentry *dentry, *old; struct inode *inode = dir->d_inode; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); /* Don't go there if it's already dead */ if (unlikely(IS_DEADDIR(inode))) return ERR_PTR(-ENOENT); again: - dentry = d_alloc_parallel(dir, name, &wq); + dentry = d_alloc_parallel(dir, name); if (IS_ERR(dentry)) return dentry; if (unlikely(!d_in_lookup(dentry))) { @@ -3566,7 +3565,6 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, struct dentry *dentry; int error, create_error = 0; umode_t mode = op->mode; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); if (unlikely(IS_DEADDIR(dir_inode))) return ERR_PTR(-ENOENT); @@ -3575,7 +3573,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, dentry = d_lookup(dir, &nd->last); for (;;) { if (!dentry) { - dentry = d_alloc_parallel(dir, &nd->last, &wq); + dentry = d_alloc_parallel(dir, &nd->last); if (IS_ERR(dentry)) return dentry; } diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 492cffd9d3d8..531bf586501f 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -725,7 +725,6 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry, unsigned long dir_verifier) { struct qstr filename = QSTR_INIT(entry->name, entry->len); - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); struct dentry *dentry; struct dentry *alias; struct inode *inode; @@ -754,7 +753,7 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry, dentry = d_lookup(parent, &filename); again: if (!dentry) { - dentry = d_alloc_parallel(parent, &filename, &wq); + dentry = d_alloc_parallel(parent, &filename); if (IS_ERR(dentry)) return; } @@ -2069,7 +2068,6 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned open_flags, umode_t mode) { - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); struct nfs_open_context *ctx; struct dentry *res; struct iattr attr = { .ia_valid = ATTR_OPEN }; @@ -2125,7 +2123,7 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, d_drop(dentry); switched = true; dentry = d_alloc_parallel(dentry->d_parent, - &dentry->d_name, &wq); + &dentry->d_name); if (IS_ERR(dentry)) return PTR_ERR(dentry); if (unlikely(!d_in_lookup(dentry))) diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c index bf77399696a7..d44162d3a8f1 100644 --- a/fs/nfs/unlink.c +++ b/fs/nfs/unlink.c @@ -124,7 +124,7 @@ static int nfs_call_unlink(struct dentry *dentry, struct inode *inode, struct nf struct dentry *alias; down_read_non_owner(&NFS_I(dir)->rmdir_sem); - alias = d_alloc_parallel(dentry->d_parent, &data->args.name, &data->wq); + alias = d_alloc_parallel(dentry->d_parent, &data->args.name); if (IS_ERR(alias)) { up_read_non_owner(&NFS_I(dir)->rmdir_sem); return 0; @@ -185,7 +185,6 @@ nfs_async_unlink(struct dentry *dentry, const struct qstr *name) data->cred = get_current_cred(); data->res.dir_attr = &data->dir_attr; - init_waitqueue_head(&data->wq); status = -EBUSY; spin_lock(&dentry->d_lock); diff --git a/fs/proc/base.c b/fs/proc/base.c index 0edf14a9840e..0c2d5583aef3 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -2125,8 +2125,7 @@ bool proc_fill_cache(struct file *file, struct dir_context *ctx, child = d_hash_and_lookup(dir, &qname); if (!child) { - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); - child = d_alloc_parallel(dir, &qname, &wq); + child = d_alloc_parallel(dir, &qname); if (IS_ERR(child)) goto end_instantiate; if (d_in_lookup(child)) { diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 27a283d85a6e..cd7999439aa9 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -693,8 +693,7 @@ static bool proc_sys_fill_cache(struct file *file, child = d_lookup(dir, &qname); if (!child) { - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); - child = d_alloc_parallel(dir, &qname, &wq); + child = d_alloc_parallel(dir, &qname); if (IS_ERR(child)) return false; if (d_in_lookup(child)) { diff --git a/fs/smb/client/readdir.c b/fs/smb/client/readdir.c index 273358d20a46..2c4c9b5ce3a2 100644 --- a/fs/smb/client/readdir.c +++ b/fs/smb/client/readdir.c @@ -73,7 +73,6 @@ cifs_prime_dcache(struct dentry *parent, struct qstr *name, struct cifs_sb_info *cifs_sb = CIFS_SB(sb); bool posix = cifs_sb_master_tcon(cifs_sb)->posix_extensions; bool reparse_need_reval = false; - DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq); int rc; cifs_dbg(FYI, "%s: for %s\n", __func__, name->name); @@ -105,7 +104,7 @@ cifs_prime_dcache(struct dentry *parent, struct qstr *name, (fattr->cf_flags & CIFS_FATTR_NEED_REVAL)) return; - dentry = d_alloc_parallel(parent, name, &wq); + dentry = d_alloc_parallel(parent, name); } if (IS_ERR(dentry)) return; diff --git a/include/linux/dcache.h b/include/linux/dcache.h index bff956f7b2b9..b64c0260e4be 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -235,8 +235,7 @@ extern void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op /* allocate/de-allocate */ extern struct dentry * d_alloc(struct dentry *, const struct qstr *); extern struct dentry * d_alloc_anon(struct super_block *); -extern struct dentry * d_alloc_parallel(struct dentry *, const struct qstr *, - wait_queue_head_t *); +extern struct dentry * d_alloc_parallel(struct dentry *, const struct qstr *); extern struct dentry * d_splice_alias(struct inode *, struct dentry *); extern struct dentry * d_add_ci(struct dentry *, struct inode *, struct qstr *); extern bool d_same_name(const struct dentry *dentry, const struct dentry *parent, diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 559273a0f16d..7b807b6c1e81 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -1735,7 +1735,6 @@ struct nfs_unlinkdata { struct nfs_removeargs args; struct nfs_removeres res; struct dentry *dentry; - wait_queue_head_t wq; const struct cred *cred; struct nfs_fattr dir_attr; long timeout; From patchwork Fri Dec 20 02:54:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916086 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65CC613B5AE; Fri, 20 Dec 2024 03:09:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664150; cv=none; b=ZyK3lWrq2n+AUjWJ58tFKVTvc/4Mx/UyI3GBlnA5D3nEXgGWEKCmrYXn6hEIC5uE2VrgTod00bjYqKUSRv3IXZ016KazJ+hgixPYdrriq/se8zYJIrJg0JB9kqsuoDK2QB+37iGzCC8VwAaIaCBiu1Alpy8mgI/xspO3ckzInTg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664150; c=relaxed/simple; bh=Bo7QEoD4d6wXAjUqhWsYKI7k/1L8cMEEFHwZm/53sQo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Uy2LpunAxxDnpBylOykpeo5EnDPDEqgnPYYxlYfgOP2z4rMdUZMWMhokL00cMsTPERVnFQjMZb3B8WI0mkWqwwmczuVZ/TwkBycyxrRbCwzclgNZcyplC71Orgz4ZE3GueqoJsuU+ErzgM8H8Db71Jn9Oj3dWZpgLYJTROldiEQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=L9vPEMp/; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=iKz9WiNl; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=L9vPEMp/; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=iKz9WiNl; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="L9vPEMp/"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="iKz9WiNl"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="L9vPEMp/"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="iKz9WiNl" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 73D6E1F385; Fri, 20 Dec 2024 03:09:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664145; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7SySF3mKOgCcyFExR0bWEbGojqlSsrDz567ZlnXh0aQ=; b=L9vPEMp/UxYN+/TvCtANB7QEIAIwECEGukTeCEbqMvpqmcPFRh5DqTxE2rjbnexf/xpP1x n6DzMhA2/W0DXYI7nb0vhnZYVKRHEL8TKVZ+i0FI9IxwPaZfdizYvqXWd6PAeCFnCyVu0S 2/4FlWWeZQwRQpWcX8tyPf8cXRaYWeQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664145; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7SySF3mKOgCcyFExR0bWEbGojqlSsrDz567ZlnXh0aQ=; b=iKz9WiNlUyZaC1s+qMBX9OEtt7CTp4R2wPOrXM8iP76DyH5zZuZ3j1hoKbbdRmHzB9i9OX 9oGgVEa3FGl2IOCg== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b="L9vPEMp/"; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=iKz9WiNl DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664145; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7SySF3mKOgCcyFExR0bWEbGojqlSsrDz567ZlnXh0aQ=; b=L9vPEMp/UxYN+/TvCtANB7QEIAIwECEGukTeCEbqMvpqmcPFRh5DqTxE2rjbnexf/xpP1x n6DzMhA2/W0DXYI7nb0vhnZYVKRHEL8TKVZ+i0FI9IxwPaZfdizYvqXWd6PAeCFnCyVu0S 2/4FlWWeZQwRQpWcX8tyPf8cXRaYWeQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664145; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7SySF3mKOgCcyFExR0bWEbGojqlSsrDz567ZlnXh0aQ=; b=iKz9WiNlUyZaC1s+qMBX9OEtt7CTp4R2wPOrXM8iP76DyH5zZuZ3j1hoKbbdRmHzB9i9OX 9oGgVEa3FGl2IOCg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 4BB9813A32; Fri, 20 Dec 2024 03:09:03 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id p52qAM/fZGdNGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:03 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 04/11] VFS: use d_alloc_parallel() in lookup_one_qstr_excl() Date: Fri, 20 Dec 2024 13:54:22 +1100 Message-ID: <20241220030830.272429-5-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 73D6E1F385 X-Spam-Level: X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; FROM_EQ_ENVFROM(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCPT_COUNT_FIVE(0.00)[6]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns,suse.de:email,suse.de:dkim,suse.de:mid] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO lookup_one_qstr_excl() is used for lookups prior to directory modifications, whether create, unlink, rename, or whatever. To prepare for allowing modification to happen in parallel, change lookup_one_qstr_excl() to use d_alloc_parallel(). If any for the "intent" LOOKUP flags are passed, the caller must ensure d_lookup_done() is called at an appropriate time. If none are passed then we can be sure ->lookup() will do a real lookup and d_lookup_done() is called internally. Signed-off-by: NeilBrown --- fs/namei.c | 21 ++++++++++++++------- fs/smb/server/vfs.c | 1 + include/linux/namei.h | 3 +++ 3 files changed, 18 insertions(+), 7 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 174e6693304e..395bfbc8fc92 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1664,11 +1664,9 @@ static struct dentry *lookup_dcache(const struct qstr *name, } /* - * Parent directory has inode locked exclusive. This is one - * and only case when ->lookup() gets called on non in-lookup - * dentries - as the matter of fact, this only gets called - * when directory is guaranteed to have no in-lookup children - * at all. + * Parent directory has inode locked exclusive. + * If @flags contains any LOOKUP_INTENT_FLAGS then d_lookup_done() + * must be called after the intended operation is performed - or aborted. */ struct dentry *lookup_one_qstr_excl(const struct qstr *name, struct dentry *base, @@ -1685,15 +1683,22 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name, if (unlikely(IS_DEADDIR(dir))) return ERR_PTR(-ENOENT); - dentry = d_alloc(base, name); - if (unlikely(!dentry)) + dentry = d_alloc_parallel(base, name); + if (unlikely(IS_ERR_OR_NULL(dentry))) return ERR_PTR(-ENOMEM); + if (!d_in_lookup(dentry)) + /* Raced with another thread which did the lookup */ + return dentry; old = dir->i_op->lookup(dir, dentry, flags); if (unlikely(old)) { + d_lookup_done(dentry); dput(dentry); dentry = old; } + if ((flags & LOOKUP_INTENT_FLAGS) == 0) + /* ->lookup must have given final answer */ + d_lookup_done(dentry); return dentry; } EXPORT_SYMBOL(lookup_one_qstr_excl); @@ -4112,6 +4117,7 @@ static struct dentry *filename_create(int dfd, struct filename *name, } return dentry; fail: + d_lookup_done(dentry); dput(dentry); dentry = ERR_PTR(error); unlock: @@ -5340,6 +5346,7 @@ int do_renameat2(int olddfd, struct filename *from, int newdfd, rd.flags = flags; error = vfs_rename(&rd); exit5: + d_lookup_done(new_dentry); dput(new_dentry); exit4: dput(old_dentry); diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index dfb0eee5f5f3..83131f08bfb4 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -772,6 +772,7 @@ int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path, ksmbd_debug(VFS, "vfs_rename failed err %d\n", err); out4: + d_lookup_done(new_dentry); dput(new_dentry); out3: dput(old_parent); diff --git a/include/linux/namei.h b/include/linux/namei.h index 8ec8fed3bce8..15118992f745 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -34,6 +34,9 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT}; #define LOOKUP_EXCL 0x0400 /* ... in exclusive creation */ #define LOOKUP_RENAME_TARGET 0x0800 /* ... in destination of rename() */ +#define LOOKUP_INTENT_FLAGS (LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_EXCL | \ + LOOKUP_RENAME_TARGET) + /* internal use only */ #define LOOKUP_PARENT 0x0010 From patchwork Fri Dec 20 02:54:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916087 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A5D831632C8; Fri, 20 Dec 2024 03:09:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664154; cv=none; b=JPNEjJIlVx6g+4tH4BBLLUtA1pdJkPU4vZH4hmjCm9NI/gUyQi/UMGNUV1AbjWD6x+imx1tnOzlDJqd7shSGi5mAJWkAha6j1SfjyMhmi4wkgyuJJ19H85yATSXshPwt2oNDgLIHyXFo0YJkAkkeTFD8SRG4s7ows1++MISy9JY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664154; c=relaxed/simple; bh=igjLZU34N84JAJHbkG3DJsr5Py53lspSM/nGwQzLgm8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uZtnEZtQ17bH9wwL8xrp9GFtaZzeCh2Q4W0cpD7yJrlINTkXx+3Oqs/kKkV83GD3tfvUUnTQQdECFpeQN/O2EXAfDO3LXsApWrTaNa7F7pdJZaZmLx5bzlX32pdVyYET1xdKTCobERA5vz2U01XoM743CIB8UQc7yJvKngLbAek= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=LXDUHvSK; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=FUUGJQu8; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=RjTNp9b2; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=DOH2YnTk; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="LXDUHvSK"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="FUUGJQu8"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="RjTNp9b2"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="DOH2YnTk" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E42AD1F385; Fri, 20 Dec 2024 03:09:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664151; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g78TC/Zbx8kwsLCDh78LnjHG14nWVEMWb4e/lUvSCGI=; b=LXDUHvSKAxVzHleis25dnUqV8IpKRuDk5GO/8ijcmuYt5KDBG+gh6KyQsHq3YD1YJyGy7e gprXAs2Sg5z+dftCPBFuzb4UVVaZZvnd7tVhd9yv6H3oJ3fdj9rb0pZuqYYLfiAOsmtITj 1wWfPt0ieIToiZy/3UTDCdpBYz0990M= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664151; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g78TC/Zbx8kwsLCDh78LnjHG14nWVEMWb4e/lUvSCGI=; b=FUUGJQu85zdXk/lvyhN6CF/EDkh8S1i6ui0yCk0VRbW1OWXlWEUuNp5d/G2QVeLzyFQDRN zk82QMPDlK3raLDA== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=RjTNp9b2; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=DOH2YnTk DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664150; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g78TC/Zbx8kwsLCDh78LnjHG14nWVEMWb4e/lUvSCGI=; b=RjTNp9b27DSfXom89DQt6DbUZnL1ES/qS28LHCKFsmL2np1bC5cq7qY418T860GG8R6q8H sOdEpsf9LZFwFYQGtdaw7OgoRC7COVLCLA/9ZzKazKjpOfPXKYguQY+x8OnZPDj1X4IoF9 1+vrGk8Of06PIP3DO9aZMTvi5mTvel4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664150; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g78TC/Zbx8kwsLCDh78LnjHG14nWVEMWb4e/lUvSCGI=; b=DOH2YnTkN5bh0qVRjo/XzqC582Il62loqkNL3QhPPDJSm+60ceCoUXMNI3iu9WJCTYLtu7 MMXJZnVk43laxPBg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id CC20913A32; Fri, 20 Dec 2024 03:09:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 42lDINTfZGdTGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:08 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 05/11] VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry Date: Fri, 20 Dec 2024 13:54:23 +1100 Message-ID: <20241220030830.272429-6-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: E42AD1F385 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; FROM_EQ_ENVFROM(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCPT_COUNT_FIVE(0.00)[6]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo,suse.de:dkim,suse.de:mid,suse.de:email] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: No callers of kern_path_locked() or user_path_locked_at() want a negative dentry. So change them to return -ENOENT instead. This simplifies callers. This results in a subtle change to bcachefs in that an ioctl will now return -ENOENT in preference to -EXDEV. I believe this restores the behaviour to what it was prior to Commit bbe6a7c899e7 ("bch2_ioctl_subvolume_destroy(): fix locking") Signed-off-by: NeilBrown --- drivers/base/devtmpfs.c | 65 +++++++++++++++++++---------------------- fs/bcachefs/fs-ioctl.c | 4 --- fs/namei.c | 4 +++ kernel/audit_watch.c | 12 ++++---- 4 files changed, 40 insertions(+), 45 deletions(-) diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c index b848764ef018..c9e34842139f 100644 --- a/drivers/base/devtmpfs.c +++ b/drivers/base/devtmpfs.c @@ -245,15 +245,12 @@ static int dev_rmdir(const char *name) dentry = kern_path_locked(name, &parent); if (IS_ERR(dentry)) return PTR_ERR(dentry); - if (d_really_is_positive(dentry)) { - if (d_inode(dentry)->i_private == &thread) - err = vfs_rmdir(&nop_mnt_idmap, d_inode(parent.dentry), - dentry); - else - err = -EPERM; - } else { - err = -ENOENT; - } + if (d_inode(dentry)->i_private == &thread) + err = vfs_rmdir(&nop_mnt_idmap, d_inode(parent.dentry), + dentry); + else + err = -EPERM; + dput(dentry); inode_unlock(d_inode(parent.dentry)); path_put(&parent); @@ -310,6 +307,8 @@ static int handle_remove(const char *nodename, struct device *dev) { struct path parent; struct dentry *dentry; + struct kstat stat; + struct path p; int deleted = 0; int err; @@ -317,32 +316,28 @@ static int handle_remove(const char *nodename, struct device *dev) if (IS_ERR(dentry)) return PTR_ERR(dentry); - if (d_really_is_positive(dentry)) { - struct kstat stat; - struct path p = {.mnt = parent.mnt, .dentry = dentry}; - err = vfs_getattr(&p, &stat, STATX_TYPE | STATX_MODE, - AT_STATX_SYNC_AS_STAT); - if (!err && dev_mynode(dev, d_inode(dentry), &stat)) { - struct iattr newattrs; - /* - * before unlinking this node, reset permissions - * of possible references like hardlinks - */ - newattrs.ia_uid = GLOBAL_ROOT_UID; - newattrs.ia_gid = GLOBAL_ROOT_GID; - newattrs.ia_mode = stat.mode & ~0777; - newattrs.ia_valid = - ATTR_UID|ATTR_GID|ATTR_MODE; - inode_lock(d_inode(dentry)); - notify_change(&nop_mnt_idmap, dentry, &newattrs, NULL); - inode_unlock(d_inode(dentry)); - err = vfs_unlink(&nop_mnt_idmap, d_inode(parent.dentry), - dentry, NULL); - if (!err || err == -ENOENT) - deleted = 1; - } - } else { - err = -ENOENT; + p.mnt = parent.mnt; + p.dentry = dentry; + err = vfs_getattr(&p, &stat, STATX_TYPE | STATX_MODE, + AT_STATX_SYNC_AS_STAT); + if (!err && dev_mynode(dev, d_inode(dentry), &stat)) { + struct iattr newattrs; + /* + * before unlinking this node, reset permissions + * of possible references like hardlinks + */ + newattrs.ia_uid = GLOBAL_ROOT_UID; + newattrs.ia_gid = GLOBAL_ROOT_GID; + newattrs.ia_mode = stat.mode & ~0777; + newattrs.ia_valid = + ATTR_UID|ATTR_GID|ATTR_MODE; + inode_lock(d_inode(dentry)); + notify_change(&nop_mnt_idmap, dentry, &newattrs, NULL); + inode_unlock(d_inode(dentry)); + err = vfs_unlink(&nop_mnt_idmap, d_inode(parent.dentry), + dentry, NULL); + if (!err || err == -ENOENT) + deleted = 1; } dput(dentry); inode_unlock(d_inode(parent.dentry)); diff --git a/fs/bcachefs/fs-ioctl.c b/fs/bcachefs/fs-ioctl.c index 405cf08bda34..c5464219b23f 100644 --- a/fs/bcachefs/fs-ioctl.c +++ b/fs/bcachefs/fs-ioctl.c @@ -516,10 +516,6 @@ static long bch2_ioctl_subvolume_destroy(struct bch_fs *c, struct file *filp, ret = -EXDEV; goto err; } - if (!d_is_positive(victim)) { - ret = -ENOENT; - goto err; - } ret = __bch2_unlink(dir, victim, true); if (!ret) { fsnotify_rmdir(dir, victim); diff --git a/fs/namei.c b/fs/namei.c index 395bfbc8fc92..8780406cb4d7 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2743,6 +2743,10 @@ static struct dentry *__kern_path_locked(int dfd, struct filename *name, struct } inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); d = lookup_one_qstr_excl(&last, path->dentry, 0); + if (!IS_ERR(d) && d_is_negative(d)) { + dput(d); + d = ERR_PTR(-ENOENT); + } if (IS_ERR(d)) { inode_unlock(path->dentry->d_inode); path_put(path); diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c index 7f358740e958..e3130675ee6b 100644 --- a/kernel/audit_watch.c +++ b/kernel/audit_watch.c @@ -350,11 +350,10 @@ static int audit_get_nd(struct audit_watch *watch, struct path *parent) struct dentry *d = kern_path_locked(watch->path, parent); if (IS_ERR(d)) return PTR_ERR(d); - if (d_is_positive(d)) { - /* update watch filter fields */ - watch->dev = d->d_sb->s_dev; - watch->ino = d_backing_inode(d)->i_ino; - } + /* update watch filter fields */ + watch->dev = d->d_sb->s_dev; + watch->ino = d_backing_inode(d)->i_ino; + inode_unlock(d_backing_inode(parent->dentry)); dput(d); return 0; @@ -419,7 +418,7 @@ int audit_add_watch(struct audit_krule *krule, struct list_head **list) /* caller expects mutex locked */ mutex_lock(&audit_filter_mutex); - if (ret) { + if (ret && ret != -ENOENT) { audit_put_watch(watch); return ret; } @@ -438,6 +437,7 @@ int audit_add_watch(struct audit_krule *krule, struct list_head **list) h = audit_hash_ino((u32)watch->ino); *list = &audit_inode_hash[h]; + ret = 0; error: path_put(&parent_path); audit_put_watch(watch); From patchwork Fri Dec 20 02:54:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916088 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 227AD17E8E2; Fri, 20 Dec 2024 03:09:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664159; cv=none; b=QHnWhWXyVbCHH2mbw8+35F2F+URm2FdsoWHovsUixD85D2N/aLeg/TiguGynG5/zQS/RiW10fbkzF6EqRzDOWrhXZEu/zgXDP8JIUPas0QdbFu3Wj+pqlUCqL8Wif4OTkcBouuYI/qyOXpN/O9iIpIlZ3b8M9CUyQ7vrM56Yym0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664159; c=relaxed/simple; bh=C9RupBWquoHEOawqo9DZe76e8vReOGkDekoEbmi6HAo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MbluqHSjMR5A1GNL0tIjc86NtqookUfd757Wn0KrzSoZSOxrYYiuLHD/bSU79ToDQnqaW+oqDeKDRAHu9I5zn8odNFUj2PBwdzkuupodOtqrV6xHDiluufTJxvNvcDWAhcESVxiujcBaGhKnVL72fcdz/GZ0iXt7fc4/3oj/zkk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=O8Cc1fgf; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=B53r9Jh1; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=O8Cc1fgf; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=B53r9Jh1; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="O8Cc1fgf"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="B53r9Jh1"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="O8Cc1fgf"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="B53r9Jh1" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6A5AD1F385; Fri, 20 Dec 2024 03:09:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664156; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8BPrK/dwiAh/i96umzwTDF3eWf8F9yePnwIjP2Sj9Ck=; b=O8Cc1fgfSpoDvGLWxEf3FaqRzQAFpHUYDnycCEvNSkyGeeyHBSksTUSZ/ADqc0mdxnwZQn qUAPMQdmiQ6zQMiZf26QoxknPfutevEcL30baPjuACCJ73pFjd4i9j8YJtNDpc10c2oI5y SFHK1aY5zhT4xNDtkQx1dC4D2N+CsLY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664156; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8BPrK/dwiAh/i96umzwTDF3eWf8F9yePnwIjP2Sj9Ck=; b=B53r9Jh1KtlbypEc7hKpH7T1Z4e57FvbSbmENRPlvZvaQG699TkiTSC53vSRxlkhyFMuCw 7hZ9SKww8gxG7LDw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=O8Cc1fgf; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=B53r9Jh1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664156; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8BPrK/dwiAh/i96umzwTDF3eWf8F9yePnwIjP2Sj9Ck=; b=O8Cc1fgfSpoDvGLWxEf3FaqRzQAFpHUYDnycCEvNSkyGeeyHBSksTUSZ/ADqc0mdxnwZQn qUAPMQdmiQ6zQMiZf26QoxknPfutevEcL30baPjuACCJ73pFjd4i9j8YJtNDpc10c2oI5y SFHK1aY5zhT4xNDtkQx1dC4D2N+CsLY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664156; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8BPrK/dwiAh/i96umzwTDF3eWf8F9yePnwIjP2Sj9Ck=; b=B53r9Jh1KtlbypEc7hKpH7T1Z4e57FvbSbmENRPlvZvaQG699TkiTSC53vSRxlkhyFMuCw 7hZ9SKww8gxG7LDw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 5007F13A32; Fri, 20 Dec 2024 03:09:14 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id gMPhAdrfZGdZGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:14 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 06/11] VFS: introduce done_lookup_and_lock() Date: Fri, 20 Dec 2024 13:54:24 +1100 Message-ID: <20241220030830.272429-7-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 6A5AD1F385 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; FROM_EQ_ENVFROM(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCPT_COUNT_FIVE(0.00)[6]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo,suse.de:dkim,suse.de:mid,suse.de:email] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: Callers of kern_path_locked() and user_path_locked_at() should now call done_lookup_and_lock() to unlock the directory and dput() the dentry. This will allow the locking rules to be changed in a central place. Signed-off-by: NeilBrown --- drivers/base/devtmpfs.c | 7 +++---- fs/bcachefs/fs-ioctl.c | 3 +-- fs/namei.c | 10 ++++++++-- include/linux/namei.h | 1 + kernel/audit_fsnotify.c | 3 +-- kernel/audit_watch.c | 3 +-- 6 files changed, 15 insertions(+), 12 deletions(-) diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c index c9e34842139f..bb6d26338b6c 100644 --- a/drivers/base/devtmpfs.c +++ b/drivers/base/devtmpfs.c @@ -251,8 +251,8 @@ static int dev_rmdir(const char *name) else err = -EPERM; - dput(dentry); - inode_unlock(d_inode(parent.dentry)); + done_lookup_and_lock(parent.dentry, dentry); + path_put(&parent); return err; } @@ -339,8 +339,7 @@ static int handle_remove(const char *nodename, struct device *dev) if (!err || err == -ENOENT) deleted = 1; } - dput(dentry); - inode_unlock(d_inode(parent.dentry)); + done_lookup_and_lock(parent.dentry, dentry); path_put(&parent); if (deleted && strchr(nodename, '/')) diff --git a/fs/bcachefs/fs-ioctl.c b/fs/bcachefs/fs-ioctl.c index c5464219b23f..d51c86e24bef 100644 --- a/fs/bcachefs/fs-ioctl.c +++ b/fs/bcachefs/fs-ioctl.c @@ -522,8 +522,7 @@ static long bch2_ioctl_subvolume_destroy(struct bch_fs *c, struct file *filp, d_delete(victim); } err: - inode_unlock(dir); - dput(victim); + done_lookup_and_lock(path.dentry, victim); path_put(&path); return ret; } diff --git a/fs/namei.c b/fs/namei.c index 8780406cb4d7..29f86df4b9dc 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2773,6 +2773,13 @@ struct dentry *user_path_locked_at(int dfd, const char __user *name, struct path } EXPORT_SYMBOL(user_path_locked_at); +void done_lookup_and_lock(struct dentry *parent, struct dentry *child) +{ + dput(child); + inode_unlock(d_inode(parent)); +} +EXPORT_SYMBOL(done_lookup_and_lock); + int kern_path(const char *name, unsigned int flags, struct path *path) { struct filename *filename = getname_kernel(name); @@ -4146,8 +4153,7 @@ EXPORT_SYMBOL(kern_path_create); void done_path_create(struct path *path, struct dentry *dentry) { - dput(dentry); - inode_unlock(path->dentry->d_inode); + done_lookup_and_lock(path->dentry, dentry); mnt_drop_write(path->mnt); path_put(path); } diff --git a/include/linux/namei.h b/include/linux/namei.h index 15118992f745..898fc8ba37e1 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -65,6 +65,7 @@ extern struct dentry *user_path_create(int, const char __user *, struct path *, extern void done_path_create(struct path *, struct dentry *); extern struct dentry *kern_path_locked(const char *, struct path *); extern struct dentry *user_path_locked_at(int , const char __user *, struct path *); +extern void done_lookup_and_lock(struct dentry *parent, struct dentry *child); int vfs_path_parent_lookup(struct filename *filename, unsigned int flags, struct path *parent, struct qstr *last, int *type, const struct path *root); diff --git a/kernel/audit_fsnotify.c b/kernel/audit_fsnotify.c index c565fbf66ac8..db2c03caa74d 100644 --- a/kernel/audit_fsnotify.c +++ b/kernel/audit_fsnotify.c @@ -86,7 +86,6 @@ struct audit_fsnotify_mark *audit_alloc_mark(struct audit_krule *krule, char *pa if (IS_ERR(dentry)) return ERR_CAST(dentry); /* returning an error */ inode = path.dentry->d_inode; - inode_unlock(inode); audit_mark = kzalloc(sizeof(*audit_mark), GFP_KERNEL); if (unlikely(!audit_mark)) { @@ -107,7 +106,7 @@ struct audit_fsnotify_mark *audit_alloc_mark(struct audit_krule *krule, char *pa audit_mark = ERR_PTR(ret); } out: - dput(dentry); + done_lookup_and_lock(path.dentry, dentry); path_put(&path); return audit_mark; } diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c index e3130675ee6b..e1137ea9294b 100644 --- a/kernel/audit_watch.c +++ b/kernel/audit_watch.c @@ -354,8 +354,7 @@ static int audit_get_nd(struct audit_watch *watch, struct path *parent) watch->dev = d->d_sb->s_dev; watch->ino = d_backing_inode(d)->i_ino; - inode_unlock(d_backing_inode(parent->dentry)); - dput(d); + done_lookup_and_lock(parent->dentry, d); return 0; } From patchwork Fri Dec 20 02:54:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916089 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEBC6198A38; Fri, 20 Dec 2024 03:09:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664165; cv=none; b=b9GIjk0+pF1+XquMHTBTsGBZN+ug5XgLpAzjzi7oMataYLS1ZLt2EwU2uSZQlDiFhFo3l4PitOrzJZqQ7OVVbTlN1EWhOzER8oYwZesBpU6AYL7iIoR5TlHDImU56k1shq3Hlu45GR4BtEAvw9wPk20jpLmUH9nQY21uPN78s8Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664165; c=relaxed/simple; bh=P88JwG11tRsTydZ18sfuVVbZNGEKEJgROeBGZiDCPYU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WDO4tvDd84m1JZHvxebj2ba/YRpsRx48QKbtls51DeswhJ74OVSbCtHJiTsv46tVP9Ij5qMRD79f8/5GKKSEsUvjstXcW60xATs561t3shjELf+4tbGKol690QCTpPJ3tzrTT8Y6zVkfyF3z0N1jEJLiNKfwdh7SUz/4vEiFZUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=HZPgGX3f; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=K3E6P5Fk; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=HZPgGX3f; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=K3E6P5Fk; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="HZPgGX3f"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="K3E6P5Fk"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="HZPgGX3f"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="K3E6P5Fk" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id F2B31210F2; Fri, 20 Dec 2024 03:09:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664162; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mp+9yKahrxXskV0v17ELTnYvjtxAsxpKKJ0xbMRqepw=; b=HZPgGX3fz2eGAvucxoFsAwYKpU8GF9pZXDNbfRzBZftX+vMEYYBRCb0iRs4K3UTa5nqODz 2tYGe9s/PauyFn6wvfrm6UasdIZ1RVKwu7TrFquyfuqXnD4xKZwJy5s+XWpe7NP3AdSzON SFvOP063omaNYDRYtFT8yauQTNnArsc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664162; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mp+9yKahrxXskV0v17ELTnYvjtxAsxpKKJ0xbMRqepw=; b=K3E6P5Fka2HwCNU6Kn5kERAfl8F6jeLMB3FAM2wO6H5q7Xwq7pWlyNAPswmLjWKGVs8k55 HXtybquIRdEFZDDw== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=HZPgGX3f; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=K3E6P5Fk DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664162; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mp+9yKahrxXskV0v17ELTnYvjtxAsxpKKJ0xbMRqepw=; b=HZPgGX3fz2eGAvucxoFsAwYKpU8GF9pZXDNbfRzBZftX+vMEYYBRCb0iRs4K3UTa5nqODz 2tYGe9s/PauyFn6wvfrm6UasdIZ1RVKwu7TrFquyfuqXnD4xKZwJy5s+XWpe7NP3AdSzON SFvOP063omaNYDRYtFT8yauQTNnArsc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664162; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mp+9yKahrxXskV0v17ELTnYvjtxAsxpKKJ0xbMRqepw=; b=K3E6P5Fka2HwCNU6Kn5kERAfl8F6jeLMB3FAM2wO6H5q7Xwq7pWlyNAPswmLjWKGVs8k55 HXtybquIRdEFZDDw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id CDF6213A32; Fri, 20 Dec 2024 03:09:19 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 6eadIN/fZGdmGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:19 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 07/11] VFS: introduce lookup_and_lock() Date: Fri, 20 Dec 2024 13:54:25 +1100 Message-ID: <20241220030830.272429-8-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: F2B31210F2 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; FROM_EQ_ENVFROM(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCPT_COUNT_FIVE(0.00)[6]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo,suse.de:dkim,suse.de:mid,suse.de:email] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: lookup_and_lock() combines locking the directory and performing a lookup prior to a change to the directory. Abstracting this prepares for changing the locking requirements. done_lookup_and_lock() will be called by all callers of lookup_and_lock() to unlock and dput() lookup_and_lock() returns -ENOENT if LOOKUP_CREATE was NOT given and the name cannot be found,, and returns -EEXIST if LOOKUP_EXCL WAS given and the name CAN be found. This is what callers want. These functions replace all uses of lookup_one_qstr_excl() in namei.c except for those used for rename. The name might seem backwards as the lock happens before the lookup. A future patch will change this so that only a shared lock is taken before the lookup, and an exclusive lock on the dentry is taken after a successful lookup. So the order "lookup" then "lock" will make sense. This functionality is exported as lookup_and_lock_one() which takes a name and len rather than a qstr. Signed-off-by: NeilBrown --- fs/namei.c | 118 ++++++++++++++++++++++-------------------- include/linux/namei.h | 3 ++ 2 files changed, 65 insertions(+), 56 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 29f86df4b9dc..371c80902c59 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1703,6 +1703,33 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name, } EXPORT_SYMBOL(lookup_one_qstr_excl); +static struct dentry *lookup_and_lock(const struct qstr *last, + struct dentry *base, + unsigned int lookup_flags) +{ + struct dentry *dentry; + int err; + + inode_lock_nested(base->d_inode, I_MUTEX_PARENT); + dentry = lookup_one_qstr_excl(last, base, lookup_flags); + if (IS_ERR(dentry)) + goto out; + err = -EEXIST; + if ((lookup_flags & LOOKUP_EXCL) && d_is_positive(dentry)) + goto err; + err = -ENOENT; + if (!(lookup_flags & LOOKUP_CREATE) && d_is_negative(dentry)) + goto err; + return dentry; + +err: + dput(dentry); + dentry = ERR_PTR(err); +out: + inode_unlock(base->d_inode); + return dentry; +} + /** * lookup_fast - do fast lockless (but racy) lookup of a dentry * @nd: current nameidata @@ -2741,16 +2768,9 @@ static struct dentry *__kern_path_locked(int dfd, struct filename *name, struct path_put(path); return ERR_PTR(-EINVAL); } - inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - d = lookup_one_qstr_excl(&last, path->dentry, 0); - if (!IS_ERR(d) && d_is_negative(d)) { - dput(d); - d = ERR_PTR(-ENOENT); - } - if (IS_ERR(d)) { - inode_unlock(path->dentry->d_inode); + d = lookup_and_lock(&last, path->dentry, 0); + if (IS_ERR(d)) path_put(path); - } return d; } @@ -3051,6 +3071,22 @@ struct dentry *lookup_positive_unlocked(const char *name, } EXPORT_SYMBOL(lookup_positive_unlocked); +struct dentry *lookup_and_lock_one(struct mnt_idmap *idmap, + const char *name, int len, struct dentry *base, + unsigned int lookup_flags) +{ + struct qstr this; + int err; + + if (!idmap) + idmap = &nop_mnt_idmap; + err = lookup_one_common(idmap, name, base, len, &this); + if (err) + return ERR_PTR(err); + return lookup_and_lock(&this, base, lookup_flags); +} +EXPORT_SYMBOL(lookup_and_lock_one); + #ifdef CONFIG_UNIX98_PTYS int path_pts(struct path *path) { @@ -4080,7 +4116,6 @@ static struct dentry *filename_create(int dfd, struct filename *name, unsigned int reval_flag = lookup_flags & LOOKUP_REVAL; unsigned int create_flags = LOOKUP_CREATE | LOOKUP_EXCL; int type; - int err2; int error; error = filename_parentat(dfd, name, reval_flag, path, &last, &type); @@ -4092,50 +4127,31 @@ static struct dentry *filename_create(int dfd, struct filename *name, * (foo/., foo/.., /////) */ if (unlikely(type != LAST_NORM)) - goto out; + goto put; /* don't fail immediately if it's r/o, at least try to report other errors */ - err2 = mnt_want_write(path->mnt); + error = mnt_want_write(path->mnt); /* * Do the final lookup. Suppress 'create' if there is a trailing * '/', and a directory wasn't requested. */ if (last.name[last.len] && !want_dir) - create_flags = 0; - inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr_excl(&last, path->dentry, - reval_flag | create_flags); + create_flags &= ~LOOKUP_CREATE; + dentry = lookup_and_lock(&last, path->dentry, reval_flag | create_flags); if (IS_ERR(dentry)) - goto unlock; + goto drop; - error = -EEXIST; - if (d_is_positive(dentry)) - goto fail; - - /* - * Special case - lookup gave negative, but... we had foo/bar/ - * From the vfs_mknod() POV we just have a negative dentry - - * all is fine. Let's be bastards - you had / on the end, you've - * been asking for (non-existent) directory. -ENOENT for you. - */ - if (unlikely(!create_flags)) { - error = -ENOENT; - goto fail; - } - if (unlikely(err2)) { - error = err2; + if (unlikely(error)) goto fail; - } return dentry; fail: d_lookup_done(dentry); - dput(dentry); + done_lookup_and_lock(path->dentry, dentry); dentry = ERR_PTR(error); -unlock: - inode_unlock(path->dentry->d_inode); - if (!err2) +drop: + if (!error) mnt_drop_write(path->mnt); -out: +put: path_put(path); return dentry; } @@ -4555,23 +4571,18 @@ int do_rmdir(int dfd, struct filename *name) if (error) goto exit2; - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags); + dentry = lookup_and_lock(&last, path.dentry, lookup_flags); error = PTR_ERR(dentry); if (IS_ERR(dentry)) goto exit3; - if (!dentry->d_inode) { - error = -ENOENT; - goto exit4; - } + error = security_path_rmdir(&path, dentry); if (error) goto exit4; error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); exit4: - dput(dentry); + done_lookup_and_lock(path.dentry, dentry); exit3: - inode_unlock(path.dentry->d_inode); mnt_drop_write(path.mnt); exit2: path_put(&path); @@ -4691,13 +4702,11 @@ int do_unlinkat(int dfd, struct filename *name) if (error) goto exit2; retry_deleg: - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags); + dentry = lookup_and_lock(&last, path.dentry, lookup_flags); error = PTR_ERR(dentry); if (!IS_ERR(dentry)) { - /* Why not before? Because we want correct error value */ - if (last.name[last.len] || d_is_negative(dentry)) + if (last.name[last.len]) goto slashes; inode = dentry->d_inode; ihold(inode); @@ -4707,9 +4716,8 @@ int do_unlinkat(int dfd, struct filename *name) error = vfs_unlink(mnt_idmap(path.mnt), path.dentry->d_inode, dentry, &delegated_inode); exit3: - dput(dentry); + done_lookup_and_lock(path.dentry, dentry); } - inode_unlock(path.dentry->d_inode); if (inode) iput(inode); /* truncate the inode here */ inode = NULL; @@ -4731,9 +4739,7 @@ int do_unlinkat(int dfd, struct filename *name) return error; slashes: - if (d_is_negative(dentry)) - error = -ENOENT; - else if (d_is_dir(dentry)) + if (d_is_dir(dentry)) error = -EISDIR; else error = -ENOTDIR; diff --git a/include/linux/namei.h b/include/linux/namei.h index 898fc8ba37e1..f882874a7b00 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -83,6 +83,9 @@ struct dentry *lookup_one_unlocked(struct mnt_idmap *idmap, struct dentry *lookup_one_positive_unlocked(struct mnt_idmap *idmap, const char *name, struct dentry *base, int len); +struct dentry *lookup_and_lock_one(struct mnt_idmap *idmap, + const char *name, int len, struct dentry *base, + unsigned int lookup_flags); extern int follow_down_one(struct path *); extern int follow_down(struct path *path, unsigned int flags); From patchwork Fri Dec 20 02:54:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916090 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5957055887; Fri, 20 Dec 2024 03:09:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664175; cv=none; b=ulVPaoA3V2TG/FUGyyMkLXu/A1JgEos9nDwTw1l5iWe+NQXdCPxfBmzx17/XnvuOH/Z8jGobS6+kizQ2pXZa3do+zHnZaarRIv9qqI9C/XDrn4DnCpcW4U35oyVRUz3A7hP1nZw1hLgT+uFIVhA/NqLOET1QaczXvRAS/z15xow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664175; c=relaxed/simple; bh=IpA3sUqrWV1i6oqiYoccjA2wza7Q/AbvjHQukZ9TCrs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cI+NzGqNJOp9VAn2oPIuIjQo8YEYCrwuZlrDdy7D5ewBBY7BWtiBt3ZmSZehvLlYrWHhf97PeTtyth1Bzqaw5FjChN+yJ102azu+yV7tPmrtaE2tv+LqgVY2c+ag3XGNSUeZlWhoagaVB+fX9U6s1WA4fH4yQ10C//tnwj//jUY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=dZjj/jc1; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=JqOA1ABL; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=dZjj/jc1; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=JqOA1ABL; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="dZjj/jc1"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="JqOA1ABL"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="dZjj/jc1"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="JqOA1ABL" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 71C0B1F385; Fri, 20 Dec 2024 03:09:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664171; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CIy/EqZ/XEwEdnSZYB/wNk3K9wIaZypr3NWhBIvRZc0=; b=dZjj/jc1KNzQaUi17JGX3o+bZVOg5s7qpz5O2qhTnvWV8U8LS9iFbeDid1pjp4L/+DtlH1 XmKPXGehHTtyKA7in5nv1mzjCcppnY6tHht2UDxR5BXXWzwSnT9mFOpVuKQ2Mud7xhueBc /mwWV/ZjL8fgnTseLGCLVUsUWRWWR18= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664171; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CIy/EqZ/XEwEdnSZYB/wNk3K9wIaZypr3NWhBIvRZc0=; b=JqOA1ABLoDwt7hCR3pUd8TIH6bWk3U6NdCdG7T86bs8cnTXR0jtWNbSLyssOu8xAxAsfT6 LAe1fGNu5nvgFzBg== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664171; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CIy/EqZ/XEwEdnSZYB/wNk3K9wIaZypr3NWhBIvRZc0=; b=dZjj/jc1KNzQaUi17JGX3o+bZVOg5s7qpz5O2qhTnvWV8U8LS9iFbeDid1pjp4L/+DtlH1 XmKPXGehHTtyKA7in5nv1mzjCcppnY6tHht2UDxR5BXXWzwSnT9mFOpVuKQ2Mud7xhueBc /mwWV/ZjL8fgnTseLGCLVUsUWRWWR18= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664171; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CIy/EqZ/XEwEdnSZYB/wNk3K9wIaZypr3NWhBIvRZc0=; b=JqOA1ABLoDwt7hCR3pUd8TIH6bWk3U6NdCdG7T86bs8cnTXR0jtWNbSLyssOu8xAxAsfT6 LAe1fGNu5nvgFzBg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 58B3213A32; Fri, 20 Dec 2024 03:09:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id jzLjA+nfZGduGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:29 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 08/11] VFS: add inode_dir_lock/unlock Date: Fri, 20 Dec 2024 13:54:26 +1100 Message-ID: <20241220030830.272429-9-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.de:email,suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Score: -2.80 X-Spam-Flag: NO During the transition from providing exclusive locking on the directory for directory modifying operation to providing exclusive locking only on the dentry with a shared lock on the directory - we need an alternate way to provide exclusion on the directory for file systems which haven't been converted. This is provided by inode_dir_lock() and inode_dir_inlock(). This uses a bit in i_state for locking, and wait_var_event_spinlock() for waiting. Signed-off-by: NeilBrown --- fs/inode.c | 3 ++ fs/namei.c | 81 +++++++++++++++++++++++++++++++++++++--------- include/linux/fs.h | 5 +++ 3 files changed, 74 insertions(+), 15 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 6b4c77268fc0..9ba69837aa56 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -492,6 +492,8 @@ EXPORT_SYMBOL(address_space_init_once); */ void inode_init_once(struct inode *inode) { + static struct lock_class_key __key; + memset(inode, 0, sizeof(*inode)); INIT_HLIST_NODE(&inode->i_hash); INIT_LIST_HEAD(&inode->i_devices); @@ -501,6 +503,7 @@ void inode_init_once(struct inode *inode) INIT_LIST_HEAD(&inode->i_sb_list); __address_space_init_once(&inode->i_data); i_size_ordered_init(inode); + lockdep_init_map(&inode->i_dirlock_map, "I_DIR_LOCKED", &__key, 0); } EXPORT_SYMBOL(inode_init_once); diff --git a/fs/namei.c b/fs/namei.c index 371c80902c59..68750b15dbf4 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3364,6 +3364,34 @@ static inline umode_t vfs_prepare_mode(struct mnt_idmap *idmap, return mode; } +static bool check_dir_locked(struct inode *dir) +{ + if (dir->i_state & I_DIR_LOCKED) { + dir->i_state |= I_DIR_LOCK_WAITER; + return true; + } + return false; +} + +static void inode_lock_dir(struct inode *dir) +{ + lock_acquire_exclusive(&dir->i_dirlock_map, 0, 0, NULL, _THIS_IP_); + spin_lock(&dir->i_lock); + wait_var_event_spinlock(dir, !check_dir_locked(dir), + &dir->i_lock); + dir->i_state |= I_DIR_LOCKED; + spin_unlock(&dir->i_lock); +} + +static void inode_unlock_dir(struct inode *dir) +{ + lock_map_release(&dir->i_dirlock_map); + spin_lock(&dir->i_lock); + dir->i_state &= ~(I_DIR_LOCKED | I_DIR_LOCK_WAITER); + wake_up_var_locked(dir, &dir->i_lock); + spin_unlock(&dir->i_lock); +} + /** * vfs_create - create new file * @idmap: idmap of the mount the inode was found from @@ -3396,10 +3424,13 @@ int vfs_create(struct mnt_idmap *idmap, struct inode *dir, error = security_inode_create(dir, dentry, mode); if (error) return error; - if (dir->i_op->create_shared) + if (dir->i_op->create_shared) { error = dir->i_op->create_shared(idmap, dir, dentry, mode, want_excl); - else + } else { + inode_lock_dir(dir); error = dir->i_op->create(idmap, dir, dentry, mode, want_excl); + inode_unlock_dir(dir); + } if (!error) fsnotify_create(dir, dentry); return error; @@ -3699,16 +3730,19 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, file->f_mode |= FMODE_CREATED; audit_inode_child(dir_inode, dentry, AUDIT_TYPE_CHILD_CREATE); - if (dir_inode->i_op->create_shared) + if (dir_inode->i_op->create_shared) { error = dir_inode->i_op->create_shared(idmap, dir_inode, dentry, mode, open_flag & O_EXCL); - else if (dir_inode->i_op->create) + } else if (dir_inode->i_op->create) { + inode_lock_dir(dir_inode); error = dir_inode->i_op->create(idmap, dir_inode, dentry, mode, open_flag & O_EXCL); - else + inode_unlock_dir(dir_inode); + } else { error = -EACCES; + } if (error) goto out_dput; } @@ -4227,10 +4261,13 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (dir->i_op->mknod_shared) + if (dir->i_op->mknod_shared) { error = dir->i_op->mknod_shared(idmap, dir, dentry, mode, dev); - else + } else { + inode_lock_dir(dir); error = dir->i_op->mknod(idmap, dir, dentry, mode, dev); + inode_unlock_dir(dir); + } if (!error) fsnotify_create(dir, dentry); return error; @@ -4360,7 +4397,9 @@ int vfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, else if (de) dput(de); } else { + inode_lock_dir(dir); error = dir->i_op->mkdir(idmap, dir, dentry, mode); + inode_unlock_dir(dir); } if (!error) fsnotify_mkdir(dir, dentry); @@ -4521,10 +4560,13 @@ int vfs_rmdir(struct mnt_idmap *idmap, struct inode *dir, if (error) goto out; - if (dir->i_op->rmdir_shared) + if (dir->i_op->rmdir_shared) { error = dir->i_op->rmdir_shared(dir, dentry); - else + } else { + inode_lock_dir(dir); error = dir->i_op->rmdir(dir, dentry); + inode_unlock_dir(dir); + } if (error) goto out; @@ -4648,10 +4690,13 @@ int vfs_unlink(struct mnt_idmap *idmap, struct inode *dir, error = try_break_deleg(target, delegated_inode); if (error) goto out; - if (dir->i_op->unlink_shared) + if (dir->i_op->unlink_shared) { error = dir->i_op->unlink_shared(dir, dentry); - else + } else { + inode_lock_dir(dir); error = dir->i_op->unlink(dir, dentry); + inode_unlock_dir(dir); + } if (!error) { dont_mount(dentry); detach_mounts(dentry); @@ -4792,10 +4837,13 @@ int vfs_symlink(struct mnt_idmap *idmap, struct inode *dir, if (error) return error; - if (dir->i_op->symlink_shared) + if (dir->i_op->symlink_shared) { error = dir->i_op->symlink_shared(idmap, dir, dentry, oldname); - else + } else { + inode_lock_dir(dir); error = dir->i_op->symlink(idmap, dir, dentry, oldname); + inode_unlock_dir(dir); + } if (!error) fsnotify_create(dir, dentry); return error; @@ -4920,10 +4968,13 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap, error = try_break_deleg(inode, delegated_inode); if (error) ; - else if (dir->i_op->link_shared) + else if (dir->i_op->link_shared) { error = dir->i_op->link_shared(old_dentry, dir, new_dentry); - else + } else { + inode_lock_dir(dir); error = dir->i_op->link(old_dentry, dir, new_dentry); + inode_unlock_dir(dir); + } } if (!error && (inode->i_state & I_LINKABLE)) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 68eba181175b..3ca92a54f28e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -722,6 +722,8 @@ struct inode { void (*free_inode)(struct inode *); }; struct file_lock_context *i_flctx; + + struct lockdep_map i_dirlock_map; /* For tracking I_DIR_LOCKED locks */ struct address_space i_data; struct list_head i_devices; union { @@ -2493,6 +2495,9 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src, #define I_SYNC_QUEUED (1 << 16) #define I_PINNING_NETFS_WB (1 << 17) +#define I_DIR_LOCK_WAITER (1 << 30) +#define I_DIR_LOCKED (1 << 31) + #define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC) #define I_DIRTY (I_DIRTY_INODE | I_DIRTY_PAGES) #define I_DIRTY_ALL (I_DIRTY | I_DIRTY_TIME) From patchwork Fri Dec 20 02:54:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916091 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 79ED986326; Fri, 20 Dec 2024 03:09:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664184; cv=none; b=L7jw1R/cIeGasWBK0+17+ZZ1GXhnBmrBC9hQKrZJvFtjxk4X6pEfDEXzn2DXJpH1tw8CnSqE1Ct7QJMNrOw2IIBWzSBKbaXfeqhyngUlVPixMPxshTgpJWghyD52WtvvGGA8jDK0ltzzwnHzldhY/LQ6T8iNuEdXXdxIJ4syD3I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664184; c=relaxed/simple; bh=XUTEW85MMCiBnH/6G9YoXhQSf291hhsZaXmHVuR03DM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PZ0Ec51zeNKWeEoQEchodVj3VN6cTIo1H0lLZUmrzwGCPmFqjfbZkPoncKBGbHn586Mg/WlPew4/MgQwHlEEPi3xm9udS7cxYa4LH18EcdOgM8Gf8d++1XqYBxWGqBVw4ypNo8NscrKW5tKCDFf5mxe/nj7gdesaxDSwe3w4Q6s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=yA+GRp9w; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=js7R4aye; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=CbOmiBOt; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=BUgjUNhk; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="yA+GRp9w"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="js7R4aye"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="CbOmiBOt"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="BUgjUNhk" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id EB61321169; Fri, 20 Dec 2024 03:09:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664181; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jVS4U6AEBMfCBjKPIhLdoeJu6UO4uozxamU759WDpMc=; b=yA+GRp9wU4YpC/nYWKAjIGqHqpOv5iOyht7A2lBZzLWELcwaJplcG6vSWmy8JR/xnxxJ3r gLXSj2daZtIl9e6OV3ITaQorccU8fdzSjeCGirzl8kNYdxn72wsc8+4HndxecAAlW2UG77 aeW2Ln8s2O5FgxNcKruagB6LBitmrDY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664181; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jVS4U6AEBMfCBjKPIhLdoeJu6UO4uozxamU759WDpMc=; b=js7R4ayefaD9mtxUAJz5ppdSjtEmF9jef0S/WjyRbpna5QfHfksCu8vlFz+zmSo7Ls9cz6 4g8RQoEKXxSXinAA== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=CbOmiBOt; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=BUgjUNhk DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664180; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jVS4U6AEBMfCBjKPIhLdoeJu6UO4uozxamU759WDpMc=; b=CbOmiBOt66l1hB3XoI8x9HySJWfji3ZjbeqVnuY/F/nectB0V5NALdg59kft/Dv091At6K M2B9P++GnMfdIDZS8VeKkkpClOrPZ7vxBISFaPih1an0R2EdFjKjYImj5q0Q0ML3yoevb/ TJAIRQ5snYx+w8iDXBNXx/1gaFQwgWU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664180; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jVS4U6AEBMfCBjKPIhLdoeJu6UO4uozxamU759WDpMc=; b=BUgjUNhkaC8ov/Gx4/xfsD8T/lPWL9paRxobCqmJuh7I0VFFVYTvEsNVRgllZuGxmZ4gzs wmfE3It3SbyrdYCA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id D493313A32; Fri, 20 Dec 2024 03:09:38 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id dzpGIvLfZGeAGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:38 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 09/11] VFS: re-pack DENTRY_ flags. Date: Fri, 20 Dec 2024 13:54:27 +1100 Message-ID: <20241220030830.272429-10-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: EB61321169 X-Spam-Level: X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; FROM_EQ_ENVFROM(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.de:+]; RCPT_COUNT_FIVE(0.00)[6]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:dkim,suse.de:mid,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO Bits 13, 23, 24, and 27 are not used. Move all those holes to the end. Signed-off-by: NeilBrown --- include/linux/dcache.h | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/include/linux/dcache.h b/include/linux/dcache.h index b64c0260e4be..fc7f571bd5bb 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -191,34 +191,34 @@ struct dentry_operations { #define DCACHE_NFSFS_RENAMED BIT(12) /* this dentry has been "silly renamed" and has to be deleted on the last * dput() */ -#define DCACHE_FSNOTIFY_PARENT_WATCHED BIT(14) +#define DCACHE_FSNOTIFY_PARENT_WATCHED BIT(13) /* Parent inode is watched by some fsnotify listener */ -#define DCACHE_DENTRY_KILLED BIT(15) +#define DCACHE_DENTRY_KILLED BIT(14) -#define DCACHE_MOUNTED BIT(16) /* is a mountpoint */ -#define DCACHE_NEED_AUTOMOUNT BIT(17) /* handle automount on this dir */ -#define DCACHE_MANAGE_TRANSIT BIT(18) /* manage transit from this dirent */ +#define DCACHE_MOUNTED BIT(15) /* is a mountpoint */ +#define DCACHE_NEED_AUTOMOUNT BIT(16) /* handle automount on this dir */ +#define DCACHE_MANAGE_TRANSIT BIT(17) /* manage transit from this dirent */ #define DCACHE_MANAGED_DENTRY \ (DCACHE_MOUNTED|DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT) -#define DCACHE_LRU_LIST BIT(19) +#define DCACHE_LRU_LIST BIT(18) -#define DCACHE_ENTRY_TYPE (7 << 20) /* bits 20..22 are for storing type: */ -#define DCACHE_MISS_TYPE (0 << 20) /* Negative dentry */ -#define DCACHE_WHITEOUT_TYPE (1 << 20) /* Whiteout dentry (stop pathwalk) */ -#define DCACHE_DIRECTORY_TYPE (2 << 20) /* Normal directory */ -#define DCACHE_AUTODIR_TYPE (3 << 20) /* Lookupless directory (presumed automount) */ -#define DCACHE_REGULAR_TYPE (4 << 20) /* Regular file type */ -#define DCACHE_SPECIAL_TYPE (5 << 20) /* Other file type */ -#define DCACHE_SYMLINK_TYPE (6 << 20) /* Symlink */ +#define DCACHE_ENTRY_TYPE (7 << 19) /* bits 19..21 are for storing type: */ +#define DCACHE_MISS_TYPE (0 << 19) /* Negative dentry */ +#define DCACHE_WHITEOUT_TYPE (1 << 19) /* Whiteout dentry (stop pathwalk) */ +#define DCACHE_DIRECTORY_TYPE (2 << 19) /* Normal directory */ +#define DCACHE_AUTODIR_TYPE (3 << 19) /* Lookupless directory (presumed automount) */ +#define DCACHE_REGULAR_TYPE (4 << 19) /* Regular file type */ +#define DCACHE_SPECIAL_TYPE (5 << 19) /* Other file type */ +#define DCACHE_SYMLINK_TYPE (6 << 19) /* Symlink */ -#define DCACHE_NOKEY_NAME BIT(25) /* Encrypted name encoded without key */ -#define DCACHE_OP_REAL BIT(26) +#define DCACHE_NOKEY_NAME BIT(22) /* Encrypted name encoded without key */ +#define DCACHE_OP_REAL BIT(23) -#define DCACHE_PAR_LOOKUP BIT(28) /* being looked up (with parent locked shared) */ -#define DCACHE_DENTRY_CURSOR BIT(29) -#define DCACHE_NORCU BIT(30) /* No RCU delay for freeing */ +#define DCACHE_PAR_LOOKUP BIT(24) /* being looked up (with parent locked shared) */ +#define DCACHE_DENTRY_CURSOR BIT(25) +#define DCACHE_NORCU BIT(26) /* No RCU delay for freeing */ extern seqlock_t rename_lock; From patchwork Fri Dec 20 02:54:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916092 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EF904501A; Fri, 20 Dec 2024 03:09:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664194; cv=none; b=aoqqaN4Uq05rQOQuMeYahES5dqFYOR3cz1CVvbVBRenDnXOupOACRE9Cr9pY0aXU5YYyEAnT5+B1gWgsw4bdgIN4kNpfuox/TOh8Va7O+E2faIQg4ZwD/nf+CiTAGIM+CzDyQVxZF3vqCfpBwPAlc7EJ3HiAVhaTY1W6XkzAq9s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664194; c=relaxed/simple; bh=OL1t/kJulixMtDs+kqNIcP5Pt8I1l1g8NdLP0mBsS4E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CVULAVr5wNf8eflFVaVRDOCC/YQ8teM9Vlnoec8ITb+jC0aE2CgcfO4lVHqHTxrTgFXhBW7t72r0gfTz2ro+vrvqE2qLL3tOpWwsnMv1vOyhy/nbid8GFXjDmDcbMNmO4IbOhLG40mi89gWeD80KQYwmBcVdTANNk+++xTBsQSQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=lNJjEqPD; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=wH0WjbXQ; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=lNJjEqPD; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=wH0WjbXQ; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="lNJjEqPD"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="wH0WjbXQ"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="lNJjEqPD"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="wH0WjbXQ" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 77F3D1F38C; Fri, 20 Dec 2024 03:09:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xI6gwL32tvphmuKRm12RYhkNIgjqKyHE5XDqNpdqk98=; b=lNJjEqPDdKwr6prgG26AxIghOOdcy8cNKDuueupwxtqsUhKKdw9wKBQRsxlfUyIeMPm7EV qeGKI96gWFW2inmU9x2oetoM1GpxAmangwsGWr4LNSbv76u6hows4JF0Da0UPknVGYhTZc JETSzsC+f4i7lM0IFdZmF+CbcJ8Fu8w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xI6gwL32tvphmuKRm12RYhkNIgjqKyHE5XDqNpdqk98=; b=wH0WjbXQXVnjstbMy5vRGuIvCM+iLJSw4PYjFMFEMJtVRw0YFm79ocppkHv+OE65shVNV2 uPsNFZ+9AI73SXDg== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xI6gwL32tvphmuKRm12RYhkNIgjqKyHE5XDqNpdqk98=; b=lNJjEqPDdKwr6prgG26AxIghOOdcy8cNKDuueupwxtqsUhKKdw9wKBQRsxlfUyIeMPm7EV qeGKI96gWFW2inmU9x2oetoM1GpxAmangwsGWr4LNSbv76u6hows4JF0Da0UPknVGYhTZc JETSzsC+f4i7lM0IFdZmF+CbcJ8Fu8w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xI6gwL32tvphmuKRm12RYhkNIgjqKyHE5XDqNpdqk98=; b=wH0WjbXQXVnjstbMy5vRGuIvCM+iLJSw4PYjFMFEMJtVRw0YFm79ocppkHv+OE65shVNV2 uPsNFZ+9AI73SXDg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 5E2EE13A32; Fri, 20 Dec 2024 03:09:48 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 7T5EBfzfZGeHGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:48 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 10/11] VFS: take a shared lock for create/remove directory operations. Date: Fri, 20 Dec 2024 13:54:28 +1100 Message-ID: <20241220030830.272429-11-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_SOME(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; RCVD_TLS_ALL(0.00)[]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_FIVE(0.00)[6]; FROM_EQ_ENVFROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.de:email,suse.de:mid] X-Spam-Score: -2.80 X-Spam-Flag: NO With this patch the VFS takes a shared lock on the directory (i_rwsem) when performing create or remove operations. Rename is as yet unchanged. Not all callers are changed, only the common ones in fs/namei.c While the directory only has a shared lock, the dentry being updated has an exclusive lock using a bit in ->d_flags. Waiters use wait_var_event_spinlock(), and a wakeup is only sent in the unusual case that some other task is actually waiting - indicated by another d_flags bit. Once the exclusive "update" lock is obtained on the dentry we must make sure it wasn't unlinked or renamed while we slept. If it was we repeat the lookup. The filesystem operations that expect an exclusive lock are still provided with exclusion, but this is handled by inode_dir_lock(). Signed-off-by: NeilBrown --- fs/dcache.c | 9 ++++++- fs/namei.c | 53 ++++++++++++++++++++++++++++++++++++++---- include/linux/dcache.h | 4 ++++ 3 files changed, 60 insertions(+), 6 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index ebe849474bd8..3fb3af83add5 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1636,9 +1636,10 @@ EXPORT_SYMBOL(d_invalidate); * available. On a success the dentry is returned. The name passed in is * copied and the copy passed in may be reused after this call. */ - + static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) { + static struct lock_class_key __key; struct dentry *dentry; char *dname; int err; @@ -1697,6 +1698,8 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) INIT_HLIST_NODE(&dentry->d_sib); d_set_d_op(dentry, dentry->d_sb->s_d_op); + lockdep_init_map(&dentry->d_update_map, "DCACHE_PAR_UPDATE", &__key, 0); + if (dentry->d_op && dentry->d_op->d_init) { err = dentry->d_op->d_init(dentry); if (err) { @@ -3030,6 +3033,10 @@ static int __d_unalias(struct dentry *dentry, struct dentry *alias) * In that case, we know that the inode will be a regular file, and also this * will only occur during atomic_open. So we need to check for the dentry * being already hashed only in the final case. + * + * @dentry must have a valid ->d_parent and that directory must be + * locked (i_rwsem) either exclusively or shared. If shared then + * @dentry must have %DCACHE_PAR_LOOKUP or %DCACHE_PAR_UPDATE set. */ struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry) { diff --git a/fs/namei.c b/fs/namei.c index 68750b15dbf4..fb40ae64dc8d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1703,6 +1703,15 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name, } EXPORT_SYMBOL(lookup_one_qstr_excl); +static bool check_dentry_locked(struct dentry *de) +{ + if (de->d_flags & DCACHE_PAR_UPDATE) { + de->d_flags |= DCACHE_PAR_WAITER; + return true; + } + return false; +} + static struct dentry *lookup_and_lock(const struct qstr *last, struct dentry *base, unsigned int lookup_flags) @@ -1710,10 +1719,36 @@ static struct dentry *lookup_and_lock(const struct qstr *last, struct dentry *dentry; int err; - inode_lock_nested(base->d_inode, I_MUTEX_PARENT); + inode_lock_shared_nested(base->d_inode, I_MUTEX_PARENT); +retry: dentry = lookup_one_qstr_excl(last, base, lookup_flags); if (IS_ERR(dentry)) goto out; + lock_acquire_exclusive(&dentry->d_update_map, 0, 0, NULL, _THIS_IP_); + spin_lock(&dentry->d_lock); + wait_var_event_spinlock(&dentry->d_flags, + !check_dentry_locked(dentry), + &dentry->d_lock); + if (d_is_positive(dentry)) { + rcu_read_lock(); /* needed for d_same_name() */ + if ( + /* Was unlinked while we waited ?*/ + d_unhashed(dentry) || + /* Or was dentry renamed ?? */ + dentry->d_parent != base || + dentry->d_name.hash != last->hash || + !d_same_name(dentry, base, last) + ) { + rcu_read_unlock(); + spin_unlock(&dentry->d_lock); + lock_map_release(&dentry->d_update_map); + dput(dentry); + goto retry; + } + rcu_read_unlock(); + } + dentry->d_flags |= DCACHE_PAR_UPDATE; + spin_unlock(&dentry->d_lock); err = -EEXIST; if ((lookup_flags & LOOKUP_EXCL) && d_is_positive(dentry)) goto err; @@ -1723,10 +1758,11 @@ static struct dentry *lookup_and_lock(const struct qstr *last, return dentry; err: - dput(dentry); - dentry = ERR_PTR(err); + done_lookup_and_lock(base, dentry); + return ERR_PTR(err); + out: - inode_unlock(base->d_inode); + inode_unlock_shared(base->d_inode); return dentry; } @@ -2795,8 +2831,15 @@ EXPORT_SYMBOL(user_path_locked_at); void done_lookup_and_lock(struct dentry *parent, struct dentry *child) { + lock_map_release(&child->d_update_map); + spin_lock(&child->d_lock); + if (child->d_flags & DCACHE_PAR_WAITER) + wake_up_var_locked(&child->d_flags, &child->d_lock); + child->d_flags &= ~(DCACHE_PAR_UPDATE | DCACHE_PAR_WAITER); + spin_unlock(&child->d_lock); + + inode_unlock_shared(parent->d_inode); dput(child); - inode_unlock(d_inode(parent)); } EXPORT_SYMBOL(done_lookup_and_lock); diff --git a/include/linux/dcache.h b/include/linux/dcache.h index fc7f571bd5bb..6d404c296ac0 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -102,6 +102,8 @@ struct dentry { * possible! */ + /* lockdep tracking of DCACHE_PAR_UPDATE locks */ + struct lockdep_map d_update_map; union { struct list_head d_lru; /* LRU list */ wait_queue_head_t *d_wait; /* in-lookup ones only */ @@ -220,6 +222,8 @@ struct dentry_operations { #define DCACHE_DENTRY_CURSOR BIT(25) #define DCACHE_NORCU BIT(26) /* No RCU delay for freeing */ +#define DCACHE_PAR_UPDATE BIT(27) /* Locked for update */ +#define DCACHE_PAR_WAITER BIT(28) /* someone is waiting for PAR_UPDATE */ extern seqlock_t rename_lock; /* From patchwork Fri Dec 20 02:54:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13916093 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D27951A83EF; Fri, 20 Dec 2024 03:09:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664199; cv=none; b=n8zQzFIKSxctLxa5/EhqZtpkiWAxP174eDmtyivwfk73qUN88LD7MlrNkyYsEKbs7yeclKqb8K0XZ8X4RTArJUw35YPwTK9Lh6Yzsp0zDxP/4ZD/3X7+keXFqcoW01nHSv3O9C090gbsv4+/Vkz1tcP67xr5cQYgYmt8bU63hH8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734664199; c=relaxed/simple; bh=qmeWqCbcfrxnRSB0BSzquZCh8r6lbGcB7uCaJ4rrfO0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LtV0Fa7tBXmcxxsQ4yWPCFGegUcqPauNRLkx2J7wgh1ekQXbbfYNBWqPJ6DYZqcVgtICUJ0m3aWwzDkp20yOGSbZwaK7I8CGXqJOMiAmRiszuGaU1Y5nZrpJr4h85Zuam3MvlR4kObZcQDZed728umE7uZxIAOrdyjjRKnTnrqQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=JmTKbEUK; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=jccQc5g6; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=JmTKbEUK; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=jccQc5g6; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="JmTKbEUK"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="jccQc5g6"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="JmTKbEUK"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="jccQc5g6" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 0D7961F385; Fri, 20 Dec 2024 03:09:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664196; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9MOfGsNLkxyU8h2zwoe18I6cYlAXQeQBcJEBsYCWOsY=; b=JmTKbEUKmOGMtCDEHzHW9b9KB8o63nuF3aEwy60JdX9S6EUeb9yBgrndrcW4TPzVWpWxD0 WER23Le/kf+M2SRJdP3BWOtDVY0HHaJfyJ+0Up6DK/vm5WVM2wQwCl+Q/DvyIIk4wrfn2i ObQjJUxagq0xAkd0AKh8ZJ6YnzRjgQg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664196; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9MOfGsNLkxyU8h2zwoe18I6cYlAXQeQBcJEBsYCWOsY=; b=jccQc5g6gRHr17e/HDn6TwAZTy+QeRixKGXjsv8OoW1KH3DetWVXGd03jnO6i5NZMFH8vC dxDlIL8ZCM4TfFBA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1734664196; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9MOfGsNLkxyU8h2zwoe18I6cYlAXQeQBcJEBsYCWOsY=; b=JmTKbEUKmOGMtCDEHzHW9b9KB8o63nuF3aEwy60JdX9S6EUeb9yBgrndrcW4TPzVWpWxD0 WER23Le/kf+M2SRJdP3BWOtDVY0HHaJfyJ+0Up6DK/vm5WVM2wQwCl+Q/DvyIIk4wrfn2i ObQjJUxagq0xAkd0AKh8ZJ6YnzRjgQg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1734664196; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9MOfGsNLkxyU8h2zwoe18I6cYlAXQeQBcJEBsYCWOsY=; b=jccQc5g6gRHr17e/HDn6TwAZTy+QeRixKGXjsv8OoW1KH3DetWVXGd03jnO6i5NZMFH8vC dxDlIL8ZCM4TfFBA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id E7FA213A32; Fri, 20 Dec 2024 03:09:53 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id lur1JgHgZGePGAAAD6G6ig (envelope-from ); Fri, 20 Dec 2024 03:09:53 +0000 From: NeilBrown To: Alexander Viro , Christian Brauner , Jan Kara , Linus Torvalds Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 11/11] nfsd: use lookup_and_lock_one() Date: Fri, 20 Dec 2024 13:54:29 +1100 Message-ID: <20241220030830.272429-12-neilb@suse.de> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20241220030830.272429-1-neilb@suse.de> References: <20241220030830.272429-1-neilb@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; R_RATELIMIT(0.00)[from(RLewrxuus8mos16izbn)]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.de:email,suse.de:mid]; RCVD_TLS_ALL(0.00)[] X-Spam-Score: -2.80 X-Spam-Flag: NO nfsd now used looksup_and_lock_one() when creating/removing names in the exported filesystem. Signed-off-by: NeilBrown --- fs/nfsd/nfsproc.c | 12 ++++----- fs/nfsd/vfs.c | 67 +++++++++++++++-------------------------------- 2 files changed, 27 insertions(+), 52 deletions(-) diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c index 6dda081eb24c..11ad710a0853 100644 --- a/fs/nfsd/nfsproc.c +++ b/fs/nfsd/nfsproc.c @@ -311,17 +311,16 @@ nfsd_proc_create(struct svc_rqst *rqstp) goto done; } - inode_lock_nested(dirfhp->fh_dentry->d_inode, I_MUTEX_PARENT); - dchild = lookup_one_len(argp->name, dirfhp->fh_dentry, argp->len); + dchild = lookup_and_lock_one(NULL, argp->name, argp->len, + dirfhp->fh_dentry, LOOKUP_CREATE); if (IS_ERR(dchild)) { resp->status = nfserrno(PTR_ERR(dchild)); - goto out_unlock; + goto put_write; } fh_init(newfhp, NFS_FHSIZE); resp->status = fh_compose(newfhp, dirfhp->fh_export, dchild, dirfhp); if (!resp->status && d_really_is_negative(dchild)) resp->status = nfserr_noent; - dput(dchild); if (resp->status) { if (resp->status != nfserr_noent) goto out_unlock; @@ -331,7 +330,7 @@ nfsd_proc_create(struct svc_rqst *rqstp) */ resp->status = nfserr_acces; if (!newfhp->fh_dentry) { - printk(KERN_WARNING + printk(KERN_WARNING "nfsd_proc_create: file handle not verified\n"); goto out_unlock; } @@ -427,7 +426,8 @@ nfsd_proc_create(struct svc_rqst *rqstp) } out_unlock: - inode_unlock(dirfhp->fh_dentry->d_inode); + done_lookup_and_lock(dirfhp->fh_dentry, dchild); +put_write: fh_drop_write(dirfhp); done: fh_put(dirfhp); diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 740332413138..011fb68bfa4b 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1551,19 +1551,13 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, if (host_err) return nfserrno(host_err); - inode_lock_nested(dentry->d_inode, I_MUTEX_PARENT); - dchild = lookup_one_len(fname, dentry, flen); + dchild = lookup_and_lock_one(NULL, fname, flen, dentry, LOOKUP_CREATE); host_err = PTR_ERR(dchild); if (IS_ERR(dchild)) { err = nfserrno(host_err); - goto out_unlock; + goto out; } err = fh_compose(resfhp, fhp->fh_export, dchild, fhp); - /* - * We unconditionally drop our ref to dchild as fh_compose will have - * already grabbed its own ref for it. - */ - dput(dchild); if (err) goto out_unlock; err = fh_fill_pre_attrs(fhp); @@ -1572,7 +1566,8 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp, err = nfsd_create_locked(rqstp, fhp, attrs, type, rdev, resfhp); fh_fill_post_attrs(fhp); out_unlock: - inode_unlock(dentry->d_inode); + done_lookup_and_lock(dentry, dchild); +out: return err; } @@ -1656,8 +1651,7 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, } dentry = fhp->fh_dentry; - inode_lock_nested(dentry->d_inode, I_MUTEX_PARENT); - dnew = lookup_one_len(fname, dentry, flen); + dnew = lookup_and_lock_one(NULL, fname, flen, dentry, LOOKUP_CREATE); if (IS_ERR(dnew)) { err = nfserrno(PTR_ERR(dnew)); inode_unlock(dentry->d_inode); @@ -1673,11 +1667,11 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, nfsd_create_setattr(rqstp, fhp, resfhp, attrs); fh_fill_post_attrs(fhp); out_unlock: - inode_unlock(dentry->d_inode); + done_lookup_and_lock(dentry, dnew); if (!err) err = nfserrno(commit_metadata(fhp)); - dput(dnew); - if (err==0) err = cerr; + if (err==0) + err = cerr; out_drop_write: fh_drop_write(fhp); out: @@ -1721,43 +1715,35 @@ nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp, ddir = ffhp->fh_dentry; dirp = d_inode(ddir); - inode_lock_nested(dirp, I_MUTEX_PARENT); - - dnew = lookup_one_len(name, ddir, len); + dnew = lookup_and_lock_one(NULL, name, len, ddir, LOOKUP_CREATE); if (IS_ERR(dnew)) { - err = nfserrno(PTR_ERR(dnew)); - goto out_unlock; + err = PTR_ERR(dnew); + goto out_drop_write; } dold = tfhp->fh_dentry; err = nfserr_noent; if (d_really_is_negative(dold)) - goto out_dput; + goto out_unlock; err = fh_fill_pre_attrs(ffhp); if (err != nfs_ok) - goto out_dput; + goto out_unlock; host_err = vfs_link(dold, &nop_mnt_idmap, dirp, dnew, NULL); fh_fill_post_attrs(ffhp); - inode_unlock(dirp); - if (!host_err) { +out_unlock: + done_lookup_and_lock(ddir, dnew); + if (!err && !host_err) { err = nfserrno(commit_metadata(ffhp)); if (!err) err = nfserrno(commit_metadata(tfhp)); - } else { + } else if (!err) { err = nfserrno(host_err); } - dput(dnew); out_drop_write: fh_drop_write(tfhp); out: return err; - -out_dput: - dput(dnew); -out_unlock: - inode_unlock(dirp); - goto out_drop_write; } static void @@ -1943,18 +1929,11 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, dentry = fhp->fh_dentry; dirp = d_inode(dentry); - inode_lock_nested(dirp, I_MUTEX_PARENT); - - rdentry = lookup_one_len(fname, dentry, flen); + rdentry = lookup_and_lock_one(NULL, fname, flen, dentry, 0); host_err = PTR_ERR(rdentry); if (IS_ERR(rdentry)) - goto out_unlock; + goto out_drop_write; - if (d_really_is_negative(rdentry)) { - dput(rdentry); - host_err = -ENOENT; - goto out_unlock; - } rinode = d_inode(rdentry); err = fh_fill_pre_attrs(fhp); if (err != nfs_ok) @@ -1981,11 +1960,10 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, host_err = vfs_rmdir(&nop_mnt_idmap, dirp, rdentry); } fh_fill_post_attrs(fhp); - - inode_unlock(dirp); +out_unlock: + done_lookup_and_lock(dentry, rdentry); if (!host_err) host_err = commit_metadata(fhp); - dput(rdentry); iput(rinode); /* truncate the inode here */ out_drop_write: @@ -2001,9 +1979,6 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, } out: return err; -out_unlock: - inode_unlock(dirp); - goto out_drop_write; } /*