diff mbox series

[2/2] VFS: add common error checks to lookup_one_qstr_excl()

Message ID 20250207034040.3402438-3-neilb@suse.de (mailing list archive)
State Handled Elsewhere
Headers show
Series VFS: minor improvements to a couple of interfaces | expand

Commit Message

NeilBrown Feb. 7, 2025, 3:36 a.m. UTC
Callers of lookup_one_qstr_excl() often check if the result is negative or
positive.
These changes can easily be moved into lookup_one_qstr_excl() by checking the
lookup flags:
LOOKUP_CREATE means it is NOT an error if the name doesn't exist.
LOOKUP_EXCL means it IS an error if the name DOES exist.

This patch adds these checks, then removes error checks from callers,
and ensures that appropriate flags are passed.

This subtly changes the meaning of LOOKUP_EXCL.  Previously it could
only accompany LOOKUP_CREATE.  Now it can accompany LOOKUP_RENAME_TARGET
as well.  A couple of small changes are needed to accommodate this.  The
NFS change is functionally a no-op but ensures nfs_is_exclusive_create() does
exactly what the name says.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/namei.c          | 59 +++++++++++++++------------------------------
 fs/nfs/dir.c        |  3 ++-
 fs/smb/server/vfs.c | 26 ++++++++------------
 3 files changed, 31 insertions(+), 57 deletions(-)

Comments

Oliver Sang Feb. 12, 2025, 2:50 a.m. UTC | #1
Hello,

kernel test robot noticed "BUG:unable_to_handle_page_fault_for_address" on:

commit: 9a292bc4cbb25ca84f90dbacdf3064a9d6e7804f ("[PATCH 2/2] VFS: add common error checks to lookup_one_qstr_excl()")
url: https://github.com/intel-lab-lkp/linux/commits/NeilBrown/VFS-change-kern_path_locked-and-user_path_locked_at-to-never-return-negative-dentry/20250207-185417
base: https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git vfs.all
patch link: https://lore.kernel.org/all/20250207034040.3402438-3-neilb@suse.de/
patch subject: [PATCH 2/2] VFS: add common error checks to lookup_one_qstr_excl()

in testcase: trinity
version: 
with following parameters:

	runtime: 300s
	group: group-01
	nr_groups: 5



config: i386-randconfig-053-20250208
compiler: clang-19
test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202502121009.f7bc8e67-lkp@intel.com


[  163.118842][ T4188] BUG: unable to handle page fault for address: fffffffe
[  163.119485][ T4188] #PF: supervisor read access in kernel mode
[  163.120015][ T4188] #PF: error_code(0x0000) - not-present page
[  163.120523][ T4188] *pde = 026d3067 *pte = 00000000
[  163.120971][ T4188] Oops: Oops: 0000 [#1]
[  163.121339][ T4188] CPU: 0 UID: 65534 PID: 4188 Comm: trinity-c3 Tainted: G S                 6.14.0-rc1-00084-g9a292bc4cbb2 #1
[  163.122321][ T4188] Tainted: [S]=CPU_OUT_OF_SPEC
[  163.122717][ T4188] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 163.123520][ T4188] EIP: lookup_one_qstr_excl (include/linux/dcache.h:416 include/linux/dcache.h:421 include/linux/dcache.h:467 fs/namei.c:1696) 
[ 163.123973][ T4188] Code: 5e 89 d8 89 fa e8 62 ed 00 00 85 c0 74 58 8b 7e 18 89 c2 89 f0 89 d6 8b 4d f0 ff 17 85 c0 75 4d 89 f0 8b 75 f0 b9 00 00 38 00 <23> 08 89 f2 81 e2 00 00 02 00 09 ca 74 13 f7 c6 00 00 04 00 74 17
All code
========
   0:	5e                   	pop    %rsi
   1:	89 d8                	mov    %ebx,%eax
   3:	89 fa                	mov    %edi,%edx
   5:	e8 62 ed 00 00       	call   0xed6c
   a:	85 c0                	test   %eax,%eax
   c:	74 58                	je     0x66
   e:	8b 7e 18             	mov    0x18(%rsi),%edi
  11:	89 c2                	mov    %eax,%edx
  13:	89 f0                	mov    %esi,%eax
  15:	89 d6                	mov    %edx,%esi
  17:	8b 4d f0             	mov    -0x10(%rbp),%ecx
  1a:	ff 17                	call   *(%rdi)
  1c:	85 c0                	test   %eax,%eax
  1e:	75 4d                	jne    0x6d
  20:	89 f0                	mov    %esi,%eax
  22:	8b 75 f0             	mov    -0x10(%rbp),%esi
  25:	b9 00 00 38 00       	mov    $0x380000,%ecx
  2a:*	23 08                	and    (%rax),%ecx		<-- trapping instruction
  2c:	89 f2                	mov    %esi,%edx
  2e:	81 e2 00 00 02 00    	and    $0x20000,%edx
  34:	09 ca                	or     %ecx,%edx
  36:	74 13                	je     0x4b
  38:	f7 c6 00 00 04 00    	test   $0x40000,%esi
  3e:	74 17                	je     0x57

Code starting with the faulting instruction
===========================================
   0:	23 08                	and    (%rax),%ecx
   2:	89 f2                	mov    %esi,%edx
   4:	81 e2 00 00 02 00    	and    $0x20000,%edx
   a:	09 ca                	or     %ecx,%edx
   c:	74 13                	je     0x21
   e:	f7 c6 00 00 04 00    	test   $0x40000,%esi
  14:	74 17                	je     0x2d
[  163.125537][ T4188] EAX: fffffffe EBX: c3e7e540 ECX: 00380000 EDX: 273874b2
[  163.126145][ T4188] ESI: 00060000 EDI: fffffffe EBP: ebef9ef4 ESP: ebef9edc
[  163.126716][ T4188] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010246
[  163.127324][ T4188] CR0: 80050033 CR2: fffffffe CR3: 2b2df000 CR4: 00040690
[  163.127872][ T4188] Call Trace:
[ 163.128178][ T4188] ? __die_body (arch/x86/kernel/dumpstack.c:478 arch/x86/kernel/dumpstack.c:420) 
[ 163.128552][ T4188] ? __die (arch/x86/kernel/dumpstack.c:434) 
[ 163.128898][ T4188] ? page_fault_oops (arch/x86/mm/fault.c:710) 
[ 163.129329][ T4188] ? lock_acquire (kernel/locking/lockdep.c:5851) 
[ 163.129723][ T4188] ? kernelmode_fixup_or_oops (arch/x86/mm/fault.c:737) 
[ 163.130222][ T4188] ? __bad_area_nosemaphore (arch/x86/mm/fault.c:784) 
[ 163.130687][ T4188] ? bad_area_nosemaphore (arch/x86/mm/fault.c:833) 
[ 163.131129][ T4188] ? do_kern_addr_fault (arch/x86/mm/fault.c:1197) 
[ 163.131569][ T4188] ? exc_page_fault (arch/x86/mm/fault.c:1479) 
[ 163.132004][ T4188] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:? kernel/locking/lockdep.c:4408) 
[ 163.132482][ T4188] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1493) 
[ 163.132999][ T4188] ? handle_exception (arch/x86/entry/entry_32.S:1048) 
[ 163.133429][ T4188] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1493) 
[ 163.133946][ T4188] ? lookup_one_qstr_excl (include/linux/dcache.h:416 include/linux/dcache.h:421 include/linux/dcache.h:467 fs/namei.c:1696) 
[ 163.134390][ T4188] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1493) 
[ 163.134919][ T4188] ? lookup_one_qstr_excl (include/linux/dcache.h:416 include/linux/dcache.h:421 include/linux/dcache.h:467 fs/namei.c:1696) 
[ 163.135354][ T4188] filename_create (include/linux/err.h:67 fs/namei.c:4091) 
[ 163.135763][ T4188] do_symlinkat (fs/namei.c:4676) 
[ 163.136166][ T4188] __ia32_sys_symlinkat (fs/namei.c:4696) 
[ 163.136593][ T4188] ia32_sys_call (arch/x86/entry/syscall_32.c:44) 
[ 163.136967][ T4188] __do_fast_syscall_32 (arch/x86/entry/common.c:?) 
[ 163.137377][ T4188] do_fast_syscall_32 (arch/x86/entry/common.c:411) 
[ 163.137774][ T4188] do_SYSENTER_32 (arch/x86/entry/common.c:449) 
[ 163.138146][ T4188] entry_SYSENTER_32 (arch/x86/entry/entry_32.S:836) 
[  163.138542][ T4188] EIP: 0xb7f61539
[ 163.138845][ T4188] Code: 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 0f 1f 00 58 b8 77 00 00 00 cd 80 90 0f 1f
All code
========
   0:	03 74 b4 01          	add    0x1(%rsp,%rsi,4),%esi
   4:	10 07                	adc    %al,(%rdi)
   6:	03 74 b0 01          	add    0x1(%rax,%rsi,4),%esi
   a:	10 08                	adc    %cl,(%rax)
   c:	03 74 d8 01          	add    0x1(%rax,%rbx,8),%esi
	...
  20:	00 51 52             	add    %dl,0x52(%rcx)
  23:	55                   	push   %rbp
  24:*	89 e5                	mov    %esp,%ebp		<-- trapping instruction
  26:	0f 34                	sysenter
  28:	cd 80                	int    $0x80
  2a:	5d                   	pop    %rbp
  2b:	5a                   	pop    %rdx
  2c:	59                   	pop    %rcx
  2d:	c3                   	ret
  2e:	90                   	nop
  2f:	90                   	nop
  30:	90                   	nop
  31:	90                   	nop
  32:	0f 1f 00             	nopl   (%rax)
  35:	58                   	pop    %rax
  36:	b8 77 00 00 00       	mov    $0x77,%eax
  3b:	cd 80                	int    $0x80
  3d:	90                   	nop
  3e:	0f                   	.byte 0xf
  3f:	1f                   	(bad)

Code starting with the faulting instruction
===========================================
   0:	5d                   	pop    %rbp
   1:	5a                   	pop    %rdx
   2:	59                   	pop    %rcx
   3:	c3                   	ret
   4:	90                   	nop
   5:	90                   	nop
   6:	90                   	nop
   7:	90                   	nop
   8:	0f 1f 00             	nopl   (%rax)
   b:	58                   	pop    %rax
   c:	b8 77 00 00 00       	mov    $0x77,%eax
  11:	cd 80                	int    $0x80
  13:	90                   	nop
  14:	0f                   	.byte 0xf
  15:	1f                   	(bad)


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250212/202502121009.f7bc8e67-lkp@intel.com
Al Viro Feb. 12, 2025, 3:16 a.m. UTC | #2
On Fri, Feb 07, 2025 at 02:36:48PM +1100, NeilBrown wrote:
> @@ -1690,6 +1692,15 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name,
>  		dput(dentry);
>  		dentry = old;
>  	}
> +found:

... and if ->lookup() returns an error, this will blow up (as bot has just
reported).

> +	if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) {
> +		dput(dentry);
> +		return ERR_PTR(-ENOENT);
> +	}
> +	if (d_is_positive(dentry) && (flags & LOOKUP_EXCL)) {
> +		dput(dentry);
> +		return ERR_PTR(-EEXIST);
> +	}


> @@ -4077,27 +4084,13 @@ static struct dentry *filename_create(int dfd, struct filename *name,
>  	 * '/', and a directory wasn't requested.
>  	 */
>  	if (last.name[last.len] && !want_dir)
> -		create_flags = 0;
> +		create_flags &= ~LOOKUP_CREATE;

See the patch I've posted in earlier thread; the entire "strip LOOKUP_CREATE"
thing is wrong.
Al Viro Feb. 12, 2025, 3:25 a.m. UTC | #3
On Wed, Feb 12, 2025 at 03:16:08AM +0000, Al Viro wrote:
> On Fri, Feb 07, 2025 at 02:36:48PM +1100, NeilBrown wrote:
> > @@ -1690,6 +1692,15 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name,
> >  		dput(dentry);
> >  		dentry = old;
> >  	}
> > +found:
> 
> ... and if ->lookup() returns an error, this will blow up (as bot has just
> reported).
> 
> > +	if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) {
> > +		dput(dentry);
> > +		return ERR_PTR(-ENOENT);
> > +	}
> > +	if (d_is_positive(dentry) && (flags & LOOKUP_EXCL)) {
> > +		dput(dentry);
> > +		return ERR_PTR(-EEXIST);
> > +	}
> 
> 
> > @@ -4077,27 +4084,13 @@ static struct dentry *filename_create(int dfd, struct filename *name,
> >  	 * '/', and a directory wasn't requested.
> >  	 */
> >  	if (last.name[last.len] && !want_dir)
> > -		create_flags = 0;
> > +		create_flags &= ~LOOKUP_CREATE;
> 
> See the patch I've posted in earlier thread; the entire "strip LOOKUP_CREATE"
> thing is wrong.

On top of mainline that's

filename_create(): don't force handling trailing slashes into the common path

Only mkdir accepts pathnames that end with / - anything like mknod() (symlink(),
etc.) always fails on those.  Don't try to force that the common codepath -
all we are doing is a lookup and check for existence to determine which
error should it be.  Do that before bothering with mnt_want_write(), etc.;
as far as underlying filesystem is concerned it's just a lookup.  Simplifies
the normal codepath and kills the lookup intent dependency on more than
the call site.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/namei.c b/fs/namei.c
index 3ab9440c5b93..6189e54f767a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4054,13 +4054,13 @@ static struct dentry *filename_create(int dfd, struct filename *name,
 	struct dentry *dentry = ERR_PTR(-EEXIST);
 	struct qstr last;
 	bool want_dir = lookup_flags & LOOKUP_DIRECTORY;
-	unsigned int reval_flag = lookup_flags & LOOKUP_REVAL;
-	unsigned int create_flags = LOOKUP_CREATE | LOOKUP_EXCL;
 	int type;
 	int err2;
 	int error;
 
-	error = filename_parentat(dfd, name, reval_flag, path, &last, &type);
+	lookup_flags &= LOOKUP_REVAL;
+
+	error = filename_parentat(dfd, name, lookup_flags, path, &last, &type);
 	if (error)
 		return ERR_PTR(error);
 
@@ -4070,18 +4070,28 @@ static struct dentry *filename_create(int dfd, struct filename *name,
 	 */
 	if (unlikely(type != LAST_NORM))
 		goto out;
+	/*
+	 * mkdir foo/bar/ is OK, but for anything else a slash in the end
+	 * is always an error; the only question is which one.
+	 */
+	if (unlikely(last.name[last.len] && !want_dir)) {
+		dentry = lookup_dcache(&last, path->dentry, lookup_flags);
+		if (!dentry)
+			dentry = lookup_slow(&last, path->dentry, lookup_flags);
+		if (!IS_ERR(dentry)) {
+			error = d_is_positive(dentry) ? -EEXIST : -ENOENT;
+			dput(dentry);
+			dentry = ERR_PTR(error);
+		}
+		goto out;
+	}
 
 	/* don't fail immediately if it's r/o, at least try to report other errors */
 	err2 = mnt_want_write(path->mnt);
-	/*
-	 * Do the final lookup.  Suppress 'create' if there is a trailing
-	 * '/', and a directory wasn't requested.
-	 */
-	if (last.name[last.len] && !want_dir)
-		create_flags = 0;
+	/* do the final lookup */
 	inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT);
 	dentry = lookup_one_qstr_excl(&last, path->dentry,
-				      reval_flag | create_flags);
+				lookup_flags | LOOKUP_CREATE | LOOKUP_EXCL);
 	if (IS_ERR(dentry))
 		goto unlock;
 
@@ -4089,16 +4099,6 @@ static struct dentry *filename_create(int dfd, struct filename *name,
 	if (d_is_positive(dentry))
 		goto fail;
 
-	/*
-	 * Special case - lookup gave negative, but... we had foo/bar/
-	 * From the vfs_mknod() POV we just have a negative dentry -
-	 * all is fine. Let's be bastards - you had / on the end, you've
-	 * been asking for (non-existent) directory. -ENOENT for you.
-	 */
-	if (unlikely(!create_flags)) {
-		error = -ENOENT;
-		goto fail;
-	}
 	if (unlikely(err2)) {
 		error = err2;
 		goto fail;
NeilBrown Feb. 12, 2025, 3:45 a.m. UTC | #4
On Wed, 12 Feb 2025, Al Viro wrote:
> On Wed, Feb 12, 2025 at 03:16:08AM +0000, Al Viro wrote:
> > On Fri, Feb 07, 2025 at 02:36:48PM +1100, NeilBrown wrote:
> > > @@ -1690,6 +1692,15 @@ struct dentry *lookup_one_qstr_excl(const struct qstr *name,
> > >  		dput(dentry);
> > >  		dentry = old;
> > >  	}
> > > +found:
> > 
> > ... and if ->lookup() returns an error, this will blow up (as bot has just
> > reported).

Yes, I need an early exit if (IS_ERR(dentry)).  Thanks.

> > 
> > > +	if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) {
> > > +		dput(dentry);
> > > +		return ERR_PTR(-ENOENT);
> > > +	}
> > > +	if (d_is_positive(dentry) && (flags & LOOKUP_EXCL)) {
> > > +		dput(dentry);
> > > +		return ERR_PTR(-EEXIST);
> > > +	}
> > 
> > 
> > > @@ -4077,27 +4084,13 @@ static struct dentry *filename_create(int dfd, struct filename *name,
> > >  	 * '/', and a directory wasn't requested.
> > >  	 */
> > >  	if (last.name[last.len] && !want_dir)
> > > -		create_flags = 0;
> > > +		create_flags &= ~LOOKUP_CREATE;
> > 
> > See the patch I've posted in earlier thread; the entire "strip LOOKUP_CREATE"
> > thing is wrong.
> 
> On top of mainline that's
> 
> filename_create(): don't force handling trailing slashes into the common path
> 
> Only mkdir accepts pathnames that end with / - anything like mknod() (symlink(),
> etc.) always fails on those.  Don't try to force that the common codepath -
> all we are doing is a lookup and check for existence to determine which
> error should it be.  Do that before bothering with mnt_want_write(), etc.;
> as far as underlying filesystem is concerned it's just a lookup.  Simplifies
> the normal codepath and kills the lookup intent dependency on more than
> the call site.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/fs/namei.c b/fs/namei.c
> index 3ab9440c5b93..6189e54f767a 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4054,13 +4054,13 @@ static struct dentry *filename_create(int dfd, struct filename *name,
>  	struct dentry *dentry = ERR_PTR(-EEXIST);
>  	struct qstr last;
>  	bool want_dir = lookup_flags & LOOKUP_DIRECTORY;
> -	unsigned int reval_flag = lookup_flags & LOOKUP_REVAL;
> -	unsigned int create_flags = LOOKUP_CREATE | LOOKUP_EXCL;
>  	int type;
>  	int err2;
>  	int error;
>  
> -	error = filename_parentat(dfd, name, reval_flag, path, &last, &type);
> +	lookup_flags &= LOOKUP_REVAL;
> +
> +	error = filename_parentat(dfd, name, lookup_flags, path, &last, &type);
>  	if (error)
>  		return ERR_PTR(error);
>  
> @@ -4070,18 +4070,28 @@ static struct dentry *filename_create(int dfd, struct filename *name,
>  	 */
>  	if (unlikely(type != LAST_NORM))
>  		goto out;
> +	/*
> +	 * mkdir foo/bar/ is OK, but for anything else a slash in the end
> +	 * is always an error; the only question is which one.
> +	 */
> +	if (unlikely(last.name[last.len] && !want_dir)) {
> +		dentry = lookup_dcache(&last, path->dentry, lookup_flags);
> +		if (!dentry)
> +			dentry = lookup_slow(&last, path->dentry, lookup_flags);

I do see some value in the simplicity of this approach, though maybe not
as much value as you see.  But the above uses inode_lock_share(), rather
than the nested version, so lockdep will complain.
If you open-code a nested lock, or write a new helper, you get very
close to the sequence for calling lookup_one_qstr_excl() below.  So
it isn't clear to me that the benefit is worth the cost.

This current code in filename_create isn't actually wrong is it?

Thanks,
NeilBrown



> +		if (!IS_ERR(dentry)) {
> +			error = d_is_positive(dentry) ? -EEXIST : -ENOENT;
> +			dput(dentry);
> +			dentry = ERR_PTR(error);
> +		}
> +		goto out;
> +	}
>  
>  	/* don't fail immediately if it's r/o, at least try to report other errors */
>  	err2 = mnt_want_write(path->mnt);
> -	/*
> -	 * Do the final lookup.  Suppress 'create' if there is a trailing
> -	 * '/', and a directory wasn't requested.
> -	 */
> -	if (last.name[last.len] && !want_dir)
> -		create_flags = 0;
> +	/* do the final lookup */
>  	inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT);
>  	dentry = lookup_one_qstr_excl(&last, path->dentry,
> -				      reval_flag | create_flags);
> +				lookup_flags | LOOKUP_CREATE | LOOKUP_EXCL);
>  	if (IS_ERR(dentry))
>  		goto unlock;
>  
> @@ -4089,16 +4099,6 @@ static struct dentry *filename_create(int dfd, struct filename *name,
>  	if (d_is_positive(dentry))
>  		goto fail;
>  
> -	/*
> -	 * Special case - lookup gave negative, but... we had foo/bar/
> -	 * From the vfs_mknod() POV we just have a negative dentry -
> -	 * all is fine. Let's be bastards - you had / on the end, you've
> -	 * been asking for (non-existent) directory. -ENOENT for you.
> -	 */
> -	if (unlikely(!create_flags)) {
> -		error = -ENOENT;
> -		goto fail;
> -	}
>  	if (unlikely(err2)) {
>  		error = err2;
>  		goto fail;
>
Al Viro Feb. 12, 2025, 4:06 a.m. UTC | #5
On Wed, Feb 12, 2025 at 02:45:04PM +1100, NeilBrown wrote:
> On Wed, 12 Feb 2025, Al Viro wrote:

> I do see some value in the simplicity of this approach, though maybe not
> as much value as you see.  But the above uses inode_lock_share(), rather
> than the nested version, so lockdep will complain.

IDGI...  It doesn't grab any ->i_rwsem inside that one, so what's there to
complain about?  And in that case it returns with no locks held, so...
NeilBrown Feb. 12, 2025, 4:40 a.m. UTC | #6
On Wed, 12 Feb 2025, Al Viro wrote:
> On Wed, Feb 12, 2025 at 02:45:04PM +1100, NeilBrown wrote:
> > On Wed, 12 Feb 2025, Al Viro wrote:
> 
> > I do see some value in the simplicity of this approach, though maybe not
> > as much value as you see.  But the above uses inode_lock_share(), rather
> > than the nested version, so lockdep will complain.
> 
> IDGI...  It doesn't grab any ->i_rwsem inside that one, so what's there to
> complain about?  And in that case it returns with no locks held, so...
> 

Sorry - my bad.  I saw the difference in nesting and jumped the wrong
way.

NeilBrown
diff mbox series

Patch

diff --git a/fs/namei.c b/fs/namei.c
index e3047db7b2b4..e0527f4b0586 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1665,6 +1665,8 @@  static struct dentry *lookup_dcache(const struct qstr *name,
  * dentries - as the matter of fact, this only gets called
  * when directory is guaranteed to have no in-lookup children
  * at all.
+ * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed.
+ * Will return -EEXIST if name is found and LOOKUP_EXCL was passed.
  */
 struct dentry *lookup_one_qstr_excl(const struct qstr *name,
 				    struct dentry *base,
@@ -1675,7 +1677,7 @@  struct dentry *lookup_one_qstr_excl(const struct qstr *name,
 	struct inode *dir = base->d_inode;
 
 	if (dentry)
-		return dentry;
+		goto found;
 
 	/* Don't create child dentry for a dead directory. */
 	if (unlikely(IS_DEADDIR(dir)))
@@ -1690,6 +1692,15 @@  struct dentry *lookup_one_qstr_excl(const struct qstr *name,
 		dput(dentry);
 		dentry = old;
 	}
+found:
+	if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) {
+		dput(dentry);
+		return ERR_PTR(-ENOENT);
+	}
+	if (d_is_positive(dentry) && (flags & LOOKUP_EXCL)) {
+		dput(dentry);
+		return ERR_PTR(-EEXIST);
+	}
 	return dentry;
 }
 EXPORT_SYMBOL(lookup_one_qstr_excl);
@@ -2736,10 +2747,6 @@  static struct dentry *__kern_path_locked(int dfd, struct filename *name, struct
 	}
 	inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT);
 	d = lookup_one_qstr_excl(&last, path->dentry, 0);
-	if (!IS_ERR(d) && d_is_negative(d)) {
-		dput(d);
-		d = ERR_PTR(-ENOENT);
-	}
 	if (IS_ERR(d)) {
 		inode_unlock(path->dentry->d_inode);
 		path_put(path);
@@ -4077,27 +4084,13 @@  static struct dentry *filename_create(int dfd, struct filename *name,
 	 * '/', and a directory wasn't requested.
 	 */
 	if (last.name[last.len] && !want_dir)
-		create_flags = 0;
+		create_flags &= ~LOOKUP_CREATE;
 	inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT);
 	dentry = lookup_one_qstr_excl(&last, path->dentry,
 				      reval_flag | create_flags);
 	if (IS_ERR(dentry))
 		goto unlock;
 
-	error = -EEXIST;
-	if (d_is_positive(dentry))
-		goto fail;
-
-	/*
-	 * Special case - lookup gave negative, but... we had foo/bar/
-	 * From the vfs_mknod() POV we just have a negative dentry -
-	 * all is fine. Let's be bastards - you had / on the end, you've
-	 * been asking for (non-existent) directory. -ENOENT for you.
-	 */
-	if (unlikely(!create_flags)) {
-		error = -ENOENT;
-		goto fail;
-	}
 	if (unlikely(err2)) {
 		error = err2;
 		goto fail;
@@ -4444,10 +4437,6 @@  int do_rmdir(int dfd, struct filename *name)
 	error = PTR_ERR(dentry);
 	if (IS_ERR(dentry))
 		goto exit3;
-	if (!dentry->d_inode) {
-		error = -ENOENT;
-		goto exit4;
-	}
 	error = security_path_rmdir(&path, dentry);
 	if (error)
 		goto exit4;
@@ -4578,7 +4567,7 @@  int do_unlinkat(int dfd, struct filename *name)
 	if (!IS_ERR(dentry)) {
 
 		/* Why not before? Because we want correct error value */
-		if (last.name[last.len] || d_is_negative(dentry))
+		if (last.name[last.len])
 			goto slashes;
 		inode = dentry->d_inode;
 		ihold(inode);
@@ -4612,9 +4601,7 @@  int do_unlinkat(int dfd, struct filename *name)
 	return error;
 
 slashes:
-	if (d_is_negative(dentry))
-		error = -ENOENT;
-	else if (d_is_dir(dentry))
+	if (d_is_dir(dentry))
 		error = -EISDIR;
 	else
 		error = -ENOTDIR;
@@ -5114,7 +5101,8 @@  int do_renameat2(int olddfd, struct filename *from, int newdfd,
 	struct qstr old_last, new_last;
 	int old_type, new_type;
 	struct inode *delegated_inode = NULL;
-	unsigned int lookup_flags = 0, target_flags = LOOKUP_RENAME_TARGET;
+	unsigned int lookup_flags = 0, target_flags =
+		LOOKUP_RENAME_TARGET | LOOKUP_CREATE;
 	bool should_retry = false;
 	int error = -EINVAL;
 
@@ -5127,6 +5115,8 @@  int do_renameat2(int olddfd, struct filename *from, int newdfd,
 
 	if (flags & RENAME_EXCHANGE)
 		target_flags = 0;
+	if (flags & RENAME_NOREPLACE)
+		target_flags |= LOOKUP_EXCL;
 
 retry:
 	error = filename_parentat(olddfd, from, lookup_flags, &old_path,
@@ -5168,23 +5158,12 @@  int do_renameat2(int olddfd, struct filename *from, int newdfd,
 	error = PTR_ERR(old_dentry);
 	if (IS_ERR(old_dentry))
 		goto exit3;
-	/* source must exist */
-	error = -ENOENT;
-	if (d_is_negative(old_dentry))
-		goto exit4;
 	new_dentry = lookup_one_qstr_excl(&new_last, new_path.dentry,
 					  lookup_flags | target_flags);
 	error = PTR_ERR(new_dentry);
 	if (IS_ERR(new_dentry))
 		goto exit4;
-	error = -EEXIST;
-	if ((flags & RENAME_NOREPLACE) && d_is_positive(new_dentry))
-		goto exit5;
 	if (flags & RENAME_EXCHANGE) {
-		error = -ENOENT;
-		if (d_is_negative(new_dentry))
-			goto exit5;
-
 		if (!d_is_dir(new_dentry)) {
 			error = -ENOTDIR;
 			if (new_last.name[new_last.len])
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 2b04038b0e40..56cf16a72334 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1532,7 +1532,8 @@  static int nfs_is_exclusive_create(struct inode *dir, unsigned int flags)
 {
 	if (NFS_PROTO(dir)->version == 2)
 		return 0;
-	return flags & LOOKUP_EXCL;
+	return (flags & (LOOKUP_CREATE | LOOKUP_EXCL)) ==
+		(LOOKUP_CREATE | LOOKUP_EXCL);
 }
 
 /*
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index 6890016e1923..fe29acef5872 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -113,11 +113,6 @@  static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf,
 	if (IS_ERR(d))
 		goto err_out;
 
-	if (d_is_negative(d)) {
-		dput(d);
-		goto err_out;
-	}
-
 	path->dentry = d;
 	path->mnt = mntget(parent_path->mnt);
 
@@ -693,6 +688,7 @@  int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path,
 	struct ksmbd_file *parent_fp;
 	int new_type;
 	int err, lookup_flags = LOOKUP_NO_SYMLINKS;
+	int target_lookup_flags = LOOKUP_RENAME_TARGET;
 
 	if (ksmbd_override_fsids(work))
 		return -ENOMEM;
@@ -703,6 +699,14 @@  int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path,
 		goto revert_fsids;
 	}
 
+	/*
+	 * explicitly handle file overwrite case, for compatibility with
+	 * filesystems that may not support rename flags (e.g: fuse)
+	 */
+	if (flags & RENAME_NOREPLACE)
+		target_lookup_flags |= LOOKUP_EXCL;
+	flags &= ~(RENAME_NOREPLACE);
+
 retry:
 	err = vfs_path_parent_lookup(to, lookup_flags | LOOKUP_BENEATH,
 				     &new_path, &new_last, &new_type,
@@ -743,7 +747,7 @@  int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path,
 	}
 
 	new_dentry = lookup_one_qstr_excl(&new_last, new_path.dentry,
-					  lookup_flags | LOOKUP_RENAME_TARGET);
+					  lookup_flags | target_lookup_flags);
 	if (IS_ERR(new_dentry)) {
 		err = PTR_ERR(new_dentry);
 		goto out3;
@@ -754,16 +758,6 @@  int ksmbd_vfs_rename(struct ksmbd_work *work, const struct path *old_path,
 		goto out4;
 	}
 
-	/*
-	 * explicitly handle file overwrite case, for compatibility with
-	 * filesystems that may not support rename flags (e.g: fuse)
-	 */
-	if ((flags & RENAME_NOREPLACE) && d_is_positive(new_dentry)) {
-		err = -EEXIST;
-		goto out4;
-	}
-	flags &= ~(RENAME_NOREPLACE);
-
 	if (old_child == trap) {
 		err = -EINVAL;
 		goto out4;