Message ID | 20220714200434.161818-1-jlayton@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | nfsd: close potential race between open and delegation | expand |
On Fri, 15 Jul 2022, Jeff Layton wrote: > This is a respin of the patchset that I sent earlier today. I hit a > deadlock with that one because of the ambiguous locking. > > This series is based on top of Neil's set entitled: > > [PATCH 0/8] NFSD: clean up locking. > > His patchset makes the locking in the nfsd4_open codepath much more > consistent, and this becomes a lot simpler to fix. Without that set > however, the state of the parent's i_rwsem is unclear after nfsd_lookup > is called, and I don't see a way to determine it reliably. I haven't examined these patch very closely, but a few initial thoughts are: 1/ Before my series, you can unambiguously tell if i_rwsem is held by checking fhp->fh_locked. In fact, just call "fh_lock()", and you can then be sure the fh is locked, whether or not it was locked before however... 2/ Do we really need to lock the parent? If a rename or unlink happens after the lease was taken, the lease will be broken. So take lease. repeat lookup (locklessly) Check if lease has been broken Should provide all you need. You don't *need* to lock the directory to open an existing file and with my pending parallel-updates patch set, you only need a shared lock on the directory to create a file. So I'd rather not be locking the directory at all to get a delegation 3/ When you vet the name you only do a lookup_one_len(), while nfsd_lookup_dentry() also calls nfsd_cross_mnt() as it is possible for a file to be mounted on. That means that if I did bind mount one file over another and export over NFSD, the file will never offer a delegation. This is a minor point, but I think it would be best to be as correct and consistent as possible. Thanks for working on this! NeilBrown > > Jeff Layton (2): > nfsd: drop fh argument from alloc_init_deleg > nfsd: vet the opened dentry after setting a delegation > > fs/nfsd/nfs4state.c | 54 +++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 45 insertions(+), 9 deletions(-) > > -- > 2.36.1 > >
On Fri, 2022-07-15 at 09:59 +1000, NeilBrown wrote: > On Fri, 15 Jul 2022, Jeff Layton wrote: > > This is a respin of the patchset that I sent earlier today. I hit a > > deadlock with that one because of the ambiguous locking. > > > > This series is based on top of Neil's set entitled: > > > > [PATCH 0/8] NFSD: clean up locking. > > > > His patchset makes the locking in the nfsd4_open codepath much more > > consistent, and this becomes a lot simpler to fix. Without that set > > however, the state of the parent's i_rwsem is unclear after nfsd_lookup > > is called, and I don't see a way to determine it reliably. > > I haven't examined these patch very closely, but a few initial thoughts > are: > > 1/ Before my series, you can unambiguously tell if i_rwsem is held by > checking fhp->fh_locked. In fact, just call "fh_lock()", and you can > then be sure the fh is locked, whether or not it was locked before Thanks, good to know. I wasn't sure how reliable that bool is. I guess though that once you have a svc_fh, then you can more or less assume that you have exclusive access to it for the life of the RPC being processed. > however... > 2/ Do we really need to lock the parent? If a rename or unlink happens > after the lease was taken, the lease will be broken. So > take lease. > repeat lookup (locklessly) > Check if lease has been broken > Should provide all you need. > > You don't *need* to lock the directory to open an existing file and > with my pending parallel-updates patch set, you only need a shared > lock on the directory to create a file. So I'd rather not be locking > the directory at all to get a delegation > Yeah, we probably don't need to lock the dir. That said, after your patch series we already hold the i_rwsem on the parent at this point so lookup_one_len is fine in this instance. > 3/ When you vet the name you only do a lookup_one_len(), while > nfsd_lookup_dentry() also calls nfsd_cross_mnt() as it is possible > for a file to be mounted on. > That means that if I did bind mount one file over another and export > over NFSD, the file will never offer a delegation. > This is a minor point, but I think it would be best to be as correct > and consistent as possible. > Agreed, but that will take a bit more work. nfsd_lookup_dentry takes several parameters that we don't currently have access to in nfs4_set_delegation (e.g. the rqstp). Those will need to be plumbed through several functions. > Thanks for working on this! ...and thank you for the locking cleanup! Getting rid of fh_lock/_unlock is a really nice cleanup that makes it a lot more clear how this should work.
On Fri, 15 Jul 2022, Jeff Layton wrote: > On Fri, 2022-07-15 at 09:59 +1000, NeilBrown wrote: > > > however... > > 2/ Do we really need to lock the parent? If a rename or unlink happens > > after the lease was taken, the lease will be broken. So > > take lease. > > repeat lookup (locklessly) > > Check if lease has been broken > > Should provide all you need. > > > > You don't *need* to lock the directory to open an existing file and > > with my pending parallel-updates patch set, you only need a shared > > lock on the directory to create a file. So I'd rather not be locking > > the directory at all to get a delegation > > > > Yeah, we probably don't need to lock the dir. That said, after your > patch series we already hold the i_rwsem on the parent at this point so > lookup_one_len is fine in this instance. But the only reason we hold i_rwsem at this point is to prevent renames in the "opened existing file" case. The "created new file" case holds it as well just be be consistent with the first case. If we "vet" the dentry, then we don't need the lock any more. We can then simplify nfsd_lookup_dentry() to always assume the dir is not locked - so the "locked" arg can go, and nfsd_lookup() can lose the "lock" arg and always return with the directory unlocked. I'm tempted to add your patch to the front of my series. The inconsistency in locking can be fix by unlocking the directory before we get even close to handing out a delegation - so the delegation never sees a locked directory. But right now I have a cold and don't trust myself to think clearly enough to create code worth posting. Hopefully I'll be thinking more clearly later in the week. While I'm here ... is "vet" a good word? The meaning is appropriate, but I wonder if it would cause our friends for whom English isn't their first language to stumble. There are about 5 uses in the kernel at present. Would validate or verify be better? Even though they are longer.. Thanks, NeilBrown
On Mon, 2022-07-18 at 13:21 +1000, NeilBrown wrote: > On Fri, 15 Jul 2022, Jeff Layton wrote: > > On Fri, 2022-07-15 at 09:59 +1000, NeilBrown wrote: > > > > > however... > > > 2/ Do we really need to lock the parent? If a rename or unlink happens > > > after the lease was taken, the lease will be broken. So > > > take lease. > > > repeat lookup (locklessly) > > > Check if lease has been broken > > > Should provide all you need. > > > > > > You don't *need* to lock the directory to open an existing file and > > > with my pending parallel-updates patch set, you only need a shared > > > lock on the directory to create a file. So I'd rather not be locking > > > the directory at all to get a delegation > > > > > > > Yeah, we probably don't need to lock the dir. That said, after your > > patch series we already hold the i_rwsem on the parent at this point so > > lookup_one_len is fine in this instance. > > But the only reason we hold i_rwsem at this point is to prevent renames > in the "opened existing file" case. The "created new file" case holds > it as well just be be consistent with the first case. > > If we "vet" the dentry, then we don't need the lock any more. We can > then simplify nfsd_lookup_dentry() to always assume the dir is not > locked - so the "locked" arg can go, and nfsd_lookup() can lose the > "lock" arg and always return with the directory unlocked. > > I'm tempted to add your patch to the front of my series. The > inconsistency in locking can be fix by unlocking the directory before we > get even close to handing out a delegation - so the delegation never > sees a locked directory. Hmm, ok. I suppose we don't necessarily have to care whether the thing is locked before calling into nfsd_lookup_dentry. I'll take another stab at fixing this in the kernel w/o your series. That'll make Chuck happy too. > But right now I have a cold and don't trust myself to think clearly > enough to create code worth posting. Hopefully I'll be thinking more > clearly later in the week. > > While I'm here ... is "vet" a good word? The meaning is appropriate, > but I wonder if it would cause our friends for whom English isn't their > first language to stumble. There are about 5 uses in the kernel at > present. > > Would validate or verify be better? Even though they are longer.. Good point. I'm all for helping out non-native English speakers. I'll plan to change it to something less esoteric.
> On Jul 17, 2022, at 11:21 PM, NeilBrown <neilb@suse.de> wrote: > > But right now I have a cold and don't trust myself to think clearly > enough to create code worth posting. Hopefully I'll be thinking more > clearly later in the week. Thanks for the update! Here are my plans: I'd like to finalize content of the first 5.20 NFSD pull request this week. If these patches are not ready by then, I can prepare a second PR later in the 5.20 merge window with your work, which should give you another two weeks. If they are still not ready, I'll get them in first thing for the next merge window. -- Chuck Lever
On Tue, 19 Jul 2022, Chuck Lever III wrote: > > > On Jul 17, 2022, at 11:21 PM, NeilBrown <neilb@suse.de> wrote: > > > > But right now I have a cold and don't trust myself to think clearly > > enough to create code worth posting. Hopefully I'll be thinking more > > clearly later in the week. > > Thanks for the update! > > Here are my plans: I'd like to finalize content of the first 5.20 > NFSD pull request this week. If these patches are not ready by > then, I can prepare a second PR later in the 5.20 merge window > with your work, which should give you another two weeks. If they > are still not ready, I'll get them in first thing for the next > merge window. FYI I've addressed the outstanding issues (I think), but have not yet done any testing or given the patches a final inspection. I hope to do that tomorrow, and will post the patches once it is done. If you want a preview you can find them on github.com/neilbrown/linux in the "nfsd" branch. Jeff's patches to validate delegations after getting the lease, so we don't have to hold the lock so long, comes first. NeilBrown