Message ID | 20191023235600.10880-1-trond.myklebust@hammerspace.com (mailing list archive) |
---|---|
Headers | show |
Series | Delegation bugfixes | expand |
Hi Trond, This patch set produces the following in my testing. Basically what I see the client is prevented from using a delegation at all. After I induce a race of DELEGRETURN/OPEN --- the racing OPEN gets a delegation (it returns the same seqid and other as the delegation being returned) but the client doesn't use it. --- the following (next) OPEN that also gets a delegation immediately has the client returning the given delegation. Disclaimer: in my testing the racing DELEGRETURN doesn't fail with OLD_STATEID, NetApp returns OK. On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust <trondmy@gmail.com> wrote: > > The following patchset fixes up a number of issues with delegations, > but mainly attempts to fix a race condition between open and > delegreturn, where the open and the delegreturn get re-ordered so > that the delegreturn ends up revoking the delegation that was returned > by the open. > The root cause is that in certain circumstances, we may currently end > up freeing the delegation from delegreturn, so when we later receive > the reply to the open, we've lost track of the fact that the seqid > predates the one that was returned. > > This patchset fixes that case by ensuring that we always keep track > of the last delegation stateid that was returned for any given inode. > That way, if we later see a delegation stateid with the same opaque > field, but an older seqid, we know we cannot trust it, and so we > ask to replay the OPEN compound. > > Trond Myklebust (14): > NFSv4: Don't allow a cached open with a revoked delegation > NFSv4: Fix delegation handling in update_open_stateid() > NFSv4: nfs4_callback_getattr() should ignore revoked delegations > NFSv4: Delegation recalls should not find revoked delegations > NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was > revoked > NFS: Rename nfs_inode_return_delegation_noreclaim() > NFSv4: Don't remove the delegation from the super_list more than once > NFSv4: Hold the delegation spinlock when updating the seqid > NFSv4: Clear the NFS_DELEGATION_REVOKED flag in > nfs_update_inplace_delegation() > NFSv4: Update the stateid seqid in nfs_revoke_delegation() > NFSv4: Revoke the delegation on success in nfs4_delegreturn_done() > NFSv4: Ignore requests to return the delegation if it was revoked > NFSv4: Don't reclaim delegations that have been returned or revoked > NFSv4: Fix races between open and delegreturn > > fs/nfs/callback_proc.c | 2 +- > fs/nfs/delegation.c | 109 +++++++++++++++++++++++++++++------------ > fs/nfs/delegation.h | 4 +- > fs/nfs/nfs4proc.c | 13 ++--- > fs/nfs/nfs4super.c | 4 +- > 5 files changed, 88 insertions(+), 44 deletions(-) > > -- > 2.21.0 >
On Thu, Oct 31, 2019 at 11:27 AM Olga Kornievskaia <aglo@umich.edu> wrote: > > Hi Trond, > > This patch set produces the following in my testing. Basically what I > see the client is prevented from using a delegation at all. > > After I induce a race of DELEGRETURN/OPEN > --- the racing OPEN gets a delegation (it returns the same seqid and > other as the delegation being returned) but the client doesn't use it. > --- the following (next) OPEN that also gets a delegation immediately > has the client returning the given delegation. > > Disclaimer: in my testing the racing DELEGRETURN doesn't fail with > OLD_STATEID, NetApp returns OK. Testing the same against Linux. It prevents the client from using future delegation stateid. On the induced DELEGRETURN/OPEN race, the linux server doesn't give a new read delegation. The following open gets a read delegation and returns it right away. > On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust <trondmy@gmail.com> wrote: > > > > The following patchset fixes up a number of issues with delegations, > > but mainly attempts to fix a race condition between open and > > delegreturn, where the open and the delegreturn get re-ordered so > > that the delegreturn ends up revoking the delegation that was returned > > by the open. > > The root cause is that in certain circumstances, we may currently end > > up freeing the delegation from delegreturn, so when we later receive > > the reply to the open, we've lost track of the fact that the seqid > > predates the one that was returned. > > > > This patchset fixes that case by ensuring that we always keep track > > of the last delegation stateid that was returned for any given inode. > > That way, if we later see a delegation stateid with the same opaque > > field, but an older seqid, we know we cannot trust it, and so we > > ask to replay the OPEN compound. > > > > Trond Myklebust (14): > > NFSv4: Don't allow a cached open with a revoked delegation > > NFSv4: Fix delegation handling in update_open_stateid() > > NFSv4: nfs4_callback_getattr() should ignore revoked delegations > > NFSv4: Delegation recalls should not find revoked delegations > > NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was > > revoked > > NFS: Rename nfs_inode_return_delegation_noreclaim() > > NFSv4: Don't remove the delegation from the super_list more than once > > NFSv4: Hold the delegation spinlock when updating the seqid > > NFSv4: Clear the NFS_DELEGATION_REVOKED flag in > > nfs_update_inplace_delegation() > > NFSv4: Update the stateid seqid in nfs_revoke_delegation() > > NFSv4: Revoke the delegation on success in nfs4_delegreturn_done() > > NFSv4: Ignore requests to return the delegation if it was revoked > > NFSv4: Don't reclaim delegations that have been returned or revoked > > NFSv4: Fix races between open and delegreturn > > > > fs/nfs/callback_proc.c | 2 +- > > fs/nfs/delegation.c | 109 +++++++++++++++++++++++++++++------------ > > fs/nfs/delegation.h | 4 +- > > fs/nfs/nfs4proc.c | 13 ++--- > > fs/nfs/nfs4super.c | 4 +- > > 5 files changed, 88 insertions(+), 44 deletions(-) > > > > -- > > 2.21.0 > >
Hi Trond, Now that I recall the reason the Ontap server gives out the same delegation was to deal with the linux client behaviour who sends an open even though it holds a delegation (according to the server's view). So what server does is it gives out the same delegation. This patch series changes the semantics. Can you describe what you expect the server is supposed to do in this case? On Thu, Oct 31, 2019 at 11:49 AM Olga Kornievskaia <aglo@umich.edu> wrote: > > On Thu, Oct 31, 2019 at 11:27 AM Olga Kornievskaia <aglo@umich.edu> wrote: > > > > Hi Trond, > > > > This patch set produces the following in my testing. Basically what I > > see the client is prevented from using a delegation at all. > > > > After I induce a race of DELEGRETURN/OPEN > > --- the racing OPEN gets a delegation (it returns the same seqid and > > other as the delegation being returned) but the client doesn't use it. > > --- the following (next) OPEN that also gets a delegation immediately > > has the client returning the given delegation. > > > > Disclaimer: in my testing the racing DELEGRETURN doesn't fail with > > OLD_STATEID, NetApp returns OK. > > Testing the same against Linux. It prevents the client from using > future delegation stateid. On the induced DELEGRETURN/OPEN race, the > linux server doesn't give a new read delegation. The following open > gets a read delegation and returns it right away. > > > > On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust <trondmy@gmail.com> wrote: > > > > > > The following patchset fixes up a number of issues with delegations, > > > but mainly attempts to fix a race condition between open and > > > delegreturn, where the open and the delegreturn get re-ordered so > > > that the delegreturn ends up revoking the delegation that was returned > > > by the open. > > > The root cause is that in certain circumstances, we may currently end > > > up freeing the delegation from delegreturn, so when we later receive > > > the reply to the open, we've lost track of the fact that the seqid > > > predates the one that was returned. > > > > > > This patchset fixes that case by ensuring that we always keep track > > > of the last delegation stateid that was returned for any given inode. > > > That way, if we later see a delegation stateid with the same opaque > > > field, but an older seqid, we know we cannot trust it, and so we > > > ask to replay the OPEN compound. > > > > > > Trond Myklebust (14): > > > NFSv4: Don't allow a cached open with a revoked delegation > > > NFSv4: Fix delegation handling in update_open_stateid() > > > NFSv4: nfs4_callback_getattr() should ignore revoked delegations > > > NFSv4: Delegation recalls should not find revoked delegations > > > NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was > > > revoked > > > NFS: Rename nfs_inode_return_delegation_noreclaim() > > > NFSv4: Don't remove the delegation from the super_list more than once > > > NFSv4: Hold the delegation spinlock when updating the seqid > > > NFSv4: Clear the NFS_DELEGATION_REVOKED flag in > > > nfs_update_inplace_delegation() > > > NFSv4: Update the stateid seqid in nfs_revoke_delegation() > > > NFSv4: Revoke the delegation on success in nfs4_delegreturn_done() > > > NFSv4: Ignore requests to return the delegation if it was revoked > > > NFSv4: Don't reclaim delegations that have been returned or revoked > > > NFSv4: Fix races between open and delegreturn > > > > > > fs/nfs/callback_proc.c | 2 +- > > > fs/nfs/delegation.c | 109 +++++++++++++++++++++++++++++------------ > > > fs/nfs/delegation.h | 4 +- > > > fs/nfs/nfs4proc.c | 13 ++--- > > > fs/nfs/nfs4super.c | 4 +- > > > 5 files changed, 88 insertions(+), 44 deletions(-) > > > > > > -- > > > 2.21.0 > > >
Hi Olga On Thu, 2019-10-31 at 12:15 -0400, Olga Kornievskaia wrote: > Hi Trond, > > Now that I recall the reason the Ontap server gives out the same > delegation was to deal with the linux client behaviour who sends an > open even though it holds a delegation (according to the server's > view). So what server does is it gives out the same delegation. This > patch series changes the semantics. Can you describe what you expect > the server is supposed to do in this case? If the client sends out a second OPEN for the same file (presumably because an application request caused the file to be opened again before the client could process the reply to the first open), then I think it should be OK for the server to send the delegation stateid again. If the new open caused a state change (for instance the delegation was upgraded from READ to WRITE) then the seqid for the delegation stateid should be bumped. If there was not state change, it really should be up to the server whether or not it bumps the seqid. However on the client side, we want to be able to consider that if the 2 open replies return the exact same delegation stateid, then we consider the second reply to be a no-op as far as our update of the internal state is concerned. If the seqid was bumped, the client needs to update its internal state. So all this patch series is doing, is setting up rules to allow us to enforce that case, and in particular to ensure that _if_ we send a DELEGRETURN before we process the reply to the second open, then we recognise that a successful DELEGRETURN means we no longer hold delegation state, no matter what the contents of the reply to the second open tells us. > On Thu, Oct 31, 2019 at 11:49 AM Olga Kornievskaia <aglo@umich.edu> > wrote: > > On Thu, Oct 31, 2019 at 11:27 AM Olga Kornievskaia <aglo@umich.edu> > > wrote: > > > Hi Trond, > > > > > > This patch set produces the following in my testing. Basically > > > what I > > > see the client is prevented from using a delegation at all. > > > > > > After I induce a race of DELEGRETURN/OPEN > > > --- the racing OPEN gets a delegation (it returns the same seqid > > > and > > > other as the delegation being returned) but the client doesn't > > > use it. > > > --- the following (next) OPEN that also gets a delegation > > > immediately > > > has the client returning the given delegation. > > > > > > Disclaimer: in my testing the racing DELEGRETURN doesn't fail > > > with > > > OLD_STATEID, NetApp returns OK. > > > > Testing the same against Linux. It prevents the client from using > > future delegation stateid. On the induced DELEGRETURN/OPEN race, > > the > > linux server doesn't give a new read delegation. The following open > > gets a read delegation and returns it right away. > > > > > > > On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust < > > > trondmy@gmail.com> wrote: > > > > The following patchset fixes up a number of issues with > > > > delegations, > > > > but mainly attempts to fix a race condition between open and > > > > delegreturn, where the open and the delegreturn get re-ordered > > > > so > > > > that the delegreturn ends up revoking the delegation that was > > > > returned > > > > by the open. > > > > The root cause is that in certain circumstances, we may > > > > currently end > > > > up freeing the delegation from delegreturn, so when we later > > > > receive > > > > the reply to the open, we've lost track of the fact that the > > > > seqid > > > > predates the one that was returned. > > > > > > > > This patchset fixes that case by ensuring that we always keep > > > > track > > > > of the last delegation stateid that was returned for any given > > > > inode. > > > > That way, if we later see a delegation stateid with the same > > > > opaque > > > > field, but an older seqid, we know we cannot trust it, and so > > > > we > > > > ask to replay the OPEN compound. > > > > > > > > Trond Myklebust (14): > > > > NFSv4: Don't allow a cached open with a revoked delegation > > > > NFSv4: Fix delegation handling in update_open_stateid() > > > > NFSv4: nfs4_callback_getattr() should ignore revoked > > > > delegations > > > > NFSv4: Delegation recalls should not find revoked delegations > > > > NFSv4: fail nfs4_refresh_delegation_stateid() when the > > > > delegation was > > > > revoked > > > > NFS: Rename nfs_inode_return_delegation_noreclaim() > > > > NFSv4: Don't remove the delegation from the super_list more > > > > than once > > > > NFSv4: Hold the delegation spinlock when updating the seqid > > > > NFSv4: Clear the NFS_DELEGATION_REVOKED flag in > > > > nfs_update_inplace_delegation() > > > > NFSv4: Update the stateid seqid in nfs_revoke_delegation() > > > > NFSv4: Revoke the delegation on success in > > > > nfs4_delegreturn_done() > > > > NFSv4: Ignore requests to return the delegation if it was > > > > revoked > > > > NFSv4: Don't reclaim delegations that have been returned or > > > > revoked > > > > NFSv4: Fix races between open and delegreturn > > > > > > > > fs/nfs/callback_proc.c | 2 +- > > > > fs/nfs/delegation.c | 109 +++++++++++++++++++++++++++++-- > > > > ---------- > > > > fs/nfs/delegation.h | 4 +- > > > > fs/nfs/nfs4proc.c | 13 ++--- > > > > fs/nfs/nfs4super.c | 4 +- > > > > 5 files changed, 88 insertions(+), 44 deletions(-) > > > > > > > > -- > > > > 2.21.0 > > > >