Message ID | 20201124031609.67297-1-dai.ngo@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | NFSD: Fix 5 seconds delay when doing inter server copy | expand |
On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote: > Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after > CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 > seconds delay regardless of the size of the copy. The delay is from > nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential > fails because the seqid in both nfs4_state and nfs4_stateid are 0. > > Fix by modifying the source server to return the stateid for COPY_NOTIFY > request with seqid 1 instead of 0. This is also to conform with > section 4.8 of RFC 7862. > > Here is the relevant paragraph from section 4.8 of RFC 7862: > > A copy offload stateid's seqid MUST NOT be zero. In the context of a > copy offload operation, it is inappropriate to indicate "the most > recent copy offload operation" using a stateid with a seqid of zero > (see Section 8.2.2 of [RFC5661]). It is inappropriate because the > stateid refers to internal state in the server and there may be > several asynchronous COPY operations being performed in parallel on > the same file by the server. Therefore, a copy offload stateid with > a seqid of zero MUST be considered invalid. > > Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") > Signed-off-by: Dai Ngo <dai.ngo@oracle.com> > --- > fs/nfsd/nfs4state.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index d7f27ed6b794..33ee1a6961e3 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn, > refcount_set(&cps->cp_stateid.sc_count, 1); > if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID)) > goto out_free; > + cps->cp_stateid.stid.si_generation = 1; This affects the stateid returned by COPY_NOTIFY, but not the one returned by COPY. I think we wan to add this to nfs4_init_cp_state() and cover both. --b. > spin_lock(&nn->s2s_cp_lock); > list_add(&cps->cp_list, &p_stid->sc_cp_list); > spin_unlock(&nn->s2s_cp_lock); > -- > 2.9.5
Hello Dai - > On Nov 24, 2020, at 3:49 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote: >> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after >> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 >> seconds delay regardless of the size of the copy. The delay is from >> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential >> fails because the seqid in both nfs4_state and nfs4_stateid are 0. >> >> Fix by modifying the source server to return the stateid for COPY_NOTIFY >> request with seqid 1 instead of 0. This is also to conform with >> section 4.8 of RFC 7862. >> >> Here is the relevant paragraph from section 4.8 of RFC 7862: >> >> A copy offload stateid's seqid MUST NOT be zero. In the context of a >> copy offload operation, it is inappropriate to indicate "the most >> recent copy offload operation" using a stateid with a seqid of zero >> (see Section 8.2.2 of [RFC5661]). It is inappropriate because the >> stateid refers to internal state in the server and there may be >> several asynchronous COPY operations being performed in parallel on >> the same file by the server. Therefore, a copy offload stateid with >> a seqid of zero MUST be considered invalid. >> >> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") >> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> >> --- >> fs/nfsd/nfs4state.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> index d7f27ed6b794..33ee1a6961e3 100644 >> --- a/fs/nfsd/nfs4state.c >> +++ b/fs/nfsd/nfs4state.c >> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn, >> refcount_set(&cps->cp_stateid.sc_count, 1); >> if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID)) >> goto out_free; >> + cps->cp_stateid.stid.si_generation = 1; > > This affects the stateid returned by COPY_NOTIFY, but not the one > returned by COPY. I think we wan to add this to nfs4_init_cp_state() > and cover both. Since time is creeping on towards the next merge window, I assume this particular fix needs to go there, but I don't see the final version of it (with Bruce's suggested fix) on the list. Did I miss it? >> spin_lock(&nn->s2s_cp_lock); >> list_add(&cps->cp_list, &p_stid->sc_cp_list); >> spin_unlock(&nn->s2s_cp_lock); >> -- >> 2.9.5 -- Chuck Lever chucklever@gmail.com
Hi Chuck, Sorry for the delay. I will make update the patch, test it, and re-submit it by end of today. Thanks, -Dai On 11/30/20 9:57 AM, Chuck Lever wrote: > Hello Dai - > >> On Nov 24, 2020, at 3:49 PM, J. Bruce Fields <bfields@fieldses.org> wrote: >> >> On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote: >>> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after >>> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 >>> seconds delay regardless of the size of the copy. The delay is from >>> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential >>> fails because the seqid in both nfs4_state and nfs4_stateid are 0. >>> >>> Fix by modifying the source server to return the stateid for COPY_NOTIFY >>> request with seqid 1 instead of 0. This is also to conform with >>> section 4.8 of RFC 7862. >>> >>> Here is the relevant paragraph from section 4.8 of RFC 7862: >>> >>> A copy offload stateid's seqid MUST NOT be zero. In the context of a >>> copy offload operation, it is inappropriate to indicate "the most >>> recent copy offload operation" using a stateid with a seqid of zero >>> (see Section 8.2.2 of [RFC5661]). It is inappropriate because the >>> stateid refers to internal state in the server and there may be >>> several asynchronous COPY operations being performed in parallel on >>> the same file by the server. Therefore, a copy offload stateid with >>> a seqid of zero MUST be considered invalid. >>> >>> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") >>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> >>> --- >>> fs/nfsd/nfs4state.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>> index d7f27ed6b794..33ee1a6961e3 100644 >>> --- a/fs/nfsd/nfs4state.c >>> +++ b/fs/nfsd/nfs4state.c >>> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn, >>> refcount_set(&cps->cp_stateid.sc_count, 1); >>> if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID)) >>> goto out_free; >>> + cps->cp_stateid.stid.si_generation = 1; >> This affects the stateid returned by COPY_NOTIFY, but not the one >> returned by COPY. I think we wan to add this to nfs4_init_cp_state() >> and cover both. > Since time is creeping on towards the next merge window, I assume > this particular fix needs to go there, but I don't see the final > version of it (with Bruce's suggested fix) on the list. Did I miss > it? > > >>> spin_lock(&nn->s2s_cp_lock); >>> list_add(&cps->cp_list, &p_stid->sc_cp_list); >>> spin_unlock(&nn->s2s_cp_lock); >>> -- >>> 2.9.5 > -- > Chuck Lever > chucklever@gmail.com > > >
On 11/24/20 12:49 PM, J. Bruce Fields wrote: > On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote: >> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after >> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 >> seconds delay regardless of the size of the copy. The delay is from >> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential >> fails because the seqid in both nfs4_state and nfs4_stateid are 0. >> >> Fix by modifying the source server to return the stateid for COPY_NOTIFY >> request with seqid 1 instead of 0. This is also to conform with >> section 4.8 of RFC 7862. >> >> Here is the relevant paragraph from section 4.8 of RFC 7862: >> >> A copy offload stateid's seqid MUST NOT be zero. In the context of a >> copy offload operation, it is inappropriate to indicate "the most >> recent copy offload operation" using a stateid with a seqid of zero >> (see Section 8.2.2 of [RFC5661]). It is inappropriate because the >> stateid refers to internal state in the server and there may be >> several asynchronous COPY operations being performed in parallel on >> the same file by the server. Therefore, a copy offload stateid with >> a seqid of zero MUST be considered invalid. >> >> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") >> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> >> --- >> fs/nfsd/nfs4state.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> index d7f27ed6b794..33ee1a6961e3 100644 >> --- a/fs/nfsd/nfs4state.c >> +++ b/fs/nfsd/nfs4state.c >> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn, >> refcount_set(&cps->cp_stateid.sc_count, 1); >> if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID)) >> goto out_free; >> + cps->cp_stateid.stid.si_generation = 1; > This affects the stateid returned by COPY_NOTIFY, but not the one > returned by COPY. I think we wan to add this to nfs4_init_cp_state() > and cover both. Hi Bruce, thank you for your suggestion. Updated patch tested and submitted. -Dai P.S sorry for the delay, I was on leave last few days. > > --b. > >> spin_lock(&nn->s2s_cp_lock); >> list_add(&cps->cp_list, &p_stid->sc_cp_list); >> spin_unlock(&nn->s2s_cp_lock); >> -- >> 2.9.5
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index d7f27ed6b794..33ee1a6961e3 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn, refcount_set(&cps->cp_stateid.sc_count, 1); if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID)) goto out_free; + cps->cp_stateid.stid.si_generation = 1; spin_lock(&nn->s2s_cp_lock); list_add(&cps->cp_list, &p_stid->sc_cp_list); spin_unlock(&nn->s2s_cp_lock);
Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 seconds delay regardless of the size of the copy. The delay is from nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential fails because the seqid in both nfs4_state and nfs4_stateid are 0. Fix by modifying the source server to return the stateid for COPY_NOTIFY request with seqid 1 instead of 0. This is also to conform with section 4.8 of RFC 7862. Here is the relevant paragraph from section 4.8 of RFC 7862: A copy offload stateid's seqid MUST NOT be zero. In the context of a copy offload operation, it is inappropriate to indicate "the most recent copy offload operation" using a stateid with a seqid of zero (see Section 8.2.2 of [RFC5661]). It is inappropriate because the stateid refers to internal state in the server and there may be several asynchronous COPY operations being performed in parallel on the same file by the server. Therefore, a copy offload stateid with a seqid of zero MUST be considered invalid. Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> --- fs/nfsd/nfs4state.c | 1 + 1 file changed, 1 insertion(+)