ceph: fix possible long time wait during umount

Message ID	20191204062718.56105-1-xiubli@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=EwtB=Z2=vger.kernel.org=ceph-devel-owner@kernel.org> From: xiubli@redhat.com To: jlayton@kernel.org Cc: sage@redhat.com, idryomov@gmail.com, zyan@redhat.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li <xiubli@redhat.com> Subject: [PATCH] ceph: fix possible long time wait during umount Date: Wed, 4 Dec 2019 01:27:18 -0500 Message-Id: <20191204062718.56105-1-xiubli@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk
Series	ceph: fix possible long time wait during umount \| expand ceph: fix possible long time wait during umount

Message ID

20191204062718.56105-1-xiubli@redhat.com (mailing list archive)

State

New, archived

Headers

From: xiubli@redhat.com
To: jlayton@kernel.org
Cc: sage@redhat.com, idryomov@gmail.com, zyan@redhat.com,
        pdonnell@redhat.com, ceph-devel@vger.kernel.org,
        Xiubo Li <xiubli@redhat.com>
Subject: [PATCH] ceph: fix possible long time wait during umount
Date: Wed,  4 Dec 2019 01:27:18 -0500
Message-Id: <20191204062718.56105-1-xiubli@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
Sender: ceph-devel-owner@vger.kernel.org
Precedence: bulk

Series

ceph: fix possible long time wait during umount | expand

Commit Message

Xiubo Li Dec. 4, 2019, 6:27 a.m. UTC

From: Xiubo Li <xiubli@redhat.com>

During umount, if there has no any unsafe request in the mdsc and
some requests still in-flight and not got reply yet, and if the
rest requets are all safe ones, after that even all of them in mdsc
are unregistered, the umount must wait until after mount_timeout
seconds anyway.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 fs/ceph/mds_client.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Jeff Layton Dec. 4, 2019, 2:26 p.m. UTC | #1

On Wed, 2019-12-04 at 01:27 -0500, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> During umount, if there has no any unsafe request in the mdsc and
> some requests still in-flight and not got reply yet, and if the
> rest requets are all safe ones, after that even all of them in mdsc
> are unregistered, the umount must wait until after mount_timeout
> seconds anyway.
> 
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  fs/ceph/mds_client.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 163b470f3000..39f4d8501df5 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -2877,6 +2877,10 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
>  		set_bit(CEPH_MDS_R_GOT_SAFE, &req->r_req_flags);
>  		__unregister_request(mdsc, req);
>  
> +		/* last request during umount? */
> +		if (mdsc->stopping && !__get_oldest_req(mdsc))
> +			complete_all(&mdsc->safe_umount_waiters);
> +
>  		if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) {
>  			/*
>  			 * We already handled the unsafe response, now do the
> @@ -2887,9 +2891,6 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
>  			 */
>  			dout("got safe reply %llu, mds%d\n", tid, mds);
>  
> -			/* last unsafe request during umount? */
> -			if (mdsc->stopping && !__get_oldest_req(mdsc))
> -				complete_all(&mdsc->safe_umount_waiters);
>  			mutex_unlock(&mdsc->mutex);
>  			goto out;
>  		}

Looks reasonable. AIUI, the MDS is free to send a safe reply without
ever sending an unsafe one, so I don't see why we want to make that
conditional on receiving an earlier unsafe reply.

Xiubo Li Dec. 4, 2019, 2:46 p.m. UTC | #2

On 2019/12/4 22:26, Jeff Layton wrote:
> On Wed, 2019-12-04 at 01:27 -0500, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> During umount, if there has no any unsafe request in the mdsc and
>> some requests still in-flight and not got reply yet, and if the
>> rest requets are all safe ones, after that even all of them in mdsc
>> are unregistered, the umount must wait until after mount_timeout
>> seconds anyway.
>>
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>   fs/ceph/mds_client.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>> index 163b470f3000..39f4d8501df5 100644
>> --- a/fs/ceph/mds_client.c
>> +++ b/fs/ceph/mds_client.c
>> @@ -2877,6 +2877,10 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
>>   		set_bit(CEPH_MDS_R_GOT_SAFE, &req->r_req_flags);
>>   		__unregister_request(mdsc, req);
>>   
>> +		/* last request during umount? */
>> +		if (mdsc->stopping && !__get_oldest_req(mdsc))
>> +			complete_all(&mdsc->safe_umount_waiters);
>> +
>>   		if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) {
>>   			/*
>>   			 * We already handled the unsafe response, now do the
>> @@ -2887,9 +2891,6 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
>>   			 */
>>   			dout("got safe reply %llu, mds%d\n", tid, mds);
>>   
>> -			/* last unsafe request during umount? */
>> -			if (mdsc->stopping && !__get_oldest_req(mdsc))
>> -				complete_all(&mdsc->safe_umount_waiters);
>>   			mutex_unlock(&mdsc->mutex);
>>   			goto out;
>>   		}
> Looks reasonable. AIUI, the MDS is free to send a safe reply without
> ever sending an unsafe one,

Yeah, that's the same from the test output.


>   so I don't see why we want to make that
> conditional on receiving an earlier unsafe reply.

 From the commit history I couldn't get why.

Thanks.

BRs

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 163b470f3000..39f4d8501df5 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2877,6 +2877,10 @@  static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
 		set_bit(CEPH_MDS_R_GOT_SAFE, &req->r_req_flags);
 		__unregister_request(mdsc, req);
 
+		/* last request during umount? */
+		if (mdsc->stopping && !__get_oldest_req(mdsc))
+			complete_all(&mdsc->safe_umount_waiters);
+
 		if (test_bit(CEPH_MDS_R_GOT_UNSAFE, &req->r_req_flags)) {
 			/*
 			 * We already handled the unsafe response, now do the
@@ -2887,9 +2891,6 @@  static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg)
 			 */
 			dout("got safe reply %llu, mds%d\n", tid, mds);
 
-			/* last unsafe request during umount? */
-			if (mdsc->stopping && !__get_oldest_req(mdsc))
-				complete_all(&mdsc->safe_umount_waiters);
 			mutex_unlock(&mdsc->mutex);
 			goto out;
 		}

ceph: fix possible long time wait during umount

Commit Message

Comments

Patch