diff mbox

[4/4] rbd: don't drop watch requests on completion

Message ID 51043F71.3050208@inktank.com (mailing list archive)
State New, archived
Headers show

Commit Message

Alex Elder Jan. 26, 2013, 8:41 p.m. UTC
The new request code arranges to get a callback for every osd
request we submit (this was not the case previously).

We register a lingering object watch request for the header object
for each mapped rbd image.

If a connection problem occurs, the osd client will re-submit
lingering requests.  And each time such a request is re-submitted,
its callback function will get called again.

We therefore need to ensure the object request associated with the
lingering osd request stays valid, and the way to do that is to have
an extra reference to the lingering osd request.

So when a request to initiate a watch has completed, do not drop a
reference as one normally would.  Instead, hold off dropping that
reference until the request to tear down that watch request is done.

Also, only set the rbd device's watch_request pointer after the
watch request has been completed successfully, and clear the
pointer once it's been torn down.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

 	ret = -ENOMEM;
@@ -1735,32 +1736,44 @@ static int rbd_dev_header_watch_sync(struct
rbd_device *rbd_dev, int start)
 	if (!obj_request->osd_req)
 		goto out_cancel;

-	if (start) {
+	if (start)
 		ceph_osdc_set_request_linger(osdc, obj_request->osd_req);
-		rbd_dev->watch_request = obj_request;
-	} else {
+	else
 		ceph_osdc_unregister_linger_request(osdc,
 					rbd_dev->watch_request->osd_req);
-		rbd_dev->watch_request = NULL;
-	}
 	ret = rbd_obj_request_submit(osdc, obj_request);
 	if (ret)
 		goto out_cancel;
 	ret = rbd_obj_request_wait(obj_request);
 	if (ret)
 		goto out_cancel;
-
 	ret = obj_request->result;
 	if (ret)
 		goto out_cancel;

-	if (start)
-		goto done;	/* Done if setting up the watch request */
+	/*
+	 * Since a watch request is set to linger the osd client
+	 * will hang onto it in case it needs to be re-sent in the
+	 * event of connection loss.  If we're initiating the watch
+	 * we therefore do *not* want to drop our reference to the
+	 * object request now; we'll effectively transfer ownership
+	 * of it to the osd client instead.  Instead, we'll drop
+	 * that reference when the watch request gets torn down.
+	 */
+	if (start) {
+		rbd_dev->watch_request = obj_request;
+
+		return 0;
+	}
+
+	/* We have successfully torn down the watch request */
+
+	rbd_obj_request_put(rbd_dev->watch_request);
+	rbd_dev->watch_request = NULL;
 out_cancel:
 	/* Cancel the event if we're tearing down, or on error */
 	ceph_osdc_cancel_event(rbd_dev->watch_event);
 	rbd_dev->watch_event = NULL;
-done:
 	if (obj_request)
 		rbd_obj_request_put(obj_request);

Comments

Josh Durgin Jan. 30, 2013, 7:43 p.m. UTC | #1
On 01/26/2013 12:41 PM, Alex Elder wrote:
> The new request code arranges to get a callback for every osd
> request we submit (this was not the case previously).
>
> We register a lingering object watch request for the header object
> for each mapped rbd image.
>
> If a connection problem occurs, the osd client will re-submit
> lingering requests.  And each time such a request is re-submitted,
> its callback function will get called again.

I think this should be fixed in the osd_client - rbd should only get
the callback once, when the watch is first registered. Later we
could add a separate callback to handle re-registration if we need to.

> We therefore need to ensure the object request associated with the
> lingering osd request stays valid, and the way to do that is to have
> an extra reference to the lingering osd request.
>
> So when a request to initiate a watch has completed, do not drop a
> reference as one normally would.  Instead, hold off dropping that
> reference until the request to tear down that watch request is done.
>
> Also, only set the rbd device's watch_request pointer after the
> watch request has been completed successfully, and clear the
> pointer once it's been torn down.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   31 ++++++++++++++++++++++---------
>   1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 340773f..177ba0c 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -1716,6 +1716,7 @@ static int rbd_dev_header_watch_sync(struct
> rbd_device *rbd_dev, int start)
>   						&rbd_dev->watch_event);
>   		if (ret < 0)
>   			return ret;
> +		rbd_assert(rbd_dev->watch_event != NULL);
>   	}
>
>   	ret = -ENOMEM;
> @@ -1735,32 +1736,44 @@ static int rbd_dev_header_watch_sync(struct
> rbd_device *rbd_dev, int start)
>   	if (!obj_request->osd_req)
>   		goto out_cancel;
>
> -	if (start) {
> +	if (start)
>   		ceph_osdc_set_request_linger(osdc, obj_request->osd_req);
> -		rbd_dev->watch_request = obj_request;
> -	} else {
> +	else
>   		ceph_osdc_unregister_linger_request(osdc,
>   					rbd_dev->watch_request->osd_req);
> -		rbd_dev->watch_request = NULL;
> -	}
>   	ret = rbd_obj_request_submit(osdc, obj_request);
>   	if (ret)
>   		goto out_cancel;
>   	ret = rbd_obj_request_wait(obj_request);
>   	if (ret)
>   		goto out_cancel;
> -
>   	ret = obj_request->result;
>   	if (ret)
>   		goto out_cancel;
>
> -	if (start)
> -		goto done;	/* Done if setting up the watch request */
> +	/*
> +	 * Since a watch request is set to linger the osd client
> +	 * will hang onto it in case it needs to be re-sent in the
> +	 * event of connection loss.  If we're initiating the watch
> +	 * we therefore do *not* want to drop our reference to the
> +	 * object request now; we'll effectively transfer ownership
> +	 * of it to the osd client instead.  Instead, we'll drop
> +	 * that reference when the watch request gets torn down.
> +	 */
> +	if (start) {
> +		rbd_dev->watch_request = obj_request;
> +
> +		return 0;
> +	}
> +
> +	/* We have successfully torn down the watch request */
> +
> +	rbd_obj_request_put(rbd_dev->watch_request);
> +	rbd_dev->watch_request = NULL;
>   out_cancel:
>   	/* Cancel the event if we're tearing down, or on error */
>   	ceph_osdc_cancel_event(rbd_dev->watch_event);
>   	rbd_dev->watch_event = NULL;
> -done:
>   	if (obj_request)
>   		rbd_obj_request_put(obj_request);
>

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Elder Jan. 30, 2013, 8:38 p.m. UTC | #2
On 01/30/2013 01:43 PM, Josh Durgin wrote:
> On 01/26/2013 12:41 PM, Alex Elder wrote:
>> The new request code arranges to get a callback for every osd
>> request we submit (this was not the case previously).
>>
>> We register a lingering object watch request for the header object
>> for each mapped rbd image.
>>
>> If a connection problem occurs, the osd client will re-submit
>> lingering requests.  And each time such a request is re-submitted,
>> its callback function will get called again.
> 
> I think this should be fixed in the osd_client - rbd should only get
> the callback once, when the watch is first registered. Later we
> could add a separate callback to handle re-registration if we need to.

I agree.  Even so, I would like to maintain a reference
to this lingering object request as is done in this patch.
I think it makes sense even if we'll never get another
callback.

I would like to therefore address the multiple callback
from the osd client as a separate issue.  If I update
the comments here accordingly, and open a tracker issue
for the other thing, would that be OK with you?

					-Alex

>> We therefore need to ensure the object request associated with the
>> lingering osd request stays valid, and the way to do that is to have
>> an extra reference to the lingering osd request.
>>
>> So when a request to initiate a watch has completed, do not drop a
>> reference as one normally would.  Instead, hold off dropping that
>> reference until the request to tear down that watch request is done.
>>
>> Also, only set the rbd device's watch_request pointer after the
>> watch request has been completed successfully, and clear the
>> pointer once it's been torn down.
>>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>>   drivers/block/rbd.c |   31 ++++++++++++++++++++++---------
>>   1 file changed, 22 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>> index 340773f..177ba0c 100644
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -1716,6 +1716,7 @@ static int rbd_dev_header_watch_sync(struct
>> rbd_device *rbd_dev, int start)
>>                           &rbd_dev->watch_event);
>>           if (ret < 0)
>>               return ret;
>> +        rbd_assert(rbd_dev->watch_event != NULL);
>>       }
>>
>>       ret = -ENOMEM;
>> @@ -1735,32 +1736,44 @@ static int rbd_dev_header_watch_sync(struct
>> rbd_device *rbd_dev, int start)
>>       if (!obj_request->osd_req)
>>           goto out_cancel;
>>
>> -    if (start) {
>> +    if (start)
>>           ceph_osdc_set_request_linger(osdc, obj_request->osd_req);
>> -        rbd_dev->watch_request = obj_request;
>> -    } else {
>> +    else
>>           ceph_osdc_unregister_linger_request(osdc,
>>                       rbd_dev->watch_request->osd_req);
>> -        rbd_dev->watch_request = NULL;
>> -    }
>>       ret = rbd_obj_request_submit(osdc, obj_request);
>>       if (ret)
>>           goto out_cancel;
>>       ret = rbd_obj_request_wait(obj_request);
>>       if (ret)
>>           goto out_cancel;
>> -
>>       ret = obj_request->result;
>>       if (ret)
>>           goto out_cancel;
>>
>> -    if (start)
>> -        goto done;    /* Done if setting up the watch request */
>> +    /*
>> +     * Since a watch request is set to linger the osd client
>> +     * will hang onto it in case it needs to be re-sent in the
>> +     * event of connection loss.  If we're initiating the watch
>> +     * we therefore do *not* want to drop our reference to the
>> +     * object request now; we'll effectively transfer ownership
>> +     * of it to the osd client instead.  Instead, we'll drop
>> +     * that reference when the watch request gets torn down.
>> +     */
>> +    if (start) {
>> +        rbd_dev->watch_request = obj_request;
>> +
>> +        return 0;
>> +    }
>> +
>> +    /* We have successfully torn down the watch request */
>> +
>> +    rbd_obj_request_put(rbd_dev->watch_request);
>> +    rbd_dev->watch_request = NULL;
>>   out_cancel:
>>       /* Cancel the event if we're tearing down, or on error */
>>       ceph_osdc_cancel_event(rbd_dev->watch_event);
>>       rbd_dev->watch_event = NULL;
>> -done:
>>       if (obj_request)
>>           rbd_obj_request_put(obj_request);
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josh Durgin Jan. 30, 2013, 8:39 p.m. UTC | #3
On 01/30/2013 12:38 PM, Alex Elder wrote:
> On 01/30/2013 01:43 PM, Josh Durgin wrote:
>> On 01/26/2013 12:41 PM, Alex Elder wrote:
>>> The new request code arranges to get a callback for every osd
>>> request we submit (this was not the case previously).
>>>
>>> We register a lingering object watch request for the header object
>>> for each mapped rbd image.
>>>
>>> If a connection problem occurs, the osd client will re-submit
>>> lingering requests.  And each time such a request is re-submitted,
>>> its callback function will get called again.
>>
>> I think this should be fixed in the osd_client - rbd should only get
>> the callback once, when the watch is first registered. Later we
>> could add a separate callback to handle re-registration if we need to.
>
> I agree.  Even so, I would like to maintain a reference
> to this lingering object request as is done in this patch.
> I think it makes sense even if we'll never get another
> callback.
>
> I would like to therefore address the multiple callback
> from the osd client as a separate issue.  If I update
> the comments here accordingly, and open a tracker issue
> for the other thing, would that be OK with you?

That's fine with me.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Elder Jan. 30, 2013, 9:54 p.m. UTC | #4
On 01/30/2013 02:39 PM, Josh Durgin wrote:
> On 01/30/2013 12:38 PM, Alex Elder wrote:
>> On 01/30/2013 01:43 PM, Josh Durgin wrote:
>>> On 01/26/2013 12:41 PM, Alex Elder wrote:
>>>> The new request code arranges to get a callback for every osd
>>>> request we submit (this was not the case previously).
>>>>
>>>> We register a lingering object watch request for the header object
>>>> for each mapped rbd image.
>>>>
>>>> If a connection problem occurs, the osd client will re-submit
>>>> lingering requests.  And each time such a request is re-submitted,
>>>> its callback function will get called again.
>>>
>>> I think this should be fixed in the osd_client - rbd should only get
>>> the callback once, when the watch is first registered. Later we
>>> could add a separate callback to handle re-registration if we need to.
>>
>> I agree.  Even so, I would like to maintain a reference
>> to this lingering object request as is done in this patch.
>> I think it makes sense even if we'll never get another
>> callback.
>>
>> I would like to therefore address the multiple callback
>> from the osd client as a separate issue.  If I update
>> the comments here accordingly, and open a tracker issue
>> for the other thing, would that be OK with you?
> 
> That's fine with me.

http://tracker.ceph.com/issues/3967

					-Alex

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 340773f..177ba0c 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1716,6 +1716,7 @@  static int rbd_dev_header_watch_sync(struct
rbd_device *rbd_dev, int start)
 						&rbd_dev->watch_event);
 		if (ret < 0)
 			return ret;
+		rbd_assert(rbd_dev->watch_event != NULL);
 	}