From patchwork Fri Jul 17 10:36:40 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 6814811 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 4C660C05AC for ; Fri, 17 Jul 2015 10:37:00 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 36A8320672 for ; Fri, 17 Jul 2015 10:36:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 27C7820555 for ; Fri, 17 Jul 2015 10:36:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757709AbbGQKg4 (ORCPT ); Fri, 17 Jul 2015 06:36:56 -0400 Received: from mail-wg0-f43.google.com ([74.125.82.43]:33431 "EHLO mail-wg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757588AbbGQKgz (ORCPT ); Fri, 17 Jul 2015 06:36:55 -0400 Received: by wgmn9 with SMTP id n9so78998081wgm.0 for ; Fri, 17 Jul 2015 03:36:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=iSA1HxE9EkJCvBsNvxs3/3FTFz00R84vF++MjFbxbiA=; b=HBLmqOo2p4ZFdu2bCZVUHkEJAeCblTdctEXOpIYkDFznTFq3rgHmQU+FB+DKTb7196 DRK+ZPGCjxlJ23o7PKExkRIAEnNxX3TTovxp3ou7/4s0VBTB5JLWfCRmEydupXUdFUef h6g7S+tL6Yiq/zdkreSYDaFXKEw+MRgIivAXUd8kdAfQxHtZBMwvXifNGsOM9Yl4Vzmk JVfVNWKORc/ZBi0Y00ZwwlSxKB77l3QmDh2OBoHrxB0YfwA8AF3Sfi/iHWW15fz0DVtr RF4Nq8rKzE4w+OwMRPxwNAqRG1Z7EaV8KlcqqqToPM2RmUiykqU3Ndc370lo6SA/4WB5 pvhQ== X-Received: by 10.180.24.198 with SMTP id w6mr15003934wif.49.1437129414514; Fri, 17 Jul 2015 03:36:54 -0700 (PDT) Received: from orange.local.localdomain ([109.110.66.238]) by smtp.gmail.com with ESMTPSA id ex8sm17704286wjc.34.2015.07.17.03.36.53 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 17 Jul 2015 03:36:53 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Alex Elder Subject: [PATCH] rbd: fix copyup completion race Date: Fri, 17 Jul 2015 13:36:40 +0300 Message-Id: <1437129400-12392-1-git-send-email-idryomov@gmail.com> X-Mailer: git-send-email 1.9.3 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-8.0 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For write/discard obj_requests that involved a copyup method call, the opcode of the first op is CEPH_OSD_OP_CALL and the ->callback is rbd_img_obj_copyup_callback(). The latter frees copyup pages, sets ->xferred and delegates to rbd_img_obj_callback(), the "normal" image object callback, for reporting to block layer and putting refs. rbd_osd_req_callback() however treats CEPH_OSD_OP_CALL as a trivial op, which means obj_request is marked done in rbd_osd_trivial_callback(), *before* ->callback is invoked and rbd_img_obj_copyup_callback() has a chance to run. Marking obj_request done essentially means giving rbd_img_obj_callback() a license to end it at any moment, so if another obj_request from the same img_request is being completed concurrently, rbd_img_obj_end_request() may very well be called on such prematurally marked done request: handle_reply() rbd_osd_req_callback() rbd_osd_trivial_callback() rbd_obj_request_complete() rbd_img_obj_copyup_callback() rbd_img_obj_callback() handle_reply() rbd_osd_req_callback() rbd_osd_trivial_callback() for_each_obj_request(obj_request->img_request) { rbd_img_obj_end_request(obj_request-1/2) rbd_img_obj_end_request(obj_request-2/2) <-- } Calling rbd_img_obj_end_request() on such a request leads to trouble, in particular because its ->xfferred is 0. We report 0 to the block layer with blk_update_request(), get back 1 for "this request has more data in flight" and then trip on rbd_assert(more ^ (which == img_request->obj_request_count)); with rhs (which == ...) being 1 because rbd_img_obj_end_request() has been called for both requests and lhs (more) being 1 because we haven't got a chance to set ->xfferred in rbd_img_obj_copyup_callback() yet. To fix this, leverage that rbd wants to call class methods in only two cases: one is a generic method call wrapper (obj_request is standalone) and the other is a copyup (obj_request is part of an img_request). So make a dedicated handler for CEPH_OSD_OP_CALL and directly invoke rbd_img_obj_copyup_callback() from it if obj_request is part of an img_request, similar to how CEPH_OSD_OP_READ handler invokes rbd_img_obj_request_read_callback(). Cc: Alex Elder Cc: stable@vger.kernel.org # 3.10+, needs backporting for < 3.18 Signed-off-by: Ilya Dryomov Reviewed-by: Alex Elder --- drivers/block/rbd.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index d94529d5c8e9..007e5d0cadaf 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -523,6 +523,7 @@ void rbd_warn(struct rbd_device *rbd_dev, const char *fmt, ...) # define rbd_assert(expr) ((void) 0) #endif /* !RBD_DEBUG */ +static void rbd_img_obj_copyup_callback(struct rbd_obj_request *obj_request); static int rbd_img_obj_request_submit(struct rbd_obj_request *obj_request); static void rbd_img_parent_read(struct rbd_obj_request *obj_request); static void rbd_dev_remove_parent(struct rbd_device *rbd_dev); @@ -1818,6 +1819,16 @@ static void rbd_osd_stat_callback(struct rbd_obj_request *obj_request) obj_request_done_set(obj_request); } +static void rbd_osd_call_callback(struct rbd_obj_request *obj_request) +{ + dout("%s: obj %p\n", __func__, obj_request); + + if (obj_request_img_data_test(obj_request)) + rbd_img_obj_copyup_callback(obj_request); + else + obj_request_done_set(obj_request); +} + static void rbd_osd_req_callback(struct ceph_osd_request *osd_req, struct ceph_msg *msg) { @@ -1866,6 +1877,8 @@ static void rbd_osd_req_callback(struct ceph_osd_request *osd_req, rbd_osd_discard_callback(obj_request); break; case CEPH_OSD_OP_CALL: + rbd_osd_call_callback(obj_request); + break; case CEPH_OSD_OP_NOTIFY_ACK: case CEPH_OSD_OP_WATCH: rbd_osd_trivial_callback(obj_request); @@ -2537,6 +2550,8 @@ rbd_img_obj_copyup_callback(struct rbd_obj_request *obj_request) struct page **pages; u32 page_count; + dout("%s: obj %p\n", __func__, obj_request); + rbd_assert(obj_request->type == OBJ_REQUEST_BIO || obj_request->type == OBJ_REQUEST_NODATA); rbd_assert(obj_request_img_data_test(obj_request)); @@ -2563,9 +2578,7 @@ rbd_img_obj_copyup_callback(struct rbd_obj_request *obj_request) if (!obj_request->result) obj_request->xferred = obj_request->length; - /* Finish up with the normal image object callback */ - - rbd_img_obj_callback(obj_request); + obj_request_done_set(obj_request); } static void @@ -2650,7 +2663,6 @@ rbd_img_obj_parent_read_full_callback(struct rbd_img_request *img_request) /* All set, send it off. */ - orig_request->callback = rbd_img_obj_copyup_callback; osdc = &rbd_dev->rbd_client->client->osdc; img_result = rbd_obj_request_submit(osdc, orig_request); if (!img_result)