From patchwork Wed Oct 12 14:03:28 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 9373243 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9A35760839 for ; Wed, 12 Oct 2016 14:05:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B86629E12 for ; Wed, 12 Oct 2016 14:05:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8055F29E14; Wed, 12 Oct 2016 14:05:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B9E729E12 for ; Wed, 12 Oct 2016 14:05:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755041AbcJLOEx (ORCPT ); Wed, 12 Oct 2016 10:04:53 -0400 Received: from mail-lf0-f67.google.com ([209.85.215.67]:34552 "EHLO mail-lf0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754998AbcJLOEi (ORCPT ); Wed, 12 Oct 2016 10:04:38 -0400 Received: by mail-lf0-f67.google.com with SMTP id x23so1585678lfi.1 for ; Wed, 12 Oct 2016 07:04:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=M5PK2TA1xtG0mfna+88EivxU4L2jkATgA5z1THdMx3g=; b=Gr4wqrwu4SwaZf9435whaLEa9529JjENQM3mA5SKTNte5o5sNpeJPyIgCfBK5qOw5U Le+w6oz9YDre+FIlfcep2+vlIpmOrFoiqZ+MBYRjrik/v6WwyKus2+A+PaCSaREx9zgH 5fDqfgTJ+N/PzTgNJShEi28udgerUE5QanmTezrJVnxSw8/31HD749K9wnr9dtPURRm6 QMAQxaLyw/ILbkhvSIJ32SzEtfWPaWySLDiBSYndlvTmtwoXpFF5EfYPZIamnX8asQPT jvPgyrRD5aQ7gcpy8zdrKJPm1wd731brtRWMsk5HBnqm8TZRgLdKCNgN7UcaJDz1LEqU SSKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=M5PK2TA1xtG0mfna+88EivxU4L2jkATgA5z1THdMx3g=; b=ClfzoHXXLJjmLhRiZwWWX4NKZdMKol30//u4LYtS4UasQsb6CDA9ksFpwyRzSsrt73 p+R/p/WMkHPt2bZAAjWoEiKJRCwr78Ozalnesasw+D9s/u90BYh/knccft1LPewz0oGI MaJrNv94I0Y3jDKWQe4LXxcdSFosjzoc1pSesBx22IfCVg3lepCLp1AYyZ72hrGh5C7b e1vH4tJhIdIBydnGqNpvYbuHIisZggU9o0vXnSAJVWFlEgkz8oXK6rbAQba/eyJu6Fg3 UFSjyKLJ9xm2MVvqKGvoktMyC/5+GQu2aZsicH0HXpnv6m7e0LlJBpzuDnlTNlEvAOyP Lqtg== X-Gm-Message-State: AA6/9Rkz1yC5wrpfxaLJZax0OdBRGFdOE6MXu3mWYBIBhEgHI7Uvgf89ZSoeS/kXMBLusQ== X-Received: by 10.194.243.104 with SMTP id wx8mr1707560wjc.229.1476281044093; Wed, 12 Oct 2016 07:04:04 -0700 (PDT) Received: from localhost.localdomain.com (ip-78-102-108-116.net.upcbroadband.cz. [78.102.108.116]) by smtp.gmail.com with ESMTPSA id ce6sm13146369wjc.27.2016.10.12.07.04.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Oct 2016 07:04:03 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Mike Christie Subject: [PATCH] rbd: don't wait for the lock forever if blacklisted Date: Wed, 12 Oct 2016 16:03:28 +0200 Message-Id: <1476281008-30556-1-git-send-email-idryomov@gmail.com> X-Mailer: git-send-email 2.4.3 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP -EBLACKLISTED from __rbd_register_watch() means that our ceph_client got blacklisted - we won't be able to restore the watch and reacquire the lock. Wake up and fail all outstanding requests waiting for the lock and arrange for all new requests that require the lock to fail immediately. Signed-off-by: Ilya Dryomov Tested-by: Mike Christie --- drivers/block/rbd.c | 50 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 33 insertions(+), 17 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 7a9792cb6ef1..fbc025a87261 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -415,15 +415,15 @@ struct rbd_device { }; /* - * Flag bits for rbd_dev->flags. If atomicity is required, - * rbd_dev->lock is used to protect access. - * - * Currently, only the "removing" flag (which is coupled with the - * "open_count" field) requires atomic access. + * Flag bits for rbd_dev->flags: + * - REMOVING (which is coupled with rbd_dev->open_count) is protected + * by rbd_dev->lock + * - BLACKLISTED is protected by rbd_dev->lock_rwsem */ enum rbd_dev_flags { RBD_DEV_FLAG_EXISTS, /* mapped snapshot has not been deleted */ RBD_DEV_FLAG_REMOVING, /* this mapping is being removed */ + RBD_DEV_FLAG_BLACKLISTED, /* our ceph_client is blacklisted */ }; static DEFINE_MUTEX(client_mutex); /* Serialize client creation */ @@ -3926,6 +3926,7 @@ static void rbd_reregister_watch(struct work_struct *work) struct rbd_device *rbd_dev = container_of(to_delayed_work(work), struct rbd_device, watch_dwork); bool was_lock_owner = false; + bool need_to_wake = false; int ret; dout("%s rbd_dev %p\n", __func__, rbd_dev); @@ -3935,19 +3936,27 @@ static void rbd_reregister_watch(struct work_struct *work) was_lock_owner = rbd_release_lock(rbd_dev); mutex_lock(&rbd_dev->watch_mutex); - if (rbd_dev->watch_state != RBD_WATCH_STATE_ERROR) - goto fail_unlock; + if (rbd_dev->watch_state != RBD_WATCH_STATE_ERROR) { + mutex_unlock(&rbd_dev->watch_mutex); + goto out; + } ret = __rbd_register_watch(rbd_dev); if (ret) { rbd_warn(rbd_dev, "failed to reregister watch: %d", ret); - if (ret != -EBLACKLISTED) + if (ret == -EBLACKLISTED) { + set_bit(RBD_DEV_FLAG_BLACKLISTED, &rbd_dev->flags); + need_to_wake = true; + } else { queue_delayed_work(rbd_dev->task_wq, &rbd_dev->watch_dwork, RBD_RETRY_DELAY); - goto fail_unlock; + } + mutex_unlock(&rbd_dev->watch_mutex); + goto out; } + need_to_wake = true; rbd_dev->watch_state = RBD_WATCH_STATE_REGISTERED; rbd_dev->watch_cookie = rbd_dev->watch_handle->linger_id; mutex_unlock(&rbd_dev->watch_mutex); @@ -3963,13 +3972,10 @@ static void rbd_reregister_watch(struct work_struct *work) ret); } +out: up_write(&rbd_dev->lock_rwsem); - wake_requests(rbd_dev, true); - return; - -fail_unlock: - mutex_unlock(&rbd_dev->watch_mutex); - up_write(&rbd_dev->lock_rwsem); + if (need_to_wake) + wake_requests(rbd_dev, true); } /* @@ -4074,7 +4080,9 @@ static void rbd_wait_state_locked(struct rbd_device *rbd_dev) up_read(&rbd_dev->lock_rwsem); schedule(); down_read(&rbd_dev->lock_rwsem); - } while (rbd_dev->lock_state != RBD_LOCK_STATE_LOCKED); + } while (rbd_dev->lock_state != RBD_LOCK_STATE_LOCKED && + !test_bit(RBD_DEV_FLAG_BLACKLISTED, &rbd_dev->flags)); + finish_wait(&rbd_dev->lock_waitq, &wait); } @@ -4166,8 +4174,16 @@ static void rbd_queue_workfn(struct work_struct *work) if (must_be_locked) { down_read(&rbd_dev->lock_rwsem); - if (rbd_dev->lock_state != RBD_LOCK_STATE_LOCKED) + if (rbd_dev->lock_state != RBD_LOCK_STATE_LOCKED && + !test_bit(RBD_DEV_FLAG_BLACKLISTED, &rbd_dev->flags)) rbd_wait_state_locked(rbd_dev); + + WARN_ON((rbd_dev->lock_state == RBD_LOCK_STATE_LOCKED) ^ + !test_bit(RBD_DEV_FLAG_BLACKLISTED, &rbd_dev->flags)); + if (test_bit(RBD_DEV_FLAG_BLACKLISTED, &rbd_dev->flags)) { + result = -EBLACKLISTED; + goto err_unlock; + } } img_request = rbd_img_request_create(rbd_dev, offset, length, op_type,