From patchwork Thu Apr 6 21:02:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 9668427 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1226960364 for ; Thu, 6 Apr 2017 21:02:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A8F1285DE for ; Thu, 6 Apr 2017 21:02:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0FA4C285E3; Thu, 6 Apr 2017 21:02:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8D496285DE for ; Thu, 6 Apr 2017 21:02:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752173AbdDFVCb (ORCPT ); Thu, 6 Apr 2017 17:02:31 -0400 Received: from mail-qt0-f195.google.com ([209.85.216.195]:33235 "EHLO mail-qt0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755288AbdDFVCX (ORCPT ); Thu, 6 Apr 2017 17:02:23 -0400 Received: by mail-qt0-f195.google.com with SMTP id r45so7384034qte.0 for ; Thu, 06 Apr 2017 14:02:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=rGNFPfu0fvVJ5SXOTuWcdJCljl+pU2tMspSLj9EDGuo=; b=u7SOdivRVYpxUNq55Uzii7jZ6J/X/99+akNoFTc/Gh0t5td3CoPfs9/RbyhUQ0Gncp Dd4ABuxR4rVBzrbbDosUvLmnP9RjkEvlqN3XzbOAPbyeXwYHOy7gbRtDBTbiB4v6cUEY 6dVakeXcv93iO7CLUY9rML00EHBsAH/aqc9MnPSLYJBkHojMjpAm0Uxn3GRW8bYQEGjb Jg57IohhwqO9XGbpozTvtpMPbmFF+H1nie260sPq7oII9PxLko3X3qMDoSonONOc3zLy eB95rMtv+Fqel7qweu0SEHeGzowhtYYErahoiPtz90FPMF0xc0UEUXhIFqnK43Nv7+Mx snHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=rGNFPfu0fvVJ5SXOTuWcdJCljl+pU2tMspSLj9EDGuo=; b=X/F9IXfbULK9c9xpvt9K31Rk5ReHGh7OWgYwnXAAA/oq0KZgyyTlQKJDARaY0o7AJD +bK7uXGOsUPAzNPUyXVNaGxGJS/xYHPf9sziDeTL+Wdo4irusuJY4tNycSN951yqGjcn OGE13wcvHecT3SOcq3yBPZEhqXj8H3NQEJpMjhlc4JbZnn0oSEpJHK0Z65qWuy101Vu0 jzKh1JNgYb82EKXRL3D76mu3gEDzLp8btdHwzEx/yvlYNhyf2+1u0xzFmBN2rYjG275o +vYQ7fI6aDOIpfrghDjzGaDugrg8fnt61LE0OOf+N3b3C8L3/TU9eNqWpmls6F+jqY3e 7lfA== X-Gm-Message-State: AFeK/H1UiSBl9RFwHzGBf4ukEmw7M0Yh0qZuF5RVfSl/XjvBVZkrZF7z0OPiwFlsB7FAvw== X-Received: by 10.237.63.78 with SMTP id q14mr39721070qtf.266.1491512541604; Thu, 06 Apr 2017 14:02:21 -0700 (PDT) Received: from localhost (cpe-2606-A000-4381-1201-225-22FF-FEB3-E51A.dyn6.twc.com. [2606:a000:4381:1201:225:22ff:feb3:e51a]) by smtp.gmail.com with ESMTPSA id h6sm1709856qkd.56.2017.04.06.14.02.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 06 Apr 2017 14:02:21 -0700 (PDT) From: Josef Bacik X-Google-Original-From: Josef Bacik To: axboe@kernel.dk, nbd-general@lists.sourceforge.net, linux-block@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 09/12] nbd: handle dead connections Date: Thu, 6 Apr 2017 17:02:04 -0400 Message-Id: <1491512527-4286-10-git-send-email-jbacik@fb.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491512527-4286-1-git-send-email-jbacik@fb.com> References: <1491512527-4286-1-git-send-email-jbacik@fb.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Sometimes we like to upgrade our server without making all of our clients freak out and reconnect. This patch provides a way to specify a dead connection timeout to allow us to pause all requests and wait for new connections to be opened. With this in place I can take down the nbd server for less than the dead connection timeout time and bring it back up and everything resumes gracefully. Signed-off-by: Josef Bacik --- drivers/block/nbd.c | 63 +++++++++++++++++++++++++++++++++++++--- include/uapi/linux/nbd-netlink.h | 1 + 2 files changed, 60 insertions(+), 4 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 70c5e75..fd3d535 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -77,9 +77,12 @@ struct link_dead_args { struct nbd_config { u32 flags; unsigned long runtime_flags; + u64 dead_conn_timeout; struct nbd_sock **socks; int num_connections; + atomic_t live_connections; + wait_queue_head_t conn_wait; atomic_t recv_threads; wait_queue_head_t recv_wq; @@ -178,8 +181,10 @@ static void nbd_mark_nsock_dead(struct nbd_device *nbd, struct nbd_sock *nsock, queue_work(system_wq, &args->work); } } - if (!nsock->dead) + if (!nsock->dead) { kernel_sock_shutdown(nsock->sock, SHUT_RDWR); + atomic_dec(&nbd->config->live_connections); + } nsock->dead = true; nsock->pending = NULL; nsock->sent = 0; @@ -257,6 +262,14 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, return BLK_EH_HANDLED; } + /* If we are waiting on our dead timer then we could get timeout + * callbacks for our request. For this we just want to reset the timer + * and let the queue side take care of everything. + */ + if (!completion_done(&cmd->send_complete)) { + nbd_config_put(nbd); + return BLK_EH_RESET_TIMER; + } config = nbd->config; if (config->num_connections > 1) { @@ -665,6 +678,19 @@ static int find_fallback(struct nbd_device *nbd, int index) return new_index; } +static int wait_for_reconnect(struct nbd_device *nbd) +{ + struct nbd_config *config = nbd->config; + if (!config->dead_conn_timeout) + return 0; + if (test_bit(NBD_DISCONNECTED, &config->runtime_flags)) + return 0; + wait_event_interruptible_timeout(config->conn_wait, + atomic_read(&config->live_connections), + config->dead_conn_timeout); + return atomic_read(&config->live_connections); +} + static int nbd_handle_cmd(struct nbd_cmd *cmd, int index) { struct request *req = blk_mq_rq_from_pdu(cmd); @@ -691,12 +717,24 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index) nsock = config->socks[index]; mutex_lock(&nsock->tx_lock); if (nsock->dead) { + int old_index = index; index = find_fallback(nbd, index); + mutex_unlock(&nsock->tx_lock); if (index < 0) { - ret = -EIO; - goto out; + if (wait_for_reconnect(nbd)) { + index = old_index; + goto again; + } + /* All the sockets should already be down at this point, + * we just want to make sure that DISCONNECTED is set so + * any requests that come in that were queue'ed waiting + * for the reconnect timer don't trigger the timer again + * and instead just error out. + */ + sock_shutdown(nbd); + nbd_config_put(nbd); + return -EIO; } - mutex_unlock(&nsock->tx_lock); goto again; } @@ -809,6 +847,7 @@ static int nbd_add_socket(struct nbd_device *nbd, unsigned long arg, nsock->sent = 0; nsock->cookie = 0; socks[config->num_connections++] = nsock; + atomic_inc(&config->live_connections); return 0; } @@ -860,6 +899,9 @@ static int nbd_reconnect_socket(struct nbd_device *nbd, unsigned long arg) * need to queue_work outside of the tx_mutex. */ queue_work(recv_workqueue, &args->work); + + atomic_inc(&config->live_connections); + wake_up(&config->conn_wait); return 0; } sockfd_put(sock); @@ -1137,7 +1179,9 @@ static struct nbd_config *nbd_alloc_config(void) return NULL; atomic_set(&config->recv_threads, 0); init_waitqueue_head(&config->recv_wq); + init_waitqueue_head(&config->conn_wait); config->blksize = 1024; + atomic_set(&config->live_connections, 0); try_module_get(THIS_MODULE); return config; } @@ -1449,6 +1493,7 @@ static struct nla_policy nbd_attr_policy[NBD_ATTR_MAX + 1] = { [NBD_ATTR_SERVER_FLAGS] = { .type = NLA_U64 }, [NBD_ATTR_CLIENT_FLAGS] = { .type = NLA_U64 }, [NBD_ATTR_SOCKETS] = { .type = NLA_NESTED}, + [NBD_ATTR_DEAD_CONN_TIMEOUT] = { .type = NLA_U64 }, }; static struct nla_policy nbd_sock_policy[NBD_SOCK_MAX + 1] = { @@ -1535,6 +1580,11 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info) nbd->tag_set.timeout = timeout * HZ; blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); } + if (info->attrs[NBD_ATTR_DEAD_CONN_TIMEOUT]) { + config->dead_conn_timeout = + nla_get_u64(info->attrs[NBD_ATTR_DEAD_CONN_TIMEOUT]); + config->dead_conn_timeout *= HZ; + } if (info->attrs[NBD_ATTR_SERVER_FLAGS]) config->flags = nla_get_u64(info->attrs[NBD_ATTR_SERVER_FLAGS]); @@ -1655,6 +1705,11 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info) nbd->tag_set.timeout = timeout * HZ; blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); } + if (info->attrs[NBD_ATTR_DEAD_CONN_TIMEOUT]) { + config->dead_conn_timeout = + nla_get_u64(info->attrs[NBD_ATTR_DEAD_CONN_TIMEOUT]); + config->dead_conn_timeout *= HZ; + } if (info->attrs[NBD_ATTR_SOCKETS]) { struct nlattr *attr; diff --git a/include/uapi/linux/nbd-netlink.h b/include/uapi/linux/nbd-netlink.h index b69105cc..c2209c75 100644 --- a/include/uapi/linux/nbd-netlink.h +++ b/include/uapi/linux/nbd-netlink.h @@ -32,6 +32,7 @@ enum { NBD_ATTR_SERVER_FLAGS, NBD_ATTR_CLIENT_FLAGS, NBD_ATTR_SOCKETS, + NBD_ATTR_DEAD_CONN_TIMEOUT, __NBD_ATTR_MAX, }; #define NBD_ATTR_MAX (__NBD_ATTR_MAX - 1)