From patchwork Mon Jul 16 16:11:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 10527159 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 26D08601D2 for ; Mon, 16 Jul 2018 16:11:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D1D6289B6 for ; Mon, 16 Jul 2018 16:11:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1110C28FEA; Mon, 16 Jul 2018 16:11:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A750289B6 for ; Mon, 16 Jul 2018 16:11:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729927AbeGPQjo (ORCPT ); Mon, 16 Jul 2018 12:39:44 -0400 Received: from mail-qk0-f194.google.com ([209.85.220.194]:35988 "EHLO mail-qk0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727405AbeGPQjo (ORCPT ); Mon, 16 Jul 2018 12:39:44 -0400 Received: by mail-qk0-f194.google.com with SMTP id a132-v6so20752252qkg.3 for ; Mon, 16 Jul 2018 09:11:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id; bh=szUPXXuEmOlY4wFwJjwAlm8ugMQjeN7l7YZgiGBXCEc=; b=DQPwUF0YAlNIMfE35PaHR9VZohaJ0kd0ysD878ARkwCjxfrvVQ/XfZ//CRpsZ1w98m yG+8k+agyhXFvql0iyj1FXVhBlu0sIVlMOOtanNK9tGZ/KXzHysh7oSRv1UzbkuU3ekF 3u0l0eqRysFKfmMoNCwC0kjdzQ/Phdw7GDe+OPPOkj/stph6qusW+1o0uUpQHRz4WKBA /ZomIKWJEB1ZtEStW09fifxi4WETBD5BtxYXZmuYD24bZJhuZJJrxv7u6vhcy/8PkE3r qb0GKfAd11pHMTtz0Z/Wp+HVTOfBhNj4BTJmDhV51sJrMdishFgt+utLagoJmRksKi9b kbBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=szUPXXuEmOlY4wFwJjwAlm8ugMQjeN7l7YZgiGBXCEc=; b=DCFqZJnVx2HwWOOrfROm4CyTs9ruPSjXgCshqPkduxdG+q8V0rysW4+L0xh7OBd5HY 3L0ZRTvRAXOKUYlCF5walcGIGX3Qys9RePmWCvGhX2bp9PZ0jp37+n1GWo5aCjgrqcyk /dV/a7b8wfzKJfjvs5TlAgK0kQ5JHwo6qcDpu0iqF8315e9275rahZqCoCVk16ZP9WoE ru7y5+Noan5Ceuj3ZTl8Y4MzU8hbWKp+RK+MSbfWqJFlERBg0YmJ4F2cc6bb+KWptIHp hDZkGn4aIPTJrj7QQAu+B9C5dh0PqBt9cXd0Qb+qyPZqIOo1Bhl8ubtvErsoyGlCooWr c9yQ== X-Gm-Message-State: AOUpUlFsNMy0f2P62rLxp7tSjMH/ahuJCERJm/+wVGwdUNmo/WCnrF7C oX8SexPIDyjnH+I2ejZ64n+KCQ== X-Google-Smtp-Source: AAOMgpdT8+q8BWYy05DZ8l1Fok9XbmQ7a4RrTjaZ2G8LLtydmeYzyitsxxDL5idnGpZBbbHGTIGA4w== X-Received: by 2002:a37:ab14:: with SMTP id u20-v6mr14417324qke.120.1531757497574; Mon, 16 Jul 2018 09:11:37 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id o71-v6sm28887310qki.55.2018.07.16.09.11.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 16 Jul 2018 09:11:36 -0700 (PDT) From: Josef Bacik To: axboe@kernel.dk, nbd@other.debian.org, linux-block@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/2] nbd: don't requeue the same request twice. Date: Mon, 16 Jul 2018 12:11:34 -0400 Message-Id: <20180716161136.21222-1-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We can race with the snd timeout and the per-request timeout and end up requeuing the same request twice. We can't use the send_complete completion to tell if everything is ok because we hold the tx_lock during send, so the timeout stuff will block waiting to mark the socket dead, and we could be marked complete and still requeue. Instead add a flag to the socket so we know whether we've been requeued yet. Signed-off-by: Josef Bacik --- drivers/block/nbd.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 74a05561b620..f8cf7d4cca7f 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -112,12 +112,15 @@ struct nbd_device { struct task_struct *task_setup; }; +#define NBD_CMD_REQUEUED 1 + struct nbd_cmd { struct nbd_device *nbd; int index; int cookie; struct completion send_complete; blk_status_t status; + unsigned long flags; }; #if IS_ENABLED(CONFIG_DEBUG_FS) @@ -146,6 +149,14 @@ static inline struct device *nbd_to_dev(struct nbd_device *nbd) return disk_to_dev(nbd->disk); } +static void nbd_requeue_cmd(struct nbd_cmd *cmd) +{ + struct request *req = blk_mq_rq_from_pdu(cmd); + + if (!test_and_set_bit(NBD_CMD_REQUEUED, &cmd->flags)) + blk_mq_requeue_request(req, true); +} + static const char *nbdcmd_to_ascii(int cmd) { switch (cmd) { @@ -343,7 +354,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, nbd_mark_nsock_dead(nbd, nsock, 1); mutex_unlock(&nsock->tx_lock); } - blk_mq_requeue_request(req, true); + nbd_requeue_cmd(cmd); nbd_config_put(nbd); return BLK_EH_DONE; } @@ -500,6 +511,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index) nsock->pending = req; nsock->sent = sent; } + set_bit(NBD_CMD_REQUEUED, &cmd->flags); return BLK_STS_RESOURCE; } dev_err_ratelimited(disk_to_dev(nbd->disk), @@ -541,6 +553,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index) */ nsock->pending = req; nsock->sent = sent; + set_bit(NBD_CMD_REQUEUED, &cmd->flags); return BLK_STS_RESOURCE; } dev_err(disk_to_dev(nbd->disk), @@ -805,7 +818,7 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index) */ blk_mq_start_request(req); if (unlikely(nsock->pending && nsock->pending != req)) { - blk_mq_requeue_request(req, true); + nbd_requeue_cmd(cmd); ret = 0; goto out; } @@ -818,7 +831,7 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index) dev_err_ratelimited(disk_to_dev(nbd->disk), "Request send failed, requeueing\n"); nbd_mark_nsock_dead(nbd, nsock, 1); - blk_mq_requeue_request(req, true); + nbd_requeue_cmd(cmd); ret = 0; } out: @@ -843,6 +856,7 @@ static blk_status_t nbd_queue_rq(struct blk_mq_hw_ctx *hctx, * done sending everything over the wire. */ init_completion(&cmd->send_complete); + clear_bit(NBD_CMD_REQUEUED, &cmd->flags); /* We can be called directly from the user space process, which means we * could possibly have signals pending so our sendmsg will fail. In @@ -1460,6 +1474,7 @@ static int nbd_init_request(struct blk_mq_tag_set *set, struct request *rq, { struct nbd_cmd *cmd = blk_mq_rq_to_pdu(rq); cmd->nbd = set->driver_data; + cmd->flags = 0; return 0; }