From patchwork Thu Dec 28 00:47:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Lyle X-Patchwork-Id: 10134339 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8A212605B4 for ; Thu, 28 Dec 2017 00:48:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C8F62CE1C for ; Thu, 28 Dec 2017 00:48:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 717B12CFCC; Thu, 28 Dec 2017 00:48:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 103552CE1C for ; Thu, 28 Dec 2017 00:48:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752836AbdL1AsK (ORCPT ); Wed, 27 Dec 2017 19:48:10 -0500 Received: from mail-pl0-f66.google.com ([209.85.160.66]:43384 "EHLO mail-pl0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752641AbdL1AsJ (ORCPT ); Wed, 27 Dec 2017 19:48:09 -0500 Received: by mail-pl0-f66.google.com with SMTP id z5so20261599plo.10 for ; Wed, 27 Dec 2017 16:48:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lyle-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DWc8IJx1hseLAhdpbvdXIAHC/y0AA2L1KpDDMnOPMC4=; b=KlcM5fhqTMMAzzpN/6qXhc1lrsP1EoildBiMJDble4BG1ipTWBHOLKosunL4mBIK7o TpFt0JoTPeNth/GDp3KZTqWzyNVW2/JvkJlyelZqEOYaprQ/+9UYx/4+BJD43mfxHkgu cxElCvIEQhdi5TJgCb5HM25LuKlUxL8LyT2BiryHeud253MsiMT9BZZR/DezBL8IQKPq b/kxsZL0fVSujF/6M5cnnTdsYqVa3vbtMDdVO1OcmV2gXcbqzakqdoboIAiErh55Tah9 7R+WvGb/Kt5xokBfTokxuvz+SuHotSHHLQJc/YvHoDn2SfgEi6O1NKkXPxdLoGQjRQTH 8WSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DWc8IJx1hseLAhdpbvdXIAHC/y0AA2L1KpDDMnOPMC4=; b=sMLfwQRs65cbs+d+Ct+bN5SfoPxF0v4I/NO4VuZdxU454KhnABjze1XFu4BqkrCrQH AWxBdhvZ7I+6XWflTOf63U1HZhcvQJymnobmzF7apZYFT6q5l92Oo3SGFxIb2tvOnUS6 jkWIL2ojbKGskd3cGiteJHwkhqAzu/vrUZx3SGCA1FaDZcC/5VureAxAFV0/C1tt2iRT pYeTx7ybvv3KPRAMyYrAqGJ5eWQ18gnkhM643GaCZlzHXiwrXC4bOg2QgQDC5FxUJi9Z ZN50aWMe5m6zOM5Z+fVU9swaQDJQFnEqicHid2qq9i9+s0pciVFcgJWfc0ZXzLLFotDC eqPg== X-Gm-Message-State: AKGB3mLNSwfVp75cEW55ue5YGf4Ub0Nsh5dUDUTqbLxLliMC57wX7i7G I8dEVDjThR4j4p8gXRdXAjSqu1l5 X-Google-Smtp-Source: ACJfBosPYO6mgbTzBJ/z9km6GLvHG8CbEoMhwRF9mhXINM4sFv0XnaxIAxgjnzayD5PjTbBswer/Qw== X-Received: by 10.84.236.71 with SMTP id h7mr30030267pln.93.1514422088581; Wed, 27 Dec 2017 16:48:08 -0800 (PST) Received: from midnight.lan (2600-6c52-6200-383d-a0f8-4aea-fac9-9f39.dhcp6.chtrptr.net. [2600:6c52:6200:383d:a0f8:4aea:fac9:9f39]) by smtp.gmail.com with ESMTPSA id y5sm68236061pfa.128.2017.12.27.16.48.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 Dec 2017 16:48:08 -0800 (PST) From: Michael Lyle To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org Cc: Michael Lyle Subject: [for-416 PATCH 2/3] bcache: writeback: properly order backing device IO Date: Wed, 27 Dec 2017 16:47:43 -0800 Message-Id: <20171228004744.3522-2-mlyle@lyle.org> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20171228004744.3522-1-mlyle@lyle.org> References: <20171228004744.3522-1-mlyle@lyle.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Writeback keys are presently iterated and dispatched for writeback in order of the logical block address on the backing device. Multiple may be, in parallel, read from the cache device and then written back (especially when there are contiguous I/O). However-- there was no guarantee with the existing code that the writes would be issued in LBA order, as the reads from the cache device are often re-ordered. In turn, when writing back quickly, the backing disk often has to seek backwards-- this slows writeback and increases utilization. This patch introduces an ordering mechanism that guarantees that the original order of issue is maintained for the write portion of the I/O. Performance for writeback is significantly improved when there are multiple contiguous keys or high writeback rates. Signed-off-by: Michael Lyle --- drivers/md/bcache/bcache.h | 8 ++++++++ drivers/md/bcache/writeback.c | 29 +++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 1784e50eb857..3be0fcc19b1f 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -330,6 +330,14 @@ struct cached_dev { struct keybuf writeback_keys; + /* + * Order the write-half of writeback operations strongly in dispatch + * order. (Maintain LBA order; don't allow reads completing out of + * order to re-order the writes...) + */ + struct closure_waitlist writeback_ordering_wait; + atomic_t writeback_sequence_next; + /* For tracking sequential IO */ #define RECENT_IO_BITS 7 #define RECENT_IO (1 << RECENT_IO_BITS) diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 4e4836c6e7cf..4084586d5991 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -130,6 +130,7 @@ static unsigned writeback_delay(struct cached_dev *dc, unsigned sectors) struct dirty_io { struct closure cl; struct cached_dev *dc; + uint16_t sequence; struct bio bio; }; @@ -208,6 +209,27 @@ static void write_dirty(struct closure *cl) { struct dirty_io *io = container_of(cl, struct dirty_io, cl); struct keybuf_key *w = io->bio.bi_private; + struct cached_dev *dc = io->dc; + + uint16_t next_sequence; + + if (atomic_read(&dc->writeback_sequence_next) != io->sequence) { + /* Not our turn to write; wait for a write to complete */ + closure_wait(&dc->writeback_ordering_wait, cl); + + if (atomic_read(&dc->writeback_sequence_next) == io->sequence) { + /* + * Edge case-- it happened in indeterminate order + * relative to when we were added to wait list.. + */ + closure_wake_up(&dc->writeback_ordering_wait); + } + + continue_at(cl, write_dirty, io->dc->writeback_write_wq); + return; + } + + next_sequence = io->sequence + 1; /* * IO errors are signalled using the dirty bit on the key. @@ -225,6 +247,9 @@ static void write_dirty(struct closure *cl) closure_bio_submit(&io->bio, cl); } + atomic_set(&dc->writeback_sequence_next, next_sequence); + closure_wake_up(&dc->writeback_ordering_wait); + continue_at(cl, write_dirty_finish, io->dc->writeback_write_wq); } @@ -269,7 +294,10 @@ static void read_dirty(struct cached_dev *dc) int nk, i; struct dirty_io *io; struct closure cl; + uint16_t sequence = 0; + BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list)); + atomic_set(&dc->writeback_sequence_next, sequence); closure_init_stack(&cl); /* @@ -330,6 +358,7 @@ static void read_dirty(struct cached_dev *dc) w->private = io; io->dc = dc; + io->sequence = sequence++; dirty_init(w); bio_set_op_attrs(&io->bio, REQ_OP_READ, 0);