From patchwork Thu Sep 21 19:54:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 9964767 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EF4B6602D8 for ; Thu, 21 Sep 2017 19:55:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DF12C296A5 for ; Thu, 21 Sep 2017 19:55:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D0B89296A7; Thu, 21 Sep 2017 19:55:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8F0C296A5 for ; Thu, 21 Sep 2017 19:55:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751795AbdIUTzF (ORCPT ); Thu, 21 Sep 2017 15:55:05 -0400 Received: from mx2.suse.de ([195.135.220.15]:60554 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751697AbdIUTzF (ORCPT ); Thu, 21 Sep 2017 15:55:05 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 59E3BAC0E; Thu, 21 Sep 2017 19:55:03 +0000 (UTC) From: Coly Li To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org Cc: Coly Li , Kent Overstreet , Michael Lyle , Nix , Kai Krakow , Eric Wheeler , Junhui Tang , stable@vger.kernel.org Subject: [PATCHv3] bcache: only permit to recovery read error when cache device is clean Date: Fri, 22 Sep 2017 03:54:58 +0800 Message-Id: <20170921195458.53890-1-colyli@suse.de> X-Mailer: git-send-email 2.13.5 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When bcache does read I/Os, for example in writeback or writethrough mode, if a read request on cache device is failed, bcache will try to recovery the request by reading from cached device. If the data on cached device is not synced with cache device, then requester will get a stale data. For critical storage system like database, providing stale data from recovery may result an application level data corruption, which is unacceptible. With this patch, for a failed read request in writeback or writethrough mode, recovery a recoverable read request only happens when cache device is clean. That is to say, all data on cached device is up to update. For other cache modes in bcache, read request will never hit cached_dev_read_error(), they don't need this patch. Please note, because cache mode can be switched arbitrarily in run time, a writethrough mode might be switched from a writeback mode. Therefore checking dc->has_data in writethrough mode still makes sense. Changelog: v3: By response from Kent Oversteet, he thinks recovering stale data is a bug to fix, and option to permit it is unneccessary. So this version the sysfs file is removed. v2: rename sysfs entry from allow_stale_data_on_failure to allow_stale_data_on_failure, and fix the confusing commit log. v1: initial patch posted. Signed-off-by: Coly Li Reported-by: Arne Wolf Cc: Kent Overstreet Cc: Michael Lyle Cc: Nix Cc: Kai Krakow Cc: Eric Wheeler Cc: Junhui Tang Cc: stable@vger.kernel.org --- drivers/md/bcache/request.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index 681b4f12b05a..e7f769ff7234 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -697,8 +697,16 @@ static void cached_dev_read_error(struct closure *cl) { struct search *s = container_of(cl, struct search, cl); struct bio *bio = &s->bio.bio; + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); - if (s->recoverable) { + /* + * If cache device is dirty (dc->has_dirty is non-zero), then + * recovery a failed read request from cached device may get a + * stale data back. So read failure recovery is only permitted + * when cache device is clean. + */ + if (s->recoverable && + (dc && !atomic_read(&dc->has_dirty)) { /* Retry from the backing device: */ trace_bcache_read_retry(s->orig_bio);