From patchwork Fri Nov 24 23:14:26 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Lyle <mlyle@lyle.org>
X-Patchwork-Id: 10074671
Return-Path: <linux-block-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	BBCA260383 for <patchwork-linux-block@patchwork.kernel.org>;
	Fri, 24 Nov 2017 23:14:51 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B284D2A210
	for <patchwork-linux-block@patchwork.kernel.org>;
	Fri, 24 Nov 2017 23:14:51 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id A61ED2A213; Fri, 24 Nov 2017 23:14:51 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B5562A210
	for <patchwork-linux-block@patchwork.kernel.org>;
	Fri, 24 Nov 2017 23:14:51 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753853AbdKXXOt (ORCPT
	<rfc822;patchwork-linux-block@patchwork.kernel.org>);
	Fri, 24 Nov 2017 18:14:49 -0500
Received: from mail-pg0-f54.google.com ([74.125.83.54]:39104 "EHLO
	mail-pg0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753841AbdKXXOr (ORCPT
	<rfc822;linux-block@vger.kernel.org>);
	Fri, 24 Nov 2017 18:14:47 -0500
Received: by mail-pg0-f54.google.com with SMTP id 70so16214728pgf.6
	for <linux-block@vger.kernel.org>;
	Fri, 24 Nov 2017 15:14:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=lyle-org.20150623.gappssmtp.com; s=20150623;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=Vhi9hdQWLXOadPFJYFVWTrDLe7AC6gDwwzC1I0+wQIE=;
	b=e4BgEI7/T+gvwf/+P3PShpq3hS4GCiXDuK5DNJH9YX1yLTQmukFnEdWV3DLWODITK/
	4YHabijJ8yHXRpMBJ2MzB8wM0uL0EL10idqlbzuV6WZsVeOW6zrStqwelVyKLbPps3VR
	y+BLuK3YMMKILmzqVaTB1/3omXbU4WHuDxty8gV9Satbxt+/GET1ajbe4uHVmxXZXObI
	L2NpJlPU3A4YJgq62FQenFbmj3d+DEcp3o2F1Qwljb4/cmKaKmDnxG75EV2x29hlLer3
	QPyjGnqvL3Rt7sa+vYYEp8O5vd59HOJxc0+kUU53U2HR3rmQHdlKGetHrqdFZPsUxPon
	j29g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=Vhi9hdQWLXOadPFJYFVWTrDLe7AC6gDwwzC1I0+wQIE=;
	b=akTNlYkIFYPomDE/vjedkTVw/YKS7wjy9TD6BuX9SxOFrXzn+z1zZqKVVJRxfrwNnU
	wEO3x0/dFN+Xbmfact6Kualo6J92tNVz4KsBjMBKQSZjXDEj649HuiybwNJ4yC35RKHK
	6R4XXkGSLzNHyOz9c0oc+Q3X1YNI8AaVh+R5xT1+cVa21D0wVWkzVd1sh8GENpn7OHky
	dItUtHjAd0JNrxKmNmfPsbmSoDwoxIAoRyxG+COuZIZt5a5lCBgIfxUbcwvL6wf2pOMD
	IJHksLH9OP1HiYbf8V7SiYRfLFCbarONuA6B2rPtS3edtwVKO4h6wKgoARyDfagZLjv9
	cWdQ==
X-Gm-Message-State: AJaThX7LX+eU8J4F7JNYkal0IT/1ojTQGasndUPT2Sh6+eVPD6kK3oAN
	mzobq1tmNPVlSTCMCirnabOHIg==
X-Google-Smtp-Source: 
 AGs4zMZlnse/XpQBDVXSAhKY3xAJYoR84rqWd1ueOJwJDijDIxfefeXL1Tet7gC28b8ljkOrOBeW6A==
X-Received: by 10.98.82.3 with SMTP id g3mr28906526pfb.189.1511565287024;
	Fri, 24 Nov 2017 15:14:47 -0800 (PST)
Received: from midnight.lan (68-189-67-104.dhcp.prtv.ca.charter.com.
	[68.189.67.104]) by smtp.gmail.com with ESMTPSA id
	y83sm41020277pfd.66.2017.11.24.15.14.45
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Fri, 24 Nov 2017 15:14:46 -0800 (PST)
From: Michael Lyle <mlyle@lyle.org>
To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org
Cc: axboe@kernel.dk, Rui Hua <huarui.dev@gmail.com>,
	stable@vger.kernel.org, Michael Lyle <mlyle@lyle.org>
Subject: [PATCH 3/4] bcache: recover data from backing when data is clean
Date: Fri, 24 Nov 2017 15:14:26 -0800
Message-Id: <20171124231427.9563-4-mlyle@lyle.org>
X-Mailer: git-send-email 2.14.1
In-Reply-To: <20171124231427.9563-1-mlyle@lyle.org>
References: <20171124231427.9563-1-mlyle@lyle.org>
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

From: Rui Hua <huarui.dev@gmail.com>

When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s->iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)

It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in  /sys/fs/bcache/XXX/internal/cache_read_races.

Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in
cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.

In this patch, we use s->read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.

[edited by mlyle to fix up whitespace, commit log title, comment
spelling]

Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Cc: <stable@vger.kernel.org> # 4.14
Signed-off-by: Hua Rui <huarui.dev@gmail.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Michael Lyle <mlyle@lyle.org>
---
 drivers/md/bcache/request.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 3a7aed7282b2..643c3021624f 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -708,16 +708,15 @@ static void cached_dev_read_error(struct closure *cl)
 {
 	struct search *s = container_of(cl, struct search, cl);
 	struct bio *bio = &s->bio.bio;
-	struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
 
 	/*
-	 * If cache device is dirty (dc->has_dirty is non-zero), then
-	 * recovery a failed read request from cached device may get a
-	 * stale data back. So read failure recovery is only permitted
-	 * when cache device is clean.
+	 * If read request hit dirty data (s->read_dirty_data is true),
+	 * then recovery a failed read request from cached device may
+	 * get a stale data back. So read failure recovery is only
+	 * permitted when read request hit clean data in cache device,
+	 * or when cache read race happened.
 	 */
-	if (s->recoverable &&
-	    (dc && !atomic_read(&dc->has_dirty))) {
+	if (s->recoverable && !s->read_dirty_data) {
 		/* Retry from the backing device: */
 		trace_bcache_read_retry(s->orig_bio);