From patchwork Thu Oct 12 20:11:16 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Doug Anderson <dianders@chromium.org>
X-Patchwork-Id: 10002757
Return-Path: 
 <linux-rockchip-bounces+patchwork-linux-rockchip=patchwork.kernel.org@lists.infradead.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	D32F760216 for <patchwork-linux-rockchip@patchwork.kernel.org>;
	Thu, 12 Oct 2017 20:12:09 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4FD128E98
	for <patchwork-linux-rockchip@patchwork.kernel.org>;
	Thu, 12 Oct 2017 20:12:09 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id B9AD628EA0; Thu, 12 Oct 2017 20:12:09 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1
Received: from bombadil.infradead.org (bombadil.infradead.org
	[65.50.211.133])
	(using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 215CF28E98
	for <patchwork-linux-rockchip@patchwork.kernel.org>;
	Thu, 12 Oct 2017 20:12:09 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=lists.infradead.org; s=bombadil.20170209; h=Sender:
	Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe:
	List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:References:
	In-Reply-To:Message-Id:Date:Subject:To:From:Reply-To:Content-ID:
	Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
	:Resent-Message-ID:List-Owner;
	bh=rO0bSf4B9cynxQiSGkuqTRNS3LW4TTfHM/ELS5DyIUQ=;
	b=NyKQdTebNH8F2q5OoWQZk/W7m5
	tnflkTHZTYsip87rgnwPAwsPrBzVr/JEaEMtYSlqDMR8rxPPTvBj9cmdkFtSCN9yUL+VeaxBIbHWC
	9Kl/odmKessGxdRQ6NtvZlWCbt460H2ZOHbnOFcZKzPc/heMH+ocoUr7ibpiHPiVE/uYZ8oC1aEFV
	pZpAfMu3YWREjbC0GEIbgwxagUEkf6GUTBDV/+br/gntjuH/GP23gMrUPafWHdFN6ec1NVfueqh5P
	IAJ/dXt64dbW0fnU1MLQ0GjwBgQkato91BZYZBXoOQwAu+b75EjAgY5tG7RfM0N/A1F96aR/7JlH9
	36owVZ4w==;
Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux))
	id 1e2jpw-0002vh-GV; Thu, 12 Oct 2017 20:12:08 +0000
Received: from mail-pf0-x230.google.com ([2607:f8b0:400e:c00::230])
	by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux))
	id 1e2jpr-0002oB-Ir for linux-rockchip@lists.infradead.org;
	Thu, 12 Oct 2017 20:12:06 +0000
Received: by mail-pf0-x230.google.com with SMTP id a8so6001760pfc.0
	for <linux-rockchip@lists.infradead.org>;
	Thu, 12 Oct 2017 13:11:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org;
	s=google;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=MnI1n7s4UvvLziAZGWhi4jjAklv2RzyM5QSnYKQY9Yc=;
	b=MZ2gDd5fmlhWtUWTN975sNMQFLaMNU+rzo7J/euTp8zGR6LSP4CERBaFDFwjuzwvP2
	h3mBhBccumgx8JMhxM4cOZT6ERXYD1ktKyg8Rd+zPHMouwPYz+9VhqGgsOHFCyH+6XGw
	WsDFUnl54OgGHemSAbWSc2K3tVVbNZ7WZ26TY=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=MnI1n7s4UvvLziAZGWhi4jjAklv2RzyM5QSnYKQY9Yc=;
	b=LbM2d642O3Qjw7U14kpWXPlK+WI2+yS/L6/24aheWsFodvqPGJjGZI33/52te5GCzG
	3zeH9cebFvdH/r7p+CbbAM9zWbKmF88WdH6/OAMSo6kBLNDC+J92XsGgJlUahqX9l49V
	ZkC6yqYYkxRiRohrXKyOf+R8BG6p73RYhr/IhCO1OR/I4a18pmXzIXt7JHOYRvtEKvD6
	1kzQcdzsDo/5d+H03qBzvz2L/dAGXaalPS4OBdko62av8QvfboB/GdR2CnfXJxwkENFc
	Li0mEjO5S8sXZd0BKDwteg4bcxXE7R/5jRUsu0B+rkL/SDCUIFrQKkZc0kUKCHDOjfz0
	hJrw==
X-Gm-Message-State: AMCzsaXmXVodTW7xX5KKaffhKc5ZSyinssgl3FKG5EB9Nv4+2xOa+Zg7
	8vul5QfcbhV4a3CD1T+9gfdo5jGmxCI=
X-Google-Smtp-Source: 
 AOwi7QBLslLvp614jbroUXAi87hA0RlbOcJQON+WbYRW4FgVCCkg91S23JG8SGdrrrNKtBTu3dcdig==
X-Received: by 10.98.246.24 with SMTP id x24mr3024417pfh.223.1507839102753;
	Thu, 12 Oct 2017 13:11:42 -0700 (PDT)
Received: from tictac.mtv.corp.google.com ([172.22.112.154])
	by smtp.gmail.com with ESMTPSA id
	l3sm29749635pgn.36.2017.10.12.13.11.41
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Thu, 12 Oct 2017 13:11:41 -0700 (PDT)
From: Douglas Anderson <dianders@chromium.org>
To: jh80.chung@samsung.com, ulf.hansson@linaro.org, shawn.lin@rock-chips.com
Subject: [PATCH v2 3/5] mmc: dw_mmc: Add locking to the CTO timer
Date: Thu, 12 Oct 2017 13:11:16 -0700
Message-Id: <20171012201118.23570-4-dianders@chromium.org>
X-Mailer: git-send-email 2.15.0.rc0.271.g36b669edcc-goog
In-Reply-To: <20171012201118.23570-1-dianders@chromium.org>
References: <20171012201118.23570-1-dianders@chromium.org>
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20171012_131203_798296_69099D85 
X-CRM114-Status: GOOD (  22.93  )
X-BeenThere: linux-rockchip@lists.infradead.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: Upstream kernel work for Rockchip platforms
	<linux-rockchip.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-rockchip>,
	<mailto:linux-rockchip-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-rockchip/>
List-Post: <mailto:linux-rockchip@lists.infradead.org>
List-Help: <mailto:linux-rockchip-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-rockchip>,
	<mailto:linux-rockchip-request@lists.infradead.org?subject=subscribe>
Cc: linux-samsung-soc@vger.kernel.org, kernel@esmil.dk,
	briannorris@chromium.org, xzy.xu@rock-chips.com,
	linux-mmc@vger.kernel.org,
	Douglas Anderson <dianders@chromium.org>, linux-kernel@vger.kernel.org,
	linux-rockchip@lists.infradead.org, amstan@chromium.org
MIME-Version: 1.0
Sender: "Linux-rockchip" <linux-rockchip-bounces@lists.infradead.org>
Errors-To: 
 linux-rockchip-bounces+patchwork-linux-rockchip=patchwork.kernel.org@lists.infradead.org
X-Virus-Scanned: ClamAV using ClamSMTP

This attempts to instill a bit of paranoia to the code dealing with
the CTO timer.  It's believed that this will make the CTO timer more
robust in the case that we're having very long interrupt latencies.

Note that I originally thought that perhaps this patch was being
overly paranoid and wasn't really needed, but then while I was running
mmc_test on an rk3399 board I saw one instance of the message:
  dwmmc_rockchip fe320000.dwmmc: Unexpected interrupt latency

I had debug prints in the CTO timer code and I found that it was
running CMD 13 at the time.

...so even though this patch seems like it might be overly paranoid,
maybe it really isn't?

Presumably the bad interrupt latency experienced was due to the fact
that I had serial console enabled as serial console is typically where
I place blame when I see absurdly large interrupt latencies.  In this
particular case there was an (unrelated) printout to the serial
console just before I saw the "Unexpected interrupt latency" printout.

...and actually, I managed to even reproduce the problems by running
"iw mlan0 scan > /dev/null" while mmc_test was running.  That not only
does a bunch of PCIe traffic but it also (on my system) outputs some
SELinux log spam.

Fixes: 03de19212ea3 ("mmc: dw_mmc: introduce timer for broken command transfer over scheme")
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v2:
- Removed extra "int i"

 drivers/mmc/host/dw_mmc.c | 91 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 81 insertions(+), 10 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
index 16516c528a88..50148991f30e 100644
--- a/drivers/mmc/host/dw_mmc.c
+++ b/drivers/mmc/host/dw_mmc.c
@@ -403,6 +403,7 @@ static inline void dw_mci_set_cto(struct dw_mci *host)
 	unsigned int cto_clks;
 	unsigned int cto_div;
 	unsigned int cto_ms;
+	unsigned long irqflags;
 
 	cto_clks = mci_readl(host, TMOUT) & 0xff;
 	cto_div = (mci_readl(host, CLKDIV) & 0xff) * 2;
@@ -413,8 +414,24 @@ static inline void dw_mci_set_cto(struct dw_mci *host)
 	/* add a bit spare time */
 	cto_ms += 10;
 
-	mod_timer(&host->cto_timer,
-		  jiffies + msecs_to_jiffies(cto_ms) + 1);
+	/*
+	 * The durations we're working with are fairly short so we have to be
+	 * extra careful about synchronization here.  Specifically in hardware a
+	 * command timeout is _at most_ 5.1 ms, so that means we expect an
+	 * interrupt (either command done or timeout) to come rather quickly
+	 * after the mci_writel.  ...but just in case we have a long interrupt
+	 * latency let's add a bit of paranoia.
+	 *
+	 * In general we'll assume that at least an interrupt will be asserted
+	 * in hardware by the time the cto_timer runs.  ...and if it hasn't
+	 * been asserted in hardware by that time then we'll assume it'll never
+	 * come.
+	 */
+	spin_lock_irqsave(&host->irq_lock, irqflags);
+	if (!test_bit(EVENT_CMD_COMPLETE, &host->pending_events))
+		mod_timer(&host->cto_timer,
+			jiffies + msecs_to_jiffies(cto_ms) + 1);
+	spin_unlock_irqrestore(&host->irq_lock, irqflags);
 }
 
 static void dw_mci_start_command(struct dw_mci *host,
@@ -429,11 +446,11 @@ static void dw_mci_start_command(struct dw_mci *host,
 	wmb(); /* drain writebuffer */
 	dw_mci_wait_while_busy(host, cmd_flags);
 
+	mci_writel(host, CMD, cmd_flags | SDMMC_CMD_START);
+
 	/* response expected command only */
 	if (cmd_flags & SDMMC_CMD_RESP_EXP)
 		dw_mci_set_cto(host);
-
-	mci_writel(host, CMD, cmd_flags | SDMMC_CMD_START);
 }
 
 static inline void send_stop_abort(struct dw_mci *host, struct mmc_data *data)
@@ -1930,6 +1947,24 @@ static void dw_mci_set_drto(struct dw_mci *host)
 	mod_timer(&host->dto_timer, jiffies + msecs_to_jiffies(drto_ms));
 }
 
+static bool dw_mci_clear_pending_cmd_complete(struct dw_mci *host)
+{
+	if (!test_bit(EVENT_CMD_COMPLETE, &host->pending_events))
+		return false;
+
+	/*
+	 * Really be certain that the timer has stopped.  This is a bit of
+	 * paranoia and could only really happen if we had really bad
+	 * interrupt latency and the interrupt routine and timeout were
+	 * running concurrently so that the del_timer() in the interrupt
+	 * handler couldn't run.
+	 */
+	WARN_ON(del_timer_sync(&host->cto_timer));
+	clear_bit(EVENT_CMD_COMPLETE, &host->pending_events);
+
+	return true;
+}
+
 static void dw_mci_tasklet_func(unsigned long priv)
 {
 	struct dw_mci *host = (struct dw_mci *)priv;
@@ -1956,8 +1991,7 @@ static void dw_mci_tasklet_func(unsigned long priv)
 
 		case STATE_SENDING_CMD11:
 		case STATE_SENDING_CMD:
-			if (!test_and_clear_bit(EVENT_CMD_COMPLETE,
-						&host->pending_events))
+			if (!dw_mci_clear_pending_cmd_complete(host))
 				break;
 
 			cmd = host->cmd;
@@ -2126,8 +2160,7 @@ static void dw_mci_tasklet_func(unsigned long priv)
 			/* fall through */
 
 		case STATE_SENDING_STOP:
-			if (!test_and_clear_bit(EVENT_CMD_COMPLETE,
-						&host->pending_events))
+			if (!dw_mci_clear_pending_cmd_complete(host))
 				break;
 
 			/* CMD error in data command */
@@ -2600,6 +2633,7 @@ static irqreturn_t dw_mci_interrupt(int irq, void *dev_id)
 	struct dw_mci *host = dev_id;
 	u32 pending;
 	struct dw_mci_slot *slot = host->slot;
+	unsigned long irqflags;
 
 	pending = mci_readl(host, MINTSTS); /* read-only mask reg */
 
@@ -2607,8 +2641,6 @@ static irqreturn_t dw_mci_interrupt(int irq, void *dev_id)
 		/* Check volt switch first, since it can look like an error */
 		if ((host->state == STATE_SENDING_CMD11) &&
 		    (pending & SDMMC_INT_VOLT_SWITCH)) {
-			unsigned long irqflags;
-
 			mci_writel(host, RINTSTS, SDMMC_INT_VOLT_SWITCH);
 			pending &= ~SDMMC_INT_VOLT_SWITCH;
 
@@ -2624,11 +2656,15 @@ static irqreturn_t dw_mci_interrupt(int irq, void *dev_id)
 		}
 
 		if (pending & DW_MCI_CMD_ERROR_FLAGS) {
+			spin_lock_irqsave(&host->irq_lock, irqflags);
+
 			del_timer(&host->cto_timer);
 			mci_writel(host, RINTSTS, DW_MCI_CMD_ERROR_FLAGS);
 			host->cmd_status = pending;
 			smp_wmb(); /* drain writebuffer */
 			set_bit(EVENT_CMD_COMPLETE, &host->pending_events);
+
+			spin_unlock_irqrestore(&host->irq_lock, irqflags);
 		}
 
 		if (pending & DW_MCI_DATA_ERROR_FLAGS) {
@@ -2668,8 +2704,12 @@ static irqreturn_t dw_mci_interrupt(int irq, void *dev_id)
 		}
 
 		if (pending & SDMMC_INT_CMD_DONE) {
+			spin_lock_irqsave(&host->irq_lock, irqflags);
+
 			mci_writel(host, RINTSTS, SDMMC_INT_CMD_DONE);
 			dw_mci_cmd_interrupt(host, pending);
+
+			spin_unlock_irqrestore(&host->irq_lock, irqflags);
 		}
 
 		if (pending & SDMMC_INT_CD) {
@@ -2943,7 +2983,35 @@ static void dw_mci_cmd11_timer(unsigned long arg)
 static void dw_mci_cto_timer(unsigned long arg)
 {
 	struct dw_mci *host = (struct dw_mci *)arg;
+	unsigned long irqflags;
+	u32 pending;
+
+	spin_lock_irqsave(&host->irq_lock, irqflags);
 
+	/*
+	 * If somehow we have very bad interrupt latency it's remotely possible
+	 * that the timer could fire while the interrupt is still pending or
+	 * while the interrupt is midway through running.  Let's be paranoid
+	 * and detect those two cases.  Note that this is paranoia is somewhat
+	 * justified because in this function we don't actually cancel the
+	 * pending command in the controller--we just assume it will never come.
+	 */
+	pending = mci_readl(host, MINTSTS); /* read-only mask reg */
+	if (pending & (DW_MCI_CMD_ERROR_FLAGS | SDMMC_INT_CMD_DONE)) {
+		/* The interrupt should fire; no need to act but we can warn */
+		dev_warn(host->dev, "Unexpected interrupt latency\n");
+		goto exit;
+	}
+	if (test_bit(EVENT_CMD_COMPLETE, &host->pending_events)) {
+		/* Presumably interrupt handler couldn't delete the timer */
+		dev_warn(host->dev, "CTO timeout when already completed\n");
+		goto exit;
+	}
+
+	/*
+	 * Continued paranoia to make sure we're in the state we expect.
+	 * This paranoia isn't really justified but it seems good to be safe.
+	 */
 	switch (host->state) {
 	case STATE_SENDING_CMD11:
 	case STATE_SENDING_CMD:
@@ -2962,6 +3030,9 @@ static void dw_mci_cto_timer(unsigned long arg)
 			 host->state);
 		break;
 	}
+
+exit:
+	spin_unlock_irqrestore(&host->irq_lock, irqflags);
 }
 
 static void dw_mci_dto_timer(unsigned long arg)