From patchwork Mon May 1 17:19:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 9706741 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CE61B60387 for ; Mon, 1 May 2017 17:25:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B206827C0B for ; Mon, 1 May 2017 17:25:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A583E28066; Mon, 1 May 2017 17:25:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3D21627C0B for ; Mon, 1 May 2017 17:25:44 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id D8D5E21A04830; Mon, 1 May 2017 10:25:44 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id E14B021A04823 for ; Mon, 1 May 2017 10:25:43 -0700 (PDT) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2017 10:25:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,401,1488873600"; d="scan'208";a="255580135" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.125]) by fmsmga004.fm.intel.com with ESMTP; 01 May 2017 10:25:40 -0700 Subject: [PATCH] libnvdimm: restore "libnvdimm: band aid btt vs clear poison locking" From: Dan Williams To: linux-nvdimm@lists.01.org Date: Mon, 01 May 2017 10:19:48 -0700 Message-ID: <149365918855.23471.2493940794453999099.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP This continues the 4.11 status quo of disabling of error clearing from the BTT I/O path. Toshi found that even though we have eliminated all the libnvdimm sources of sleeping-while-atomic triggers, we still have sleeping operations that will occur in the path to send the ACPI DSM to the DIMM to clear the error: BUG: sleeping function called from invalid context at mm/slab.h:432 in_atomic(): 1, irqs_disabled(): 0, pid: 13353, name: dd Call Trace: dump_stack+0x86/0xc3 ___might_sleep+0x17d/0x250 __might_sleep+0x4a/0x80 __kmalloc+0x1c0/0x2e0 acpi_os_allocate_zeroed+0x2d/0x2f acpi_evaluate_object+0x59/0x3b1 acpi_evaluate_dsm+0xbd/0x10c acpi_nfit_ctl+0x1ef/0x7c0 [nfit] ? nsio_rw_bytes+0x152/0x280 nvdimm_clear_poison+0x77/0x140 nsio_rw_bytes+0x18f/0x280 btt_write_pg+0x1d4/0x3d0 [nd_btt] btt_make_request+0x119/0x2d0 [nd_btt] A solution for tracking and handling media errors natively in the BTT is needed. Cc: Jeff Moyer Cc: Dave Jiang Cc: Vishal Verma Reported-by: Toshi Kani Signed-off-by: Dan Williams --- drivers/nvdimm/claim.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/nvdimm/claim.c b/drivers/nvdimm/claim.c index 35b210dc1e56..6945e35058bf 100644 --- a/drivers/nvdimm/claim.c +++ b/drivers/nvdimm/claim.c @@ -250,7 +250,16 @@ static int nsio_rw_bytes(struct nd_namespace_common *ndns, } if (unlikely(is_bad_pmem(&nsio->bb, sector, sz_align))) { - if (IS_ALIGNED(offset, 512) && IS_ALIGNED(size, 512)) { + /* + * FIXME: nsio_rw_bytes() may be called from atomic + * context in the btt case and the ACPI DSM path for + * clearing the error takes sleeping locks and allocates + * memory. An explicit error clearing path, and support + * for tracking badblocks in BTT metadata is needed to + * work around this collision. + */ + if (IS_ALIGNED(offset, 512) && IS_ALIGNED(size, 512) + && (!ndns->claim || !is_nd_btt(ndns->claim))) { long cleared; cleared = nvdimm_clear_poison(&ndns->dev,