From patchwork Fri Jul  7 00:22:38 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Dan Williams <dan.j.williams@intel.com>
X-Patchwork-Id: 9829249
Return-Path: <linux-nvdimm-bounces@lists.01.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	51A28602F0 for <patchwork-linux-nvdimm@patchwork.kernel.org>;
	Fri,  7 Jul 2017 00:22:45 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 33EA125F31
	for <patchwork-linux-nvdimm@patchwork.kernel.org>;
	Fri,  7 Jul 2017 00:22:45 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 2691B28429; Fri,  7 Jul 2017 00:22:45 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE
	autolearn=ham version=3.3.1
Received: from ml01.01.org (ml01.01.org [198.145.21.10])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8B1B925F31
	for <patchwork-linux-nvdimm@patchwork.kernel.org>;
	Fri,  7 Jul 2017 00:22:43 +0000 (UTC)
Received: from [127.0.0.1] (localhost [IPv6:::1])
	by ml01.01.org (Postfix) with ESMTP id CB13821C9E7C0;
	Thu,  6 Jul 2017 17:21:01 -0700 (PDT)
X-Original-To: linux-nvdimm@lists.01.org
Delivered-To: linux-nvdimm@lists.01.org
Received: from mga07.intel.com (mga07.intel.com [134.134.136.100])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ml01.01.org (Postfix) with ESMTPS id 0486421CE740B
	for <linux-nvdimm@lists.01.org>; Thu,  6 Jul 2017 17:21:00 -0700 (PDT)
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
	by orsmga105.jf.intel.com with ESMTP; 06 Jul 2017 17:22:40 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.40,319,1496127600"; d="scan'208";a="876075685"
Received: from orsmsx109.amr.corp.intel.com ([10.22.240.7])
	by FMSMGA003.fm.intel.com with ESMTP; 06 Jul 2017 17:22:40 -0700
Received: from orsmsx161.amr.corp.intel.com (10.22.240.84) by
	ORSMSX109.amr.corp.intel.com (10.22.240.7) with Microsoft SMTP Server
	(TLS) id 14.3.319.2; Thu, 6 Jul 2017 17:22:39 -0700
Received: from orsmsx107.amr.corp.intel.com ([169.254.1.150]) by
	ORSMSX161.amr.corp.intel.com ([169.254.4.55]) with mapi id
	14.03.0319.002; Thu, 6 Jul 2017 17:22:39 -0700
From: "Williams, Dan J" <dan.j.williams@intel.com>
To: "torvalds@linux-foundation.org" <torvalds@linux-foundation.org>
Subject: [GIT PULL] libnvdimm for 4.13
Thread-Topic: [GIT PULL] libnvdimm for 4.13
Thread-Index: AQHS9rcj+vhGWWihEk2YtrCnde4lMw==
Date: Fri, 7 Jul 2017 00:22:38 +0000
Message-ID: <1499386957.6081.29.camel@intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.7.201.133]
Content-ID: <50E9FF9960AEA045A79246ECFBE9C8CB@intel.com>
MIME-Version: 1.0
X-BeenThere: linux-nvdimm@lists.01.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Linux-nvdimm developer list." <linux-nvdimm.lists.01.org>
List-Unsubscribe: <https://lists.01.org/mailman/options/linux-nvdimm>,
	<mailto:linux-nvdimm-request@lists.01.org?subject=unsubscribe>
List-Archive: <http://lists.01.org/pipermail/linux-nvdimm/>
List-Post: <mailto:linux-nvdimm@lists.01.org>
List-Help: <mailto:linux-nvdimm-request@lists.01.org?subject=help>
List-Subscribe: <https://lists.01.org/mailman/listinfo/linux-nvdimm>,
	<mailto:linux-nvdimm-request@lists.01.org?subject=subscribe>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Errors-To: linux-nvdimm-bounces@lists.01.org
Sender: "Linux-nvdimm" <linux-nvdimm-bounces@lists.01.org>
X-Virus-Scanned: ClamAV using ClamSMTP

Hi Linus, please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.13

to receive, libnvdimm updates for the latest ACPI and UEFI
specifications. This pull request also includes new 'struct
dax_operations' enabling to undo the abuse [1] of copy_user_nocache()
for copy operations to pmem. The dax work originally missed 4.12 to
address concerns raised by Al.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html

All of the commits in this pull request have appeared in one or more
-next releases with no errors reported, however, Stephen did report a
late merge conflict between nvdimm.git and the vfs.git tree. Stephen's
merge resolution is here: http://marc.info/?l=linux-kernel&m=1499064115
07301&w=2, but to match Al's changes we appear to also need the
incremental change below.

Please pull, I believe any straggling _flushcache() feedback at this
point can be fixed up post -rc1. I include commit 0aed55af8834 "x86,
uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass
operations" at the end of this message for reference.

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 2f46f8d4b508..073bb1feb0d0 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -97,6 +97,7 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
 size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i);
 bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i);
 size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i);
+size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i);
 bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i);
 
 static __always_inline __must_check
@@ -151,7 +152,14 @@ bool copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
  * IS_ENABLED(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) before assuming that the
  * destination is flushed from the cache on return.
  */
-size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i);
+static __always_inline __must_check
+size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	if (unlikely(!check_copy_size(addr, bytes, false)))
+		return bytes;
+	else
+		return _copy_from_iter_flushcache(addr, bytes, i);
+}
 #else
 static inline size_t copy_from_iter_flushcache(void *addr, size_t bytes,
 				       struct iov_iter *i)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index ee82300d98b9..0d18ede56a36 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -642,7 +642,7 @@ size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 EXPORT_SYMBOL(_copy_from_iter_nocache);
 
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
-size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
 	if (unlikely(i->type & ITER_PIPE)) {
@@ -660,7 +660,7 @@ size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
 
 	return bytes;
 }
-EXPORT_SYMBOL_GPL(copy_from_iter_flushcache);
+EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
 #endif
 
 bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)

---

The following changes since commit 87085ff2e90ecfa91f8bb0cb0ce19ea661bd6f83:

  thermal: int340x_thermal: fix compile after the UUID API switch (2017-06-09 16:37:31 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.13

for you to fetch changes up to 9d92573fff3ec70785ef1815cc80573f70e7a921:

  Merge branch 'for-4.13/dax' into libnvdimm-for-next (2017-07-03 16:54:58 -0700)

----------------------------------------------------------------
libnvdimm for 4.13

* Introduce the _flushcache() family of memory copy helpers and use them
  for persistent memory write operations on x86. The _flushcache()
  semantic indicates that the cache is either bypassed for the copy
  operation (movnt) or any lines dirtied by the copy operation are
  written back (clwb, clflushopt, or clflush).

* Extend dax_operations with ->copy_from_iter() and ->flush()
  operations. These operations and other infrastructure updates allow
  all persistent memory specific dax functionality to be pushed into
  libnvdimm and the pmem driver directly. It also allows dax-specific
  sysfs attributes to be linked to a host device, for example:
      /sys/block/pmem0/dax/write_cache

* Add support for the new NVDIMM platform/firmware mechanisms introduced
  in ACPI 6.2 and UEFI 2.7. This support includes the v1.2 namespace
  label format, extensions to the address-range-scrub command set, new
  error injection commands, and a new BTT (block-translation-table)
  layout. These updates support inter-OS and pre-OS compatibility.

* Fix a longstanding memory corruption bug in nfit_test.

* Make the pmem and nvdimm-region 'badblocks' sysfs files poll(2)
  capable.

* Miscellaneous fixes and small updates across libnvdimm and the nfit
  driver.

Acknowledgements that came after the branch was pushed:

commit 6aa734a2f38e "libnvdimm, region, pmem: fix 'badblocks'
  sysfs_get_dirent() reference lifetime"
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>

----------------------------------------------------------------
Arvind Yadav (1):
      acpi, nfit: constify *_attribute_group

Dan Williams (29):
      x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations
      dm: add ->copy_from_iter() dax operation support
      libnvdimm, label: add v1.2 nvdimm label definitions
      libnvdimm, label: add v1.2 interleave-set-cookie algorithm
      libnvdimm, label: honor the lba size specified in v1.2 labels
      libnvdimm, label: populate the type_guid property for v1.2 namespaces
      libnvdimm, label: populate 'isetcookie' for blk-aperture namespaces
      libnvdimm, label: update 'nlabel' and 'position' handling for local namespaces
      libnvdimm, label: add v1.2 label checksum support
      libnvdimm, label: add address abstraction identifiers
      libnvdimm, label: switch to using v1.2 labels by default
      filesystem-dax: convert to dax_copy_from_iter()
      dax, pmem: introduce an optional 'flush' dax_operation
      dm: add ->flush() dax operation support
      filesystem-dax: convert to dax_flush()
      x86, dax: replace clear_pmem() with open coded memset + dax_ops->flush
      x86, dax, libnvdimm: remove wb_cache_pmem() indirection
      x86, libnvdimm, pmem: move arch_invalidate_pmem() to libnvdimm
      x86, libnvdimm, pmem: remove global pmem api
      libnvdimm, pmem: fix persistence warning
      libnvdimm, nfit: enable support for volatile ranges
      dax: remove default copy_from_iter fallback
      dax: convert to bitmask for flags
      libnvdimm, pmem, dax: export a cache control attribute
      libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile region
      acpi, nfit: quiet invalid block-aperture-region warnings
      libnvdimm, region, pmem: fix 'badblocks' sysfs_get_dirent() reference lifetime
      libnvdimm, namespace: record 'lbasize' for pmem namespaces
      Merge branch 'for-4.13/dax' into libnvdimm-for-next

Jerry Hoemann (5):
      libnvdimm: passthru functions clear to send
      acpi, nfit: Enable DSM pass thru for root functions.
      libnvdimm, acpi, nfit: Add bus level dsm mask for pass thru.
      acpi, nfit: Show bus_dsm_mask in sysfs
      libnvdimm: New ACPI 6.2 DSM functions

Toshi Kani (3):
      libnvdimm, pmem: Add sysfs notifications to badblocks
      acpi/nfit: Add support of NVDIMM memory error notification in ACPI 6.2
      acpi/nfit: Issue Start ARS to retrieve existing records

Vishal Verma (4):
      libnvdimm, btt: BTT updates for UEFI 2.7 format
      libnvdimm, btt: fix btt_rw_page not returning errors
      libnvdimm: fix the clear-error check in nsio_rw_bytes
      libnvdimm, btt: convert some info messages to warn/err

Yasunori Goto (1):
      tools/testing/nvdimm: fix nfit_test buffer overflow

 MAINTAINERS                       |   4 +-
 arch/powerpc/sysdev/axonram.c     |   8 ++
 arch/x86/Kconfig                  |   1 +
 arch/x86/include/asm/pmem.h       | 136 ------------------
 arch/x86/include/asm/string_64.h  |   5 +
 arch/x86/include/asm/uaccess_64.h |  11 ++
 arch/x86/lib/usercopy_64.c        | 134 ++++++++++++++++++
 arch/x86/mm/pageattr.c            |   6 +
 drivers/acpi/nfit/core.c          | 167 ++++++++++++++++++----
 drivers/acpi/nfit/mce.c           |   2 +-
 drivers/acpi/nfit/nfit.h          |   4 +-
 drivers/block/brd.c               |   8 ++
 drivers/dax/super.c               | 118 +++++++++++++++-
 drivers/md/dm-linear.c            |  30 ++++
 drivers/md/dm-stripe.c            |  40 ++++++
 drivers/md/dm.c                   |  45 ++++++
 drivers/nvdimm/btt.c              |  45 ++++--
 drivers/nvdimm/btt.h              |   2 +
 drivers/nvdimm/btt_devs.c         |  54 +++++++-
 drivers/nvdimm/bus.c              |  15 +-
 drivers/nvdimm/claim.c            |  38 ++++-
 drivers/nvdimm/core.c             |   5 +-
 drivers/nvdimm/dax_devs.c         |  10 +-
 drivers/nvdimm/dimm_devs.c        |  10 +-
 drivers/nvdimm/label.c            | 251 +++++++++++++++++++++++++++++----
 drivers/nvdimm/label.h            |  21 ++-
 drivers/nvdimm/namespace_devs.c   | 282 ++++++++++++++++++++++++++++++++------
 drivers/nvdimm/nd-core.h          |   9 ++
 drivers/nvdimm/nd.h               |  17 ++-
 drivers/nvdimm/pfn_devs.c         |  12 +-
 drivers/nvdimm/pmem.c             |  63 ++++++++-
 drivers/nvdimm/pmem.h             |  15 ++
 drivers/nvdimm/region.c           |  17 ++-
 drivers/nvdimm/region_devs.c      |  88 ++++++++----
 drivers/s390/block/dcssblk.c      |   8 ++
 fs/dax.c                          |   9 +-
 include/linux/dax.h               |  12 ++
 include/linux/device-mapper.h     |   6 +
 include/linux/libnvdimm.h         |  11 +-
 include/linux/nd.h                |  13 ++
 include/linux/pmem.h              | 142 -------------------
 include/linux/string.h            |   6 +
 include/linux/uio.h               |  15 ++
 include/uapi/linux/ndctl.h        |  42 +++++-
 lib/Kconfig                       |   3 +
 lib/iov_iter.c                    |  22 +++
 tools/testing/nvdimm/test/nfit.c  |   2 +-
 47 files changed, 1504 insertions(+), 460 deletions(-)
 delete mode 100644 arch/x86/include/asm/pmem.h
 delete mode 100644 include/linux/pmem.h

---

commit 0aed55af88345b5d673240f90e671d79662fb01e
Author: Dan Williams <dan.j.williams@intel.com>
Date:   Mon May 29 12:22:50 2017 -0700

    x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations
    
    The pmem driver has a need to transfer data with a persistent memory
    destination and be able to rely on the fact that the destination writes are not
    cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
    (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
    to ensure data-writes have reached a power-fail-safe zone in the platform. The
    fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
    around and fence previous writes with an "sfence".
    
    Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
    memcpy_flushcache, that guarantee that the destination buffer is not dirty in
    the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
    will be used to replace the "pmem api" (include/linux/pmem.h +
    arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
    and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
    config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
    otherwise.
    
    This is meant to satisfy the concern from Linus that if a driver wants to do
    something beyond the normal nocache semantics it should be something private to
    that driver [1], and Al's concern that anything uaccess related belongs with
    the rest of the uaccess code [2].
    
    The first consumer of this interface is a new 'copy_from_iter' dax operation so
    that pmem can inject cache maintenance operations without imposing this
    overhead on other dax-capable drivers.
    
    [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
    [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
    
    Cc: <x86@kernel.org>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Jeff Moyer <jmoyer@redhat.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Toshi Kani <toshi.kani@hpe.com>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Matthew Wilcox <mawilcox@microsoft.com>
    Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4ccfacc7232a..bb273b2f50b5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -54,6 +54,7 @@ config X86
 	select ARCH_HAS_KCOV			if X86_64
 	select ARCH_HAS_MMIO_FLUSH
 	select ARCH_HAS_PMEM_API		if X86_64
+	select ARCH_HAS_UACCESS_FLUSHCACHE	if X86_64
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_SG_CHAIN
 	select ARCH_HAS_STRICT_KERNEL_RWX
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index 733bae07fb29..1f22bc277c45 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -109,6 +109,11 @@ memcpy_mcsafe(void *dst, const void *src, size_t cnt)
 	return 0;
 }
 
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+#define __HAVE_ARCH_MEMCPY_FLUSHCACHE 1
+void memcpy_flushcache(void *dst, const void *src, size_t cnt);
+#endif
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_STRING_64_H */
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c5504b9a472e..b16f6a1d8b26 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -171,6 +171,10 @@ unsigned long raw_copy_in_user(void __user *dst, const void __user *src, unsigne
 extern long __copy_user_nocache(void *dst, const void __user *src,
 				unsigned size, int zerorest);
 
+extern long __copy_user_flushcache(void *dst, const void __user *src, unsigned size);
+extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
+			   size_t len);
+
 static inline int
 __copy_from_user_inatomic_nocache(void *dst, const void __user *src,
 				  unsigned size)
@@ -179,6 +183,13 @@ __copy_from_user_inatomic_nocache(void *dst, const void __user *src,
 	return __copy_user_nocache(dst, src, size, 0);
 }
 
+static inline int
+__copy_from_user_flushcache(void *dst, const void __user *src, unsigned size)
+{
+	kasan_check_write(dst, size);
+	return __copy_user_flushcache(dst, src, size);
+}
+
 unsigned long
 copy_user_handle_tail(char *to, char *from, unsigned len);
 
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index 3b7c40a2e3e1..f42d2fd86ca3 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -7,6 +7,7 @@
  */
 #include <linux/export.h>
 #include <linux/uaccess.h>
+#include <linux/highmem.h>
 
 /*
  * Zero Userspace
@@ -73,3 +74,130 @@ copy_user_handle_tail(char *to, char *from, unsigned len)
 	clac();
 	return len;
 }
+
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+/**
+ * clean_cache_range - write back a cache range with CLWB
+ * @vaddr:	virtual start address
+ * @size:	number of bytes to write back
+ *
+ * Write back a cache range using the CLWB (cache line write back)
+ * instruction. Note that @size is internally rounded up to be cache
+ * line size aligned.
+ */
+static void clean_cache_range(void *addr, size_t size)
+{
+	u16 x86_clflush_size = boot_cpu_data.x86_clflush_size;
+	unsigned long clflush_mask = x86_clflush_size - 1;
+	void *vend = addr + size;
+	void *p;
+
+	for (p = (void *)((unsigned long)addr & ~clflush_mask);
+	     p < vend; p += x86_clflush_size)
+		clwb(p);
+}
+
+long __copy_user_flushcache(void *dst, const void __user *src, unsigned size)
+{
+	unsigned long flushed, dest = (unsigned long) dst;
+	long rc = __copy_user_nocache(dst, src, size, 0);
+
+	/*
+	 * __copy_user_nocache() uses non-temporal stores for the bulk
+	 * of the transfer, but we need to manually flush if the
+	 * transfer is unaligned. A cached memory copy is used when
+	 * destination or size is not naturally aligned. That is:
+	 *   - Require 8-byte alignment when size is 8 bytes or larger.
+	 *   - Require 4-byte alignment when size is 4 bytes.
+	 */
+	if (size < 8) {
+		if (!IS_ALIGNED(dest, 4) || size != 4)
+			clean_cache_range(dst, 1);
+	} else {
+		if (!IS_ALIGNED(dest, 8)) {
+			dest = ALIGN(dest, boot_cpu_data.x86_clflush_size);
+			clean_cache_range(dst, 1);
+		}
+
+		flushed = dest - (unsigned long) dst;
+		if (size > flushed && !IS_ALIGNED(size - flushed, 8))
+			clean_cache_range(dst + size - 1, 1);
+	}
+
+	return rc;
+}
+
+void memcpy_flushcache(void *_dst, const void *_src, size_t size)
+{
+	unsigned long dest = (unsigned long) _dst;
+	unsigned long source = (unsigned long) _src;
+
+	/* cache copy and flush to align dest */
+	if (!IS_ALIGNED(dest, 8)) {
+		unsigned len = min_t(unsigned, size, ALIGN(dest, 8) - dest);
+
+		memcpy((void *) dest, (void *) source, len);
+		clean_cache_range((void *) dest, len);
+		dest += len;
+		source += len;
+		size -= len;
+		if (!size)
+			return;
+	}
+
+	/* 4x8 movnti loop */
+	while (size >= 32) {
+		asm("movq    (%0), %%r8\n"
+		    "movq   8(%0), %%r9\n"
+		    "movq  16(%0), %%r10\n"
+		    "movq  24(%0), %%r11\n"
+		    "movnti  %%r8,   (%1)\n"
+		    "movnti  %%r9,  8(%1)\n"
+		    "movnti %%r10, 16(%1)\n"
+		    "movnti %%r11, 24(%1)\n"
+		    :: "r" (source), "r" (dest)
+		    : "memory", "r8", "r9", "r10", "r11");
+		dest += 32;
+		source += 32;
+		size -= 32;
+	}
+
+	/* 1x8 movnti loop */
+	while (size >= 8) {
+		asm("movq    (%0), %%r8\n"
+		    "movnti  %%r8,   (%1)\n"
+		    :: "r" (source), "r" (dest)
+		    : "memory", "r8");
+		dest += 8;
+		source += 8;
+		size -= 8;
+	}
+
+	/* 1x4 movnti loop */
+	while (size >= 4) {
+		asm("movl    (%0), %%r8d\n"
+		    "movnti  %%r8d,   (%1)\n"
+		    :: "r" (source), "r" (dest)
+		    : "memory", "r8");
+		dest += 4;
+		source += 4;
+		size -= 4;
+	}
+
+	/* cache copy for remaining bytes */
+	if (size) {
+		memcpy((void *) dest, (void *) source, size);
+		clean_cache_range((void *) dest, size);
+	}
+}
+EXPORT_SYMBOL_GPL(memcpy_flushcache);
+
+void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
+		size_t len)
+{
+	char *from = kmap_atomic(page);
+
+	memcpy_flushcache(to, from + offset, len);
+	kunmap_atomic(from);
+}
+#endif
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 656acb5d7166..cbd5596e7562 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1842,8 +1842,7 @@ static int acpi_nfit_blk_single_io(struct nfit_blk *nfit_blk,
 		}
 
 		if (rw)
-			memcpy_to_pmem(mmio->addr.aperture + offset,
-					iobuf + copied, c);
+			memcpy_flushcache(mmio->addr.aperture + offset, iobuf + copied, c);
 		else {
 			if (nfit_blk->dimm_flags & NFIT_BLK_READ_FLUSH)
 				mmio_flush_range((void __force *)
diff --git a/drivers/nvdimm/claim.c b/drivers/nvdimm/claim.c
index 7ceb5fa4f2a1..b8b9c8ca7862 100644
--- a/drivers/nvdimm/claim.c
+++ b/drivers/nvdimm/claim.c
@@ -277,7 +277,7 @@ static int nsio_rw_bytes(struct nd_namespace_common *ndns,
 			rc = -EIO;
 	}
 
-	memcpy_to_pmem(nsio->addr + offset, buf, size);
+	memcpy_flushcache(nsio->addr + offset, buf, size);
 	nvdimm_flush(to_nd_region(ndns->dev.parent));
 
 	return rc;
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index c544d466ea51..2f3aefe565c6 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -29,6 +29,7 @@
 #include <linux/pfn_t.h>
 #include <linux/slab.h>
 #include <linux/pmem.h>
+#include <linux/uio.h>
 #include <linux/dax.h>
 #include <linux/nd.h>
 #include "pmem.h"
@@ -80,7 +81,7 @@ static void write_pmem(void *pmem_addr, struct page *page,
 {
 	void *mem = kmap_atomic(page);
 
-	memcpy_to_pmem(pmem_addr, mem + off, len);
+	memcpy_flushcache(pmem_addr, mem + off, len);
 	kunmap_atomic(mem);
 }
 
@@ -235,8 +236,15 @@ static long pmem_dax_direct_access(struct dax_device *dax_dev,
 	return __pmem_direct_access(pmem, pgoff, nr_pages, kaddr, pfn);
 }
 
+static size_t pmem_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff,
+		void *addr, size_t bytes, struct iov_iter *i)
+{
+	return copy_from_iter_flushcache(addr, bytes, i);
+}
+
 static const struct dax_operations pmem_dax_ops = {
 	.direct_access = pmem_dax_direct_access,
+	.copy_from_iter = pmem_copy_from_iter,
 };
 
 static void pmem_release_queue(void *q)
@@ -294,7 +302,8 @@ static int pmem_attach_disk(struct device *dev,
 	dev_set_drvdata(dev, pmem);
 	pmem->phys_addr = res->start;
 	pmem->size = resource_size(res);
-	if (nvdimm_has_flush(nd_region) < 0)
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE)
+			|| nvdimm_has_flush(nd_region) < 0)
 		dev_warn(dev, "unable to guarantee persistence of writes\n");
 
 	if (!devm_request_mem_region(dev, res->start, resource_size(res),
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index b550edf2571f..985b0e11bd73 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1015,8 +1015,8 @@ void nvdimm_flush(struct nd_region *nd_region)
 	 * The first wmb() is needed to 'sfence' all previous writes
 	 * such that they are architecturally visible for the platform
 	 * buffer flush.  Note that we've already arranged for pmem
-	 * writes to avoid the cache via arch_memcpy_to_pmem().  The
-	 * final wmb() ensures ordering for the NVDIMM flush write.
+	 * writes to avoid the cache via memcpy_flushcache().  The final
+	 * wmb() ensures ordering for the NVDIMM flush write.
 	 */
 	wmb();
 	for (i = 0; i < nd_region->ndr_mappings; i++)
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 5ec1f6c47716..bbe79ed90e2b 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -16,6 +16,9 @@ struct dax_operations {
 	 */
 	long (*direct_access)(struct dax_device *, pgoff_t, long,
 			void **, pfn_t *);
+	/* copy_from_iter: dax-driver override for default copy_from_iter */
+	size_t (*copy_from_iter)(struct dax_device *, pgoff_t, void *, size_t,
+			struct iov_iter *);
 };
 
 #if IS_ENABLED(CONFIG_DAX)
diff --git a/include/linux/string.h b/include/linux/string.h
index 537918f8a98e..7439d83eaa33 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -122,6 +122,12 @@ static inline __must_check int memcpy_mcsafe(void *dst, const void *src,
 	return 0;
 }
 #endif
+#ifndef __HAVE_ARCH_MEMCPY_FLUSHCACHE
+static inline void memcpy_flushcache(void *dst, const void *src, size_t cnt)
+{
+	memcpy(dst, src, cnt);
+}
+#endif
 void *memchr_inv(const void *s, int c, size_t n);
 char *strreplace(char *s, char old, char new);
 
diff --git a/include/linux/uio.h b/include/linux/uio.h
index f2d36a3d3005..55cd54a0e941 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -95,6 +95,21 @@ size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
 size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i);
 bool copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i);
 size_t copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i);
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+/*
+ * Note, users like pmem that depend on the stricter semantics of
+ * copy_from_iter_flushcache() than copy_from_iter_nocache() must check for
+ * IS_ENABLED(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) before assuming that the
+ * destination is flushed from the cache on return.
+ */
+size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i);
+#else
+static inline size_t copy_from_iter_flushcache(void *addr, size_t bytes,
+				       struct iov_iter *i)
+{
+	return copy_from_iter_nocache(addr, bytes, i);
+}
+#endif
 bool copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i);
 size_t iov_iter_zero(size_t bytes, struct iov_iter *);
 unsigned long iov_iter_alignment(const struct iov_iter *i);
diff --git a/lib/Kconfig b/lib/Kconfig
index 0c8b78a9ae2e..2d1c4b3a085c 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -548,6 +548,9 @@ config ARCH_HAS_SG_CHAIN
 config ARCH_HAS_PMEM_API
 	bool
 
+config ARCH_HAS_UACCESS_FLUSHCACHE
+	bool
+
 config ARCH_HAS_MMIO_FLUSH
 	bool
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index f835964c9485..c9a69064462f 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -615,6 +615,28 @@ size_t copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 }
 EXPORT_SYMBOL(copy_from_iter_nocache);
 
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	if (unlikely(i->type & ITER_PIPE)) {
+		WARN_ON(1);
+		return 0;
+	}
+	iterate_and_advance(i, bytes, v,
+		__copy_from_user_flushcache((to += v.iov_len) - v.iov_len,
+					 v.iov_base, v.iov_len),
+		memcpy_page_flushcache((to += v.bv_len) - v.bv_len, v.bv_page,
+				 v.bv_offset, v.bv_len),
+		memcpy_flushcache((to += v.iov_len) - v.iov_len, v.iov_base,
+			v.iov_len)
+	)
+
+	return bytes;
+}
+EXPORT_SYMBOL_GPL(copy_from_iter_flushcache);
+#endif
+
 bool copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;