diff mbox series

[3/4] mm/memory-failure: Fix detection of memory_failure() handlers

Message ID 166153428781.2758201.1990616683438224741.stgit@dwillia2-xfh.jf.intel.com (mailing list archive)
State Deferred, archived
Headers show
Series mm, xfs, dax: Fixes for memory_failure() handling | expand

Commit Message

Dan Williams Aug. 26, 2022, 5:18 p.m. UTC
Some pagemap types, like MEMORY_DEVICE_GENERIC (device-dax) do not even
have pagemap ops which results in crash signatures like this:

  BUG: kernel NULL pointer dereference, address: 0000000000000010
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 8000000205073067 P4D 8000000205073067 PUD 2062b3067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP PTI
  CPU: 22 PID: 4535 Comm: device-dax Tainted: G           OE    N 6.0.0-rc2+ #59
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:memory_failure+0x667/0xba0
 [..]
  Call Trace:
   <TASK>
   ? _printk+0x58/0x73
   do_madvise.part.0.cold+0xaf/0xc5

Check for ops before checking if the ops have a memory_failure()
handler.

Fixes: 33a8f7f2b3a3 ("pagemap,pmem: introduce ->memory_failure()")
Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Ritesh Harjani <riteshh@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 include/linux/memremap.h |    5 +++++
 mm/memory-failure.c      |    2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

Comments

HORIGUCHI NAOYA(堀口 直也) Aug. 29, 2022, 5:39 a.m. UTC | #1
On Fri, Aug 26, 2022 at 10:18:07AM -0700, Dan Williams wrote:
> Some pagemap types, like MEMORY_DEVICE_GENERIC (device-dax) do not even
> have pagemap ops which results in crash signatures like this:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000010
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 8000000205073067 P4D 8000000205073067 PUD 2062b3067 PMD 0
>   Oops: 0000 [#1] PREEMPT SMP PTI
>   CPU: 22 PID: 4535 Comm: device-dax Tainted: G           OE    N 6.0.0-rc2+ #59
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:memory_failure+0x667/0xba0
>  [..]
>   Call Trace:
>    <TASK>
>    ? _printk+0x58/0x73
>    do_madvise.part.0.cold+0xaf/0xc5
> 
> Check for ops before checking if the ops have a memory_failure()
> handler.
> 
> Fixes: 33a8f7f2b3a3 ("pagemap,pmem: introduce ->memory_failure()")
> Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Darrick J. Wong <djwong@kernel.org>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
> Cc: Jane Chu <jane.chu@oracle.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Ritesh Harjani <riteshh@linux.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Thank you for sending patches, this looks fine to me.

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

> ---
>  include/linux/memremap.h |    5 +++++
>  mm/memory-failure.c      |    2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> index 19010491a603..c3b4cc84877b 100644
> --- a/include/linux/memremap.h
> +++ b/include/linux/memremap.h
> @@ -139,6 +139,11 @@ struct dev_pagemap {
>  	};
>  };
>  
> +static inline bool pgmap_has_memory_failure(struct dev_pagemap *pgmap)
> +{
> +	return pgmap->ops && pgmap->ops->memory_failure;
> +}
> +
>  static inline struct vmem_altmap *pgmap_altmap(struct dev_pagemap *pgmap)
>  {
>  	if (pgmap->flags & PGMAP_ALTMAP_VALID)
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 14439806b5ef..8a4294afbfa0 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1928,7 +1928,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>  	 * Call driver's implementation to handle the memory failure, otherwise
>  	 * fall back to generic handler.
>  	 */
> -	if (pgmap->ops->memory_failure) {
> +	if (pgmap_has_memory_failure(pgmap)) {
>  		rc = pgmap->ops->memory_failure(pgmap, pfn, 1, flags);
>  		/*
>  		 * Fall back to generic handler too if operation is not
Miaohe Lin Aug. 30, 2022, 2:49 a.m. UTC | #2
On 2022/8/27 1:18, Dan Williams wrote:
> Some pagemap types, like MEMORY_DEVICE_GENERIC (device-dax) do not even
> have pagemap ops which results in crash signatures like this:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000010
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 8000000205073067 P4D 8000000205073067 PUD 2062b3067 PMD 0
>   Oops: 0000 [#1] PREEMPT SMP PTI
>   CPU: 22 PID: 4535 Comm: device-dax Tainted: G           OE    N 6.0.0-rc2+ #59
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>   RIP: 0010:memory_failure+0x667/0xba0
>  [..]
>   Call Trace:
>    <TASK>
>    ? _printk+0x58/0x73
>    do_madvise.part.0.cold+0xaf/0xc5
> 
> Check for ops before checking if the ops have a memory_failure()
> handler.
> 
> Fixes: 33a8f7f2b3a3 ("pagemap,pmem: introduce ->memory_failure()")
> Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Darrick J. Wong <djwong@kernel.org>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
> Cc: Jane Chu <jane.chu@oracle.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Ritesh Harjani <riteshh@linux.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

LGTM. Thanks for fixing this.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks,
Miaohe Lin


> ---
>  include/linux/memremap.h |    5 +++++
>  mm/memory-failure.c      |    2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> index 19010491a603..c3b4cc84877b 100644
> --- a/include/linux/memremap.h
> +++ b/include/linux/memremap.h
> @@ -139,6 +139,11 @@ struct dev_pagemap {
>  	};
>  };
>  
> +static inline bool pgmap_has_memory_failure(struct dev_pagemap *pgmap)
> +{
> +	return pgmap->ops && pgmap->ops->memory_failure;
> +}
> +
>  static inline struct vmem_altmap *pgmap_altmap(struct dev_pagemap *pgmap)
>  {
>  	if (pgmap->flags & PGMAP_ALTMAP_VALID)
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 14439806b5ef..8a4294afbfa0 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1928,7 +1928,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>  	 * Call driver's implementation to handle the memory failure, otherwise
>  	 * fall back to generic handler.
>  	 */
> -	if (pgmap->ops->memory_failure) {
> +	if (pgmap_has_memory_failure(pgmap)) {
>  		rc = pgmap->ops->memory_failure(pgmap, pfn, 1, flags);
>  		/*
>  		 * Fall back to generic handler too if operation is not
> 
> 
> .
>
Christoph Hellwig Sept. 5, 2022, 2:45 p.m. UTC | #3
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
diff mbox series

Patch

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 19010491a603..c3b4cc84877b 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -139,6 +139,11 @@  struct dev_pagemap {
 	};
 };
 
+static inline bool pgmap_has_memory_failure(struct dev_pagemap *pgmap)
+{
+	return pgmap->ops && pgmap->ops->memory_failure;
+}
+
 static inline struct vmem_altmap *pgmap_altmap(struct dev_pagemap *pgmap)
 {
 	if (pgmap->flags & PGMAP_ALTMAP_VALID)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 14439806b5ef..8a4294afbfa0 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1928,7 +1928,7 @@  static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
 	 * Call driver's implementation to handle the memory failure, otherwise
 	 * fall back to generic handler.
 	 */
-	if (pgmap->ops->memory_failure) {
+	if (pgmap_has_memory_failure(pgmap)) {
 		rc = pgmap->ops->memory_failure(pgmap, pfn, 1, flags);
 		/*
 		 * Fall back to generic handler too if operation is not