diff mbox series

[v2] io-mapping: Indicate mapping failure

Message ID 20200721171936.81563-1-michael.j.ruhl@intel.com (mailing list archive)
State New, archived
Headers show
Series [v2] io-mapping: Indicate mapping failure | expand

Commit Message

Michael J. Ruhl July 21, 2020, 5:19 p.m. UTC
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.

Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.

During a device probe, where the ioremap failed, a crash can look
like this:

BUG: unable to handle page fault for address: 0000000000210000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 0 PID: 177 Comm:
 RIP: 0010:fill_page_dma [i915]
  gen8_ppgtt_create [i915]
  i915_ppgtt_create [i915]
  intel_gt_init [i915]
  i915_gem_init [i915]
  i915_driver_probe [i915]
  pci_device_probe
  really_probe
  driver_probe_device

The remap failure occurred much earlier in the probe.  If it had
been propagated, the driver would have exited with an error.

Return NULL on ioremap failure.

Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
---
v2: reflect review comments
---
 include/linux/io-mapping.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andrew Morton July 21, 2020, 8:56 p.m. UTC | #1
On Tue, 21 Jul 2020 13:19:36 -0400 "Michael J. Ruhl" <michael.j.ruhl@intel.com> wrote:

> The !ATOMIC_IOMAP version of io_maping_init_wc will always return
> success, even when the ioremap fails.
> 
> Since the ATOMIC_IOMAP version returns NULL when the init fails, and
> callers check for a NULL return on error this is unexpected.
> 
> During a device probe, where the ioremap failed, a crash can look
> like this:
> 
> BUG: unable to handle page fault for address: 0000000000210000
>  #PF: supervisor write access in kernel mode
>  #PF: error_code(0x0002) - not-present page
>  Oops: 0002 [#1] PREEMPT SMP
>  CPU: 0 PID: 177 Comm:
>  RIP: 0010:fill_page_dma [i915]
>   gen8_ppgtt_create [i915]
>   i915_ppgtt_create [i915]
>   intel_gt_init [i915]
>   i915_gem_init [i915]
>   i915_driver_probe [i915]
>   pci_device_probe
>   really_probe
>   driver_probe_device
> 
> The remap failure occurred much earlier in the probe.  If it had
> been propagated, the driver would have exited with an error.
> 
> Return NULL on ioremap failure.
>
> ...
>
> --- a/include/linux/io-mapping.h
> +++ b/include/linux/io-mapping.h
> @@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
>  	iomap->prot = pgprot_noncached(PAGE_KERNEL);
>  #endif
>  
> -	return iomap;
> +	return iomap->iomem ? iomap : NULL;
>  }
>  
>  static inline void

LGTM.  However I do think it would be stylistically better/typical to
detect and handle the error early, rather than to blunder on,
pointlessly initializing things?

--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure-fix
+++ a/include/linux/io-mapping.h
@@ -107,9 +107,12 @@ io_mapping_init_wc(struct io_mapping *io
 		   resource_size_t base,
 		   unsigned long size)
 {
+	iomap->iomem = ioremap_wc(base, size);
+	if (!iomap->iomem)
+		return NULL;
+
 	iomap->base = base;
 	iomap->size = size;
-	iomap->iomem = ioremap_wc(base, size);
 #if defined(pgprot_noncached_wc) /* archs can't agree on a name ... */
 	iomap->prot = pgprot_noncached_wc(PAGE_KERNEL);
 #elif defined(pgprot_writecombine)
@@ -118,7 +121,7 @@ io_mapping_init_wc(struct io_mapping *io
 	iomap->prot = pgprot_noncached(PAGE_KERNEL);
 #endif
 
-	return iomap->iomem ? iomap : NULL;
+	return iomap;
 }
 
 static inline void
Michael J. Ruhl July 21, 2020, 9:02 p.m. UTC | #2
>-----Original Message-----
>From: Andrew Morton <akpm@linux-foundation.org>
>Sent: Tuesday, July 21, 2020 4:57 PM
>To: Ruhl, Michael J <michael.j.ruhl@intel.com>
>Cc: dri-devel@lists.freedesktop.org; Mike Rapoport <rppt@linux.ibm.com>;
>Andy Shevchenko <andriy.shevchenko@linux.intel.com>; Chris Wilson
><chris@chris-wilson.co.uk>; stable@vger.kernel.org
>Subject: Re: [PATCH v2] io-mapping: Indicate mapping failure
>
>On Tue, 21 Jul 2020 13:19:36 -0400 "Michael J. Ruhl"
><michael.j.ruhl@intel.com> wrote:
>
>> The !ATOMIC_IOMAP version of io_maping_init_wc will always return
>> success, even when the ioremap fails.
>>
>> Since the ATOMIC_IOMAP version returns NULL when the init fails, and
>> callers check for a NULL return on error this is unexpected.
>>
>> During a device probe, where the ioremap failed, a crash can look
>> like this:
>>
>> BUG: unable to handle page fault for address: 0000000000210000
>>  #PF: supervisor write access in kernel mode
>>  #PF: error_code(0x0002) - not-present page
>>  Oops: 0002 [#1] PREEMPT SMP
>>  CPU: 0 PID: 177 Comm:
>>  RIP: 0010:fill_page_dma [i915]
>>   gen8_ppgtt_create [i915]
>>   i915_ppgtt_create [i915]
>>   intel_gt_init [i915]
>>   i915_gem_init [i915]
>>   i915_driver_probe [i915]
>>   pci_device_probe
>>   really_probe
>>   driver_probe_device
>>
>> The remap failure occurred much earlier in the probe.  If it had
>> been propagated, the driver would have exited with an error.
>>
>> Return NULL on ioremap failure.
>>
>> ...
>>
>> --- a/include/linux/io-mapping.h
>> +++ b/include/linux/io-mapping.h
>> @@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
>>  	iomap->prot = pgprot_noncached(PAGE_KERNEL);
>>  #endif
>>
>> -	return iomap;
>> +	return iomap->iomem ? iomap : NULL;
>>  }
>>
>>  static inline void
>
>LGTM.  However I do think it would be stylistically better/typical to
>detect and handle the error early, rather than to blunder on,
>pointlessly initializing things?

Yeah, I pondered that, and then didn't do it...

>--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure-fix
>+++ a/include/linux/io-mapping.h
>@@ -107,9 +107,12 @@ io_mapping_init_wc(struct io_mapping *io
> 		   resource_size_t base,
> 		   unsigned long size)
> {
>+	iomap->iomem = ioremap_wc(base, size);
>+	if (!iomap->iomem)
>+		return NULL;
>+

This does make more sense.

I am confused by the two follow up emails I just got.

Shall I resubmit, or is this path (if !iomap->iomem) return NULL)
now in the tree. 
Andrew Morton July 21, 2020, 9:24 p.m. UTC | #3
On Tue, 21 Jul 2020 21:02:44 +0000 "Ruhl, Michael J" <michael.j.ruhl@intel.com> wrote:

> >--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure-fix
> >+++ a/include/linux/io-mapping.h
> >@@ -107,9 +107,12 @@ io_mapping_init_wc(struct io_mapping *io
> > 		   resource_size_t base,
> > 		   unsigned long size)
> > {
> >+	iomap->iomem = ioremap_wc(base, size);
> >+	if (!iomap->iomem)
> >+		return NULL;
> >+
> 
> This does make more sense.
> 
> I am confused by the two follow up emails I just got.

One was your original patch, the other is my suggested alteration.

> Shall I resubmit, or is this path (if !iomap->iomem) return NULL)
> now in the tree.

All is OK.  If my alteration is acceptable (and, preferably, tested!)
then when the time comes, I'll fold it into the base patch, add a
note indicating this change and shall then send it to Linus.
Daniel Vetter July 22, 2020, 5:25 a.m. UTC | #4
On Tue, Jul 21, 2020 at 11:24 PM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> On Tue, 21 Jul 2020 21:02:44 +0000 "Ruhl, Michael J" <michael.j.ruhl@intel.com> wrote:
>
> > >--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure-fix
> > >+++ a/include/linux/io-mapping.h
> > >@@ -107,9 +107,12 @@ io_mapping_init_wc(struct io_mapping *io
> > >                resource_size_t base,
> > >                unsigned long size)
> > > {
> > >+    iomap->iomem = ioremap_wc(base, size);
> > >+    if (!iomap->iomem)
> > >+            return NULL;
> > >+
> >
> > This does make more sense.
> >
> > I am confused by the two follow up emails I just got.
>
> One was your original patch, the other is my suggested alteration.
>
> > Shall I resubmit, or is this path (if !iomap->iomem) return NULL)
> > now in the tree.
>
> All is OK.  If my alteration is acceptable (and, preferably, tested!)
> then when the time comes, I'll fold it into the base patch, add a
> note indicating this change and shall then send it to Linus.

Your alternative also matches the other implementation of
io_mapping_init_wc, I was kinda tempted to do that suggestion too just
because of that. But then didn't send out that email.
-Daniel
Michael J. Ruhl July 22, 2020, 11:54 a.m. UTC | #5
>-----Original Message-----
>From: Andrew Morton <akpm@linux-foundation.org>
>Sent: Tuesday, July 21, 2020 5:24 PM
>To: Ruhl, Michael J <michael.j.ruhl@intel.com>
>Cc: dri-devel@lists.freedesktop.org; Mike Rapoport <rppt@linux.ibm.com>;
>Andy Shevchenko <andriy.shevchenko@linux.intel.com>; Chris Wilson
><chris@chris-wilson.co.uk>; stable@vger.kernel.org
>Subject: Re: [PATCH v2] io-mapping: Indicate mapping failure
>
>On Tue, 21 Jul 2020 21:02:44 +0000 "Ruhl, Michael J"
><michael.j.ruhl@intel.com> wrote:
>
>> >--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure-fix
>> >+++ a/include/linux/io-mapping.h
>> >@@ -107,9 +107,12 @@ io_mapping_init_wc(struct io_mapping *io
>> > 		   resource_size_t base,
>> > 		   unsigned long size)
>> > {
>> >+	iomap->iomem = ioremap_wc(base, size);
>> >+	if (!iomap->iomem)
>> >+		return NULL;
>> >+
>>
>> This does make more sense.
>>
>> I am confused by the two follow up emails I just got.
>
>One was your original patch, the other is my suggested alteration.
>
>> Shall I resubmit, or is this path (if !iomap->iomem) return NULL)
>> now in the tree.
>
>All is OK.  If my alteration is acceptable (and, preferably, tested!)
>then when the time comes, I'll fold it into the base patch, add a
>note indicating this change and shall then send it to Linus.

I am good with the change and have tested it.

Instead of the system crashing I get:

i915 0000:01:00.0: [drm] *ERROR* Failed to setup region(-5) type=1
i915 0000:01:00.0: Device initialization failed (-5)
i915: probe of 0000:01:00.0 failed with error -5

Which is the expected error.

If you would like this for the updated patch:

Tested-By: Michael J. Ruhl <michael.j.ruhl@intel.com>

Thanks!

Mike
diff mbox series

Patch

diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@  io_mapping_init_wc(struct io_mapping *iomap,
 	iomap->prot = pgprot_noncached(PAGE_KERNEL);
 #endif
 
-	return iomap;
+	return iomap->iomem ? iomap : NULL;
 }
 
 static inline void