diff mbox

[REGRESSION,next-20170426] Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") causes oops in mvneta

Message ID bcdcc1cc-181c-8396-dd3c-dd40d0c7efc1@codeaurora.org (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Sricharan Ramabadhran April 28, 2017, 11:56 a.m. UTC
Hi Ralph,

<snip..>

>>>>>
>>>>> Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe
>>>>> time for platform/amba/pci bus devices") causes a kernel panic as
>>>>> in the log below on an armada-385. Reverting the commit fixes the
>>>>> issue.
>>>>>
>>>>> Regards
>>>>> Ralph    
>>>>
>>>> Somehow not getting a obvious clue on whats going wrong with the
>>>> logs below. From the log and looking in to dts, the drivers seems
>>>> to the one for "marvell,armada-370-neta".  
>>>
>>> Correct.
>>>   
>>>> Issue looks the data from the dma
>>>> has gone bad and subsequently referring the wrong data has resulted
>>>> in the crash. Looks like the dma_masks is the one going wrong.
>>>> Can i get some logs from mvneta_probe, about dev->dma_mask,
>>>> dev->coherent_dma_mask and dev->dma_ops with and without the patch
>>>> to see whats the difference ?  
>>>
>>> Not sure I understood what exactly you are after. Might be faster to
>>> just send me a patch with all debug print statements you like to
>>> see. 
>>
>> Attached the patch with debug prints.
>>
>> Regards,
>>  Sricharan
>>
> 
> Hi Sricharan
> 
> With commit 09515ef5ddad
> 
> [    1.288962] mvneta f1070000.ethernet: dev->dma_mask 0xffffffff
> [    1.294827] mvneta f1070000.ethernet: dev->coherent_dma_mask 0xffffffff
> [    1.301472] mvneta f1070000.ethernet: dev->dma_ops 0x40b00c0601460
> 
> [    1.322047] mvneta f1034000.ethernet: dev->dma_mask 0xffffffff
> [    1.327904] mvneta f1034000.ethernet: dev->coherent_dma_mask 0xffffffff
> [    1.334549] mvneta f1034000.ethernet: dev->dma_ops 0x40b00c0601460
> 
> 
> With the patch reverted, the build that works
> 
> [    1.289001] mvneta f1070000.ethernet: dev->dma_mask 0xffffffff
> [    1.294866] mvneta f1070000.ethernet: dev->coherent_dma_mask 0xffffffff
> [    1.301511] mvneta f1070000.ethernet: dev->dma_ops 0x40b00c06014a8
> 
> [    1.317005] mvneta f1034000.ethernet: dev->dma_mask 0xffffffff
> [    1.322867] mvneta f1034000.ethernet: dev->coherent_dma_mask 0xffffffff
> [    1.329508] mvneta f1034000.ethernet: dev->dma_ops 0x40b00c06014a8
> 

My bad, i think it is this patch missing [1], attached it as well.
Infact, this was in the series initially and got acked to get merged
separately well before the series. I should have sent this to Russell.
I will do this now. If this fixes up the issue,
i will take this patch separately, while this series gets tested
on -next.

[1] https://patchwork.kernel.org/patch/9362113/

Comments

Ralph Sennhauser April 28, 2017, 12:25 p.m. UTC | #1
On Fri, 28 Apr 2017 17:26:41 +0530
Sricharan R <sricharan@codeaurora.org> wrote:

> Hi Ralph,
> 
> <snip..>
> 
> >>>>>
> >>>>> Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe
> >>>>> time for platform/amba/pci bus devices") causes a kernel panic
> >>>>> as in the log below on an armada-385. Reverting the commit
> >>>>> fixes the issue.
> >>>>>
> >>>>> Regards
> >>>>> Ralph      
> >>>>
> >>>> Somehow not getting a obvious clue on whats going wrong with the
> >>>> logs below. From the log and looking in to dts, the drivers seems
> >>>> to the one for "marvell,armada-370-neta".    
> >>>
> >>> Correct.
> >>>     
> >>>> Issue looks the data from the dma
> >>>> has gone bad and subsequently referring the wrong data has
> >>>> resulted in the crash. Looks like the dma_masks is the one going
> >>>> wrong. Can i get some logs from mvneta_probe, about
> >>>> dev->dma_mask, dev->coherent_dma_mask and dev->dma_ops with and
> >>>> without the patch to see whats the difference ?    
> >>>
> >>> Not sure I understood what exactly you are after. Might be faster
> >>> to just send me a patch with all debug print statements you like
> >>> to see.   
> >>
> >> Attached the patch with debug prints.
> >>
> >> Regards,
> >>  Sricharan
> >>  
> > 
> > Hi Sricharan
> > 
> > With commit 09515ef5ddad
> > 
> > [    1.288962] mvneta f1070000.ethernet: dev->dma_mask 0xffffffff
> > [    1.294827] mvneta f1070000.ethernet: dev->coherent_dma_mask
> > 0xffffffff [    1.301472] mvneta f1070000.ethernet: dev->dma_ops
> > 0x40b00c0601460
> > 
> > [    1.322047] mvneta f1034000.ethernet: dev->dma_mask 0xffffffff
> > [    1.327904] mvneta f1034000.ethernet: dev->coherent_dma_mask
> > 0xffffffff [    1.334549] mvneta f1034000.ethernet: dev->dma_ops
> > 0x40b00c0601460
> > 
> > 
> > With the patch reverted, the build that works
> > 
> > [    1.289001] mvneta f1070000.ethernet: dev->dma_mask 0xffffffff
> > [    1.294866] mvneta f1070000.ethernet: dev->coherent_dma_mask
> > 0xffffffff [    1.301511] mvneta f1070000.ethernet: dev->dma_ops
> > 0x40b00c06014a8
> > 
> > [    1.317005] mvneta f1034000.ethernet: dev->dma_mask 0xffffffff
> > [    1.322867] mvneta f1034000.ethernet: dev->coherent_dma_mask
> > 0xffffffff [    1.329508] mvneta f1034000.ethernet: dev->dma_ops
> > 0x40b00c06014a8 
> 
> My bad, i think it is this patch missing [1], attached it as well.
> Infact, this was in the series initially and got acked to get merged
> separately well before the series. I should have sent this to Russell.
> I will do this now. If this fixes up the issue,
> i will take this patch separately, while this series gets tested
> on -next.
> 
> [1] https://patchwork.kernel.org/patch/9362113/
> 

With the attached patch,
0001-arm-dma-mapping-Don-t-override-dma_ops-in-arch_setup.patch, on top
of next all is well again.

Thanks
Ralph
Sricharan Ramabadhran April 28, 2017, 1:18 p.m. UTC | #2
Hi Ralph,

On 4/28/2017 5:55 PM, Ralph Sennhauser wrote:
> On Fri, 28 Apr 2017 17:26:41 +0530
> Sricharan R <sricharan@codeaurora.org> wrote:
> 
>> Hi Ralph,
>>
>> <snip..>
>>
>>>>>>>
>>>>>>> Commit 09515ef5ddad ("of/acpi: Configure dma operations at probe
>>>>>>> time for platform/amba/pci bus devices") causes a kernel panic
>>>>>>> as in the log below on an armada-385. Reverting the commit
>>>>>>> fixes the issue.
>>>>>>>
>>>>>>> Regards
>>>>>>> Ralph      
>>>>>>
>>>>>> Somehow not getting a obvious clue on whats going wrong with the
>>>>>> logs below. From the log and looking in to dts, the drivers seems
>>>>>> to the one for "marvell,armada-370-neta".    
>>>>>
>>>>> Correct.
>>>>>     
>>>>>> Issue looks the data from the dma
>>>>>> has gone bad and subsequently referring the wrong data has
>>>>>> resulted in the crash. Looks like the dma_masks is the one going
>>>>>> wrong. Can i get some logs from mvneta_probe, about
>>>>>> dev->dma_mask, dev->coherent_dma_mask and dev->dma_ops with and
>>>>>> without the patch to see whats the difference ?    
>>>>>
>>>>> Not sure I understood what exactly you are after. Might be faster
>>>>> to just send me a patch with all debug print statements you like
>>>>> to see.   
>>>>
>>>> Attached the patch with debug prints.
>>>>
>>>> Regards,
>>>>  Sricharan
>>>>  
>>>
>>> Hi Sricharan
>>>
>>> With commit 09515ef5ddad
>>>
>>> [    1.288962] mvneta f1070000.ethernet: dev->dma_mask 0xffffffff
>>> [    1.294827] mvneta f1070000.ethernet: dev->coherent_dma_mask
>>> 0xffffffff [    1.301472] mvneta f1070000.ethernet: dev->dma_ops
>>> 0x40b00c0601460
>>>
>>> [    1.322047] mvneta f1034000.ethernet: dev->dma_mask 0xffffffff
>>> [    1.327904] mvneta f1034000.ethernet: dev->coherent_dma_mask
>>> 0xffffffff [    1.334549] mvneta f1034000.ethernet: dev->dma_ops
>>> 0x40b00c0601460
>>>
>>>
>>> With the patch reverted, the build that works
>>>
>>> [    1.289001] mvneta f1070000.ethernet: dev->dma_mask 0xffffffff
>>> [    1.294866] mvneta f1070000.ethernet: dev->coherent_dma_mask
>>> 0xffffffff [    1.301511] mvneta f1070000.ethernet: dev->dma_ops
>>> 0x40b00c06014a8
>>>
>>> [    1.317005] mvneta f1034000.ethernet: dev->dma_mask 0xffffffff
>>> [    1.322867] mvneta f1034000.ethernet: dev->coherent_dma_mask
>>> 0xffffffff [    1.329508] mvneta f1034000.ethernet: dev->dma_ops
>>> 0x40b00c06014a8 
>>
>> My bad, i think it is this patch missing [1], attached it as well.
>> Infact, this was in the series initially and got acked to get merged
>> separately well before the series. I should have sent this to Russell.
>> I will do this now. If this fixes up the issue,
>> i will take this patch separately, while this series gets tested
>> on -next.
>>
>> [1] https://patchwork.kernel.org/patch/9362113/
>>
> 
> With the attached patch,
> 0001-arm-dma-mapping-Don-t-override-dma_ops-in-arch_setup.patch, on top
> of next all is well again.

Thanks for the testing.
Also, probably this patch now going through the iommu tree looks more apt,
as its for probe-deferral.
Joerg, is that correct ?

Regards,
 Sricharan
Joerg Roedel April 28, 2017, 3 p.m. UTC | #3
On Fri, Apr 28, 2017 at 06:48:33PM +0530, Sricharan R wrote:
> Also, probably this patch now going through the iommu tree looks more apt,
> as its for probe-deferral.
> Joerg, is that correct ?

Definitly. Please send the patch directly to me and I put it in the
tree.

Thanks,

	Joerg
diff mbox

Patch

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0268584..c742dfd 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -2408,6 +2408,15 @@  void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
 	const struct dma_map_ops *dma_ops;
 
 	dev->archdata.dma_coherent = coherent;
+
+	/*
+	 * Don't override the dma_ops if they have already been set. Ideally
+	 * this should be the only location where dma_ops are set, remove this
+	 * check when all other callers of set_dma_ops will have disappeared.
+	 */
+	if (dev->dma_ops)
+		return;
+
 	if (arm_setup_iommu_dma_ops(dev, dma_base, size, iommu))
 		dma_ops = arm_get_iommu_dma_map_ops(coherent);
 	else