mbox series

[0/2] SWIOTLB fixes for 4.20

Message ID cover.1542722463.git.robin.murphy@arm.com (mailing list archive)
Headers show
Series SWIOTLB fixes for 4.20 | expand

Message

Robin Murphy Nov. 20, 2018, 2:09 p.m. UTC
This is what I have so far, which at least resolves the most ovbious
problems. I still haven't got very far with the USB corruption issue
I see on Juno with -rc1, but I'm yet to confirm whether that's actually
attributable to the SWIOTLB changes or something else entirely.

Robin.

Robin Murphy (2):
  swiotlb: Make DIRECT_MAPPING_ERROR viable
  swiotlb: Skip cache maintenance on map error

 include/linux/dma-direct.h | 2 +-
 kernel/dma/swiotlb.c       | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

Comments

Robin Murphy Nov. 20, 2018, 3:01 p.m. UTC | #1
On 20/11/2018 14:49, Konrad Rzeszutek Wilk wrote:
> On Tue, Nov 20, 2018 at 02:09:52PM +0000, Robin Murphy wrote:
>> With the overflow buffer removed, we no longer have a unique address
>> which is guaranteed not to be a valid DMA target to use as an error
>> token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent
>> an unlikely DMA target, but unfortunately there are already SWIOTLB
>> users with DMA-able memory at physical address 0 which now gets falsely
>> treated as a mapping failure and leads to all manner of misbehaviour.
>>
>> The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the
>> commonly-used all-bits-set value, since the last single byte of memory
>> is by far the least-likely-valid DMA target.
> 
> Are all the callers checking for DIRECT_MAPPING_ERROR or is it more of
> a comparison (as in if (!ret)) ?

dma_direct_map_page() and dma_direct_mapping_error() were already doing 
the right thing, and external callers must rely on the latter via 
dma_mapping_error() rather than trying to inspect the actual value 
themselves, since that varies between implementations anyway. AFAICS all 
the new return paths from swiotlb_map_page() are also robust in 
referencing the macro explicitly, so I think we're good.

Thanks,
Robin.

>> Fixes: dff8d6c1ed58 ("swiotlb: remove the overflow buffer")]
>> Reported-by: John Stultz <john.stultz@linaro.org>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   include/linux/dma-direct.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
>> index bd73e7a91410..9de9c7ab39d6 100644
>> --- a/include/linux/dma-direct.h
>> +++ b/include/linux/dma-direct.h
>> @@ -5,7 +5,7 @@
>>   #include <linux/dma-mapping.h>
>>   #include <linux/mem_encrypt.h>
>>   
>> -#define DIRECT_MAPPING_ERROR		0
>> +#define DIRECT_MAPPING_ERROR		~(dma_addr_t)0
>>   
>>   #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
>>   #include <asm/dma-direct.h>
>> -- 
>> 2.19.1.dirty
>>
Christoph Hellwig Nov. 20, 2018, 4:07 p.m. UTC | #2
The subject line should say dma-direct.  This isn't really swiotlb
specific.

> +#define DIRECT_MAPPING_ERROR		~(dma_addr_t)0

Please add braces around the value like the other *MAPPING_ERROR
defintions so that it can be safely used in any context.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig Nov. 20, 2018, 4:07 p.m. UTC | #3
On Tue, Nov 20, 2018 at 02:09:53PM +0000, Robin Murphy wrote:
> If swiotlb_bounce_page() failed, calling arch_sync_dma_for_device() may
> lead to such delights as performing cache maintenance on whatever
> address phys_to_virt(SWIOTLB_MAP_ERROR) looks like, which is typically
> outside the kernel memory map and goes about as well as expected.
> 
> Don't do that.
> 
> Fixes: a4a4330db46a ("swiotlb: add support for non-coherent DMA")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig Nov. 20, 2018, 4:08 p.m. UTC | #4
On Tue, Nov 20, 2018 at 02:09:51PM +0000, Robin Murphy wrote:
> This is what I have so far, which at least resolves the most ovbious
> problems. I still haven't got very far with the USB corruption issue
> I see on Juno with -rc1, but I'm yet to confirm whether that's actually
> attributable to the SWIOTLB changes or something else entirely.

This looks good modulo the minor nitpicks.

Konrad, are you ok with me picking up both through the dma-mapping
tree?
Konrad Rzeszutek Wilk Nov. 20, 2018, 4:34 p.m. UTC | #5
On Tue, Nov 20, 2018 at 03:01:33PM +0000, Robin Murphy wrote:
> On 20/11/2018 14:49, Konrad Rzeszutek Wilk wrote:
> > On Tue, Nov 20, 2018 at 02:09:52PM +0000, Robin Murphy wrote:
> > > With the overflow buffer removed, we no longer have a unique address
> > > which is guaranteed not to be a valid DMA target to use as an error
> > > token. The DIRECT_MAPPING_ERROR value of 0 tries to at least represent
> > > an unlikely DMA target, but unfortunately there are already SWIOTLB
> > > users with DMA-able memory at physical address 0 which now gets falsely
> > > treated as a mapping failure and leads to all manner of misbehaviour.
> > > 
> > > The best we can do to mitigate that is flip DIRECT_MAPPING_ERROR to the
> > > commonly-used all-bits-set value, since the last single byte of memory
> > > is by far the least-likely-valid DMA target.
> > 
> > Are all the callers checking for DIRECT_MAPPING_ERROR or is it more of
> > a comparison (as in if (!ret)) ?
> 
> dma_direct_map_page() and dma_direct_mapping_error() were already doing the
> right thing, and external callers must rely on the latter via
> dma_mapping_error() rather than trying to inspect the actual value
> themselves, since that varies between implementations anyway. AFAICS all the
> new return paths from swiotlb_map_page() are also robust in referencing the
> macro explicitly, so I think we're good.

Cool! Thank you for checking.

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Thank you!
> 
> Thanks,
> Robin.
> 
> > > Fixes: dff8d6c1ed58 ("swiotlb: remove the overflow buffer")]
> > > Reported-by: John Stultz <john.stultz@linaro.org>
> > > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > > ---
> > >   include/linux/dma-direct.h | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> > > index bd73e7a91410..9de9c7ab39d6 100644
> > > --- a/include/linux/dma-direct.h
> > > +++ b/include/linux/dma-direct.h
> > > @@ -5,7 +5,7 @@
> > >   #include <linux/dma-mapping.h>
> > >   #include <linux/mem_encrypt.h>
> > > -#define DIRECT_MAPPING_ERROR		0
> > > +#define DIRECT_MAPPING_ERROR		~(dma_addr_t)0
> > >   #ifdef CONFIG_ARCH_HAS_PHYS_TO_DMA
> > >   #include <asm/dma-direct.h>
> > > -- 
> > > 2.19.1.dirty
> > >
Konrad Rzeszutek Wilk Nov. 20, 2018, 4:34 p.m. UTC | #6
On Tue, Nov 20, 2018 at 05:08:18PM +0100, Christoph Hellwig wrote:
> On Tue, Nov 20, 2018 at 02:09:51PM +0000, Robin Murphy wrote:
> > This is what I have so far, which at least resolves the most ovbious
> > problems. I still haven't got very far with the USB corruption issue
> > I see on Juno with -rc1, but I'm yet to confirm whether that's actually
> > attributable to the SWIOTLB changes or something else entirely.
> 
> This looks good modulo the minor nitpicks.
> 
> Konrad, are you ok with me picking up both through the dma-mapping
> tree?

Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
John Stultz Nov. 20, 2018, 7:24 p.m. UTC | #7
On Tue, Nov 20, 2018 at 6:10 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> This is what I have so far, which at least resolves the most ovbious
> problems. I still haven't got very far with the USB corruption issue
> I see on Juno with -rc1, but I'm yet to confirm whether that's actually
> attributable to the SWIOTLB changes or something else entirely.
>
> Robin.
>
> Robin Murphy (2):
>   swiotlb: Make DIRECT_MAPPING_ERROR viable
>   swiotlb: Skip cache maintenance on map error
>
>  include/linux/dma-direct.h | 2 +-
>  kernel/dma/swiotlb.c       | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)

Thanks so much for chasing this down!

Unfortunately AOSP is giving me grief this week, so I've not been able
to test the full environment, but I don't seem to be hitting the io
hangs I was seeing earlier with this patch set.

For both:
  Tested-by: John Stultz <john.stultz@linaro.org>

thanks
-john
Christoph Hellwig Nov. 21, 2018, 1:03 p.m. UTC | #8
On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > Konrad, are you ok with me picking up both through the dma-mapping
> > tree?
> 
> Yes, albeit I would want Stefano to take a peek at patch #2 just in case.

Stefano, can you take a look asap?  This is a pretty trivial fix for
a nasty bug that breaks boot on real life systems.  I'd like to merge
it by tomorrow so that I can send it off to Linus for the next rc
(I will be offline for about two days after Friday morning)
Konrad Rzeszutek Wilk Nov. 21, 2018, 3:11 p.m. UTC | #9
On Wed, Nov 21, 2018 at 02:03:31PM +0100, Christoph Hellwig wrote:
> On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > > Konrad, are you ok with me picking up both through the dma-mapping
> > > tree?
> > 
> > Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
> 
> Stefano, can you take a look asap?  This is a pretty trivial fix for
> a nasty bug that breaks boot on real life systems.  I'd like to merge
> it by tomorrow so that I can send it off to Linus for the next rc
> (I will be offline for about two days after Friday morning)

It is Turkey Day in US so he may be busy catching those pesky turkeys.
How about Tuesday? And in can also send the GIT PULL.
Stefano Stabellini Nov. 27, 2018, 7:07 p.m. UTC | #10
On Wed, 21 Nov 2018, Konrad Rzeszutek Wilk wrote:
> On Wed, Nov 21, 2018 at 02:03:31PM +0100, Christoph Hellwig wrote:
> > On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > Konrad, are you ok with me picking up both through the dma-mapping
> > > > tree?
> > > 
> > > Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
> > 
> > Stefano, can you take a look asap?  This is a pretty trivial fix for
> > a nasty bug that breaks boot on real life systems.  I'd like to merge
> > it by tomorrow so that I can send it off to Linus for the next rc
> > (I will be offline for about two days after Friday morning)
> 
> It is Turkey Day in US so he may be busy catching those pesky turkeys.
> How about Tuesday? And in can also send the GIT PULL.

I have just come back to the office. I'll give it a look today.
Stefano Stabellini Nov. 27, 2018, 8:39 p.m. UTC | #11
On Tue, 27 Nov 2018, Stefano Stabellini wrote:
> On Wed, 21 Nov 2018, Konrad Rzeszutek Wilk wrote:
> > On Wed, Nov 21, 2018 at 02:03:31PM +0100, Christoph Hellwig wrote:
> > > On Tue, Nov 20, 2018 at 11:34:41AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > Konrad, are you ok with me picking up both through the dma-mapping
> > > > > tree?
> > > > 
> > > > Yes, albeit I would want Stefano to take a peek at patch #2 just in case.
> > > 
> > > Stefano, can you take a look asap?  This is a pretty trivial fix for
> > > a nasty bug that breaks boot on real life systems.  I'd like to merge
> > > it by tomorrow so that I can send it off to Linus for the next rc
> > > (I will be offline for about two days after Friday morning)
> > 
> > It is Turkey Day in US so he may be busy catching those pesky turkeys.
> > How about Tuesday? And in can also send the GIT PULL.
> 
> I have just come back to the office. I'll give it a look today.

Everything looks good to me, thanks for keeping me in the loop.