Message ID | 20190717153135.15507-4-nsaenzjulienne@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Raspberry Pi 4 DMA addressing support | expand |
On Wed, Jul 17, 2019 at 05:31:34PM +0200, Nicolas Saenz Julienne wrote: > Historically devices with ZONE_DMA32 have been assumed to be able to > address at least the lower 4GB of ram for DMA. This is still the defualt > behavior yet the Raspberry Pi 4 is limited to the first GB of memory. > This has been observed to trigger failures in dma_direct_supported() as > the 'min_mask' isn't properly set. > > We create 'dma_direct_min_mask' in order for the arch init code to be > able to fine-tune dma direct's 'min_dma' mask. Normally we use ZONE_DMA for that case.
On Thu, 2019-07-18 at 11:15 +0200, Christoph Hellwig wrote: > On Wed, Jul 17, 2019 at 05:31:34PM +0200, Nicolas Saenz Julienne wrote: > > Historically devices with ZONE_DMA32 have been assumed to be able to > > address at least the lower 4GB of ram for DMA. This is still the defualt > > behavior yet the Raspberry Pi 4 is limited to the first GB of memory. > > This has been observed to trigger failures in dma_direct_supported() as > > the 'min_mask' isn't properly set. > > > > We create 'dma_direct_min_mask' in order for the arch init code to be > > able to fine-tune dma direct's 'min_dma' mask. > > Normally we use ZONE_DMA for that case. Fair enough, I didn't think of that possibility. So would the arm64 maintainers be happy with something like this: - ZONE_DMA: Follows standard definition, 16MB in size. ARCH_ZONE_DMA_BITS is left as is. - ZONE_DMA32: Will honor the most constraining 'dma-ranges'. Which so far for most devices is 4G, except for RPi4. - ZONE_NORMAL: The rest of the memory.
On Thu, 2019-07-18 at 13:18 +0200, Nicolas Saenz Julienne wrote: > On Thu, 2019-07-18 at 11:15 +0200, Christoph Hellwig wrote: > > On Wed, Jul 17, 2019 at 05:31:34PM +0200, Nicolas Saenz Julienne wrote: > > > Historically devices with ZONE_DMA32 have been assumed to be able to > > > address at least the lower 4GB of ram for DMA. This is still the defualt > > > behavior yet the Raspberry Pi 4 is limited to the first GB of memory. > > > This has been observed to trigger failures in dma_direct_supported() as > > > the 'min_mask' isn't properly set. > > > > > > We create 'dma_direct_min_mask' in order for the arch init code to be > > > able to fine-tune dma direct's 'min_dma' mask. > > > > Normally we use ZONE_DMA for that case. > > Fair enough, I didn't think of that possibility. > > So would the arm64 maintainers be happy with something like this: > > - ZONE_DMA: Follows standard definition, 16MB in size. ARCH_ZONE_DMA_BITS is > left as is. > - ZONE_DMA32: Will honor the most constraining 'dma-ranges'. Which so far for > most devices is 4G, except for RPi4. > - ZONE_NORMAL: The rest of the memory. Never mind this suggestion, I don't think it makes any sense. If anything arm64 seems to fit the ZONE_DMA usage pattern of arm and powerpc: where ZONE_DMA's size is decided based on ram size and/or board configuration. It was actually set-up like this until Christoph's ad67f5a6545f7 ("arm64: replace ZONE_DMA with ZONE_DMA32"). So the easy solution would be to simply revert that commit. On one hand I feel it would be a step backwards as most 64 bit architectures have been moving to use ZONE_DMA32. On the other, current ZONE_DMA32 usage seems to be heavily rooted on having a 32 bit DMA mask*, which will no longer be the case on arm64 if we want to support the RPi 4. So the way I see it and lacking a better solution, the argument is stronger on moving back arm64 to using ZONE_DMA. Any comments/opinions? Note that I've been looking at all the DMA/CMA/swiotlb code to see if this would break anything or change behaviors and couldn't find anything obvious. I also tested the revert on my RPi4 and nothing seems to fail. * A good example is dma-direct's implementation.
On Fri, Jul 19, 2019 at 03:08:52PM +0200, Nicolas Saenz Julienne wrote: > On Thu, 2019-07-18 at 13:18 +0200, Nicolas Saenz Julienne wrote: > > On Thu, 2019-07-18 at 11:15 +0200, Christoph Hellwig wrote: > > > On Wed, Jul 17, 2019 at 05:31:34PM +0200, Nicolas Saenz Julienne wrote: > > > > Historically devices with ZONE_DMA32 have been assumed to be able to > > > > address at least the lower 4GB of ram for DMA. This is still the defualt > > > > behavior yet the Raspberry Pi 4 is limited to the first GB of memory. > > > > This has been observed to trigger failures in dma_direct_supported() as > > > > the 'min_mask' isn't properly set. > > > > > > > > We create 'dma_direct_min_mask' in order for the arch init code to be > > > > able to fine-tune dma direct's 'min_dma' mask. > > > > > > Normally we use ZONE_DMA for that case. > > > > Fair enough, I didn't think of that possibility. > > > > So would the arm64 maintainers be happy with something like this: > > > > - ZONE_DMA: Follows standard definition, 16MB in size. ARCH_ZONE_DMA_BITS is > > left as is. > > - ZONE_DMA32: Will honor the most constraining 'dma-ranges'. Which so far for > > most devices is 4G, except for RPi4. > > - ZONE_NORMAL: The rest of the memory. > > Never mind this suggestion, I don't think it makes any sense. If anything arm64 > seems to fit the ZONE_DMA usage pattern of arm and powerpc: where ZONE_DMA's > size is decided based on ram size and/or board configuration. It was actually > set-up like this until Christoph's ad67f5a6545f7 ("arm64: replace ZONE_DMA with > ZONE_DMA32"). > > So the easy solution would be to simply revert that commit. On one hand I feel > it would be a step backwards as most 64 bit architectures have been moving to > use ZONE_DMA32. On the other, current ZONE_DMA32 usage seems to be heavily > rooted on having a 32 bit DMA mask*, which will no longer be the case on arm64 > if we want to support the RPi 4. > > So the way I see it and lacking a better solution, the argument is stronger on > moving back arm64 to using ZONE_DMA. Any comments/opinions? As it was suggested in this or the previous thread, I'm not keen on limiting ZONE_DMA32 to the smalles RPi4 can cover, as the naming implies this zone should cover 32-bit devices that can deal with a full 32-bit mask. I think it may be better if we have both ZONE_DMA and ZONE_DMA32 on arm64. ZONE_DMA would be based on the smallest dma-ranges as described in the DT while DMA32 covers the first naturally aligned 4GB of RAM (unchanged). When a smaller ZONE_DMA is not needed, it could be expanded to cover what would normally be ZONE_DMA32 (or could we have ZONE_DMA as 0-bytes? I don't think GFP_DMA can still allocate memory in this case). We'd probably have to define ARCH_ZONE_DMA_BITS for arm64 to something smaller than 32-bit but sufficient to cover the known platforms like RPi4 (the current 24 is too small, so maybe 30). AFAICT, __dma_direct_optimal_gfp_mask() figures out whether GFP_DMA or GFP_DMA32 should be passed.
On Wed, Jul 24, 2019 at 02:51:24PM +0100, Catalin Marinas wrote: > I think it may be better if we have both ZONE_DMA and ZONE_DMA32 on > arm64. ZONE_DMA would be based on the smallest dma-ranges as described > in the DT while DMA32 covers the first naturally aligned 4GB of RAM > (unchanged). When a smaller ZONE_DMA is not needed, it could be expanded > to cover what would normally be ZONE_DMA32 (or could we have ZONE_DMA as > 0-bytes? I don't think GFP_DMA can still allocate memory in this case). > > We'd probably have to define ARCH_ZONE_DMA_BITS for arm64 to something > smaller than 32-bit but sufficient to cover the known platforms like > RPi4 (the current 24 is too small, so maybe 30). AFAICT, > __dma_direct_optimal_gfp_mask() figures out whether GFP_DMA or GFP_DMA32 > should be passed. ARCH_ZONE_DMA_BITS should probably become a variable. That way we can just initialize it to the default 24 bits in kernel/dma/direct.c and allow architectures to override it in their early boot code.
On Wed, 2019-07-24 at 15:56 +0200, Christoph Hellwig wrote: > On Wed, Jul 24, 2019 at 02:51:24PM +0100, Catalin Marinas wrote: > > I think it may be better if we have both ZONE_DMA and ZONE_DMA32 on > > arm64. ZONE_DMA would be based on the smallest dma-ranges as described > > in the DT while DMA32 covers the first naturally aligned 4GB of RAM > > (unchanged). When a smaller ZONE_DMA is not needed, it could be expanded > > to cover what would normally be ZONE_DMA32 (or could we have ZONE_DMA as > > 0-bytes? I don't think GFP_DMA can still allocate memory in this case). > > > > We'd probably have to define ARCH_ZONE_DMA_BITS for arm64 to something > > smaller than 32-bit but sufficient to cover the known platforms like > > RPi4 (the current 24 is too small, so maybe 30). AFAICT, > > __dma_direct_optimal_gfp_mask() figures out whether GFP_DMA or GFP_DMA32 > > should be passed. > > ARCH_ZONE_DMA_BITS should probably become a variable. That way we can > just initialize it to the default 24 bits in kernel/dma/direct.c and > allow architectures to override it in their early boot code. Thanks both for your feedback. I'll start preparing a proper series.
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index b90e1aede743..3c8cd730648b 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -23,6 +23,8 @@ #define ARCH_ZONE_DMA_BITS 24 #endif +u64 dma_direct_min_mask __ro_after_init = DMA_BIT_MASK(32); + /* * For AMD SEV all DMA must be to unencrypted addresses. */ @@ -393,7 +395,7 @@ int dma_direct_supported(struct device *dev, u64 mask) if (IS_ENABLED(CONFIG_ZONE_DMA)) min_mask = DMA_BIT_MASK(ARCH_ZONE_DMA_BITS); else - min_mask = DMA_BIT_MASK(32); + min_mask = dma_direct_min_mask; min_mask = min_t(u64, min_mask, (max_pfn - 1) << PAGE_SHIFT);
Historically devices with ZONE_DMA32 have been assumed to be able to address at least the lower 4GB of ram for DMA. This is still the defualt behavior yet the Raspberry Pi 4 is limited to the first GB of memory. This has been observed to trigger failures in dma_direct_supported() as the 'min_mask' isn't properly set. We create 'dma_direct_min_mask' in order for the arch init code to be able to fine-tune dma direct's 'min_dma' mask. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> --- kernel/dma/direct.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)