diff mbox series

[v2] Revert "swiotlb: panic if nslabs is too small"

Message ID 20220831063818.3902572-1-yuzhao@google.com (mailing list archive)
State Handled Elsewhere
Headers show
Series [v2] Revert "swiotlb: panic if nslabs is too small" | expand

Commit Message

Yu Zhao Aug. 31, 2022, 6:38 a.m. UTC
This reverts commit 0bf28fc40d89b1a3e00d1b79473bad4e9ca20ad1.

Reasons:
  1. new panic()s shouldn't be added [1].
  2. It does no "cleanup" but breaks MIPS [2].

v2: properly solved the conflict [3] with
commit 20347fca71a38 ("swiotlb: split up the global swiotlb lock")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

[1] https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld+pnerc4J2Ag990WwAA@mail.gmail.com/
[2] https://lore.kernel.org/r/20220820012031.1285979-1-yuzhao@google.com/
[3] https://lore.kernel.org/r/202208310701.LKr1WDCh-lkp@intel.com/

Fixes: 0bf28fc40d89b ("swiotlb: panic if nslabs is too small")
Signed-off-by: Yu Zhao <yuzhao@google.com>
---
 kernel/dma/swiotlb.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)


base-commit: dcf8e5633e2e69ad60b730ab5905608b756a032f

Comments

Dongli Zhang Aug. 31, 2022, 10:20 p.m. UTC | #1
Hi Yu,

As we discussed in the past, the swiotlb panic on purpose because the
mips/cavium-octeon/dma-octeon.c requests to allocate only PAGE_SIZE swiotlb
buffer. This is smaller than IO_TLB_MIN_SLABS.

The below comments mentioned that IO_TLB_MIN_SLABS is the "Minimum IO TLB size
to bother booting with".

56 /*
57  * Minimum IO TLB size to bother booting with.  Systems with mainly
58  * 64bit capable cards will only lightly use the swiotlb.  If we can't
59  * allocate a contiguous 1MB, we're probably in trouble anyway.
60  */
61 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)


The arm may create swiotlb conditionally. That is, the swiotlb is not
initialized if (1) CONFIG_ARM_LPAE is not set (line 273), or (2) max_pfn <=
arm_dma_pfn_limit (line 274).

arch/arm/mm/init.c

271 void __init mem_init(void)
272 {
273 #ifdef CONFIG_ARM_LPAE
274         swiotlb_init(max_pfn > arm_dma_pfn_limit, SWIOTLB_VERBOSE);
275 #endif
276
277         set_max_mapnr(pfn_to_page(max_pfn) - mem_map);


On x86, the swiotlb is not initialized if the memory is small (> MAX_DMA32_PFN,
at line 47), or the secure memory is not required.

44 static void __init pci_swiotlb_detect(void)
45 {
46         /* don't initialize swiotlb if iommu=off (no_iommu=1) */
47         if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
48                 x86_swiotlb_enable = true;
49
50         /*
51          * Set swiotlb to 1 so that bounce buffers are allocated and used for
52          * devices that can't support DMA to encrypted memory.
53          */
54         if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
55                 x86_swiotlb_enable = true;
56
57         /*
58          * Guest with guest memory encryption currently perform all DMA through
59          * bounce buffers as the hypervisor can't access arbitrary VM memory
60          * that is not explicitly shared with it.
61          */
62         if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
63                 x86_swiotlb_enable = true;
64                 x86_swiotlb_flags |= SWIOTLB_FORCE;
65         }
66 }


Regardless whether the current patch will be reverted, unless there is specific
reason (e.g., those PAGE_SIZE will be used), I do not think it is a good idea to
allocate <IO_TLB_MIN_SLABS swiotlb buffer. I will wait for the suggestion from
the swiotlb maintainer.

Since I do not have a mips environment, I am not able to test if the below makes
any trouble in your situation at arch/mips/cavium-octeon/dma-octeon.c.

@@ -234,6 +234,8 @@ void __init plat_swiotlb_setup(void)
                swiotlbsize = 64 * (1<<20);
 #endif

-       swiotlb_adjust_size(swiotlbsize);
-       swiotlb_init(true, SWIOTLB_VERBOSE);
+       if (swiotlbsize != PAGE_SIZE) {
+               swiotlb_adjust_size(swiotlbsize);
+               swiotlb_init(true, SWIOTLB_VERBOSE);
+       }
 }


Thank you very much!

Dongli Zhang

On 8/30/22 11:38 PM, Yu Zhao wrote:
> This reverts commit 0bf28fc40d89b1a3e00d1b79473bad4e9ca20ad1.
> 
> Reasons:
>   1. new panic()s shouldn't be added [1].
>   2. It does no "cleanup" but breaks MIPS [2].
> 
> v2: properly solved the conflict [3] with
> commit 20347fca71a38 ("swiotlb: split up the global swiotlb lock")
> Reported-by: kernel test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> [1] https://urldefense.com/v3/__https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld*pnerc4J2Ag990WwAA@mail.gmail.com/__;Kw!!ACWV5N9M2RV99hQ!PPVATbHVDT6TZ4sqoj5G6vfAJGPAEz-Lmp9njTsM2PPYPQqCP6aq5RF8FDmrXDlSzxJmTUUSgOW3yjKDtg$  
> [2] https://urldefense.com/v3/__https://lore.kernel.org/r/20220820012031.1285979-1-yuzhao@google.com/__;!!ACWV5N9M2RV99hQ!PPVATbHVDT6TZ4sqoj5G6vfAJGPAEz-Lmp9njTsM2PPYPQqCP6aq5RF8FDmrXDlSzxJmTUUSgOXQRsYjKQ$  
> [3] https://urldefense.com/v3/__https://lore.kernel.org/r/202208310701.LKr1WDCh-lkp@intel.com/__;!!ACWV5N9M2RV99hQ!PPVATbHVDT6TZ4sqoj5G6vfAJGPAEz-Lmp9njTsM2PPYPQqCP6aq5RF8FDmrXDlSzxJmTUUSgOW_tjcVMA$  
> 
> Fixes: 0bf28fc40d89b ("swiotlb: panic if nslabs is too small")
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> ---
>  kernel/dma/swiotlb.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index c5a9190b218f..dd8863987e0c 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -326,9 +326,6 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
>  		swiotlb_adjust_nareas(num_possible_cpus());
>  
>  	nslabs = default_nslabs;
> -	if (nslabs < IO_TLB_MIN_SLABS)
> -		panic("%s: nslabs = %lu too small\n", __func__, nslabs);
> -
>  	/*
>  	 * By default allocate the bounce buffer memory from low memory, but
>  	 * allow to pick a location everywhere for hypervisors with guest
> @@ -341,8 +338,7 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
>  	else
>  		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
>  	if (!tlb) {
> -		pr_warn("%s: Failed to allocate %zu bytes tlb structure\n",
> -			__func__, bytes);
> +		pr_warn("%s: failed to allocate tlb structure\n", __func__);
>  		return;
>  	}
>  
> 
> base-commit: dcf8e5633e2e69ad60b730ab5905608b756a032f
>
Yu Zhao Sept. 1, 2022, 12:24 a.m. UTC | #2
On Wed, Aug 31, 2022 at 4:20 PM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>
> Hi Yu,
>
> As we discussed in the past, the swiotlb panic on purpose

We should not panic() at all, especially on a platform that has been
working well since at least 4.14.

Did you check out this link I previously shared with you?
https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld+pnerc4J2Ag990WwAA@mail.gmail.com/

> because the
> mips/cavium-octeon/dma-octeon.c requests to allocate only PAGE_SIZE swiotlb
> buffer.

What's wrong with PAGE_SIZE swiotlb?

> This is smaller than IO_TLB_MIN_SLABS.

IO_TLB_MIN_SLABS is an arbitrary number, and it's inherited from IA64.
So does the comment below. What exactly is the rationale behind it?

> The below comments mentioned that IO_TLB_MIN_SLABS is the "Minimum IO TLB size
> to bother booting with".
>
> 56 /*
> 57  * Minimum IO TLB size to bother booting with.  Systems with mainly
> 58  * 64bit capable cards will only lightly use the swiotlb.  If we can't
> 59  * allocate a contiguous 1MB, we're probably in trouble anyway.
> 60  */
> 61 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>
>
> The arm may create swiotlb conditionally. That is, the swiotlb is not
> initialized if (1) CONFIG_ARM_LPAE is not set (line 273), or (2) max_pfn <=
> arm_dma_pfn_limit (line 274).
>
> arch/arm/mm/init.c
>
> 271 void __init mem_init(void)
> 272 {
> 273 #ifdef CONFIG_ARM_LPAE
> 274         swiotlb_init(max_pfn > arm_dma_pfn_limit, SWIOTLB_VERBOSE);
> 275 #endif
> 276
> 277         set_max_mapnr(pfn_to_page(max_pfn) - mem_map);
>
>
> On x86, the swiotlb is not initialized if the memory is small (> MAX_DMA32_PFN,
> at line 47), or the secure memory is not required.
>
> 44 static void __init pci_swiotlb_detect(void)
> 45 {
> 46         /* don't initialize swiotlb if iommu=off (no_iommu=1) */
> 47         if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
> 48                 x86_swiotlb_enable = true;
> 49
> 50         /*
> 51          * Set swiotlb to 1 so that bounce buffers are allocated and used for
> 52          * devices that can't support DMA to encrypted memory.
> 53          */
> 54         if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
> 55                 x86_swiotlb_enable = true;
> 56
> 57         /*
> 58          * Guest with guest memory encryption currently perform all DMA through
> 59          * bounce buffers as the hypervisor can't access arbitrary VM memory
> 60          * that is not explicitly shared with it.
> 61          */
> 62         if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
> 63                 x86_swiotlb_enable = true;
> 64                 x86_swiotlb_flags |= SWIOTLB_FORCE;
> 65         }
> 66 }

Thanks for the code snippets. But I failed to see a point.

> Regardless whether the current patch will be reverted, unless there is specific
> reason (e.g., those PAGE_SIZE will be used), I do not think it is a good idea to
> allocate <IO_TLB_MIN_SLABS swiotlb buffer.

For what specific reason? My current understanding is that you want to
be future/fool-proof. If so, then you got the priority wrong. We need
to make sure currently working systems continue to work first, then
future/fool-proof.

> I will wait for the suggestion from
> the swiotlb maintainer.

Chris is on vacation. I sure can wait.

But it sounds like you are unsure about what to do. If so, it's not
what you claimed "we have already understood everything related to
swiotlb" previously.

> Since I do not have a mips environment, I am not able to test if the below makes
> any trouble in your situation at arch/mips/cavium-octeon/dma-octeon.c.
>
> @@ -234,6 +234,8 @@ void __init plat_swiotlb_setup(void)
>                 swiotlbsize = 64 * (1<<20);
>  #endif
>
> -       swiotlb_adjust_size(swiotlbsize);
> -       swiotlb_init(true, SWIOTLB_VERBOSE);
> +       if (swiotlbsize != PAGE_SIZE) {
> +               swiotlb_adjust_size(swiotlbsize);
> +               swiotlb_init(true, SWIOTLB_VERBOSE);
> +       }
>  }

Please ask the MIPS/Octeon maintainers to check this first.
Dongli Zhang Sept. 1, 2022, 2:18 a.m. UTC | #3
Hi Yu,

On 8/31/22 5:24 PM, Yu Zhao wrote:
> On Wed, Aug 31, 2022 at 4:20 PM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>>
>> Hi Yu,
>>
>> As we discussed in the past, the swiotlb panic on purpose
> 
> We should not panic() at all, especially on a platform that has been
> working well since at least 4.14.
> 
> Did you check out this link I previously shared with you?
> https://urldefense.com/v3/__https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld*pnerc4J2Ag990WwAA@mail.gmail.com/__;Kw!!ACWV5N9M2RV99hQ!PXzSLurBv7VqxI1451TV4zO3_-BYj4grk-HYBsXzSnA6nZcXaBzdsQ-rF2DAqlICSRPMt-efYv_Uu2A2CQ$  

Thanks for sharing! I used to read that in the past. To be honest I am still
confused on when to use BUG/panic and when to not, as I still see many usage in
some patches.

Just about swiotlb, it may panic in many cases during boot, e.g.,:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1955655

> 
>> because the
>> mips/cavium-octeon/dma-octeon.c requests to allocate only PAGE_SIZE swiotlb
>> buffer.
> 
> What's wrong with PAGE_SIZE swiotlb?

The swiotlb is like a software IOMMU. When the device supports only <=32-bit DMA
while the IOMMU is not available, it may do the below for each DMA operation:

1. The swiotlb inits and pre-allocates many 32-bit memory during boot.
2. For a DMA read/write, if the page is 64-bit but 32-bit DMA is required, the
kernel swaps the 64-bit DMA page with the pre-allocated swiotlb 32-bit memory.
3. After the DMA is complete, the 32-bit memory is swapped back to swiotlb.

The size of pre-allocated swiotlb is fixed during boot. The DMA will fail later
if the pre-allocated swiotlb is not enough.

That's why I think the objective to set swiotlb=PAGE_SIZE is just to minimize
the swiotlb pre-allocation when it is not required. I do not see a reason to
pre-allocate a PAGE_SIZE for DMA.

There was a discussion between Robin and Christoph whether it is a good idea to
disable swiotlb by allocating a small chunk of memory.

https://lore.kernel.org/all/d5016c1e-55d9-4224-278a-50377d4c6454@arm.com/

> 
>> This is smaller than IO_TLB_MIN_SLABS.
> 
> IO_TLB_MIN_SLABS is an arbitrary number, and it's inherited from IA64.
> So does the comment below. What exactly is the rationale behind it?

To be honest I do not have an answer as different people may have different
preferences. Some people may prefer to increase IO_TLB_MIN_SLABS, while some may
prefer to decrease.

That value of IO_TLB_MIN_SLABS was there since I started working on swiotlb.

> 
>> The below comments mentioned that IO_TLB_MIN_SLABS is the "Minimum IO TLB size
>> to bother booting with".
>>
>> 56 /*
>> 57  * Minimum IO TLB size to bother booting with.  Systems with mainly
>> 58  * 64bit capable cards will only lightly use the swiotlb.  If we can't
>> 59  * allocate a contiguous 1MB, we're probably in trouble anyway.
>> 60  */
>> 61 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>>
>>
>> The arm may create swiotlb conditionally. That is, the swiotlb is not
>> initialized if (1) CONFIG_ARM_LPAE is not set (line 273), or (2) max_pfn <=
>> arm_dma_pfn_limit (line 274).
>>
>> arch/arm/mm/init.c
>>
>> 271 void __init mem_init(void)
>> 272 {
>> 273 #ifdef CONFIG_ARM_LPAE
>> 274         swiotlb_init(max_pfn > arm_dma_pfn_limit, SWIOTLB_VERBOSE);
>> 275 #endif
>> 276
>> 277         set_max_mapnr(pfn_to_page(max_pfn) - mem_map);
>>
>>
>> On x86, the swiotlb is not initialized if the memory is small (> MAX_DMA32_PFN,
>> at line 47), or the secure memory is not required.
>>
>> 44 static void __init pci_swiotlb_detect(void)
>> 45 {
>> 46         /* don't initialize swiotlb if iommu=off (no_iommu=1) */
>> 47         if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
>> 48                 x86_swiotlb_enable = true;
>> 49
>> 50         /*
>> 51          * Set swiotlb to 1 so that bounce buffers are allocated and used for
>> 52          * devices that can't support DMA to encrypted memory.
>> 53          */
>> 54         if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
>> 55                 x86_swiotlb_enable = true;
>> 56
>> 57         /*
>> 58          * Guest with guest memory encryption currently perform all DMA through
>> 59          * bounce buffers as the hypervisor can't access arbitrary VM memory
>> 60          * that is not explicitly shared with it.
>> 61          */
>> 62         if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
>> 63                 x86_swiotlb_enable = true;
>> 64                 x86_swiotlb_flags |= SWIOTLB_FORCE;
>> 65         }
>> 66 }
> 
> Thanks for the code snippets. But I failed to see a point.
> 
>> Regardless whether the current patch will be reverted, unless there is specific
>> reason (e.g., those PAGE_SIZE will be used), I do not think it is a good idea to
>> allocate <IO_TLB_MIN_SLABS swiotlb buffer.
> 
> For what specific reason? My current understanding is that you want to
> be future/fool-proof. If so, then you got the priority wrong. We need
> to make sure currently working systems continue to work first, then
> future/fool-proof.

My objective was not to prove which option was correct/incorrect. I was just
going to share my understanding of the situation to people on this list, so that
a discussion can continue.

But I do agree with your point on 'priority'.

> 
>> I will wait for the suggestion from
>> the swiotlb maintainer.
> 
> Chris is on vacation. I sure can wait.
> 
> But it sounds like you are unsure about what to do. If so, it's not
> what you claimed "we have already understood everything related to
> swiotlb" previously.

The swiotlb is a component used by many arch (e.g., x86, arm/arm64, mips and
xen) ... while I believe I understand the swiotlb well, I do not understand what
other components are expecting from the swiotlb.

That's why my objective was to trigger the discussion but not to prove my
statement was correct.

The swiotlb is still evolving due to it is more widely used recently (e.g., AMD
SEV). The discussion helps people avoid introducing more issue because there are
many components that may use swiotlb now or in the future.

About what to do, I think it is to either (1) revert the patch or (2) avoid
initializing swiotlb if it is not going to be used. However, I think my thoughts
do not count that much, and that's why I would wait for the suggestion from more
experienced maintainers.

Thank you very much!

Dongli Zhang

> 
>> Since I do not have a mips environment, I am not able to test if the below makes
>> any trouble in your situation at arch/mips/cavium-octeon/dma-octeon.c.
>>
>> @@ -234,6 +234,8 @@ void __init plat_swiotlb_setup(void)
>>                 swiotlbsize = 64 * (1<<20);
>>  #endif
>>
>> -       swiotlb_adjust_size(swiotlbsize);
>> -       swiotlb_init(true, SWIOTLB_VERBOSE);
>> +       if (swiotlbsize != PAGE_SIZE) {
>> +               swiotlb_adjust_size(swiotlbsize);
>> +               swiotlb_init(true, SWIOTLB_VERBOSE);
>> +       }
>>  }
> 
> Please ask the MIPS/Octeon maintainers to check this first.
>
Robin Murphy Sept. 1, 2022, 12:24 p.m. UTC | #4
On 2022-09-01 03:18, Dongli Zhang wrote:
> Hi Yu,
> 
> On 8/31/22 5:24 PM, Yu Zhao wrote:
>> On Wed, Aug 31, 2022 at 4:20 PM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>>>
>>> Hi Yu,
>>>
>>> As we discussed in the past, the swiotlb panic on purpose
>>
>> We should not panic() at all, especially on a platform that has been
>> working well since at least 4.14.
>>
>> Did you check out this link I previously shared with you?
>> https://urldefense.com/v3/__https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld*pnerc4J2Ag990WwAA@mail.gmail.com/__;Kw!!ACWV5N9M2RV99hQ!PXzSLurBv7VqxI1451TV4zO3_-BYj4grk-HYBsXzSnA6nZcXaBzdsQ-rF2DAqlICSRPMt-efYv_Uu2A2CQ$
> 
> Thanks for sharing! I used to read that in the past. To be honest I am still
> confused on when to use BUG/panic and when to not, as I still see many usage in
> some patches.
> 
> Just about swiotlb, it may panic in many cases during boot, e.g.,:
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1955655

That's really a different thing, but frankly that panic is also bogus 
anyway - there is no good reason to have different behaviour for failing 
to allocate a buffer slot because the buffer is full vs. failing to 
allocate a buffer slot because there is no buffer. If we can fail 
gracefully some of the time we should fail gracefully all of the time. 
Yes, there's a slight difference in that one case has a chance of 
succeeding if retried in future while the other definitely doesn't, but 
it's not SWIOTLB's place to decide that the entire system is terminally 
unusable just because some device can't make a DMA mapping.

Similarly for the other panics at init time, which seem to have 
originated from a mechanical refactoring of the memblock API with the 
expectation of preserving the existing behaviour at the time. Those have 
then just been moved around without anyone thinking to question why 
they're there, and if memblock *does* now return usable error 
information, why don't we start handling that error properly like we do 
in other init paths?

Really there is no reason to panic anywhere in SWIOTLB. This is old 
code, things have moved on over the last 20 years, and we can and should 
do much better. I'll add this to my list of things to look at for 
cleanup once I find a bit of free time, unless anyone else fancies 
taking it on.

Thanks,
Robin.
Dongli Zhang Sept. 1, 2022, 4:46 p.m. UTC | #5
Hi Robin,

On 9/1/22 5:24 AM, Robin Murphy wrote:
> On 2022-09-01 03:18, Dongli Zhang wrote:
>> Hi Yu,
>>
>> On 8/31/22 5:24 PM, Yu Zhao wrote:
>>> On Wed, Aug 31, 2022 at 4:20 PM Dongli Zhang <dongli.zhang@oracle.com> wrote:
>>>>
>>>> Hi Yu,
>>>>
>>>> As we discussed in the past, the swiotlb panic on purpose
>>>
>>> We should not panic() at all, especially on a platform that has been
>>> working well since at least 4.14.
>>>
>>> Did you check out this link I previously shared with you?
>>> https://urldefense.com/v3/__https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld*pnerc4J2Ag990WwAA@mail.gmail.com/__;Kw!!ACWV5N9M2RV99hQ!PXzSLurBv7VqxI1451TV4zO3_-BYj4grk-HYBsXzSnA6nZcXaBzdsQ-rF2DAqlICSRPMt-efYv_Uu2A2CQ$
>>>
>>
>> Thanks for sharing! I used to read that in the past. To be honest I am still
>> confused on when to use BUG/panic and when to not, as I still see many usage in
>> some patches.
>>
>> Just about swiotlb, it may panic in many cases during boot, e.g.,:
>>
>> https://urldefense.com/v3/__https://bugs.launchpad.net/ubuntu/*source/linux/*bug/1955655__;Kys!!ACWV5N9M2RV99hQ!MO4S9r0PisrW6Z-kZqUitAMbuGNMX6aceQd5VApqXllP6f1jaPCyL9x50um7chsn2uGL_pNqdBzTjZK5GebeKrI$ 
> 
> 
> That's really a different thing, but frankly that panic is also bogus anyway -
> there is no good reason to have different behaviour for failing to allocate a
> buffer slot because the buffer is full vs. failing to allocate a buffer slot
> because there is no buffer. If we can fail gracefully some of the time we should
> fail gracefully all of the time. Yes, there's a slight difference in that one

Thank you very much for the feedback!

Currently the swiotlb remap logic is more gracefully to handle failure, to
re-try with smaller nslabs (line 352), but will still panic at the end if the
value is reduced to smaller than IO_TLB_MIN_SLABS.

 337 retry:
 338         bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
 339         if (flags & SWIOTLB_ANY)
 340                 tlb = memblock_alloc(bytes, PAGE_SIZE);
 341         else
 342                 tlb = memblock_alloc_low(bytes, PAGE_SIZE);
 343         if (!tlb) {
 344                 pr_warn("%s: Failed to allocate %zu bytes tlb structure\n",
 345                         __func__, bytes);
 346                 return;
 347         }
 348
 349         if (remap && remap(tlb, nslabs) < 0) {
 350                 memblock_free(tlb, PAGE_ALIGN(bytes));
 351
 352                 nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
 353                 if (nslabs < IO_TLB_MIN_SLABS)
 354                         panic("%s: Failed to remap %zu bytes\n",
 355                               __func__, bytes);
 356                 goto retry;
 357         }

> case has a chance of succeeding if retried in future while the other definitely
> doesn't, but it's not SWIOTLB's place to decide that the entire system is
> terminally unusable just because some device can't make a DMA mapping.
Currently the swiotlb panic at line 735 on purpose if swiotlb initialization is
failed, but when any DMA requires a swiotlb mapped slot.

 724 phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
 725                 size_t mapping_size, size_t alloc_size,
 726                 unsigned int alloc_align_mask, enum dma_data_direction dir,
 727                 unsigned long attrs)
 728 {
 729         struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
 730         unsigned int offset = swiotlb_align_offset(dev, orig_addr);
 731         unsigned int i;
 732         int index;
 733         phys_addr_t tlb_addr;
 734
 735         if (!mem || !mem->nslabs)
 736                 panic("Can not allocate SWIOTLB buffer earlier and can't
now provide you with the DMA bounce buffer");

I agree it is a very interesting point whether it is a good idea to leave the
decision to swiotlb for a panic. One thing to take care is the driver should
gracefully handle the error returned by swiotlb. Otherwise, there may be IO hang
(e.g., hung_task) instead of an IO failure.

> 
> Similarly for the other panics at init time, which seem to have originated from
> a mechanical refactoring of the memblock API with the expectation of preserving
> the existing behaviour at the time. Those have then just been moved around
> without anyone thinking to question why they're there, and if memblock *does*
> now return usable error information, why don't we start handling that error
> properly like we do in other init paths?
> 
> Really there is no reason to panic anywhere in SWIOTLB. This is old code, things
> have moved on over the last 20 years, and we can and should do much better. I'll
> add this to my list of things to look at for cleanup once I find a bit of free
> time, unless anyone else fancies taking it on.

Thank you very much for the feedback. Looking forward to how this can be improved!

Dongli Zhang
Christoph Hellwig Sept. 7, 2022, 8:39 a.m. UTC | #6
Thanks, applied.
diff mbox series

Patch

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index c5a9190b218f..dd8863987e0c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -326,9 +326,6 @@  void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
 		swiotlb_adjust_nareas(num_possible_cpus());
 
 	nslabs = default_nslabs;
-	if (nslabs < IO_TLB_MIN_SLABS)
-		panic("%s: nslabs = %lu too small\n", __func__, nslabs);
-
 	/*
 	 * By default allocate the bounce buffer memory from low memory, but
 	 * allow to pick a location everywhere for hypervisors with guest
@@ -341,8 +338,7 @@  void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
 	else
 		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
 	if (!tlb) {
-		pr_warn("%s: Failed to allocate %zu bytes tlb structure\n",
-			__func__, bytes);
+		pr_warn("%s: failed to allocate tlb structure\n", __func__);
 		return;
 	}