[kernel,v2] swiotlb: Half the size if allocation failed

Message ID	20221031081327.47089-1-aik@amd.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C From: Alexey Kardashevskiy <aik@amd.com> To: <kvm@vger.kernel.org> CC: <linux-kernel@vger.kernel.org>, <iommu@lists.linux.dev>, Robin Murphy <robin.murphy@arm.com>, Marek Szyprowski <m.szyprowski@samsung.com>, Christoph Hellwig <hch@lst.de>, Ashish Kalra <ashish.kalra@amd.com>, "Pankaj Gupta" <pankaj.gupta@amd.com>, Tom Lendacky <thomas.lendacky@amd.com>, "Alexey Kardashevskiy" <aik@amd.com> Subject: [PATCH kernel v2] swiotlb: Half the size if allocation failed Date: Mon, 31 Oct 2022 19:13:27 +1100 Message-ID: <20221031081327.47089-1-aik@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	[kernel,v2] swiotlb: Half the size if allocation failed \| expand [kernel,v2] swiotlb: Half the size if allocation failed

Message ID

20221031081327.47089-1-aik@amd.com (mailing list archive)

State

New, archived

Headers

Received-SPF: Pass (protection.outlook.com: domain of amd.com designates
 165.204.84.17 as permitted sender) receiver=protection.outlook.com;
 client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C
From: Alexey Kardashevskiy <aik@amd.com>
To: <kvm@vger.kernel.org>
CC: <linux-kernel@vger.kernel.org>, <iommu@lists.linux.dev>,
        Robin Murphy <robin.murphy@arm.com>,
        Marek Szyprowski <m.szyprowski@samsung.com>,
        Christoph Hellwig <hch@lst.de>,
        Ashish Kalra <ashish.kalra@amd.com>,
        "Pankaj Gupta" <pankaj.gupta@amd.com>,
        Tom Lendacky <thomas.lendacky@amd.com>,
        "Alexey Kardashevskiy" <aik@amd.com>
Subject: [PATCH kernel v2] swiotlb: Half the size if allocation failed
Date: Mon, 31 Oct 2022 19:13:27 +1100
Message-ID: <20221031081327.47089-1-aik@amd.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 08:13:45.6913
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 063e1eb9-79b6-4bfb-2d03-08dabb17d4f3
X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: 
 TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com]
X-MS-Exchange-CrossTenant-AuthSource: 
 DM6NAM11FT037.eop-nam11.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4943
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Series

[kernel,v2] swiotlb: Half the size if allocation failed | expand

Commit Message

Alexey Kardashevskiy Oct. 31, 2022, 8:13 a.m. UTC

At the moment the AMD encrypted platform reserves 6% of RAM for SWIOTLB
or 1GB, whichever is less. However it is possible that there is no block
big enough in the low memory which make SWIOTLB allocation fail and
the kernel continues without DMA. In such case a VM hangs on DMA.

This moves alloc+remap to a helper and calls it from a loop where
the size is halved on each iteration.

This updates default_nslabs on successful allocation which looks like
an oversight as not doing so should have broken callers of
swiotlb_size_or_default().

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
--
Changes:
v2:
* moved alloc+remap to a helper as suggested
* removed "x86" and "amd" from subj

--
I hit the problem with
QEMU's "-m 16819M" where SWIOTLB was adjusted to
0x7e200 == 1,058,013,184 (slightly less than 1GB) while
0x7e180 still worked.

With guest errors enabled, there are many unassigned accesses from
virtio.
---
 kernel/dma/swiotlb.c | 66 +++++++++++++++++++++++++++-----------------
 1 file changed, 41 insertions(+), 25 deletions(-)

Comments

Gupta, Pankaj Oct. 31, 2022, 7:18 p.m. UTC | #1

On 10/31/2022 9:13 AM, Alexey Kardashevskiy wrote:
> At the moment the AMD encrypted platform reserves 6% of RAM for SWIOTLB
> or 1GB, whichever is less. However it is possible that there is no block
> big enough in the low memory which make SWIOTLB allocation fail and
> the kernel continues without DMA. In such case a VM hangs on DMA.
> 
> This moves alloc+remap to a helper and calls it from a loop where
> the size is halved on each iteration.
> 
> This updates default_nslabs on successful allocation which looks like
> an oversight as not doing so should have broken callers of
> swiotlb_size_or_default().
> 
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> --
> Changes:
> v2:
> * moved alloc+remap to a helper as suggested
> * removed "x86" and "amd" from subj
> 
> --
> I hit the problem with
> QEMU's "-m 16819M" where SWIOTLB was adjusted to
> 0x7e200 == 1,058,013,184 (slightly less than 1GB) while
> 0x7e180 still worked.
> 
> With guest errors enabled, there are many unassigned accesses from
> virtio.
> ---
>   kernel/dma/swiotlb.c | 66 +++++++++++++++++++++++++++-----------------
>   1 file changed, 41 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 339a990554e7..53fc6e7d3aa5 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -300,6 +300,36 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
>   	return;
>   }
>   
> +static void *swiotlb_memblock_alloc(unsigned long nslabs, unsigned int flags,
> +				    int (*remap)(void *tlb, unsigned long nslabs))
> +{
> +	size_t bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
> +	void *tlb;
> +
> +	/*
> +	 * By default allocate the bounce buffer memory from low memory, but
> +	 * allow to pick a location everywhere for hypervisors with guest
> +	 * memory encryption.
> +	 */
> +	if (flags & SWIOTLB_ANY)
> +		tlb = memblock_alloc(bytes, PAGE_SIZE);
> +	else
> +		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
> +
> +	if (!tlb) {
> +		pr_warn("%s: Failed to allocate %zu bytes tlb structure\n", __func__, bytes);
> +		return NULL;
> +	}
> +
> +	if (remap && remap(tlb, nslabs) < 0) {
> +		memblock_free(tlb, PAGE_ALIGN(bytes));
> +		pr_warn("%s: Failed to remap %zu bytes\n", __func__, bytes);
> +		return NULL;
> +	}
> +
> +	return tlb;
> +}
> +
>   /*
>    * Statically reserve bounce buffer space and initialize bounce buffer data
>    * structures for the software IO TLB used to implement the DMA API.
> @@ -310,7 +340,6 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
>   	struct io_tlb_mem *mem = &io_tlb_default_mem;
>   	unsigned long nslabs;
>   	size_t alloc_size;
> -	size_t bytes;
>   	void *tlb;
>   
>   	if (!addressing_limit && !swiotlb_force_bounce)
> @@ -325,32 +354,19 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
>   	if (!default_nareas)
>   		swiotlb_adjust_nareas(num_possible_cpus());
>   
> -	nslabs = default_nslabs;
> -	/*
> -	 * By default allocate the bounce buffer memory from low memory, but
> -	 * allow to pick a location everywhere for hypervisors with guest
> -	 * memory encryption.
> -	 */
> -retry:
> -	bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
> -	if (flags & SWIOTLB_ANY)
> -		tlb = memblock_alloc(bytes, PAGE_SIZE);
> -	else
> -		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
> -	if (!tlb) {
> -		pr_warn("%s: failed to allocate tlb structure\n", __func__);
> -		return;
> -	}
> -
> -	if (remap && remap(tlb, nslabs) < 0) {
> -		memblock_free(tlb, PAGE_ALIGN(bytes));
> -
> +	for (nslabs = default_nslabs;; ) {
> +		tlb = swiotlb_memblock_alloc(nslabs, flags, remap);
> +		if (tlb)
> +			break;
> +		if (nslabs <= IO_TLB_MIN_SLABS)
> +			return;
>   		nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
> -		if (nslabs >= IO_TLB_MIN_SLABS)
> -			goto retry;
> +	}
>   
> -		pr_warn("%s: Failed to remap %zu bytes\n", __func__, bytes);
> -		return;
> +	if (default_nslabs != nslabs) {
> +		pr_info("SWIOTLB bounce buffer size adjusted %lu -> %lu slabs",
> +			default_nslabs, nslabs);
> +		default_nslabs = nslabs;
>   	}
>   
>   	alloc_size = PAGE_ALIGN(array_size(sizeof(*mem->slots), nslabs));

With memblock contiguous allocation under 4G memory fallback to lower 
order allocation, seems to fix the inconsistent state issue if the 
buffer is not at all allocated.

Feel free to add:
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>

Christoph Hellwig Nov. 1, 2022, 11:09 a.m. UTC | #2

Thanks.  I've applied this with minor edits (see below).

---
From 8d58aa484920c4f9be4834a7aeb446cdced21a37 Mon Sep 17 00:00:00 2001
From: Alexey Kardashevskiy <aik@amd.com>
Date: Mon, 31 Oct 2022 19:13:27 +1100
Subject: swiotlb: reduce the swiotlb buffer size on allocation failure

At the moment the AMD encrypted platform reserves 6% of RAM for SWIOTLB
or 1GB, whichever is less. However it is possible that there is no block
big enough in the low memory which make SWIOTLB allocation fail and
the kernel continues without DMA. In such case a VM hangs on DMA.

This moves alloc+remap to a helper and calls it from a loop where
the size is halved on each iteration.

This updates default_nslabs on successful allocation which looks like
an oversight as not doing so should have broken callers of
swiotlb_size_or_default().

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/swiotlb.c | 63 +++++++++++++++++++++++++++-----------------
 1 file changed, 39 insertions(+), 24 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 339a990554e7f..a34c38bbe28f1 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -300,6 +300,37 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 	return;
 }
 
+static void *swiotlb_memblock_alloc(unsigned long nslabs, unsigned int flags,
+		int (*remap)(void *tlb, unsigned long nslabs))
+{
+	size_t bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
+	void *tlb;
+
+	/*
+	 * By default allocate the bounce buffer memory from low memory, but
+	 * allow to pick a location everywhere for hypervisors with guest
+	 * memory encryption.
+	 */
+	if (flags & SWIOTLB_ANY)
+		tlb = memblock_alloc(bytes, PAGE_SIZE);
+	else
+		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
+
+	if (!tlb) {
+		pr_warn("%s: Failed to allocate %zu bytes tlb structure\n",
+			__func__, bytes);
+		return NULL;
+	}
+
+	if (remap && remap(tlb, nslabs) < 0) {
+		memblock_free(tlb, PAGE_ALIGN(bytes));
+		pr_warn("%s: Failed to remap %zu bytes\n", __func__, bytes);
+		return NULL;
+	}
+
+	return tlb;
+}
+
 /*
  * Statically reserve bounce buffer space and initialize bounce buffer data
  * structures for the software IO TLB used to implement the DMA API.
@@ -310,7 +341,6 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
 	struct io_tlb_mem *mem = &io_tlb_default_mem;
 	unsigned long nslabs;
 	size_t alloc_size;
-	size_t bytes;
 	void *tlb;
 
 	if (!addressing_limit && !swiotlb_force_bounce)
@@ -326,31 +356,16 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
 		swiotlb_adjust_nareas(num_possible_cpus());
 
 	nslabs = default_nslabs;
-	/*
-	 * By default allocate the bounce buffer memory from low memory, but
-	 * allow to pick a location everywhere for hypervisors with guest
-	 * memory encryption.
-	 */
-retry:
-	bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
-	if (flags & SWIOTLB_ANY)
-		tlb = memblock_alloc(bytes, PAGE_SIZE);
-	else
-		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
-	if (!tlb) {
-		pr_warn("%s: failed to allocate tlb structure\n", __func__);
-		return;
-	}
-
-	if (remap && remap(tlb, nslabs) < 0) {
-		memblock_free(tlb, PAGE_ALIGN(bytes));
-
+	while ((tlb = swiotlb_memblock_alloc(nslabs, flags, remap)) == NULL) {
+		if (nslabs <= IO_TLB_MIN_SLABS)
+			return;
 		nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
-		if (nslabs >= IO_TLB_MIN_SLABS)
-			goto retry;
+	}
 
-		pr_warn("%s: Failed to remap %zu bytes\n", __func__, bytes);
-		return;
+	if (default_nslabs != nslabs) {
+		pr_info("SWIOTLB bounce buffer size adjusted %lu -> %lu slabs",
+			default_nslabs, nslabs);
+		default_nslabs = nslabs;
 	}
 
 	alloc_size = PAGE_ALIGN(array_size(sizeof(*mem->slots), nslabs));

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 339a990554e7..53fc6e7d3aa5 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -300,6 +300,36 @@  static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
 	return;
 }
 
+static void *swiotlb_memblock_alloc(unsigned long nslabs, unsigned int flags,
+				    int (*remap)(void *tlb, unsigned long nslabs))
+{
+	size_t bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
+	void *tlb;
+
+	/*
+	 * By default allocate the bounce buffer memory from low memory, but
+	 * allow to pick a location everywhere for hypervisors with guest
+	 * memory encryption.
+	 */
+	if (flags & SWIOTLB_ANY)
+		tlb = memblock_alloc(bytes, PAGE_SIZE);
+	else
+		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
+
+	if (!tlb) {
+		pr_warn("%s: Failed to allocate %zu bytes tlb structure\n", __func__, bytes);
+		return NULL;
+	}
+
+	if (remap && remap(tlb, nslabs) < 0) {
+		memblock_free(tlb, PAGE_ALIGN(bytes));
+		pr_warn("%s: Failed to remap %zu bytes\n", __func__, bytes);
+		return NULL;
+	}
+
+	return tlb;
+}
+
 /*
  * Statically reserve bounce buffer space and initialize bounce buffer data
  * structures for the software IO TLB used to implement the DMA API.
@@ -310,7 +340,6 @@  void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
 	struct io_tlb_mem *mem = &io_tlb_default_mem;
 	unsigned long nslabs;
 	size_t alloc_size;
-	size_t bytes;
 	void *tlb;
 
 	if (!addressing_limit && !swiotlb_force_bounce)
@@ -325,32 +354,19 @@  void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
 	if (!default_nareas)
 		swiotlb_adjust_nareas(num_possible_cpus());
 
-	nslabs = default_nslabs;
-	/*
-	 * By default allocate the bounce buffer memory from low memory, but
-	 * allow to pick a location everywhere for hypervisors with guest
-	 * memory encryption.
-	 */
-retry:
-	bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
-	if (flags & SWIOTLB_ANY)
-		tlb = memblock_alloc(bytes, PAGE_SIZE);
-	else
-		tlb = memblock_alloc_low(bytes, PAGE_SIZE);
-	if (!tlb) {
-		pr_warn("%s: failed to allocate tlb structure\n", __func__);
-		return;
-	}
-
-	if (remap && remap(tlb, nslabs) < 0) {
-		memblock_free(tlb, PAGE_ALIGN(bytes));
-
+	for (nslabs = default_nslabs;; ) {
+		tlb = swiotlb_memblock_alloc(nslabs, flags, remap);
+		if (tlb)
+			break;
+		if (nslabs <= IO_TLB_MIN_SLABS)
+			return;
 		nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
-		if (nslabs >= IO_TLB_MIN_SLABS)
-			goto retry;
+	}
 
-		pr_warn("%s: Failed to remap %zu bytes\n", __func__, bytes);
-		return;
+	if (default_nslabs != nslabs) {
+		pr_info("SWIOTLB bounce buffer size adjusted %lu -> %lu slabs",
+			default_nslabs, nslabs);
+		default_nslabs = nslabs;
 	}
 
 	alloc_size = PAGE_ALIGN(array_size(sizeof(*mem->slots), nslabs));

[kernel,v2] swiotlb: Half the size if allocation failed

Commit Message

Comments

Patch