diff mbox series

[2/2] iommu/dma: Fix zero'ing of bounce buffer padding used by untrusted devices

Message ID 20240408041142.665563-2-mhklinux@outlook.com (mailing list archive)
State Accepted
Commit 2650073f1b5858008c32712f3d9e1e808ce7e967
Headers show
Series [1/2] swiotlb: Remove alloc_size argument to swiotlb_tbl_map_single() | expand

Commit Message

Michael Kelley April 8, 2024, 4:11 a.m. UTC
From: Michael Kelley <mhklinux@outlook.com>

iommu_dma_map_page() allocates swiotlb memory as a bounce buffer when
an untrusted device wants to map only part of the memory in an
granule. The goal is to disallow the untrusted device having
DMA access to unrelated kernel data that may be sharing the granule.
To meet this goal, the bounce buffer itself is zero'ed, and any
additional swiotlb memory up to alloc_size after the bounce buffer
end (i.e., "post-padding") is also zero'ed.

However, as of commit 901c7280ca0d ("Reinstate some of "swiotlb: rework
"fix info leak with DMA_FROM_DEVICE"""), swiotlb_tbl_map_single()
always initializes the contents of the bounce buffer to the original
memory. Zero'ing the bounce buffer is redundant and probably wrong per
the discussion in that commit. Only the post-padding needs to be
zero'ed.

Also, when the DMA min_align_mask is non-zero, the allocated bounce
buffer space may not start on a granule boundary. The swiotlb memory
from the granule boundary to the start of the allocated bounce buffer
might belong to some unrelated bounce buffer. So as described in the
"second issue" in [1], it can't be zero'ed to protect against untrusted
devices. But as of commit XXXXXXXXXXXX ("swiotlb: extend buffer
pre-padding to alloc_align_mask if necessary"), swiotlb_tbl_map_single()
allocates pre-padding slots when necessary to meet min_align_mask
requirements, making it possible to zero the pre-padding area as well.

Finally, iommu_dma_map_page() uses the swiotlb for untrusted devices
and also for certain kmalloc() memory. Current code does the zero'ing
for both cases, but it is needed only for the untrusted device case.

Fix all of this by updating iommu_dma_map_page() to zero both the
pre-padding and post-padding areas, but not the actual bounce buffer.
Do this only in the case where the bounce buffer is used because
of an untrusted device.

[1] https://lore.kernel.org/all/20210929023300.335969-1-stevensd@google.com/

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
I've wondered if this code for zero'ing the pre- and post-padding
should go in swiotlb_tbl_map_single(). The bounce buffer proper is
already being initialized there. But swiotlb_tbl_map_single()
would need to test for an untrusted device (or have a "zero the
padding" flag passed in as part of the "attrs" argument), which
adds complexity. Thoughts?

The commit ID of Petr's patch is X'ed out above because Petr's patch
hasn't gone into Linus' tree yet. We can add the real commit ID once
this patch is ready to go in.

Also I've haven't used any "Fixes:" tags. This patch really should
be backported only if all the other recent swiotlb fixes get
backported, and I'm unclear on whether that will happen.

 drivers/iommu/dma-iommu.c | 29 ++++++++++++++++-------------
 include/linux/iova.h      |  5 +++++
 2 files changed, 21 insertions(+), 13 deletions(-)

Comments

Petr Tesařík May 6, 2024, 3:48 p.m. UTC | #1
V Sun,  7 Apr 2024 21:11:42 -0700
mhkelley58@gmail.com napsáno:

> From: Michael Kelley <mhklinux@outlook.com>
> 
> iommu_dma_map_page() allocates swiotlb memory as a bounce buffer when
> an untrusted device wants to map only part of the memory in an
> granule. The goal is to disallow the untrusted device having
> DMA access to unrelated kernel data that may be sharing the granule.
> To meet this goal, the bounce buffer itself is zero'ed, and any
> additional swiotlb memory up to alloc_size after the bounce buffer
> end (i.e., "post-padding") is also zero'ed.
> 
> However, as of commit 901c7280ca0d ("Reinstate some of "swiotlb: rework
> "fix info leak with DMA_FROM_DEVICE"""), swiotlb_tbl_map_single()
> always initializes the contents of the bounce buffer to the original
> memory. Zero'ing the bounce buffer is redundant and probably wrong per
> the discussion in that commit. Only the post-padding needs to be
> zero'ed.
> 
> Also, when the DMA min_align_mask is non-zero, the allocated bounce
> buffer space may not start on a granule boundary. The swiotlb memory
> from the granule boundary to the start of the allocated bounce buffer
> might belong to some unrelated bounce buffer. So as described in the
> "second issue" in [1], it can't be zero'ed to protect against untrusted
> devices. But as of commit XXXXXXXXXXXX ("swiotlb: extend buffer
> pre-padding to alloc_align_mask if necessary"), swiotlb_tbl_map_single()

This is now commit af133562d5af.

> allocates pre-padding slots when necessary to meet min_align_mask
> requirements, making it possible to zero the pre-padding area as well.
> 
> Finally, iommu_dma_map_page() uses the swiotlb for untrusted devices
> and also for certain kmalloc() memory. Current code does the zero'ing
> for both cases, but it is needed only for the untrusted device case.
> 
> Fix all of this by updating iommu_dma_map_page() to zero both the
> pre-padding and post-padding areas, but not the actual bounce buffer.
> Do this only in the case where the bounce buffer is used because
> of an untrusted device.
> 
> [1] https://lore.kernel.org/all/20210929023300.335969-1-stevensd@google.com/
> 
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
> I've wondered if this code for zero'ing the pre- and post-padding
> should go in swiotlb_tbl_map_single(). The bounce buffer proper is
> already being initialized there. But swiotlb_tbl_map_single()
> would need to test for an untrusted device (or have a "zero the
> padding" flag passed in as part of the "attrs" argument), which
> adds complexity. Thoughts?

Historically, swiotlb has never cared about exposing data from a
previous user of a bounce buffer. I assume that's because it was
pointless to make an attempt at protecting system memory from a
malicious device that can do DMA to any address anyway. The situation
has changed with hardware IOMMUs, and that could be why the zeroing is
only done in the IOMMU path.

In short, if anybody can explain the value of concealing potentially
sensitive data from devices that are not behind an IOMMU, let's move
the zeroing to swiotlb. Otherwise, let's keep what we have.

Other than that (and the missing commit id), the patch looks good to me.

Reviewed-by: Petr Tesarik <petr@tesarici.cz>

Petr T

> 
> The commit ID of Petr's patch is X'ed out above because Petr's patch
> hasn't gone into Linus' tree yet. We can add the real commit ID once
> this patch is ready to go in.
> 
> Also I've haven't used any "Fixes:" tags. This patch really should
> be backported only if all the other recent swiotlb fixes get
> backported, and I'm unclear on whether that will happen.
> 
>  drivers/iommu/dma-iommu.c | 29 ++++++++++++++++-------------
>  include/linux/iova.h      |  5 +++++
>  2 files changed, 21 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index c21ef1388499..ecac39b3190d 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1154,9 +1154,6 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
>  	 */
>  	if (dev_use_swiotlb(dev, size, dir) &&
>  	    iova_offset(iovad, phys | size)) {
> -		void *padding_start;
> -		size_t padding_size, aligned_size;
> -
>  		if (!is_swiotlb_active(dev)) {
>  			dev_warn_once(dev, "DMA bounce buffers are inactive, unable to map unaligned transaction.\n");
>  			return DMA_MAPPING_ERROR;
> @@ -1164,24 +1161,30 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
>  
>  		trace_swiotlb_bounced(dev, phys, size);
>  
> -		aligned_size = iova_align(iovad, size);
>  		phys = swiotlb_tbl_map_single(dev, phys, size,
>  					      iova_mask(iovad), dir, attrs);
>  
>  		if (phys == DMA_MAPPING_ERROR)
>  			return DMA_MAPPING_ERROR;
>  
> -		/* Cleanup the padding area. */
> -		padding_start = phys_to_virt(phys);
> -		padding_size = aligned_size;
> +		/*
> +		 * Untrusted devices should not see padding areas with random
> +		 * leftover kernel data, so zero the pre- and post-padding.
> +		 * swiotlb_tbl_map_single() has initialized the bounce buffer
> +		 * proper to the contents of the original memory buffer.
> +		 */
> +		if (dev_is_untrusted(dev)) {
> +			size_t start, virt = (size_t)phys_to_virt(phys);
>  
> -		if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
> -		    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) {
> -			padding_start += size;
> -			padding_size -= size;
> -		}
> +			/* Pre-padding */
> +			start = iova_align_down(iovad, virt);
> +			memset((void *)start, 0, virt - start);
>  
> -		memset(padding_start, 0, padding_size);
> +			/* Post-padding */
> +			start = virt + size;
> +			memset((void *)start, 0,
> +			       iova_align(iovad, start) - start);
> +		}
>  	}
>  
>  	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> diff --git a/include/linux/iova.h b/include/linux/iova.h
> index 83c00fac2acb..d2c4fd923efa 100644
> --- a/include/linux/iova.h
> +++ b/include/linux/iova.h
> @@ -65,6 +65,11 @@ static inline size_t iova_align(struct iova_domain *iovad, size_t size)
>  	return ALIGN(size, iovad->granule);
>  }
>  
> +static inline size_t iova_align_down(struct iova_domain *iovad, size_t size)
> +{
> +	return ALIGN_DOWN(size, iovad->granule);
> +}
> +
>  static inline dma_addr_t iova_dma_addr(struct iova_domain *iovad, struct iova *iova)
>  {
>  	return (dma_addr_t)iova->pfn_lo << iova_shift(iovad);
Christoph Hellwig May 7, 2024, 5:36 a.m. UTC | #2
On Sun, Apr 07, 2024 at 09:11:42PM -0700, mhkelley58@gmail.com wrote:
> I've wondered if this code for zero'ing the pre- and post-padding
> should go in swiotlb_tbl_map_single(). The bounce buffer proper is
> already being initialized there. But swiotlb_tbl_map_single()
> would need to test for an untrusted device (or have a "zero the
> padding" flag passed in as part of the "attrs" argument), which
> adds complexity. Thoughts?

If we want to go down that route it should be the latter.  I'm
not sure if it is an improvement, but we'd have to implement it
to see if it does.

> The commit ID of Petr's patch is X'ed out above because Petr's patch
> hasn't gone into Linus' tree yet. We can add the real commit ID once
> this patch is ready to go in.

I've fixed that up and commit the series.

Thanks a lot!
diff mbox series

Patch

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index c21ef1388499..ecac39b3190d 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1154,9 +1154,6 @@  static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 	 */
 	if (dev_use_swiotlb(dev, size, dir) &&
 	    iova_offset(iovad, phys | size)) {
-		void *padding_start;
-		size_t padding_size, aligned_size;
-
 		if (!is_swiotlb_active(dev)) {
 			dev_warn_once(dev, "DMA bounce buffers are inactive, unable to map unaligned transaction.\n");
 			return DMA_MAPPING_ERROR;
@@ -1164,24 +1161,30 @@  static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 
 		trace_swiotlb_bounced(dev, phys, size);
 
-		aligned_size = iova_align(iovad, size);
 		phys = swiotlb_tbl_map_single(dev, phys, size,
 					      iova_mask(iovad), dir, attrs);
 
 		if (phys == DMA_MAPPING_ERROR)
 			return DMA_MAPPING_ERROR;
 
-		/* Cleanup the padding area. */
-		padding_start = phys_to_virt(phys);
-		padding_size = aligned_size;
+		/*
+		 * Untrusted devices should not see padding areas with random
+		 * leftover kernel data, so zero the pre- and post-padding.
+		 * swiotlb_tbl_map_single() has initialized the bounce buffer
+		 * proper to the contents of the original memory buffer.
+		 */
+		if (dev_is_untrusted(dev)) {
+			size_t start, virt = (size_t)phys_to_virt(phys);
 
-		if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-		    (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) {
-			padding_start += size;
-			padding_size -= size;
-		}
+			/* Pre-padding */
+			start = iova_align_down(iovad, virt);
+			memset((void *)start, 0, virt - start);
 
-		memset(padding_start, 0, padding_size);
+			/* Post-padding */
+			start = virt + size;
+			memset((void *)start, 0,
+			       iova_align(iovad, start) - start);
+		}
 	}
 
 	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 83c00fac2acb..d2c4fd923efa 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -65,6 +65,11 @@  static inline size_t iova_align(struct iova_domain *iovad, size_t size)
 	return ALIGN(size, iovad->granule);
 }
 
+static inline size_t iova_align_down(struct iova_domain *iovad, size_t size)
+{
+	return ALIGN_DOWN(size, iovad->granule);
+}
+
 static inline dma_addr_t iova_dma_addr(struct iova_domain *iovad, struct iova *iova)
 {
 	return (dma_addr_t)iova->pfn_lo << iova_shift(iovad);