diff mbox series

arm: dma-mapping: don't call folio_next() beyond the requested region

Message ID 20230810091955.3579004-1-m.szyprowski@samsung.com (mailing list archive)
State New, archived
Headers show
Series arm: dma-mapping: don't call folio_next() beyond the requested region | expand

Commit Message

Marek Szyprowski Aug. 10, 2023, 9:19 a.m. UTC
Add a check for the non-zero offset case to avoid calling folio_next()
beyond the requested region and relying on its parameters.

Fixes: cc24e9c0895c ("arm: implement the new page table range API")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
---
 arch/arm/mm/dma-mapping.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Russell King (Oracle) Aug. 10, 2023, 11:06 a.m. UTC | #1
On Thu, Aug 10, 2023 at 11:19:55AM +0200, Marek Szyprowski wrote:
> Add a check for the non-zero offset case to avoid calling folio_next()
> beyond the requested region and relying on its parameters.
> 
> Fixes: cc24e9c0895c ("arm: implement the new page table range API")
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> ---
>  arch/arm/mm/dma-mapping.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 0474840224d9..6c952d6899f2 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -715,6 +715,8 @@ static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
>  
>  		if (offset) {
>  			left -= folio_size(folio) - offset;

In most cases, "offset" is masked by ~PAGE_MASK, the only exception
being when we're walking scatterlists, where it is whatever is in
sg->offset.

So, what is the range of values that sg->offset could take? If it is
guaranteed to be less than PAGE_SIZE (or folio_size() of the first
folio) then we're all good.

However, consider what happens with the above when offset is larger
than the first folio size. To show this, let's rewrite it:

	left += offset - folio_size(folio);

In that case, "left" becomes larger than the original size, which
surely is not what we want?

This wasn't a problem with the original code, because we guaranteed
that "off" was always less than PAGE_SIZE, so

	left -= PAGE_SIZE - off;

would always result in a reduction, but this is no longer the case
with folios.

The more I'm looking at this, the more I'm convinced that the original
conversion is wrong. Let's go back to the original code and see what
it is doing:

                size_t left = size;

                pfn = page_to_pfn(page) + off / PAGE_SIZE;

This positions pfn to be the page frame number to which the page and
offset passed into the function reference.

                off %= PAGE_SIZE;

This gives us the offset _within_ that page.

                if (off) {
                        pfn++;
                        left -= PAGE_SIZE - off;
                }

What this is doing is saying if the first page is a partial page, then
we skip marking it clean - only _full_ pages get marked clean.

                while (left >= PAGE_SIZE) {
                        page = pfn_to_page(pfn++);
                        set_bit(PG_dcache_clean, &page->flags);
                        left -= PAGE_SIZE;
                }

There, we iterate over the size for the number of _whole_ pages only
and not a final partial page.

Now, if we consider the folio version:

+               ssize_t left = size;

Casts an unsigned to a signed, which will not give expected results if
large.

+               size_t offset = offset_in_folio(folio, paddr);

paddr here is the physical address of the page plus the passed in
offset. If the offset is larger than a page, then it will be the
following pages. What if the offset is larger than the first folio
size - bearing in mind that there is no limit on the offset in a
scatterlist?

If offset is bounded to the size of the folio, then that can truncate
the original "offset" that was passed in and we'll end up marking
the wrong folios, because:

+
+               if (offset) {
+                       left -= folio_size(folio) - offset;
+                       folio = folio_next(folio);
                }

This only allows us to move to the next folio.

Moreover, if offset here _is_ allowed to be bigger than folio_size()
then we end up _increasing_ "left" as stated above, so we end up
marking _more_ folios as clean than the user of this function requested.

+
+               while (left >= (ssize_t)folio_size(folio)) {
+                       set_bit(PG_dcache_clean, &folio->flags);
+                       left -= folio_size(folio);
+                       folio = folio_next(folio);
                }

So, in all, to me it looks like this conversion is basically wrong, and it
needs to be something like:

		size_t left = size;

		while (off >= folio_size(folio)) {
			off -= folio_size(folio);
			folio = folio_next(folio);
		}
		if (off) {
			/* Partial first folio */
			size_t first = folio_size(folio) - off;

			/* Size doesn't extend the full folio size, so exit */
			if (left < first)
				return;

			/* Truncate the size and move to the next folio */
			left -= first;
			folio = folio_next(folio);
		}

		while (left >= folio_size(folio)) {
			/* can't become negative */
			left -= folio_size(folio);
			set_bit(PG_dcache_clean, &folio->flags);
			if (!left)
				break;
			folio = folio_next(folio);
		}
Matthew Wilcox (Oracle) Aug. 22, 2023, 5:12 p.m. UTC | #2
On Thu, Aug 10, 2023 at 12:06:09PM +0100, Russell King (Oracle) wrote:
> However, consider what happens with the above when offset is larger
> than the first folio size. To show this, let's rewrite it:

Hmm.  I thought 'off' had to be smaller than PAGE_SIZE.

> So, in all, to me it looks like this conversion is basically wrong, and it
> needs to be something like:
> 
> 		size_t left = size;
> 
> 		while (off >= folio_size(folio)) {
> 			off -= folio_size(folio);
> 			folio = folio_next(folio);
> 		}


We can jump straight to the first folio without iterating over the
folios in between.  Like so:

static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
        size_t size, enum dma_data_direction dir)
{
        phys_addr_t paddr = page_to_phys(page) + off;

...

        if (dir != DMA_TO_DEVICE && size >= PAGE_SIZE) {
                struct folio *folio = pfn_folio(paddr / PAGE_SIZE);
                size_t offset = offset_in_folio(folio, paddr);

                for (;;) {
                        size_t sz = folio_size(folio) - offset;

                        if (size < sz)
                                break;
                        if (!offset)
                                set_bit(PG_dcache_clean, &folio->flags);
                        offset = 0;
                        size -= sz;
                        if (!size)
                                break;
                        folio = folio_next(folio);
                }
        }

Advantages:
 * No more signed arithmetic
 * Not even an intended arithmetic overflow
 * Only one call to folio_size() per loop
 * Folded the first conditional into the loop

Disadvantages:
 * Some maintainers don't like a for (;;) loop, or a two-exit loop.
   (we could remove the for (;;) by moving 'sz' outside the loop and
    recalculating it at the end of the loop)
diff mbox series

Patch

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0474840224d9..6c952d6899f2 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -715,6 +715,8 @@  static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 
 		if (offset) {
 			left -= folio_size(folio) - offset;
+			if (left <= 0)
+				return;
 			folio = folio_next(folio);
 		}