Message ID | 8014a146e90282b8cbc5868cee0d01776670d9a6.1450457730.git.robin.murphy@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Robin, On Fri, Dec 18, 2015 at 9:01 AM, Robin Murphy <robin.murphy@arm.com> wrote: > Doug reports that the equivalent page allocator on 32-bit ARM exhibits > particularly pathalogical behaviour under memory pressure when > fragmentation is high, where allocating a 4MB buffer takes tens of > seconds and the number of calls to alloc_pages() is over 9000![1] > > We can drastically improve that situation without losing the other > benefits of high-order allocations when they would succeed, by assuming > memory pressure is relatively constant over the course of an allocation, > and not retrying allocations at orders we know to have failed before. > This way, the best-case behaviour remains unchanged, and in the worst > case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts > before falling back to single pages for the remainder of the buffer. > > [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html > > Reported-by: Douglas Anderson <dianders@chromium.org> > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > drivers/iommu/dma-iommu.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) Presumably it makes sense to update this based on v2 of my patch to the original code? AKA: <https://patchwork.kernel.org/patch/7888851/> ?
On Fri, 2015-12-18 at 14:35 -0800, Doug Anderson wrote: > Robin, > > On Fri, Dec 18, 2015 at 9:01 AM, Robin Murphy <robin.murphy@arm.com> wrote: > > Doug reports that the equivalent page allocator on 32-bit ARM exhibits > > particularly pathalogical behaviour under memory pressure when > > fragmentation is high, where allocating a 4MB buffer takes tens of > > seconds and the number of calls to alloc_pages() is over 9000![1] > > > > We can drastically improve that situation without losing the other > > benefits of high-order allocations when they would succeed, by assuming > > memory pressure is relatively constant over the course of an allocation, > > and not retrying allocations at orders we know to have failed before. > > This way, the best-case behaviour remains unchanged, and in the worst > > case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts > > before falling back to single pages for the remainder of the buffer. > > > > [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html > > > > Reported-by: Douglas Anderson <dianders@chromium.org> > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > > --- > > drivers/iommu/dma-iommu.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > Presumably it makes sense to update this based on v2 of my patch to > the original code? AKA: <https://patchwork.kernel.org/patch/7888851/> > ? Hi Robin, Thanks very much for this patch. And Douglas has sent his v2 which improve the flow of alloc_pages. Do you have any plan to update this patch according to this v2, or any concern about it? > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 03811e3..e37516a 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -194,6 +194,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp) { struct page **pages; unsigned int i = 0, array_size = count * sizeof(*pages); + unsigned int order = MAX_ORDER; if (array_size <= PAGE_SIZE) pages = kzalloc(array_size, GFP_KERNEL); @@ -207,14 +208,15 @@ static struct page **__iommu_dma_alloc_pages(unsigned int count, gfp_t gfp) while (count) { struct page *page = NULL; - int j, order = __fls(count); + int j; /* * Higher-order allocations are a convenience rather * than a necessity, hence using __GFP_NORETRY until * falling back to single-page allocations. */ - for (order = min(order, MAX_ORDER); order > 0; order--) { + for (order = min_t(unsigned int, order, __fls(count)); + order > 0; order--) { page = alloc_pages(gfp | __GFP_NORETRY, order); if (!page) continue;
Doug reports that the equivalent page allocator on 32-bit ARM exhibits particularly pathalogical behaviour under memory pressure when fragmentation is high, where allocating a 4MB buffer takes tens of seconds and the number of calls to alloc_pages() is over 9000![1] We can drastically improve that situation without losing the other benefits of high-order allocations when they would succeed, by assuming memory pressure is relatively constant over the course of an allocation, and not retrying allocations at orders we know to have failed before. This way, the best-case behaviour remains unchanged, and in the worst case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts before falling back to single pages for the remainder of the buffer. [1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html Reported-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/dma-iommu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)