Message ID | 55141E87.8040506@compro.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Mark, On Thu, Mar 26, 2015 at 10:58:15AM -0400, Mark Hounschell wrote: > Sorry but CMA was still badly broken. I have a patch below that works. In which way is it broken? What happens when you try to allocate memory with dma_alloc_coherent? > I've tested it with small (no CMA) and large (with CMA) dma's using > dma_alloc_coherent. The patch below is just the "git diff" from your > cloned tree piped to a file then copied into this email. If you require > an official patch I can send one. Just let me know. The main differences I can spot are that you change the order (first CMA, then buddy) and you manually align the input size. I can see the reason for the later, but why does CMA need to be tried first? > Also, in my opinion, this CMA thing is clearly a BUG not a feature > request. The AMD iommu clearly breaks CMA. I feel what ever fix > you are happy with should be back ported to stable. It is not a BUG, the interface definition for dma_alloc_coherent does not specify that it can allocate infinite amounts of memory. So this patch does not qualify for stable. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/26/2015 11:52 AM, Joerg Roedel wrote: > Hi Mark, > > On Thu, Mar 26, 2015 at 10:58:15AM -0400, Mark Hounschell wrote: >> Sorry but CMA was still badly broken. I have a patch below that works. > > In which way is it broken? What happens when you try to allocate memory > with dma_alloc_coherent? > I got Oops on both the alloc and free and I had to hit the reset button. >> I've tested it with small (no CMA) and large (with CMA) dma's using >> dma_alloc_coherent. The patch below is just the "git diff" from your >> cloned tree piped to a file then copied into this email. If you require >> an official patch I can send one. Just let me know. > > The main differences I can spot are that you change the order (first > CMA, then buddy) and you manually align the input size. I can see the > reason for the later, but why does CMA need to be tried first? > I believe that is how dma_alloc_coherent works. If CMA is present it uses it. Even for small allocations. If not present then it does the best it can. That patch was derived by looking at the Intel iommu driver works properly. Maybe you should take a look at that. >> Also, in my opinion, this CMA thing is clearly a BUG not a feature >> request. The AMD iommu clearly breaks CMA. I feel what ever fix >> you are happy with should be back ported to stable. > > It is not a BUG, the interface definition for dma_alloc_coherent does > not specify that it can allocate infinite amounts of memory. So this > patch does not qualify for stable. > > I don't care what the "the interface definition for dma_alloc_coherent" says. If it does not describe it's use with CMA, it is outdated. Maybe the "interface definition for dma_alloc_coherent" was done before CMA was implemented. Maybe you should be looking at CMA Documentation. Unfortunately there does not yet seem to be any in the Documentation dir. Probably because CMA is supposed to be transparent to the DMA API. You know, one should not need to details about it, etc... It is a BUG. To access CMA you use dma_alloc_coherent. You have completely highjacked that function. You have broken CMA. PERIOD. That is a BUG. Regards Mark -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index a0197d0..5ea4fed 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2917,28 +2917,36 @@ static void *alloc_coherent(struct device *dev, size_t size, u64 dma_mask = dev->coherent_dma_mask; struct protection_domain *domain; unsigned long flags; - struct page *page; + struct page *page = 0; + int order; + unsigned int count; + + size = PAGE_ALIGN(size); + order = get_order(size); + count = size >> PAGE_SHIFT; INC_STATS_COUNTER(cnt_alloc_coherent); domain = get_domain(dev); if (PTR_ERR(domain) == -EINVAL) { - page = alloc_pages(flag, get_order(size)); + page = alloc_pages(flag, order); *dma_addr = page_to_phys(page); return page_address(page); } else if (IS_ERR(domain)) return NULL; - dma_mask = dev->coherent_dma_mask; flag &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32); + flag |= __GFP_ZERO; - page = alloc_pages(flag, get_order(size)); + if (flag & __GFP_WAIT) { + page = dma_alloc_from_contiguous(dev, count, order); + if (page && page_to_phys(page) + size > dma_mask) { + dma_release_from_contiguous(dev, page, count); + page = NULL; + } + } if (!page) { - if (!(flag & __GFP_WAIT)) - return NULL; - - page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT, - get_order(size)); + page = alloc_pages(flag, order); if (!page) return NULL; } @@ -2951,7 +2959,7 @@ static void *alloc_coherent(struct device *dev, size_t size, *dma_addr = __map_single(dev, domain->priv, page_to_phys(page), size, DMA_BIDIRECTIONAL, true, dma_mask); - if (*dma_addr == DMA_ERROR_CODE) { + if (!dma_addr || (*dma_addr == DMA_ERROR_CODE)) { spin_unlock_irqrestore(&domain->lock, flags); goto out_free; } @@ -2965,7 +2973,7 @@ static void *alloc_coherent(struct device *dev, size_t size, out_free: if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT)) - __free_pages(page, get_order(size)); + __free_pages(page, order); return NULL; } @@ -2979,6 +2987,11 @@ static void free_coherent(struct device *dev, size_t size, { unsigned long flags; struct protection_domain *domain; + int order; + struct page *page = virt_to_page(virt_addr); + + size = PAGE_ALIGN(size); + order = get_order(size); INC_STATS_COUNTER(cnt_free_coherent); @@ -2995,7 +3008,9 @@ static void free_coherent(struct device *dev, size_t size, spin_unlock_irqrestore(&domain->lock, flags); free_mem: - free_pages((unsigned long)virt_addr, get_order(size)); + if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT)) + __free_pages(page, order); + } /*