@@ -411,6 +411,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
void *kaddr;
struct page *page = NULL;
int ret = -ENOMEM;
+ bool has_lock = false;
/* Be noisy about caller asking for unsupported flags. */
WARN_ON(unlikely(!(gfp_mask & __GFP_DIRECT_RECLAIM) ||
@@ -451,17 +452,39 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
mutex_unlock(&cma->lock);
pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
- mutex_lock(&cma_mutex);
+
+ /* Mutual exclusion inside alloc_contig_range() is not
+ * strictly necessary, but it makes the allocation a
+ * little more likely to succeed, because it serializes
+ * simultaneous allocations on the same pageblock. We
+ * cannot sleep on all paths, though, so try to do the
+ * allocation speculatively, if we identify another
+ * thread using the same pageblock, fallback to the
+ * serial path mutex, if possible, or try another
+ * pageblock, otherwise.
+ */
+ has_lock = mutex_trylock(&cma_mutex);
+retry:
ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
gfp_mask);
- mutex_unlock(&cma_mutex);
+
+ if (ret == -EAGAIN && (gfp_mask & __GFP_IO)) {
+ if (!has_lock) {
+ mutex_lock(&cma_mutex);
+ has_lock = true;
+ }
+ goto retry;
+ }
+ if (has_lock)
+ mutex_unlock(&cma_mutex);
+
if (ret == 0) {
page = pfn_to_page(pfn);
break;
}
cma_clear_bitmap(cma, pfn, count);
- if (ret != -EBUSY)
+ if (ret != -EBUSY && ret != -EAGAIN)
break;
pr_debug("%s(): memory range at %p is busy, retrying\n",
Holding the mutex when calling alloc_contig_range() is not essential because of the bitmap reservation plus the implicit synchronization mechanism inside start_isolate_page_range(), which prevents allocations on the same pageblock. It is still beneficial to perform some kind of serialization on this path, though, to allow allocations on the same pageblock, if possible, instead of immediately jumping to another allocatable region. Therefore, this patch, instead of serializing every CMA allocation, speculatively try to do the allocation without acquiring the mutex. If we race with another thread allocating on the same pageblock, we can retry on the same region, after waiting for the other colliding allocations to finish. The synchronization of aborted tasks is still done globaly for the CMA allocator. Ideally, the aborted allocation would wait only for the migration of the colliding pageblock, but there is no easy way to track each pageblock isolation in a non-racy way without adding more code overhead. Thus, I believe the mutex mechanism to be an acceptable compromise, if it is not violating the mutex semantics too much. Finally, some code paths like the writeback case, should not blindly sleep waiting for the mutex, because of the possibility of deadlocking if it is a dependency of another allocation thread that holds the mutex. This exact scenario was observed by Gael Portay, with a GPU thread that allocs CMA triggering a writeback, and a USB device in the ARM device that tries to satisfy the writeback with a NOIO CMA allocation [1]. For that reason, we restrict writeback threads from waiting on the pageblock, and instead, we let them move on to a readily available contiguous memory region, effectively preventing the issue reported in [1]. [1] https://groups.google.com/a/lists.one-eyed-alien.net/forum/#!topic/usb-storage/BXpAsg-G1us Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> --- mm/cma.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)