From patchwork Wed Jul 26 14:53:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 13328208 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 224BFC001B0 for ; Wed, 26 Jul 2023 14:53:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1CADA6B0071; Wed, 26 Jul 2023 10:53:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 17B4A6B0072; Wed, 26 Jul 2023 10:53:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 043DA8D0001; Wed, 26 Jul 2023 10:53:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E78716B0071 for ; Wed, 26 Jul 2023 10:53:10 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A1F2141071 for ; Wed, 26 Jul 2023 14:53:10 +0000 (UTC) X-FDA: 81054055740.20.D76B0D1 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf03.hostedemail.com (Postfix) with ESMTP id C5EC52001A for ; Wed, 26 Jul 2023 14:53:06 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=Uab2ivjj; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf03.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.180 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690383187; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=DAfH5v/FF69jwtMUJAQFsWnnw4nApRzsZ9+9hy9LFb4=; b=5/FLuJ6SIJDEzSazB4la2i3G/RkGD1f2XCB6HhZzkZ3mMzJ7X8dls2A8PA4+2bpNdCQew+ TspfNOsJFmfCWLlvLv77tCW0VKr/qfGCoPZx/oZqWnxeXEI04UoVJRrOBaB7ej0REn3YtV wUT8LrG5WcYp2kArb+Xwb/e0JDaykOI= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=Uab2ivjj; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf03.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.180 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690383187; a=rsa-sha256; cv=none; b=vSzkuAE1Ktn26F3Kk8zRpoad18WrnKChzhQSE1WUNf0TMOEEdCxqTRfEo/sBzmoI2pEFGu etDgMkWqKbUuUB04uLUdvB/xdcjqLGt9qbM+mVUt2XwjdLDUM8WLzXu+96Ze7+yhmUzhgk 0Qqd/Psts/N+MSF3rws5Azs1ZcX4Vo4= Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-403e7472b28so49216561cf.2 for ; Wed, 26 Jul 2023 07:53:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1690383185; x=1690987985; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=DAfH5v/FF69jwtMUJAQFsWnnw4nApRzsZ9+9hy9LFb4=; b=Uab2ivjjTFcQGMdWe0ih6Wtrb5WZDGqLYfeNDc8TALXaTT8kf5avzsYAy3xo7Lfscz k6hoFbaH1zeMk8VnStPdktueZ8kaqoiDz1IJQ5hMml+zSXVDoM7hBuzaaCw6Ma1nBbK4 PU9QoWEwQUcQS3MUI9x97gxQ1+9YX26SJFS/M2aUx+tvKw6LHDjrw9QP/AksSce4/A7Y ZXXoTFpU+MJA2NKStor0L9cfppts9fY9ld3J/QygAiDusv2u5IUh+Y3NjvHN3zN8LCMS kIMPlYRQ+GaHPfkblKp3Q7gaqa4sGmrwBbiN+nHhZzkDJJpNrncHKNY1gdaoQVfyFPjM YZVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690383185; x=1690987985; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DAfH5v/FF69jwtMUJAQFsWnnw4nApRzsZ9+9hy9LFb4=; b=FWifcRMl24n+aICkFIIagIA3UWUr7DIz9Zl1sCpPn/0uii84UTZ88qlpuEb8NcKERt dyE1pHy56c4Gia9fkPzK8qSJVvaSDLJ8bQBJRdwZD/RRuArszLt2HPEpqhapunqrbuZ7 qlQFQ2WOlIGfPlwb/PtfTH8MZeVTphNtJi9nM25430LW9iKuwGGUqBB5+6u27ezApuj7 qA130DATK8fw8Ghnbmgt6j75PmKYmP3x3WgVX7sob1pX0YP5lM2dq/oUIgqHYXgSVKyr qrDHlN6TUjpXtrNoUc3Dbkk3ZDaFXccSkdtLBlX6W6Y8m8mjy0KwwZIdGCqzqauPFfDQ 26Ww== X-Gm-Message-State: ABy/qLaBg/K8M6D7ewDLRHr93NoEiPNAt1xncrk2CclhDNTl1ieSPbuI ydRvC4Hq4N70xI4SUoEeZu7/zg== X-Google-Smtp-Source: APBJJlH3e5+/gb68gHLAVSLWp2h+LL6UzWbR163vXAK6FTeP67Yc4TVejm/1C2viPpaARYf2Aq5gPg== X-Received: by 2002:ac8:5f4a:0:b0:403:e853:17c3 with SMTP id y10-20020ac85f4a000000b00403e85317c3mr2591827qta.38.1690383185689; Wed, 26 Jul 2023 07:53:05 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:ad06]) by smtp.gmail.com with ESMTPSA id w10-20020ac86b0a000000b003f9c6a311e1sm4821641qts.47.2023.07.26.07.53.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Jul 2023 07:53:05 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Roman Gushchin , Rik van Riel , Joonsoo Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] mm: page_alloc: consume available CMA space first Date: Wed, 26 Jul 2023 10:53:04 -0400 Message-ID: <20230726145304.1319046-1-hannes@cmpxchg.org> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: C5EC52001A X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: g5okpzso65e5zqzfr5p6aokwqxufcntp X-HE-Tag: 1690383186-829021 X-HE-Meta: U2FsdGVkX1+3jDrKyfMOOqR1PSB0NMx+9SJTx7G2K5W8M1yPiUm4jQmSI7JwsDxvnPq6iBj1atTbQWr5pOj9l/zDgGyIkY1TdnU5tmsy3KNtnbhOiT++sr7xK4sAfaIcke+HMCQk/GvUiJ/4Ue8/S2NtAddNaycEo0suPvxMBnKRd7JNIif/oScQjh/+gf5AR+tsdAlu8oebLcm2Zi0vI5mYAvjprmqu+qSeDnO9ls+KBe2Nuk/wTOzV3v3I53oqgoX+G4CRTQcJL4qMgSxjv7iWi0qJeUIF3errrBW75RXdkyvUd9JNWNT7v6G/TSv2m6W70u5MBr5gv40YoxnfHnIxo9a0+cWilsGi9Bl+Jh2w03Efm3ofGyORwpOJxNxR0TGLhPnXQU9150vuZ8XLe+BsPclx8zeshtF8BOP+qGXSxxlUfsHxGHR+G2VM0kKLXDELeYCURN9DX+sacR0GBh7wTI/8Svey8LlDQnBxfQA0ECCl3kMRE4LnxnRVE/1gxipAwfWm7Q261uCU0SIHk0RtK3TItXQrevmpp1/cEk7Y9vgYeqhKzafj8X5Toe09z8dqPO/sbVx9L47bj2UsokukVi6s6bwGu/0nYz0MVbUM8MoBJ9cF+Izyc+FS9QHBJEzi9YXHrVKJIHfW5+N6ZqLPbL1aMHEqoVzCyeBaUHx7vYvEfXCYUWz/rn1MCgLf7CkwIOsUVcQBTppfNYlOkVjAByp8ucN5iv6txLbsjQdtmge8qJCdfxdEpZOA0X6mpCa8CtqqeyLEE/6hwoDwurajLM8bZJMqzLhy46+HekyS9If8Fc+Ng8PuRT0px4SYkY3oD0IdxIYd+I3vQcRjdHOQLs87Thd47Vp4monVaTQFQlyyBI3yWAmGHvj5Sd6R4VCQYkMkHl0SSv6R1w5a0pblBFbGMXARCuwm04FdrHdSaS+iITw/TQhJmi6J4qGzHjr1HwYijhOlUiQiueu UA1M5eXJ 3vBka/us7PsZxVEIlnlktRys4VJ/ngb4qhDZ7tqyhSP5xw3auqly19TmZV63MnjHtMU3DgaaP98366mQTG6xeyG/lum6JwRDF8mPMZeI4PXbcRAWFj1zipHRupYfsBsZKzCkS5Ks9yGGWPUz3nNrODVGmHYP+09k/cQgZw9seD+U3qQ/4RtU0xXax0/x7XcVa3w86Imo9vMoS/vmODchGKvp6OiQThFf42g8Swf48S8VNNYkEGmfOV/TeYvEfTQ4xfgrTFUVzy4ZpHe4H4H31TIdqHpqQ4L/OakEO5yZRSN3zfV0xwXcvUginQRl/VSMqI+YPx1N8fCuk4csfVWNsuKyDGflXdKJ5Gr8Skft2KY1USsT0a8SbB9GknrOKSp0/+AkOAiUion+4fanUlSnYAAryKYhz2+R9lfXWOvTp8oXOqb7tADQdtEhA44BPIvECGASBwAOatFQB+s9AxuyNC9hgLjyOl2y9xhYyzq2bnmocpQHqDSUfqaGYEKLFhEJYcoGUQ6MWoEjMbB8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On a memcache setup with heavy anon usage and no swap, we routinely see premature OOM kills with multiple gigabytes of free space left: Node 0 Normal free:4978632kB [...] free_cma:4893276kB This free space turns out to be CMA. We set CMA regions aside for potential hugetlb users on all of our machines, figuring that even if there aren't any, the memory is available to userspace allocations. When the OOMs trigger, it's from unmovable and reclaimable allocations that aren't allowed to dip into CMA. The non-CMA regions meanwhile are dominated by the anon pages. Movable pages can be migrated out of CMA when necessary, but we don't have a mechanism to migrate them *into* CMA to make room for unmovable allocations. The only recourse we have for these pages is reclaim, which due to a lack of swap is unavailable in our case. Because we have more options for CMA pages, change the policy to always fill up CMA first. This reduces the risk of premature OOMs. Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 44 ++++++++++++++++++++------------------------ 1 file changed, 20 insertions(+), 24 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7d3460c7a480..24b9102cd4f6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1635,13 +1635,13 @@ static int fallbacks[MIGRATE_TYPES][MIGRATE_PCPTYPES - 1] = { }; #ifdef CONFIG_CMA -static __always_inline struct page *__rmqueue_cma_fallback(struct zone *zone, +static __always_inline struct page *__rmqueue_cma(struct zone *zone, unsigned int order) { return __rmqueue_smallest(zone, order, MIGRATE_CMA); } #else -static inline struct page *__rmqueue_cma_fallback(struct zone *zone, +static inline struct page *__rmqueue_cma(struct zone *zone, unsigned int order) { return NULL; } #endif @@ -2124,29 +2124,25 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, { struct page *page; - if (IS_ENABLED(CONFIG_CMA)) { - /* - * Balance movable allocations between regular and CMA areas by - * allocating from CMA when over half of the zone's free memory - * is in the CMA area. - */ - if (alloc_flags & ALLOC_CMA && - zone_page_state(zone, NR_FREE_CMA_PAGES) > - zone_page_state(zone, NR_FREE_PAGES) / 2) { - page = __rmqueue_cma_fallback(zone, order); - if (page) - return page; - } + /* + * Use up CMA first. Movable pages can be migrated out of CMA + * if necessary, but they cannot migrate into it to make room + * for unmovables elsewhere. The only recourse for them is + * then reclaim, which might be unavailable without swap. We + * want to reduce the risk of OOM with free CMA space left. + */ + if (IS_ENABLED(CONFIG_CMA) && (alloc_flags & ALLOC_CMA)) { + page = __rmqueue_cma(zone, order); + if (page) + return page; } -retry: - page = __rmqueue_smallest(zone, order, migratetype); - if (unlikely(!page)) { - if (alloc_flags & ALLOC_CMA) - page = __rmqueue_cma_fallback(zone, order); - - if (!page && __rmqueue_fallback(zone, order, migratetype, - alloc_flags)) - goto retry; + + for (;;) { + page = __rmqueue_smallest(zone, order, migratetype); + if (page) + break; + if (!__rmqueue_fallback(zone, order, migratetype, alloc_flags)) + break; } return page; }