From patchwork Tue Feb 6 03:08:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13546578 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2802C4828D for ; Tue, 6 Feb 2024 03:08:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C04C46B0078; Mon, 5 Feb 2024 22:08:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BB4C66B007B; Mon, 5 Feb 2024 22:08:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A54F26B007D; Mon, 5 Feb 2024 22:08:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 988766B0078 for ; Mon, 5 Feb 2024 22:08:27 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7028FA0B91 for ; Tue, 6 Feb 2024 03:08:27 +0000 (UTC) X-FDA: 81759895854.24.920DD79 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) by imf09.hostedemail.com (Postfix) with ESMTP id 34B4A140019 for ; Tue, 6 Feb 2024 03:08:23 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=w1RBg0yr; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707188905; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=lsjZbfVW8/WKO3D8nyVLEixZPG0VzeutC5Yw93QCPIQ=; b=7QJ37q1SpvOaAz7dbWWgBF8ND+Xni5f37QJNwJJMMDnQCDifBYI3k7VwEhDh+fmOCGqD8V sFcNcD8N0cetuG0VRUEMZqOrz0umcbkWq/zZm+N9f4xQdKsRUN9SF/Bc6J6rhiU8XrYHSL WCR7/rTiNU8ZIyGw+iJ6UYrDkNEUkyM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=w1RBg0yr; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707188905; a=rsa-sha256; cv=none; b=z9EHMN49akSUdF6FbpryfOS6OtBsiyDg7jrL7zCSXTxniIlTSrbVj2udtkCHc28xlpxl9U k+m/NBx195WMQM9xJ1RltQFblaoAJOOQIvgkHFBewhA/7yMg/yOGnEoQ9ow8ehiGwo8ohY WWFbg16VHc80g9Hzt5YzwCLLqaZowRY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1707188901; h=From:To:Subject:Date:Message-Id:MIME-Version:Content-Type; bh=lsjZbfVW8/WKO3D8nyVLEixZPG0VzeutC5Yw93QCPIQ=; b=w1RBg0yreU0Pvyls641LbFpGIypP2tDkA2AZhms0113nDpziXqdK8cYSTrfnc6zdnOo+vfRhf/qMyc9AZg2WqtLQbrbWclzumKIyTOjkrglm3M+FZv9ADcngGP+ohLCmuPmdcmfmxzdCvpoem9CiVw3jI2CjFMKKlZBLMm7zVHc= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0W0CACps_1707188899; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W0CACps_1707188899) by smtp.aliyun-inc.com; Tue, 06 Feb 2024 11:08:19 +0800 From: Baolin Wang To: akpm@linux-foundation.org, muchun.song@linux.dev Cc: osalvador@suse.de, david@redhat.com, mhocko@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3] mm: hugetlb: improve the handling of hugetlb allocation failure for freed or in-use hugetlb Date: Tue, 6 Feb 2024 11:08:11 +0800 Message-Id: <62890fd60b1ecd5bf1cdc476c973f60fe37aa0cb.1707181934.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 34B4A140019 X-Stat-Signature: mgaw93ma6qx9t6sdwjoswnpi3ui68qar X-HE-Tag: 1707188903-307903 X-HE-Meta: U2FsdGVkX18NDe+OdnAVkHvDzF47Yz4bvuBHX3eIEPk4z5DLQwCXU6uDWe5nnkPmgB1MsWrTzwCMtG0E1bOkfPj9/wIevxLQLcZ+7IdxU7cv2i5YdTPaP5gIHS+HGUC8c9rsPQ+o3msSxZzotc1bXoK2WIYWSHb5vFWktMJdnD/6Xuezc6kfjKiEgUEO0sTsnyG/HLQiksgeG7D+MPwF0HJgozK9rqmM7Xj0g1vq0zQAqdlbvac1ti5dPM7UKt5hvY6o+H5C7hdED9qmHoG7rJ4G/bTJJHNGUDNA+7VtwH0CsDehHVoJ1GxuiTXYGU+mT88+fFCm+3dWRC9ho5326AokeO/VF2x0uIxvGvqEa6t6JofZkE0g7xbENtsr9gh1aXZCbrjP+NimBgz7FUXVX1c30iuOf3iWyW+mVycJ3I6DKvd2jyAPTs/5eRvjrhe57uaC+2hFZrBTaShfxnaWLuv1ZjOIeYsrcN5qbymY1npA9RxEvEcUEsYGdLCvkU+fH2dKtwCxqztzXmVU+tD/30ymWpWj8gWUefcXu+/TGrHuG8ovSI9LFDcjWNc8zbF0JoSlohEikWE0ypRvp8RZC4Slx9x4aymrjy3k2wheW8YItKSaaXRD8CuwUGfjRhfbiwG4j1+27+RCs1TnxJJ3hKNGkR8w9IQbsJBtlG4STWyzP7S9JWn1sKgmc8VG3trVN2hdMBiKTxF3rtX1mICk3LCCvXwN30fVidKEGqRC+7EuqWwfTc/qJ5X7lg9Z6DuU1aEYB1W56B1gaXHFKtCZgLWRmT6MQccAlxEbc0aS3m6R+EYIzaOrNjJxScfFuvzTKkJT86t0FESRwTRG5D1s5LcmU5HPTZbFiXFMxUGfStVpzh3EWwfLMgVRKHS08NAGFqKgxbo8aI8KxoUbWIk/F46NmiVdW6l1/hwvfcwK+zU5GjGAG/29ghcWK4WsUZV059OWcS4D35Vj22i6Ctf vd+4j6B7 9MP+ekqznJK5Ym0KU5IumwwJoU3/LeXr+PjszK+58aGXLxeGmVcD3/Cv6ZjCtouehknOrVWDo8Un5wluXlmbJ1hpM9MgYV+mTGSvsmrewDEgKn5QzMI5ISg3Sfr2BO290R+o08dP4IcF4xB0iPCpleUJ+PzzQ9UXU853awlCfErl3esRkQqnESyh5UqtP0L/SK9tVcI46j0/qb0TWlPXzwZ4YaBy8wzw7R8NOfbO35g7CTCND62TzoFWOSzNyGdxloM9w X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: alloc_and_dissolve_hugetlb_folio() preallocates a new hugetlb page before it takes hugetlb_lock. In 3 out of 4 cases the page is not really used and therefore the newly allocated page is just freed right away. This is wasteful and it might cause pre-mature failures in those cases. Address that by moving the allocation down to the only case (hugetlb page is really in the free pages pool). We need to drop hugetlb_lock to do so and therefore need to recheck the page state after regaining it. The patch is more of a cleanup than an actual fix to an existing problem. There are no known reports about pre-mature failures. Signed-off-by: Baolin Wang Acked-by: Michal Hocko Reviewed-by: Muchun Song --- Changes from v2; - Update the commit message suggested by Michal. - Remove unnecessary comments. Changes from v1: - Update the suject line per Muchun. - Move the allocation into the free hugetlb handling branch per Michal. --- mm/hugetlb.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9d996fe4ecd9..a05507a2143f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3031,21 +3031,9 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h, { gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE; int nid = folio_nid(old_folio); - struct folio *new_folio; + struct folio *new_folio = NULL; int ret = 0; - /* - * Before dissolving the folio, we need to allocate a new one for the - * pool to remain stable. Here, we allocate the folio and 'prep' it - * by doing everything but actually updating counters and adding to - * the pool. This simplifies and let us do most of the processing - * under the lock. - */ - new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL); - if (!new_folio) - return -ENOMEM; - __prep_new_hugetlb_folio(h, new_folio); - retry: spin_lock_irq(&hugetlb_lock); if (!folio_test_hugetlb(old_folio)) { @@ -3075,6 +3063,16 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h, cond_resched(); goto retry; } else { + if (!new_folio) { + spin_unlock_irq(&hugetlb_lock); + new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, + NULL, NULL); + if (!new_folio) + return -ENOMEM; + __prep_new_hugetlb_folio(h, new_folio); + goto retry; + } + /* * Ok, old_folio is still a genuine free hugepage. Remove it from * the freelist and decrease the counters. These will be @@ -3102,9 +3100,11 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h, free_new: spin_unlock_irq(&hugetlb_lock); - /* Folio has a zero ref count, but needs a ref to be freed */ - folio_ref_unfreeze(new_folio, 1); - update_and_free_hugetlb_folio(h, new_folio, false); + if (new_folio) { + /* Folio has a zero ref count, but needs a ref to be freed */ + folio_ref_unfreeze(new_folio, 1); + update_and_free_hugetlb_folio(h, new_folio, false); + } return ret; }