From patchwork Fri Aug 2 22:39:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 11074219 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD470912 for ; Fri, 2 Aug 2019 22:39:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9D93B288E6 for ; Fri, 2 Aug 2019 22:39:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8E54428831; Fri, 2 Aug 2019 22:39:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D589628831 for ; Fri, 2 Aug 2019 22:39:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B88D46B0008; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B135C6B000A; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 98C1F6B000C; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 747886B000A for ; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id h47so69350502qtc.20 for ; Fri, 02 Aug 2019 15:39:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=i3VDfrdTxe/wuzKQ7zGTLnN8YuBNc3qCXHzmo68f0Ho=; b=tl8iKbaE6jWkguWO61WjZ794MNFzQ1otrSOR8ZYRL+cbOk7Cw9gS3HAb9yZpFFZ0+S qU+nYeGcgLL19TNGT+EV+pgPC+jWV0qn24Pe6xe1ioKKu3dEHksj4e/V7HMcyG2yKnRN h6DOAci3Ev2oKna3Rb8o9FnGFcsbcjnBdgTKQts+6tXpij42YHD0guqY/HkTSgGyz+8R iZhTr2JCVvJDNTRSVqT+zDd4Pb2nFJIXAg/5K4HBzkW6VxEiV/Rtj6CWyNKGHVs0Yp1j wW+Io+uAOIU/5U93l2N3GMPNkwh8XXKoEdqL92wOFJs40CLj2uz/kCJpvbt/pY+Z4XqN a1ig== X-Gm-Message-State: APjAAAXoC1vlKFtz+n21HxOujynRM7d8tK8vZjB0O4ktvNd3UfQMbkL3 FI8JAo+xiZxZ0bwQMeMj8lcKAmuU9clOUzI8cW86hM1pCMZizFphjHudgDAxM2MecO6zlSnJIae IyFizeLoFZntbletfvey4pdwcu2buI4Y4fpXr4+uVNwEd9yXEMQlTxabgAfjnJZfidQ== X-Received: by 2002:ac8:2e59:: with SMTP id s25mr95628995qta.94.1564785589211; Fri, 02 Aug 2019 15:39:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqxRBW5M+dOyhGG95FjrFgIAMUdmAxJsnXikkfQsPH1o48FrnSa73QMnvBRViDVgHnfesXqG X-Received: by 2002:ac8:2e59:: with SMTP id s25mr95628970qta.94.1564785588657; Fri, 02 Aug 2019 15:39:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564785588; cv=none; d=google.com; s=arc-20160816; b=uu5jMfegZ6GuKwJoL0GhzuPfr3ggS44CVetdNizBSBdrzBx/BXtfagRCR0QH7lohxR lF+KrCMmdfd28Rq48BIjdl7o3Sghg32va3gEQZ3NRLslC9zJ5NgoqXBWsd8JPMKTd29c IMPhTtAOJPAdVb58WX2fdaypEWIRsXe04rc/VbqfCSEju+rGfgtE2v1kEAjhwo5vW1Bn S+RyX2ggjTBHKz1auZNZfzWCRaLxi2GC92Tb/u7SUPEU+Hayt51R3e9EeOeBjF5WKrwf hwIbYjAsUmirync8jJHzBOniDY8h+Yy44Ych4/Tff1Dz80Qm/unbUkNtIvbeseILGkI/ n2MQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=i3VDfrdTxe/wuzKQ7zGTLnN8YuBNc3qCXHzmo68f0Ho=; b=t0tUCVJ6W8Tbq9ONy/wAXgmMoaOV4i5xbE1aqK1XVeGVadJSkhIXDtGGAhzEcocZZP jcK48tU5M42nup6g4OBhyugLAJbYQyiRPdUdVfDEJr1Axt8V7+e1T0oCy5+izgOHoVTp kLfslGRENL4CJpK0hHQLbedW5wHxvnBezW5BePAZ/uiBy3PLfc9btB7sojdyoGT4y8dO vzEH0gA0egUTqb0oBGMErvA9lbFpmRv2L3HCXQVu15P3GzQcwfuXbmxOZSXe3QybtLhC tLK1tD4j8/vfr0mdt4lH2K168ii4dYMES7CqMgSvMCz4a+ro5BpZKclbIvCt/enEwXxk AqLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=qtHtHKtJ; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2130.oracle.com (userp2130.oracle.com. [156.151.31.86]) by mx.google.com with ESMTPS id v54si46819036qvc.169.2019.08.02.15.39.48 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Aug 2019 15:39:48 -0700 (PDT) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) client-ip=156.151.31.86; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=qtHtHKtJ; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x72Mcrws121628; Fri, 2 Aug 2019 22:39:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2018-07-02; bh=i3VDfrdTxe/wuzKQ7zGTLnN8YuBNc3qCXHzmo68f0Ho=; b=qtHtHKtJ128Htg1F2LqpKvwARk4kkr9i/AC2+Jqiae0dYYGxU0YXHUAMVuL6SsdI23eF vtMk1OtPjkmmakhXOgJSrutwSiKpLSOs70/Z/N2GWdZjl3Gt5CNnNh0YjWzXmzfsuoL8 SVHTojg9FKZGXJUElDXLv9GYD9DjPJKnfGTBG9MfJJjgu8ba1sfUr1mYXcM/mjoMe0ia 62PPDv6Mdcvuesb7kTI6nUbTO67UBZH4zczsRu+IaaewoJNtBpv+ELs5ehLblOHq+FHF 3rUon932rBv7RhYi9vJ7PIjGJVbfnMFSy603UEVQzUwEpp6yxH0y+rNAEMdOdWotroya iQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2u0e1ucydt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2019 22:39:42 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x72Mbe7w019249; Fri, 2 Aug 2019 22:39:41 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 2u49hunsqf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2019 22:39:41 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x72MdbLN006549; Fri, 2 Aug 2019 22:39:37 GMT Received: from monkey.oracle.com (/71.63.128.209) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 02 Aug 2019 15:39:37 -0700 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hillf Danton , Vlastimil Babka , Michal Hocko , Mel Gorman , Johannes Weiner , Andrea Arcangeli , David Rientjes , Andrew Morton , Mike Kravetz Subject: [PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection Date: Fri, 2 Aug 2019 15:39:28 -0700 Message-Id: <20190802223930.30971-2-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190802223930.30971-1-mike.kravetz@oracle.com> References: <20190802223930.30971-1-mike.kravetz@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9337 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908020238 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9337 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908020238 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Hillf Danton Address the issue of should_continue_reclaim continuing true too often for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned. This could happen during hugetlb page allocation causing stalls for minutes or hours. We can stop reclaiming pages if compaction reports it can make a progress. A code reshuffle is needed to do that. And it has side-effects, however, with allocation latencies in other cases but that would come at the cost of potential premature reclaim which has consequences of itself. We can also bail out of reclaiming pages if we know that there are not enough inactive lru pages left to satisfy the costly allocation. We can give up reclaiming pages too if we see dryrun occur, with the certainty of plenty of inactive pages. IOW with dryrun detected, we are sure we have reclaimed as many pages as we could. Cc: Mike Kravetz Cc: Mel Gorman Cc: Michal Hocko Cc: Vlastimil Babka Cc: Johannes Weiner Signed-off-by: Hillf Danton Tested-by: Mike Kravetz Acked-by: Mel Gorman Acked-by: Vlastimil Babka Signed-off-by: Vlastimil Babka Acked-by: Mike Kravetz --- mm/vmscan.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 47aa2158cfac..a386c5351592 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2738,18 +2738,6 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, return false; } - /* - * If we have not reclaimed enough pages for compaction and the - * inactive lists are large enough, continue reclaiming - */ - pages_for_compaction = compact_gap(sc->order); - inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE); - if (get_nr_swap_pages() > 0) - inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON); - if (sc->nr_reclaimed < pages_for_compaction && - inactive_lru_pages > pages_for_compaction) - return true; - /* If compaction would go ahead or the allocation would succeed, stop */ for (z = 0; z <= sc->reclaim_idx; z++) { struct zone *zone = &pgdat->node_zones[z]; @@ -2765,7 +2753,21 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, ; } } - return true; + + /* + * If we have not reclaimed enough pages for compaction and the + * inactive lists are large enough, continue reclaiming + */ + pages_for_compaction = compact_gap(sc->order); + inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE); + if (get_nr_swap_pages() > 0) + inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON); + + return inactive_lru_pages > pages_for_compaction && + /* + * avoid dryrun with plenty of inactive pages + */ + nr_scanned && nr_reclaimed; } static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg) From patchwork Fri Aug 2 22:39:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 11074217 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5979912 for ; Fri, 2 Aug 2019 22:39:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1ACA28831 for ; Fri, 2 Aug 2019 22:39:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A3397288EA; Fri, 2 Aug 2019 22:39:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 728CF28831 for ; Fri, 2 Aug 2019 22:39:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60BAD6B0006; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 594816B0008; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4351A6B000A; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 1EEFA6B0006 for ; Fri, 2 Aug 2019 18:39:49 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id n190so66007068qkd.5 for ; Fri, 02 Aug 2019 15:39:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=TC+FvVGBj7qXtMenbMrq0locSq4RPx9eaTkqkGiQRPY=; b=Ta7ThbuaqGPexhGCB89g9KbbaqYq/ir8C+clomXr2m25GwFxMyF38eA/VBaqmGyLmu 26I8BlzAqfoIIxIjrpsIvvhkrWu361MYKPmSDhSnVnOmbmvsIIJMXHNhNYywVQDSAMTv 3y6oQndUMSvsIjvTCGgQ7ZTxBBmlwCskFOwUJaLHS4faWXQf8w58+VMYdxM/RUw8Hciv vxui5/1jD+p/qyjt3iRJ+UmjmkfTjKnACP9li3u2y+KxQlLR5aqwwgrzLZI0DJJoo35d ESanHkKEWTVI8Ferz1ABdszsrY1QK9Utlh5otsXqWeudrdomnmM9gMGylScyUYu98tR8 TGsQ== X-Gm-Message-State: APjAAAUEhmITliAtMYAoj7sGXuY6NdB1I/YsAkwneACwyg+aycEgtl36 IMfUSYxZ1ycIpAfvnBUtjoEcz19x31VVZ4uiNxOoziPZBPzJx76G+Ksi0NL3grppk48GfU8dhXf iM7OKUNjDT0IxxhDn+IBKBYYfKXIn/jXi+YZS6J8cLBqinlyPPY6IUVsQyrV2u0zgSA== X-Received: by 2002:a37:a5cb:: with SMTP id o194mr95030267qke.371.1564785588843; Fri, 02 Aug 2019 15:39:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqwZYz4sfelj2ZRcXkYUlji8Jdc5wqCXYZSsusSvQFUgz/rHTPjP/MhxnUUNqAT2LSaBHElV X-Received: by 2002:a37:a5cb:: with SMTP id o194mr95030239qke.371.1564785588159; Fri, 02 Aug 2019 15:39:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564785588; cv=none; d=google.com; s=arc-20160816; b=N7h7B5Ool38biKkszmy0U+LMyKrpjn5livctMXN2qb6rLOSB/ajTBdMlsHS9Ybf+KU 0KCamKE9gCWbatq7mDBVaoJlkAuR6r95g0wBalKZkCUevBKI/ExkJdh8tU+6NrcBEKnu +ShVR78L2F1hbvC4d/iia6uvlhf62WE5gsQC1W9tcetwu1/LdHhwyRQbDeo92aVbu8EX j4BfGwCOJgTKWg/3CHjfvTXU/iO8GXMGMqZ84/dAoZpn5SotB3hxPqfTBKKnpVSww2ZF y3AHitdmR04ZmE4bNGqEzdI6XBK8Yqvan6zOxChh+EbYy6nj/eBwatcIpuo2igtF9Fhg 2rHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=TC+FvVGBj7qXtMenbMrq0locSq4RPx9eaTkqkGiQRPY=; b=YORKEjETc6POSuHqZSvbc433Gs6qiwGMdc2uwLg7tCpvbeXYpp9dzYm5Ea8pf/AVDE 2R8Q71p21xiLqkLfGlrd/qBx0FExulMvVyHWBuDB8sx4EZmxZwcdA5JxZEAaSIqVGi/r 7auUe2jaG4VnNbfMHycmQP0yMPBUt5srhcSPc7BTL8i/MF6u00zFvQ5w7guziHH0awpe ju7dO0owv3TtZ3pTxnGPvac/7/ea9uf/Q3S2dbMMeNJUJwAc6dJNK0K7KwuaQbVjTmuA vu1QRjAe7eJOnc3kmlK+nAftno400Xlr7lXMfUps8b8HUFs18w1Z14DP5ul21+94Kp1L FdNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b="m1L2JM/O"; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.78 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from aserp2120.oracle.com (aserp2120.oracle.com. [141.146.126.78]) by mx.google.com with ESMTPS id 7si40464324qtw.230.2019.08.02.15.39.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Aug 2019 15:39:48 -0700 (PDT) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.78 as permitted sender) client-ip=141.146.126.78; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b="m1L2JM/O"; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.78 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x72McvgT004454; Fri, 2 Aug 2019 22:39:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2018-07-02; bh=TC+FvVGBj7qXtMenbMrq0locSq4RPx9eaTkqkGiQRPY=; b=m1L2JM/OMDW2n0WfQFY3LPGcIeqtXCva5LDPlsHXHDdJuc/MVSopGE/PoYJsSYbxUDjy jxnR4Gd1UZcJMjvsEnaYYKKqWE4LhYV2UDumzIkrEK3XmtjQ4mTB8ArgbaEi1WcbPHk8 sl0k930q2/W7pWSlZpfy+dzLe+LjQl+OBzDc8W6dsZbA/uBf1KvsD3PpYB3k+k+0ytlt Sg2ok6ekSZJ4nMoRKfg+1ynJEzeDbmr045oVsVjJBm4UhH4Koik3et1pQuppKrSU41JB Dp3fqD5KpB3m//DdkKvtjhdNw+HHiJ/d9UX6k6dyHGdRw2P67qer4H0rQ3ium0NXtf0+ kQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 2u0ejq4qt3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2019 22:39:42 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x72MbhOw062691; Fri, 2 Aug 2019 22:39:42 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3030.oracle.com with ESMTP id 2u4vsj1upr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2019 22:39:42 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x72Mdd7M022130; Fri, 2 Aug 2019 22:39:39 GMT Received: from monkey.oracle.com (/71.63.128.209) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 02 Aug 2019 15:39:39 -0700 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hillf Danton , Vlastimil Babka , Michal Hocko , Mel Gorman , Johannes Weiner , Andrea Arcangeli , David Rientjes , Andrew Morton , Mike Kravetz Subject: [PATCH 2/3] mm, compaction: raise compaction priority after it withdrawns Date: Fri, 2 Aug 2019 15:39:29 -0700 Message-Id: <20190802223930.30971-3-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190802223930.30971-1-mike.kravetz@oracle.com> References: <20190802223930.30971-1-mike.kravetz@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9337 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908020238 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9337 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908020238 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Vlastimil Babka Mike Kravetz reports that "hugetlb allocations could stall for minutes or hours when should_compact_retry() would return true more often then it should. Specifically, this was in the case where compact_result was COMPACT_DEFERRED and COMPACT_PARTIAL_SKIPPED and no progress was being made." The problem is that the compaction_withdrawn() test in should_compact_retry() includes compaction outcomes that are only possible on low compaction priority, and results in a retry without increasing the priority. This may result in furter reclaim, and more incomplete compaction attempts. With this patch, compaction priority is raised when possible, or should_compact_retry() returns false. The COMPACT_SKIPPED result doesn't really fit together with the other outcomes in compaction_withdrawn(), as that's a result caused by insufficient order-0 pages, not due to low compaction priority. With this patch, it is moved to a new compaction_needs_reclaim() function, and for that outcome we keep the current logic of retrying if it looks like reclaim will be able to help. Reported-by: Mike Kravetz Signed-off-by: Vlastimil Babka Tested-by: Mike Kravetz --- include/linux/compaction.h | 22 +++++++++++++++++----- mm/page_alloc.c | 16 ++++++++++++---- 2 files changed, 29 insertions(+), 9 deletions(-) diff --git a/include/linux/compaction.h b/include/linux/compaction.h index 9569e7c786d3..4b898cdbdf05 100644 --- a/include/linux/compaction.h +++ b/include/linux/compaction.h @@ -129,11 +129,8 @@ static inline bool compaction_failed(enum compact_result result) return false; } -/* - * Compaction has backed off for some reason. It might be throttling or - * lock contention. Retrying is still worthwhile. - */ -static inline bool compaction_withdrawn(enum compact_result result) +/* Compaction needs reclaim to be performed first, so it can continue. */ +static inline bool compaction_needs_reclaim(enum compact_result result) { /* * Compaction backed off due to watermark checks for order-0 @@ -142,6 +139,16 @@ static inline bool compaction_withdrawn(enum compact_result result) if (result == COMPACT_SKIPPED) return true; + return false; +} + +/* + * Compaction has backed off for some reason after doing some work or none + * at all. It might be throttling or lock contention. Retrying might be still + * worthwhile, but with a higher priority if allowed. + */ +static inline bool compaction_withdrawn(enum compact_result result) +{ /* * If compaction is deferred for high-order allocations, it is * because sync compaction recently failed. If this is the case @@ -207,6 +214,11 @@ static inline bool compaction_failed(enum compact_result result) return false; } +static inline bool compaction_needs_reclaim(enum compact_result result) +{ + return false; +} + static inline bool compaction_withdrawn(enum compact_result result) { return true; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d3bb601c461b..af29c05e23aa 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3965,15 +3965,23 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags, if (compaction_failed(compact_result)) goto check_priority; + /* + * compaction was skipped because there are not enough order-0 pages + * to work with, so we retry only if it looks like reclaim can help. + */ + if (compaction_needs_reclaim(compact_result)) { + ret = compaction_zonelist_suitable(ac, order, alloc_flags); + goto out; + } + /* * make sure the compaction wasn't deferred or didn't bail out early * due to locks contention before we declare that we should give up. - * But do not retry if the given zonelist is not suitable for - * compaction. + * But the next retry should use a higher priority if allowed, so + * we don't just keep bailing out endlessly. */ if (compaction_withdrawn(compact_result)) { - ret = compaction_zonelist_suitable(ac, order, alloc_flags); - goto out; + goto check_priority; } /* From patchwork Fri Aug 2 22:39:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 11074221 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E08EB912 for ; Fri, 2 Aug 2019 22:39:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D103A28831 for ; Fri, 2 Aug 2019 22:39:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C4E00288EC; Fri, 2 Aug 2019 22:39:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4BBFB28831 for ; Fri, 2 Aug 2019 22:39:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE9926B000A; Fri, 2 Aug 2019 18:39:50 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C4C2D6B000C; Fri, 2 Aug 2019 18:39:50 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9EDE6B000D; Fri, 2 Aug 2019 18:39:50 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 7B51F6B000A for ; Fri, 2 Aug 2019 18:39:50 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id t124so66094193qkh.3 for ; Fri, 02 Aug 2019 15:39:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=bV9RblgfAM24Mn0li+d85If1LPYzvCMLasoX7V2aE1c=; b=U3wM2bbz4wtSzKD4s+YlSWQyQu5xTFRaXvjp+ev7DrNtoikqVFdBgxk0MkghvIKdeQ aWXCY+4vZj2gJh42ZSJExVPtjz7mBtL5GSHsRNWkcrZ3vFzffvYmdE7AouzCC4kNDjnC LIEuxobsbI4uu1An/SGf21OJd1PUq0E1jjPIsxQvRhJGOA5k0iAiACSqjd5xKhwz7uPr TMFbfEoRKynkuZ6GPccES8yxIZAYZ/75Cdu6Z+TQkpGDwdZSPAqPiAIc1Lsx6csSHWTO 6jKtUHJvsYhU7bmwv+nFL2wqarsaz0mWCRsYBHFvUlniicz4PFh2p9cysrakjqf9Ssw7 xXxQ== X-Gm-Message-State: APjAAAUsUwxVKMBrMSFupboCEGG/fJ+vPCELGHqigzF7iyiDFcTG5acN Cq9ns4sLAxnYO08HwkphUNQLdmhru0sA1jM8GcnuyuJSEieNKi0j3cVLTL7AAdRdkDZ6wNgPWfP 0Pzc4m37sISglIp0DNYwva+G6AXQs5EBtP+dqzuFSWjjrwrUb4moXUwk1O2DGYyBdqg== X-Received: by 2002:a0c:eec2:: with SMTP id h2mr97545137qvs.189.1564785590268; Fri, 02 Aug 2019 15:39:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqxlSgERNP+WwVdzcpE5adnC+0NoRVPNXgctct+6M+GGz8VF9b5xZscybtwdnb5dzXYnTMnQ X-Received: by 2002:a0c:eec2:: with SMTP id h2mr97545102qvs.189.1564785589450; Fri, 02 Aug 2019 15:39:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564785589; cv=none; d=google.com; s=arc-20160816; b=dgkjxLzqIfmHqG1bT5MU0HGJ2uOT/IfkyWPLyY4pKe4A5FmXZ876ixXqSvEK+bwXuw dsybwW8b98yMhthvHP5B7Bs7j7+4/Ocst6znJ7UVbGZRHwxjQoeZcFxrwJTSKQUZVC8t diNVosiSH1xYkxlxPJUBLom1B2JwSjvP8jhb774VITD61Q/pA/tIEif2gHv+UCXqdMmU OB+u2GXWVNxRJyBJwymCMCQJuQ8KLYzCU27vQm4nsbMFEok6qKvpe/Xu4B++mjhKViLA JcUXLi22Ai8okPwmurFZkp2uumbNzM1Ml4bDq+s6pDhugmIfsqi6YExylT7GFeMwFmAe utZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=bV9RblgfAM24Mn0li+d85If1LPYzvCMLasoX7V2aE1c=; b=JTfJ8gkIXHl6id4lF6W4KdcHcoSZvWrbtKGoGGsdAbPl8RhVxRb438npvzJWSX7Apm gd92b3jhqqzLll1rxHikdyCbHEZ9MaMtZqHn7NY/R6tUAKg3wiNyEhwxSCpiaEsMdXcs e/neo7ccJ/1L3ZXWrFz7Tx4FCcgk52/as0wBYW1piZVGCxptdUtGSMy38C45nwHoqBY5 I9zMB2JUhjnQCJhnK+elkjeLhpGS+CRkVZdpU+epgUp+rG491oc1+dLjxUDONo/byYjb BrYO3scSyYrDJF8Q4xalyf2eP51znv1EXXEqOMNRw+Ah6+C1HGzx6u4FAeRb3lr9hHyw pGFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=yP4dJsbY; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2130.oracle.com (userp2130.oracle.com. [156.151.31.86]) by mx.google.com with ESMTPS id f10si35060906qvh.21.2019.08.02.15.39.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Aug 2019 15:39:49 -0700 (PDT) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) client-ip=156.151.31.86; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=yP4dJsbY; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x72Mcu0W121794; Fri, 2 Aug 2019 22:39:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2018-07-02; bh=bV9RblgfAM24Mn0li+d85If1LPYzvCMLasoX7V2aE1c=; b=yP4dJsbYBlUwSok1WTUI9oIxojgTnrQ7C4XQYodVyQ241/kkw9otYXURcaHo38DqUklv aQv7PjTYVDpsZqqzzPl0MvimyWdII9jhXRe+MIhPYgRkMgTywylPhTKmWMfeft1Zv8FJ IbTc6XQNBc+U69lGlcPOxrVIQsF6O2e11F/1dJLls+i1qfDBJTnNTc3WlaXhDeAx4hdN 4bocK1iADGpnQdz4Zf9Tnl9TUV8n3gaXIXvk4BWxHMOtBWeHOnJTXlH93Q1K45dGLJLj TNAmBKPLGVVVmsiY15rTd/NERzp7ehIjTIS/7J43w0aV3o/A5UUA1CEiylXdOE/Wc5pX dw== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2u0e1ucye0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2019 22:39:44 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x72MbjuZ019447; Fri, 2 Aug 2019 22:39:44 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 2u49hunsrq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2019 22:39:43 +0000 Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x72MdfuH006631; Fri, 2 Aug 2019 22:39:41 GMT Received: from monkey.oracle.com (/71.63.128.209) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 02 Aug 2019 15:39:41 -0700 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hillf Danton , Vlastimil Babka , Michal Hocko , Mel Gorman , Johannes Weiner , Andrea Arcangeli , David Rientjes , Andrew Morton , Mike Kravetz Subject: [PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail Date: Fri, 2 Aug 2019 15:39:30 -0700 Message-Id: <20190802223930.30971-4-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190802223930.30971-1-mike.kravetz@oracle.com> References: <20190802223930.30971-1-mike.kravetz@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9337 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908020238 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9337 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908020238 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages, the pages will be interleaved between all nodes of the system. If nodes are not equal, it is quite possible for one node to fill up before the others. When this happens, the code still attempts to allocate pages from the full node. This results in calls to direct reclaim and compaction which slow things down considerably. When allocating pool pages, note the state of the previous allocation for each node. If previous allocation failed, do not use the aggressive retry algorithm on successive attempts. The allocation will still succeed if there is memory available, but it will not try as hard to free up memory. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 76 insertions(+), 10 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ede7e7f5d1ab..c707207e208f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1405,12 +1405,25 @@ pgoff_t __basepage_index(struct page *page) } static struct page *alloc_buddy_huge_page(struct hstate *h, - gfp_t gfp_mask, int nid, nodemask_t *nmask) + gfp_t gfp_mask, int nid, nodemask_t *nmask, + nodemask_t *node_alloc_noretry) { int order = huge_page_order(h); struct page *page; + bool alloc_try_hard = true; - gfp_mask |= __GFP_COMP|__GFP_RETRY_MAYFAIL|__GFP_NOWARN; + /* + * By default we always try hard to allocate the page with + * __GFP_RETRY_MAYFAIL flag. However, if we are allocating pages in + * a loop (to adjust global huge page counts) and previous allocation + * failed, do not continue to try hard on the same node. Use the + * node_alloc_noretry bitmap to manage this state information. + */ + if (node_alloc_noretry && node_isset(nid, *node_alloc_noretry)) + alloc_try_hard = false; + gfp_mask |= __GFP_COMP|__GFP_NOWARN; + if (alloc_try_hard) + gfp_mask |= __GFP_RETRY_MAYFAIL; if (nid == NUMA_NO_NODE) nid = numa_mem_id(); page = __alloc_pages_nodemask(gfp_mask, order, nid, nmask); @@ -1419,6 +1432,22 @@ static struct page *alloc_buddy_huge_page(struct hstate *h, else __count_vm_event(HTLB_BUDDY_PGALLOC_FAIL); + /* + * If we did not specify __GFP_RETRY_MAYFAIL, but still got a page this + * indicates an overall state change. Clear bit so that we resume + * normal 'try hard' allocations. + */ + if (node_alloc_noretry && page && !alloc_try_hard) + node_clear(nid, *node_alloc_noretry); + + /* + * If we tried hard to get a page but failed, set bit so that + * subsequent attempts will not try as hard until there is an + * overall state change. + */ + if (node_alloc_noretry && !page && alloc_try_hard) + node_set(nid, *node_alloc_noretry); + return page; } @@ -1427,7 +1456,8 @@ static struct page *alloc_buddy_huge_page(struct hstate *h, * should use this function to get new hugetlb pages */ static struct page *alloc_fresh_huge_page(struct hstate *h, - gfp_t gfp_mask, int nid, nodemask_t *nmask) + gfp_t gfp_mask, int nid, nodemask_t *nmask, + nodemask_t *node_alloc_noretry) { struct page *page; @@ -1435,7 +1465,7 @@ static struct page *alloc_fresh_huge_page(struct hstate *h, page = alloc_gigantic_page(h, gfp_mask, nid, nmask); else page = alloc_buddy_huge_page(h, gfp_mask, - nid, nmask); + nid, nmask, node_alloc_noretry); if (!page) return NULL; @@ -1450,14 +1480,16 @@ static struct page *alloc_fresh_huge_page(struct hstate *h, * Allocates a fresh page to the hugetlb allocator pool in the node interleaved * manner. */ -static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) +static int alloc_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed, + nodemask_t *node_alloc_noretry) { struct page *page; int nr_nodes, node; gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE; for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) { - page = alloc_fresh_huge_page(h, gfp_mask, node, nodes_allowed); + page = alloc_fresh_huge_page(h, gfp_mask, node, nodes_allowed, + node_alloc_noretry); if (page) break; } @@ -1601,7 +1633,7 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask, goto out_unlock; spin_unlock(&hugetlb_lock); - page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask); + page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL); if (!page) return NULL; @@ -1637,7 +1669,7 @@ struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask, if (hstate_is_gigantic(h)) return NULL; - page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask); + page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL); if (!page) return NULL; @@ -2207,13 +2239,31 @@ static void __init gather_bootmem_prealloc(void) static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long i; + nodemask_t *node_alloc_noretry; + + if (!hstate_is_gigantic(h)) { + /* + * bit mask controlling how hard we retry per-node + * allocations. + */ + node_alloc_noretry = kmalloc(sizeof(*node_alloc_noretry), + GFP_KERNEL | __GFP_NORETRY); + } else { + /* allocations done at boot time */ + node_alloc_noretry = NULL; + } + + /* bit mask controlling how hard we retry per-node allocations */ + if (node_alloc_noretry) + nodes_clear(*node_alloc_noretry); for (i = 0; i < h->max_huge_pages; ++i) { if (hstate_is_gigantic(h)) { if (!alloc_bootmem_huge_page(h)) break; } else if (!alloc_pool_huge_page(h, - &node_states[N_MEMORY])) + &node_states[N_MEMORY], + node_alloc_noretry)) break; cond_resched(); } @@ -2225,6 +2275,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) h->max_huge_pages, buf, i); h->max_huge_pages = i; } + + kfree(node_alloc_noretry); } static void __init hugetlb_init_hstates(void) @@ -2323,6 +2375,14 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, nodemask_t *nodes_allowed) { unsigned long min_count, ret; + NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, + GFP_KERNEL | __GFP_NORETRY); + + /* bit mask controlling how hard we retry per-node allocations */ + if (node_alloc_noretry) + nodes_clear(*node_alloc_noretry); + else + return -ENOMEM; spin_lock(&hugetlb_lock); @@ -2356,6 +2416,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, if (hstate_is_gigantic(h) && !IS_ENABLED(CONFIG_CONTIG_ALLOC)) { if (count > persistent_huge_pages(h)) { spin_unlock(&hugetlb_lock); + if (node_alloc_noretry) + NODEMASK_FREE(node_alloc_noretry); return -EINVAL; } /* Fall through to decrease pool */ @@ -2388,7 +2450,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, /* yield cpu to avoid soft lockup */ cond_resched(); - ret = alloc_pool_huge_page(h, nodes_allowed); + ret = alloc_pool_huge_page(h, nodes_allowed, + node_alloc_noretry); spin_lock(&hugetlb_lock); if (!ret) goto out; @@ -2429,6 +2492,9 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, h->max_huge_pages = persistent_huge_pages(h); spin_unlock(&hugetlb_lock); + if (node_alloc_noretry) + NODEMASK_FREE(node_alloc_noretry); + return 0; }