From patchwork Mon Oct 5 12:15:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11816567 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 08C0C112E for ; Mon, 5 Oct 2020 12:18:06 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C2B022078D for ; Mon, 5 Oct 2020 12:18:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YgrQJJ2T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C2B022078D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.2994.8606 (Exim 4.92) (envelope-from ) id 1kPPQ5-00073U-CV; Mon, 05 Oct 2020 12:16:45 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 2994.8606; Mon, 05 Oct 2020 12:16:45 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kPPQ5-00073K-8u; Mon, 05 Oct 2020 12:16:45 +0000 Received: by outflank-mailman (input) for mailman id 2994; Mon, 05 Oct 2020 12:16:44 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kPPQ4-00072n-68 for xen-devel@lists.xenproject.org; Mon, 05 Oct 2020 12:16:44 +0000 Received: from us-smtp-delivery-124.mimecast.com (unknown [63.128.21.124]) by us1-rack-iad1.inumbo.com (Halon) with ESMTP id 39e3cea2-e6a0-4dbe-95a5-5909db546898; Mon, 05 Oct 2020 12:16:43 +0000 (UTC) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-536-hdgD0PM4N7SehO3ysGnyiw-1; Mon, 05 Oct 2020 08:16:39 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 547CE18A8220; Mon, 5 Oct 2020 12:16:36 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-222.ams2.redhat.com [10.36.114.222]) by smtp.corp.redhat.com (Postfix) with ESMTP id C518D1A8EC; Mon, 5 Oct 2020 12:16:23 +0000 (UTC) Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kPPQ4-00072n-68 for xen-devel@lists.xenproject.org; Mon, 05 Oct 2020 12:16:44 +0000 X-Inumbo-ID: 39e3cea2-e6a0-4dbe-95a5-5909db546898 Received: from us-smtp-delivery-124.mimecast.com (unknown [63.128.21.124]) by us1-rack-iad1.inumbo.com (Halon) with ESMTP id 39e3cea2-e6a0-4dbe-95a5-5909db546898; Mon, 05 Oct 2020 12:16:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1601900202; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I03+HqTTiEYZ2UJ8Hm8w9UAbNI08TTVr15820ukCo2o=; b=YgrQJJ2TPmJuYA5hEQoZlIneITREmdjrt/71x9DMcH45Lw7P+QzzAwe4Lhk9kWMt3MEm/8 ApfXRmJFTf8WUNGWcj3CGnNtMD9qKbjNUMP0Yrr47JYFNiaW7nPWN5HB6PJWnxe6blfDvD eODvi9TKj7y8wTSd+3opgvw3sN3KoeY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-536-hdgD0PM4N7SehO3ysGnyiw-1; Mon, 05 Oct 2020 08:16:39 -0400 X-MC-Unique: hdgD0PM4N7SehO3ysGnyiw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 547CE18A8220; Mon, 5 Oct 2020 12:16:36 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-222.ams2.redhat.com [10.36.114.222]) by smtp.corp.redhat.com (Postfix) with ESMTP id C518D1A8EC; Mon, 5 Oct 2020 12:16:23 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, Andrew Morton , Matthew Wilcox , David Hildenbrand , Vlastimil Babka , Oscar Salvador , Pankaj Gupta , Wei Yang , Michal Hocko , Alexander Duyck , Mel Gorman , Michal Hocko , Dave Hansen , Mike Rapoport , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu Subject: [PATCH v2 4/5] mm/page_alloc: place pages to tail in __free_pages_core() Date: Mon, 5 Oct 2020 14:15:33 +0200 Message-Id: <20201005121534.15649-5-david@redhat.com> In-Reply-To: <20201005121534.15649-1-david@redhat.com> References: <20201005121534.15649-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 __free_pages_core() is used when exposing fresh memory to the buddy during system boot and when onlining memory in generic_online_page(). generic_online_page() is used in two cases: 1. Direct memory onlining in online_pages(). 2. Deferred memory onlining in memory-ballooning-like mechanisms (HyperV balloon and virtio-mem), when parts of a section are kept fake-offline to be fake-onlined later on. In 1, we already place pages to the tail of the freelist. Pages will be freed to MIGRATE_ISOLATE lists first and moved to the tail of the freelists via undo_isolate_page_range(). In 2, we currently don't implement a proper rule. In case of virtio-mem, where we currently always online MAX_ORDER - 1 pages, the pages will be placed to the HEAD of the freelist - undesireable. While the hyper-v balloon calls generic_online_page() with single pages, usually it will call it on successive single pages in a larger block. The pages are fresh, so place them to the tail of the freelist and avoid the PCP. In __free_pages_core(), remove the now superflouos call to set_page_refcounted() and add a comment regarding page initialization and the refcount. Note: In 2. we currently don't shuffle. If ever relevant (page shuffling is usually of limited use in virtualized environments), we might want to shuffle after a sequence of generic_online_page() calls in the relevant callers. Reviewed-by: Vlastimil Babka Reviewed-by: Oscar Salvador Acked-by: Pankaj Gupta Reviewed-by: Wei Yang Acked-by: Michal Hocko Cc: Andrew Morton Cc: Alexander Duyck Cc: Mel Gorman Cc: Michal Hocko Cc: Dave Hansen Cc: Vlastimil Babka Cc: Wei Yang Cc: Oscar Salvador Cc: Mike Rapoport Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Signed-off-by: David Hildenbrand --- mm/page_alloc.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b187e46cf640..3dadcc6d4009 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -275,7 +275,8 @@ bool pm_suspended_storage(void) unsigned int pageblock_order __read_mostly; #endif -static void __free_pages_ok(struct page *page, unsigned int order); +static void __free_pages_ok(struct page *page, unsigned int order, + fpi_t fpi_flags); /* * results with 256, 32 in the lowmem_reserve sysctl: @@ -687,7 +688,7 @@ static void bad_page(struct page *page, const char *reason) void free_compound_page(struct page *page) { mem_cgroup_uncharge(page); - __free_pages_ok(page, compound_order(page)); + __free_pages_ok(page, compound_order(page), FPI_NONE); } void prep_compound_page(struct page *page, unsigned int order) @@ -1423,14 +1424,14 @@ static void free_pcppages_bulk(struct zone *zone, int count, static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, - int migratetype) + int migratetype, fpi_t fpi_flags) { spin_lock(&zone->lock); if (unlikely(has_isolate_pageblock(zone) || is_migrate_isolate(migratetype))) { migratetype = get_pfnblock_migratetype(page, pfn); } - __free_one_page(page, pfn, zone, order, migratetype, FPI_NONE); + __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); spin_unlock(&zone->lock); } @@ -1508,7 +1509,8 @@ void __meminit reserve_bootmem_region(phys_addr_t start, phys_addr_t end) } } -static void __free_pages_ok(struct page *page, unsigned int order) +static void __free_pages_ok(struct page *page, unsigned int order, + fpi_t fpi_flags) { unsigned long flags; int migratetype; @@ -1520,7 +1522,8 @@ static void __free_pages_ok(struct page *page, unsigned int order) migratetype = get_pfnblock_migratetype(page, pfn); local_irq_save(flags); __count_vm_events(PGFREE, 1 << order); - free_one_page(page_zone(page), page, pfn, order, migratetype); + free_one_page(page_zone(page), page, pfn, order, migratetype, + fpi_flags); local_irq_restore(flags); } @@ -1530,6 +1533,11 @@ void __free_pages_core(struct page *page, unsigned int order) struct page *p = page; unsigned int loop; + /* + * When initializing the memmap, __init_single_page() sets the refcount + * of all pages to 1 ("allocated"/"not free"). We have to set the + * refcount of all involved pages to 0. + */ prefetchw(p); for (loop = 0; loop < (nr_pages - 1); loop++, p++) { prefetchw(p + 1); @@ -1540,8 +1548,12 @@ void __free_pages_core(struct page *page, unsigned int order) set_page_count(p, 0); atomic_long_add(nr_pages, &page_zone(page)->managed_pages); - set_page_refcounted(page); - __free_pages(page, order); + + /* + * Bypass PCP and place fresh pages right to the tail, primarily + * relevant for memory onlining. + */ + __free_pages_ok(page, order, FPI_TO_TAIL); } #ifdef CONFIG_NEED_MULTIPLE_NODES @@ -3168,7 +3180,8 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn) */ if (migratetype >= MIGRATE_PCPTYPES) { if (unlikely(is_migrate_isolate(migratetype))) { - free_one_page(zone, page, pfn, 0, migratetype); + free_one_page(zone, page, pfn, 0, migratetype, + FPI_NONE); return; } migratetype = MIGRATE_MOVABLE; @@ -4991,7 +5004,7 @@ static inline void free_the_page(struct page *page, unsigned int order) if (order == 0) /* Via pcp? */ free_unref_page(page); else - __free_pages_ok(page, order); + __free_pages_ok(page, order, FPI_NONE); } void __free_pages(struct page *page, unsigned int order)