From patchwork Wed May 23 08:26:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 10420629 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E255060327 for ; Wed, 23 May 2018 08:27:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D30E128E3F for ; Wed, 23 May 2018 08:27:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C716628E54; Wed, 23 May 2018 08:27:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E7BF028E3F for ; Wed, 23 May 2018 08:27:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56D976B0006; Wed, 23 May 2018 04:27:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 51E566B0007; Wed, 23 May 2018 04:27:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4357E6B0283; Wed, 23 May 2018 04:27:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf0-f197.google.com (mail-pf0-f197.google.com [209.85.192.197]) by kanga.kvack.org (Postfix) with ESMTP id 0298E6B0006 for ; Wed, 23 May 2018 04:27:40 -0400 (EDT) Received: by mail-pf0-f197.google.com with SMTP id e16-v6so12798121pfn.5 for ; Wed, 23 May 2018 01:27:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=4FXRj5wLV+/1K3Dw5ZO92LodgurjA+ZHSJZaV+UCkpA=; b=NX+fRpan/gST2zIo1NqlGRlnTxm8jDu4asRSaa3E8PuCwXH3KWgE2Wmrvne3R4gMQu /vg5lrHf/FUcezTvOI7qlBo+sALic0qvXHcI9fogtPv2DJHtpsxtHHdpxLFyityIKbEe /VK++o0lKLsbej90hZLn30gZ2QoIf9qoLEyf7jyEwiJ6SNjLgkwm1hD3sb7rCz38w/DM edNsxUn0O2bjgIbCkb0KEdBDCIlJ+74wzeT75kRWuAXsutvGBjGxH5wImhC1EA3f/Ymz SLwvfVGrlrl9jOzzhouGg+l/V19/i58q2lz72FLfd2isRILg5Axwv1d+KSF2HUCYguK2 bvDA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ALKqPwfSXGalDWz0EZKKEQY0prUTr7XzyG8ahpGrCujP0pz0L64yWBBi 2oxaB3B0VI/HkxaF2e5tp+jQFUJMVXJIy8XKOEcWE9CYHltJsFQmKYhLInjXS7k2DXaaCEIPzRT VhpTJVwtd0qEEA8D7hxovFdG1rZ2VdBfiUg9BmaUFFD2SwJw5psVPaXf8pHKi0lmiVw== X-Received: by 2002:a62:11dc:: with SMTP id 89-v6mr1965343pfr.18.1527064059687; Wed, 23 May 2018 01:27:39 -0700 (PDT) X-Google-Smtp-Source: AB8JxZo2fWRqjMvqqwW+ubyoY5qlOn7P0gzxKEeTaXGPMONlu7cQgaNwalorP1NqtmgvDTH7Ge7o X-Received: by 2002:a62:11dc:: with SMTP id 89-v6mr1965295pfr.18.1527064058769; Wed, 23 May 2018 01:27:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527064058; cv=none; d=google.com; s=arc-20160816; b=O/MoBTZG3peuICva3Reb176iInUT+vGMpmqtjcFfYlZ3JHNYSf+FOxlecWsxXrKiP5 nI7WjmrT5dmmMyMGLNLsO4c3OdiJ5PPGOAYSOWhyxi3VN6iprvEE566Zh/5WxVylLTc2 NTtYY+NzmazYN6y1AfND88yu0kZ+0GqDpWnrqIrNDwNW56gMLqsjBoQV4JJVUCmIfSsP xHGP/x51H9aJ2F5mIDWmY65ObFRqvv6zj0L+cmJL+kwRAk4OTk/KEWvNeY4kVRZ4b40O cTY4a2IELodg3NycfBXuU9X0eGC6PIfeD3Uo7MGhioxeZ3mTcRmGBenJIKoxdPG6F4iO Viww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=4FXRj5wLV+/1K3Dw5ZO92LodgurjA+ZHSJZaV+UCkpA=; b=GoFlfALHnLHe8d7Rl1R9HVjthQYP0YY0Moh5zFuMGtgQo6SFM7oR7VIij7T69UmsD3 juJCkM3u7modkO/+1L+jM3AJTYm5OJxE4IhEDmrmoZtiEjvdr0ImF28dBhYu7bLZ1mMc /7ANdx2prZAlzBEUnbHMe/CDh/cewLHKagmP6qwsrE1MnS321kR4NtcuzRzRYxAOuQiv c21lLGp0tFNdkOfBKCNbN1b/8pDOJJtSHGdFQJIzohfQJOOlppZGiqqXrFTaCmBJOO7f Hd2cZwpHCJSza3eeI4GJYxrNV2rmdLpQTCn5Nl0GhxN4UzBpjkpMq0pPAvQq2uG0SRP4 iDUg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id m3-v6si14325590pgd.58.2018.05.23.01.27.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 23 May 2018 01:27:38 -0700 (PDT) Received-SPF: pass (google.com: domain of ying.huang@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 May 2018 01:27:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,432,1520924400"; d="scan'208";a="57726333" Received: from yhuang6-ux31a.sh.intel.com ([10.239.197.97]) by fmsmga001.fm.intel.com with ESMTP; 23 May 2018 01:27:32 -0700 From: "Huang, Ying" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , "Kirill A. Shutemov" , Andrea Arcangeli , Michal Hocko , Johannes Weiner , Shaohua Li , Hugh Dickins , Minchan Kim , Rik van Riel , Dave Hansen , Naoya Horiguchi , Zi Yan Subject: [PATCH -mm -V3 20/21] mm, THP, swap: create PMD swap mapping when unmap the THP Date: Wed, 23 May 2018 16:26:24 +0800 Message-Id: <20180523082625.6897-21-ying.huang@intel.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: <20180523082625.6897-1-ying.huang@intel.com> References: <20180523082625.6897-1-ying.huang@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Huang Ying This is the final step of the THP swapin support. When reclaiming a anonymous THP, after allocating the huge swap cluster and add the THP into swap cache, the PMD page mapping will be changed to the mapping to the swap space. Previously, the PMD page mapping will be split before being changed. In this patch, the unmap code is enhanced not to split the PMD mapping, but create a PMD swap mapping to replace it instead. So later when clear the SWAP_HAS_CACHE flag in the last step of swapout, the huge swap cluster will be kept instead of being split, and when swapin, the huge swap cluster will be read as a whole into a THP. That is, the THP will not be split during swapout/swapin. This can eliminate the overhead of splitting/collapsing, and reduce the page fault count, etc. But more important, the utilization of THP is improved greatly, that is, much more THP will be kept when swapping is used, so that we can take full advantage of THP including its high performance for swapout/swapin. Signed-off-by: "Huang, Ying" Cc: "Kirill A. Shutemov" Cc: Andrea Arcangeli Cc: Michal Hocko Cc: Johannes Weiner Cc: Shaohua Li Cc: Hugh Dickins Cc: Minchan Kim Cc: Rik van Riel Cc: Dave Hansen Cc: Naoya Horiguchi Cc: Zi Yan --- include/linux/huge_mm.h | 11 +++++++++++ mm/huge_memory.c | 30 ++++++++++++++++++++++++++++++ mm/rmap.c | 43 +++++++++++++++++++++++++++++++++++++++++-- mm/vmscan.c | 6 +----- 4 files changed, 83 insertions(+), 7 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5001c28b3d18..d03fcddcc42d 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -404,6 +404,8 @@ static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma) } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +struct page_vma_mapped_walk; + #ifdef CONFIG_THP_SWAP extern void __split_huge_swap_pmd(struct vm_area_struct *vma, unsigned long haddr, @@ -411,6 +413,8 @@ extern void __split_huge_swap_pmd(struct vm_area_struct *vma, extern int split_huge_swap_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, pmd_t orig_pmd); extern int do_huge_pmd_swap_page(struct vm_fault *vmf, pmd_t orig_pmd); +extern bool set_pmd_swap_entry(struct page_vma_mapped_walk *pvmw, + struct page *page, unsigned long address, pmd_t pmdval); static inline bool transparent_hugepage_swapin_enabled( struct vm_area_struct *vma) @@ -452,6 +456,13 @@ static inline int do_huge_pmd_swap_page(struct vm_fault *vmf, pmd_t orig_pmd) return 0; } +static inline bool set_pmd_swap_entry(struct page_vma_mapped_walk *pvmw, + struct page *page, unsigned long address, + pmd_t pmdval) +{ + return false; +} + static inline bool transparent_hugepage_swapin_enabled( struct vm_area_struct *vma) { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e80d03c2412a..88984e95b9b2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1876,6 +1876,36 @@ int do_huge_pmd_swap_page(struct vm_fault *vmf, pmd_t orig_pmd) count_vm_event(THP_SWPIN_FALLBACK); goto fallback; } + +bool set_pmd_swap_entry(struct page_vma_mapped_walk *pvmw, struct page *page, + unsigned long address, pmd_t pmdval) +{ + struct vm_area_struct *vma = pvmw->vma; + struct mm_struct *mm = vma->vm_mm; + pmd_t swp_pmd; + swp_entry_t entry = { .val = page_private(page) }; + + if (swap_duplicate(&entry, true) < 0) { + set_pmd_at(mm, address, pvmw->pmd, pmdval); + return false; + } + if (list_empty(&mm->mmlist)) { + spin_lock(&mmlist_lock); + if (list_empty(&mm->mmlist)) + list_add(&mm->mmlist, &init_mm.mmlist); + spin_unlock(&mmlist_lock); + } + add_mm_counter(mm, MM_ANONPAGES, -HPAGE_PMD_NR); + add_mm_counter(mm, MM_SWAPENTS, HPAGE_PMD_NR); + swp_pmd = swp_entry_to_pmd(entry); + if (pmd_soft_dirty(pmdval)) + swp_pmd = pmd_swp_mksoft_dirty(swp_pmd); + set_pmd_at(mm, address, pvmw->pmd, swp_pmd); + + page_remove_rmap(page, true); + put_page(page); + return true; +} #endif static inline void zap_deposited_table(struct mm_struct *mm, pmd_t *pmd) diff --git a/mm/rmap.c b/mm/rmap.c index 5f45d6325c40..4861b1a86e2a 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1402,12 +1402,51 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, continue; } + address = pvmw.address; + +#ifdef CONFIG_THP_SWAP + /* PMD-mapped THP swap entry */ + if (thp_swap_supported() && !pvmw.pte && PageAnon(page)) { + pmd_t pmdval; + + VM_BUG_ON_PAGE(PageHuge(page) || + !PageTransCompound(page), page); + + flush_cache_range(vma, address, + address + HPAGE_PMD_SIZE); + mmu_notifier_invalidate_range_start(mm, address, + address + HPAGE_PMD_SIZE); + if (should_defer_flush(mm, flags)) { + /* check comments for PTE below */ + pmdval = pmdp_huge_get_and_clear(mm, address, + pvmw.pmd); + set_tlb_ubc_flush_pending(mm, + pmd_dirty(pmdval)); + } else + pmdval = pmdp_huge_clear_flush(vma, address, + pvmw.pmd); + + /* + * Move the dirty bit to the page. Now the pmd + * is gone. + */ + if (pmd_dirty(pmdval)) + set_page_dirty(page); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + ret = set_pmd_swap_entry(&pvmw, page, address, pmdval); + mmu_notifier_invalidate_range_end(mm, address, + address + HPAGE_PMD_SIZE); + continue; + } +#endif + /* Unexpected PMD-mapped THP? */ VM_BUG_ON_PAGE(!pvmw.pte, page); subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte); - address = pvmw.address; - if (IS_ENABLED(CONFIG_MIGRATION) && (flags & TTU_MIGRATION) && diff --git a/mm/vmscan.c b/mm/vmscan.c index 50055d72f294..9f46047d4dee 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1148,11 +1148,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, * processes. Try to unmap it here. */ if (page_mapped(page)) { - enum ttu_flags flags = ttu_flags | TTU_BATCH_FLUSH; - - if (unlikely(PageTransHuge(page))) - flags |= TTU_SPLIT_HUGE_PMD; - if (!try_to_unmap(page, flags)) { + if (!try_to_unmap(page, ttu_flags | TTU_BATCH_FLUSH)) { nr_unmap_fail++; goto activate_locked; }