From patchwork Tue Jun 4 10:17:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13685017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 039E6C25B7E for ; Tue, 4 Jun 2024 10:18:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DABFA6B00B5; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9393C6B00B4; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EE146B00BA; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 0F0706B00B4 for ; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 922651210C6 for ; Tue, 4 Jun 2024 10:18:11 +0000 (UTC) X-FDA: 82192805982.08.FB90DD5 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) by imf10.hostedemail.com (Postfix) with ESMTP id 71371C0022 for ; Tue, 4 Jun 2024 10:18:09 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=j0H9zn4a; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717496289; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=guoh/QGJfMTFO1luBsme9MpL+CeGbbXVEAqgA5Cl58c=; b=8FcHjOZ2PsHzwlKd4wEIcuxo23H++z+WQHpZ8HqEA016C81LMdA+XOv+umKHt1rfI6zQT9 f/+d/CjtkFPxsdFYo7eKW0iFjTKmYmhiOohWYM72w8oQyOf/tg8Y+MFvG/f3Vr+U3+Hm76 SHGXuKNrbzL2uKy1Dc9QkV8evmqkK98= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717496289; a=rsa-sha256; cv=none; b=PiT4IBAwnlcbcLH+E3aMBtu8J7O7zGfp/I22hhGwZyuzEM0Kfw/AMt+3jTPXw+5Y4w3IDM dE96k6XHrRpcX2Sz4xzf30j5V/ptD5pmnMGCoN6iYpUR8QAjYvZhAmFAoc/93kit2F5fz3 5ZnVxsWulCsRzPYpb17yIjAqNtoV9u8= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=j0H9zn4a; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1717496286; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=guoh/QGJfMTFO1luBsme9MpL+CeGbbXVEAqgA5Cl58c=; b=j0H9zn4aUzodZDSL+2M/DWjU07FGxDDpvnIdV39pFW4WBmi9JKRcZB7joAj8LNIeZwqcD5aWI+FBXA/fknfMAa7rc/6kUA95A0kgkju2hShuF66N/coDe3EuuSyloL1p84KX4r+6OlAoHk6kMMpeWw+kTC0yF60w4CogLgXxSHs= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045046011;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0W7qooHn_1717496284; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W7qooHn_1717496284) by smtp.aliyun-inc.com; Tue, 04 Jun 2024 18:18:05 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 1/6] mm: memory: extend finish_fault() to support large folio Date: Tue, 4 Jun 2024 18:17:45 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: iwor3f9rjy4ioe1fkbr8njusgtqjbfze X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 71371C0022 X-HE-Tag: 1717496289-381085 X-HE-Meta: U2FsdGVkX1/kSulS8nHmZTlwIYNdeT8FE6vdtfOPnxiwMlOdbNovZUFvSwCV0/cTta06dcDmiydMbWcSEGw7OigvEDVdPHjdMwEjI5vRtav/r0XPrvPfKBAxpb9TLH+3Xx6dGxnow0AcJfyYLIadk0mxBACnNHq9+9l2tHDvlE57vfoSSGLxT0lEbnAXzqz7KFrGSVlgqQ0gxcPSKUJ3BVOKOcKvyUjLNKtb52GYnVxmgbSkWxkfijNPI29dP+cGW6LtX2j6jivHMtpjb8gJq5cZyUe41rvY3bxvG71mhKZCuncomSDBm83o43GS0eZy5MipMeAHP0gl0ET/3elim++Owzvsl0DBxHvseUY4pEaEwphmB0tPRy57QwI8CvOQ/6SxoLH49WGkJpqoN9ABl7NO7/6PVidAv0LTIGcRL2tVbDTIniC3+cRGB+eBO7Fmu2ar5KOv09cWML4Guq2BUSpIStJHUl9qZzYYxTbTpj42I8BqOawGazgr4YdI7e8vuYi27/4kKm0WYqI0ZpgcVUvlR4Zbt/2yRQLlr7I0i4LxHpTB2Bvi7+h28AW2p4JpL11Tf7ga5QGme1AVZx14zkNH/xc8qp9C7Xt2o5Nj4Fma2iR+FVWRcIcXoOlKxx5M09kVTjAP0yo+NEBg2v65l57+i77JYl1UI/KvsqzZCSMGkpppf6r9tDsJEDrBtjsoQA98m5fCyrxKcglhaOdJapgihnuCdbBIqWV+NMZ13pD+5hf6bP5Lh3S/vkkrsViwtXks44zWz5QfoxEGlCx/SAmTJAyXe8U2TrYou1mBByn8yYyanibv0nwMubU/5CtziH/8WZ8wccCDwWO6JD7+NTTGh0WmkMmwrJ1UEMxwkcTgowtghaFVBi2utywIHnDCTPgUuI1olg9rxyf/QP+KAjBrCyXRg6OlT/m1YSzUlwNcm1cgKluW83ZQF2bPh1j2yTwz2Y2JuYTpakPjmP5 n2ySx488 0lVd2n0oX3/ii3SKymNpqxMRSPHqh9r7IgQFa1wCoQDfvXTjU50P2IKusv8lrWFKgTQ6vpaKsbclBnHzTcaAMQ1muGKcUMpCUhjycT0DQsA6eX5pWTEES0Wu+U2C8r8QbqcpKE+oGA5xLu+rZBBWPIWHiW5mHDnL1eyk8I0HiDdmofHW+Frb9fThtJ104JcRNQ6Oq9a6XhwVrTOn6O+jDEv0Y7Bm65w0fUOzTKdyRGw9O2NLrtIjNVZYPrzbxLG9BryxkQPBjPzJPdA/FVVvOjN6RuWGDbsJcOLDK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add large folio mapping establishment support for finish_fault() as a preparation, to support multi-size THP allocation of anonymous shmem pages in the following patches. Keep the same behavior (per-page fault) for non-anon shmem to avoid inflating the RSS unintentionally, and we can discuss what size of mapping to build when extending mTHP to control non-anon shmem in the future. Signed-off-by: Baolin Wang --- mm/memory.c | 57 +++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 47 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index eef4e482c0c2..1f7be4c6aac4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4831,9 +4831,12 @@ vm_fault_t finish_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct page *page; + struct folio *folio; vm_fault_t ret; bool is_cow = (vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED); + int type, nr_pages, i; + unsigned long addr = vmf->address; /* Did we COW the page? */ if (is_cow) @@ -4864,24 +4867,58 @@ vm_fault_t finish_fault(struct vm_fault *vmf) return VM_FAULT_OOM; } + folio = page_folio(page); + nr_pages = folio_nr_pages(folio); + + /* + * Using per-page fault to maintain the uffd semantics, and same + * approach also applies to non-anonymous-shmem faults to avoid + * inflating the RSS of the process. + */ + if (!vma_is_anon_shmem(vma) || unlikely(userfaultfd_armed(vma))) { + nr_pages = 1; + } else if (nr_pages > 1) { + pgoff_t idx = folio_page_idx(folio, page); + /* The page offset of vmf->address within the VMA. */ + pgoff_t vma_off = vmf->pgoff - vmf->vma->vm_pgoff; + + /* + * Fallback to per-page fault in case the folio size in page + * cache beyond the VMA limits. + */ + if (unlikely(vma_off < idx || + vma_off + (nr_pages - idx) > vma_pages(vma))) { + nr_pages = 1; + } else { + /* Now we can set mappings for the whole large folio. */ + addr = vmf->address - idx * PAGE_SIZE; + page = &folio->page; + } + } + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); + addr, &vmf->ptl); if (!vmf->pte) return VM_FAULT_NOPAGE; /* Re-check under ptl */ - if (likely(!vmf_pte_changed(vmf))) { - struct folio *folio = page_folio(page); - int type = is_cow ? MM_ANONPAGES : mm_counter_file(folio); - - set_pte_range(vmf, folio, page, 1, vmf->address); - add_mm_counter(vma->vm_mm, type, 1); - ret = 0; - } else { - update_mmu_tlb(vma, vmf->address, vmf->pte); + if (nr_pages == 1 && unlikely(vmf_pte_changed(vmf))) { + update_mmu_tlb(vma, addr, vmf->pte); + ret = VM_FAULT_NOPAGE; + goto unlock; + } else if (nr_pages > 1 && !pte_range_none(vmf->pte, nr_pages)) { + update_mmu_tlb_range(vma, addr, vmf->pte, nr_pages); ret = VM_FAULT_NOPAGE; + goto unlock; } + folio_ref_add(folio, nr_pages - 1); + set_pte_range(vmf, folio, page, nr_pages, addr); + type = is_cow ? MM_ANONPAGES : mm_counter_file(folio); + add_mm_counter(vma->vm_mm, type, nr_pages); + ret = 0; + +unlock: pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; } From patchwork Tue Jun 4 10:17:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13685016 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4584BC25B78 for ; Tue, 4 Jun 2024 10:18:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A583D6B00B3; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 875626B00B6; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F39F6B00B3; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0D9616B00AB for ; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6753A1A087E for ; Tue, 4 Jun 2024 10:18:13 +0000 (UTC) X-FDA: 82192806066.28.AFC36F1 Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) by imf14.hostedemail.com (Postfix) with ESMTP id 65D53100004 for ; Tue, 4 Jun 2024 10:18:10 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=yUCcS3SE; spf=pass (imf14.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717496290; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sk0VCEBLx9HEPQafud6f811f0Tz6WgnTKVbuudZtsuE=; b=SruU0dnIqtZf/2DDAqYOJIxcBZrk+iaxAVhO+S4VGGbiL5JlIBwzgkkuvbnnCTcOW3RuKV nnmPxn3bJMLXI5FMKSOPocD+Wm5xwpKlMVNIDUTa7xT5nCzWBM365YwiyX8j3KCxOv+Ksh hsPxLu0FxRFjKn+7jGUn+pimdEtSki8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717496290; a=rsa-sha256; cv=none; b=Wm3mym7vf4IEF0p5MdQBzOB2a8G8gZDtvkpktSPkr0jPizKBCw4k20YV3o7m8UijubZ4ju E/+PcWtayK2gWsy1W68Bq700FkuUGF2H5JY2WQ/N6qVl0ODWKt+7KrQF34yWIc+P9PHBb2 iIBWDsAaMognbLJZCZvKT6a+673mQxk= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=yUCcS3SE; spf=pass (imf14.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1717496287; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=sk0VCEBLx9HEPQafud6f811f0Tz6WgnTKVbuudZtsuE=; b=yUCcS3SESu555Lqpcs5KrxI9Tg5WA2Vt0vceKC4RlaaCZ5BdHMFUMHHPXDsVH+nc9xc0BZ2CxXW17CFpsmmWnNbUBEoYnCbADQMDlNkT8geOjE+q0P9I3SaLczV4QYRnklyUs33QVLJhfX11sTUQUKZrtvSBBU4ZJ0ntAydSNco= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067109;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0W7qooIK_1717496285; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W7qooIK_1717496285) by smtp.aliyun-inc.com; Tue, 04 Jun 2024 18:18:06 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 2/6] mm: shmem: add THP validation for PMD-mapped THP related statistics Date: Tue, 4 Jun 2024 18:17:46 +0800 Message-Id: <337ad58a839cbdef4ecd446c22ffcf8c9dcfd9af.1717495894.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 65D53100004 X-Rspam-User: X-Stat-Signature: kwbt3rfkzrpodyc7k4x5auehiw8gwrqt X-HE-Tag: 1717496290-437005 X-HE-Meta: U2FsdGVkX1+1xkrAdIjtoisYl1SQQSlsVmuJSrQ3EvQvWcyypERFcoT8OZVFWzFf/Ff+XkBtPRJTl2TGI5TgkL3EODPl0ARm0FIpIe6McEdHOUUM1RVn+LFegAcMQbib+Hmfu9zLu1PQ2MqT4GbEL1PodbRzA2PqgZmuNFAgUIeXfHq6qb/BLqAinOMEySQJS+OowPJn/ii46EeBUBIEtg6iYr9k2bO4ozu8mAljvj6+OjaCpJEHJpS+5gRHpf6k7vRmyQInUHeAS4pcCeNQtFxpM2SbEDICcPNimQlTDeMFSwO1ly9d7xYacG4Jihgo7Max3lP2iTkXr14Kc+xJQybFT/WcHuna5BOILLUmZk7b+HAIigRXY+ieo/XMzKpLYg5PvW1PvXayZKDH/XQKJ7Rd61sUJGI+cXOxCotcTWFYk0mLh6zHvk+h4lbnIKYe8qC13xelk2M/9VuNAxbcou6oG99ftylgdLyQBkpvfMaiZEhQnbMbSHo7ZRyZnGSetl8p70e2MuGDl42fdCTsUfUjScV0ptL5kQgzwKAHPpzOIb4r2jTepekEMiJ87itR7OggSs7FprCCyNQr7PFptyVPojG3X4aJK4M7gDankXNKW5APJ/7atLfWXF3338YJ2VIrpCDYVlqPjBbabKeHuT59y6n6PKSHgaKWSgQDs2Jq3NXxaUtffI+FPDK+ZsX7WNLFTJr8xt7A0+yPyT1FLcwzguYSoR7zyVn07uRmNbVSXMHD/FdLx1463Ljovj0JuauhsUlWYWFLPBsfuqyqcjU2QvsvfAf2uyGbyK1a8Hj8Ab+IHaA0zcJ1Jgt+KP6fvsVLDNmy6+iLeXlSRINTsXestyGDKi8qkvISKiReVLCcnnEXodEahlXik6sHeql4dN+hvjFbq36/bXBG0JlPPw5meQaBaExMdHqjjo5ttPHxz1/Vm0H3MSHpTO2j4KmxBBP1Cpr4PYjIN4Y0lAw 72NM3GHl X4McbyzoRLQnJPcigmcTaGOJ4c03tPb7EUSgyycRmo9TTxw6TSH/ypSXQlcFY2KxJpnUsP7n2xxz21uG5M8rKouYtqDL6E03xtCV4izyIwMwGZyvnOFp0Jsv9BBvIKdY2gdzrDm/iO00TNH286yFxf/jAw/sam33XS0ipZSSXfugHgb2N2QF0iyWQttjacvtcMSZeO20gZbOIMlM1uSVHqgtDDSOYKG0uoXz4P0nh1Oq3MmzJw1oBUAwgyDYM8i+tGLezrbsngi0zlg0TX2qpufgOhianRjX48VXj7iPj8ZJDU0s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to extend support for mTHP, add THP validation for PMD-mapped THP related statistics to avoid statistical confusion. Signed-off-by: Baolin Wang Reviewed-by: Barry Song --- mm/shmem.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 6868c0af3a69..ae358efc397a 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1647,7 +1647,7 @@ static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, return ERR_PTR(-E2BIG); folio = shmem_alloc_folio(gfp, HPAGE_PMD_ORDER, info, index); - if (!folio) + if (!folio && pages == HPAGE_PMD_NR) count_vm_event(THP_FILE_FALLBACK); } else { pages = 1; @@ -1665,7 +1665,7 @@ static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, if (xa_find(&mapping->i_pages, &index, index + pages - 1, XA_PRESENT)) { error = -EEXIST; - } else if (huge) { + } else if (pages == HPAGE_PMD_NR) { count_vm_event(THP_FILE_FALLBACK); count_vm_event(THP_FILE_FALLBACK_CHARGE); } @@ -2031,7 +2031,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, folio = shmem_alloc_and_add_folio(huge_gfp, inode, index, fault_mm, true); if (!IS_ERR(folio)) { - count_vm_event(THP_FILE_ALLOC); + if (folio_test_pmd_mappable(folio)) + count_vm_event(THP_FILE_ALLOC); goto alloced; } if (PTR_ERR(folio) == -EEXIST) From patchwork Tue Jun 4 10:17:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13685015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B702C27C50 for ; Tue, 4 Jun 2024 10:18:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DB876B00AB; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 73C456B00B5; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BBF76B00B6; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 100676B00B5 for ; Tue, 4 Jun 2024 06:18:14 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8F085A134E for ; Tue, 4 Jun 2024 10:18:13 +0000 (UTC) X-FDA: 82192806066.08.5AE0752 Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) by imf10.hostedemail.com (Postfix) with ESMTP id 7E37EC000D for ; Tue, 4 Jun 2024 10:18:11 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tqNU8X16; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.101 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717496291; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=52cVjbefRglNDlR5fkFs17BK7m4QuE+WLuptrxQTULY=; b=3xG+jS9836HfWgmyGwhSWJm1sZT7ZElNrUuLMGMizPtUV+qsVcBKAdEbccahGMTVj6l6FB MQ4dVPPFIFD5yYZPpN5e2d06q+QBqy1ns2gekGv0pvkYYqjCZXbGP/86qZKm/9JIhLP2HU VG+CEaJMeZFiqv6Awev8adQLQ0PSpBc= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tqNU8X16; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf10.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.101 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717496291; a=rsa-sha256; cv=none; b=NU4m1ubfrTncA7TiOirJzo690C/wwWXjvrtx59W7T4rEDXuT3VJ+C+lXdJz4Lj/kDeoIrI zMSCtzJBfbryIBITCEqFSUwvMMD01+e969KbcnoxnEDquLkOJUyFudt+PrvhNUxeK9yQNY hPTZdHJlSjLBa6BecaVqr2iHn9mZSBk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1717496288; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=52cVjbefRglNDlR5fkFs17BK7m4QuE+WLuptrxQTULY=; b=tqNU8X164QUxxNIOh8EivkQ6iznSFCMCSRC9E9oF2GvuCsfoqkqyz1GLhQvNIGc0SG2ZQLOvzwiYLwR24XyOVrrvj39twkPDLr+m4o2qc2K/5MgI+uljRKTLjJbD05OWJ1Wyc2kCVv07eB8ZjsJRNmajybYbBM8UBbCyifrmYE4= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R451e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033022160150;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0W7qsp12_1717496287; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W7qsp12_1717496287) by smtp.aliyun-inc.com; Tue, 04 Jun 2024 18:18:07 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 3/6] mm: shmem: add multi-size THP sysfs interface for anonymous shmem Date: Tue, 4 Jun 2024 18:17:47 +0800 Message-Id: <119966ae28bf2e2d362ae3d369ac1a1cd27ba866.1717495894.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7E37EC000D X-Stat-Signature: ho4fbx5quw4noz1r4onuhwy7raoqsota X-HE-Tag: 1717496291-119591 X-HE-Meta: U2FsdGVkX1+tYcV3FbF55ct9tOoDByCdv8U1jY+4Z72/h6OnfaSuZxDr2TdtCEnRD/a6hT2lyukjClvX9QPnw3WwVnirbgdUAP8L/X1Jj4UAXKZvQ7aLw8xZrf4aec0uCyBDfEaP/HNHC1FzLeuCfSLnyuGfp5ghkiQCQuwCcGnQR/p9gm1c0N0KqPBWCDWTsQEUjTB2fKTeC7c0ssuhHR28fRZ++8UQs7OcqY/N/ZA33fvMH9laBmqwsanZo5aJENLAK3FcXXwEUHQq06ZNjUXmX930I0gqwxWs7XEACh8qMV89PWeU4Vq7CJ/f+8NzZeT2dnTM+jcSMSi7R9l3j7HYNBAhEam3glX2+Xe7Ij6Y1vh4Cu6N1r6JP2qZkWHDHMo3RIir6k6bap5vCRwfFGMDKl2uM5q/PBP+LMUsnerd2xvFkFinfuJPqp/9347DTbhmjJpC4gE8bFqW3+k7YhHjaXh3hvQogPNUMF3YtgfeYgFXhoxAsFp35LR1F7ZGiNczEA/u0M/yoK9TSFbsK30omzhV51kmmhwE2a22Q3t7nCNdJpUHCVPgGQqhxocgU6SRttnuD3/RxlSBg0xDU7k9RlGqkpu0p1EV6+2hfmmqEa2OL9YUAtW97TEPhYB/79lg90fyqEMo1D64kHxQEq06BXCANmfnnVOZ/jiOXRzmrzBxlbmpQgHWidv4+IFEE6k5s0N91YiDah+vFNE2JEXo8hIQNaCFRVP9kIcKsFRAzAUcqwsNHdqm/mmOJk6LZGCuFWs9N+1idxkr/p0PxdzJLnTHbxoQKF1UaZ0k2pKCZ6WuPGyo+J0Tbs1+tgWu73iy9776WB5jGinZolFkw62NFqqfywB6gT4tROIYHH2qbf1FMQyfobet/1GiWLUide6wKTh3q5y5wHwkVoh2A48Dhcsi9f75fOLfE3ELZsl9dLcoKWhbE655mmiMmokoOYttUvkq4PNBmjd4ciy Z5j3iApU PTii54H5XeYJzxyTb3vj0SSZ7nQkh2sxmWTi49BuHF6ENZ+UzC7oVXIger03NDVedMpY3sZb65zy1YsOy3uqraK14lUXVK7bOdT29D0FSrcTjcSuSACjIupC3qi57CwhCJGkqUpjCx6Ow94XVZT+UtPwxUz6rtVRApefU9QqnTtnzJq1HqyU1mhqHG44WkW//zqUiZBo2g+bHEmCnEdO/qyrdvDJbkiMr2a6epAHE1f4Awcc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To support the use of mTHP with anonymous shmem, add a new sysfs interface 'shmem_enabled' in the '/sys/kernel/mm/transparent_hugepage/hugepages-kB/' directory for each mTHP to control whether shmem is enabled for that mTHP, with a value similar to the top level 'shmem_enabled', which can be set to: "always", "inherit (to inherit the top level setting)", "within_size", "advise", "never". An 'inherit' option is added to ensure compatibility with these global settings, and the options 'force' and 'deny' are dropped, which are rather testing artifacts from the old ages. By default, PMD-sized hugepages have enabled="inherit" and all other hugepage sizes have enabled="never" for '/sys/kernel/mm/transparent_hugepage/hugepages-xxkB/shmem_enabled'. In addition, if top level value is 'force', then only PMD-sized hugepages have enabled="inherit", otherwise configuration will be failed and vice versa. That means now we will avoid using non-PMD sized THP to override the global huge allocation. Signed-off-by: Baolin Wang --- Documentation/admin-guide/mm/transhuge.rst | 23 ++++++ include/linux/huge_mm.h | 10 +++ mm/huge_memory.c | 11 +-- mm/shmem.c | 96 ++++++++++++++++++++++ 4 files changed, 132 insertions(+), 8 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index d414d3f5592a..b76d15e408b3 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -332,6 +332,29 @@ deny force Force the huge option on for all - very useful for testing; +Shmem can also use "multi-size THP" (mTHP) by adding a new sysfs knob to control +mTHP allocation: '/sys/kernel/mm/transparent_hugepage/hugepages-kB/shmem_enabled', +and its value for each mTHP is essentially consistent with the global setting. +An 'inherit' option is added to ensure compatibility with these global settings. +Conversely, the options 'force' and 'deny' are dropped, which are rather testing +artifacts from the old ages. +always + Attempt to allocate huge pages every time we need a new page; + +inherit + Inherit the top-level "shmem_enabled" value. By default, PMD-sized hugepages + have enabled="inherit" and all other hugepage sizes have enabled="never"; + +never + Do not allocate huge pages; + +within_size + Only allocate huge page if it will be fully within i_size. + Also respect fadvise()/madvise() hints; + +advise + Only allocate huge pages if requested with fadvise()/madvise(); + Need of application restart =========================== diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 020e2344eb86..fac21548c5de 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -6,6 +6,7 @@ #include #include /* only for vma_is_dax() */ +#include vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, @@ -63,6 +64,7 @@ ssize_t single_hugepage_flag_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf, enum transparent_hugepage_flag flag); extern struct kobj_attribute shmem_enabled_attr; +extern struct kobj_attribute thpsize_shmem_enabled_attr; /* * Mask of all large folio orders supported for anonymous THP; all orders up to @@ -265,6 +267,14 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, return __thp_vma_allowable_orders(vma, vm_flags, tva_flags, orders); } +struct thpsize { + struct kobject kobj; + struct list_head node; + int order; +}; + +#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) + enum mthp_stat_item { MTHP_STAT_ANON_FAULT_ALLOC, MTHP_STAT_ANON_FAULT_FALLBACK, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8e49f402d7c7..1360a1903b66 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -449,14 +449,6 @@ static void thpsize_release(struct kobject *kobj); static DEFINE_SPINLOCK(huge_anon_orders_lock); static LIST_HEAD(thpsize_list); -struct thpsize { - struct kobject kobj; - struct list_head node; - int order; -}; - -#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) - static ssize_t thpsize_enabled_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -517,6 +509,9 @@ static struct kobj_attribute thpsize_enabled_attr = static struct attribute *thpsize_attrs[] = { &thpsize_enabled_attr.attr, +#ifdef CONFIG_SHMEM + &thpsize_shmem_enabled_attr.attr, +#endif NULL, }; diff --git a/mm/shmem.c b/mm/shmem.c index ae358efc397a..643ff7516b4d 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -131,6 +131,14 @@ struct shmem_options { #define SHMEM_SEEN_QUOTA 32 }; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static unsigned long huge_anon_shmem_orders_always __read_mostly; +static unsigned long huge_anon_shmem_orders_madvise __read_mostly; +static unsigned long huge_anon_shmem_orders_inherit __read_mostly; +static unsigned long huge_anon_shmem_orders_within_size __read_mostly; +static DEFINE_SPINLOCK(huge_anon_shmem_orders_lock); +#endif + #ifdef CONFIG_TMPFS static unsigned long shmem_default_max_blocks(void) { @@ -4672,6 +4680,12 @@ void __init shmem_init(void) SHMEM_SB(shm_mnt->mnt_sb)->huge = shmem_huge; else shmem_huge = SHMEM_HUGE_NEVER; /* just in case it was patched */ + + /* + * Default to setting PMD-sized THP to inherit the global setting and + * disable all other multi-size THPs, when anonymous shmem uses mTHP. + */ + huge_anon_shmem_orders_inherit = BIT(HPAGE_PMD_ORDER); #endif return; @@ -4731,6 +4745,11 @@ static ssize_t shmem_enabled_store(struct kobject *kobj, huge != SHMEM_HUGE_NEVER && huge != SHMEM_HUGE_DENY) return -EINVAL; + /* Do not override huge allocation policy with non-PMD sized mTHP */ + if (huge == SHMEM_HUGE_FORCE && + huge_anon_shmem_orders_inherit != BIT(HPAGE_PMD_ORDER)) + return -EINVAL; + shmem_huge = huge; if (shmem_huge > SHMEM_HUGE_DENY) SHMEM_SB(shm_mnt->mnt_sb)->huge = shmem_huge; @@ -4738,6 +4757,83 @@ static ssize_t shmem_enabled_store(struct kobject *kobj, } struct kobj_attribute shmem_enabled_attr = __ATTR_RW(shmem_enabled); + +static ssize_t thpsize_shmem_enabled_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int order = to_thpsize(kobj)->order; + const char *output; + + if (test_bit(order, &huge_anon_shmem_orders_always)) + output = "[always] inherit within_size advise never"; + else if (test_bit(order, &huge_anon_shmem_orders_inherit)) + output = "always [inherit] within_size advise never"; + else if (test_bit(order, &huge_anon_shmem_orders_within_size)) + output = "always inherit [within_size] advise never"; + else if (test_bit(order, &huge_anon_shmem_orders_madvise)) + output = "always inherit within_size [advise] never"; + else + output = "always inherit within_size advise [never]"; + + return sysfs_emit(buf, "%s\n", output); +} + +static ssize_t thpsize_shmem_enabled_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int order = to_thpsize(kobj)->order; + ssize_t ret = count; + + if (sysfs_streq(buf, "always")) { + spin_lock(&huge_anon_shmem_orders_lock); + clear_bit(order, &huge_anon_shmem_orders_inherit); + clear_bit(order, &huge_anon_shmem_orders_madvise); + clear_bit(order, &huge_anon_shmem_orders_within_size); + set_bit(order, &huge_anon_shmem_orders_always); + spin_unlock(&huge_anon_shmem_orders_lock); + } else if (sysfs_streq(buf, "inherit")) { + /* Do not override huge allocation policy with non-PMD sized mTHP */ + if (shmem_huge == SHMEM_HUGE_FORCE && + order != HPAGE_PMD_ORDER) + return -EINVAL; + + spin_lock(&huge_anon_shmem_orders_lock); + clear_bit(order, &huge_anon_shmem_orders_always); + clear_bit(order, &huge_anon_shmem_orders_madvise); + clear_bit(order, &huge_anon_shmem_orders_within_size); + set_bit(order, &huge_anon_shmem_orders_inherit); + spin_unlock(&huge_anon_shmem_orders_lock); + } else if (sysfs_streq(buf, "within_size")) { + spin_lock(&huge_anon_shmem_orders_lock); + clear_bit(order, &huge_anon_shmem_orders_always); + clear_bit(order, &huge_anon_shmem_orders_inherit); + clear_bit(order, &huge_anon_shmem_orders_madvise); + set_bit(order, &huge_anon_shmem_orders_within_size); + spin_unlock(&huge_anon_shmem_orders_lock); + } else if (sysfs_streq(buf, "madvise")) { + spin_lock(&huge_anon_shmem_orders_lock); + clear_bit(order, &huge_anon_shmem_orders_always); + clear_bit(order, &huge_anon_shmem_orders_inherit); + clear_bit(order, &huge_anon_shmem_orders_within_size); + set_bit(order, &huge_anon_shmem_orders_madvise); + spin_unlock(&huge_anon_shmem_orders_lock); + } else if (sysfs_streq(buf, "never")) { + spin_lock(&huge_anon_shmem_orders_lock); + clear_bit(order, &huge_anon_shmem_orders_always); + clear_bit(order, &huge_anon_shmem_orders_inherit); + clear_bit(order, &huge_anon_shmem_orders_within_size); + clear_bit(order, &huge_anon_shmem_orders_madvise); + spin_unlock(&huge_anon_shmem_orders_lock); + } else { + ret = -EINVAL; + } + + return ret; +} + +struct kobj_attribute thpsize_shmem_enabled_attr = + __ATTR(shmem_enabled, 0644, thpsize_shmem_enabled_show, thpsize_shmem_enabled_store); #endif /* CONFIG_TRANSPARENT_HUGEPAGE && CONFIG_SYSFS */ #else /* !CONFIG_SHMEM */ From patchwork Tue Jun 4 10:17:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13685018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEA98C25B7E for ; Tue, 4 Jun 2024 10:18:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F1046B00B4; Tue, 4 Jun 2024 06:18:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A13F6B00B6; Tue, 4 Jun 2024 06:18:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A67F6B00B8; Tue, 4 Jun 2024 06:18:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 259116B00B4 for ; Tue, 4 Jun 2024 06:18:15 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B0278140ED0 for ; Tue, 4 Jun 2024 10:18:14 +0000 (UTC) X-FDA: 82192806108.22.10A8791 Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) by imf06.hostedemail.com (Postfix) with ESMTP id A50DA180006 for ; Tue, 4 Jun 2024 10:18:12 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=PxKHTsjz; spf=pass (imf06.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.113 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717496293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U1fPuNOZf3P7EgFNGhmfPPdzYJEqgGM7sbdyAxnZ0P4=; b=ehBIcw9Dr1Y8thb2zcHEpr34d12eJx94djiiwLnnXKvRGLPn4ZLsxNVtrIrRe3ZI2kIrOw baqSb+qKgE5UJDI6nA3RRXE12ZPzX3NxIU/GK5C7sr7odBXJXbpod36AWK4vfXi4pES4R1 oTuIK25ME4EHGJuMU+QO78X6vaSOHd0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717496293; a=rsa-sha256; cv=none; b=Wkz9mPRT9N7zgLurd5ky2jo+pQhRGdjzHbd1qXsIAX4+mzCiukeGHYQf2outrNB17IK262 AETnYHny4ieIjJtMKuzPdLyD0j89mlDqy/bZiZMQ956uX/Dc0sCa0gIz1CUFoualL2anll gEVnTWrnyKFitDFI7JRWpYUxRWPEOHc= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=PxKHTsjz; spf=pass (imf06.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.113 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1717496290; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=U1fPuNOZf3P7EgFNGhmfPPdzYJEqgGM7sbdyAxnZ0P4=; b=PxKHTsjz+81INdAB3d8BGfwF2pTQzI6MEiiQRjxkAw8jKgxmEp7mv8ktk5+SREdA6Q5CyFKYAW9+4WQjQsFqb15mPV5RDkB8N+ANeG15RVeZZKfws8JMsATs4dcjczfkaBETE2Fp9X181qg0EimahGNgSdBWBrDukk2aFkLRHDo= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R251e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067109;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0W7qv1Ys_1717496288; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W7qv1Ys_1717496288) by smtp.aliyun-inc.com; Tue, 04 Jun 2024 18:18:08 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 4/6] mm: shmem: add mTHP support for anonymous shmem Date: Tue, 4 Jun 2024 18:17:48 +0800 Message-Id: <9be6eeacd0304c82a1cb1b7487977a3e14d2b5df.1717495894.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A50DA180006 X-Rspam-User: X-Stat-Signature: gzfmbabzupurcz6mws9rakdbj44q9sxt X-HE-Tag: 1717496292-999908 X-HE-Meta: U2FsdGVkX1+UqGNeMSwMx95fhHApl+F9pWpciUSo2eM93lpq6TbWdtU5zZQDYEMK0mLVbpudEUdLaZyDgUD5YKQro04N5pxzQytfKhmrfs8hqkWrkB0G6C7RSANjcjHFh655KPihr1iumiHPbRMghScNf02Q02rwggAi4fxnMBh5XYsGI2BW5YgCCsbEoVDxdlwAUKwlRTurwl8tbkusIlYDNomDvXYKvlWgYsVe8DAqsqdsqk7QUj/JEtvFwlgGo97gY9EKIa7GyhtxVt6+L0fZuCpP1r5C0wMaiIX9wGFmJX/jHuntkPQ6xIl5wd5gM529yjltE2LPUXoZqkL6qXvstBnmo7yfZY0ou8JHCouyRe+liatxjdRXNi1QcAbD6LJRiycYQcpbhjkmGfi/fO0HjClFPtJziug+dq7yqRl0Gs3+IpybYSJov3+yraifqO3DcTGyIcfclYDo4MDlQogF77b0AJMAYFO9iKwWStOdD28Sj0nLveXuX6iyaSgsBd5R0vaQyDlW9aog57eoIn23m0ZsBoTRhRJdRv5IzeDzkbBTMtfABjHoZn4zV/LFNHKW0QwfEN6sk+yXEVtJJfIChokC8hGKt1BYc+zng93rmqOX3kemH50RylkY1uJn3dMWrmoJyoXrTj41JmmKsuPqA5wluMABv4zSYC7msYhr3xqz7dWun5hOtvY4OJTAK5d8wiOa7cw1fTHKuosBS1OgfIynQbaw68W5xxoSyU+lg2qHEZi5Z3mIceBwJulDlXpOA8XBCWwpSfaGtzjGZJjsXOMShBJffExBvHETlAHmnv5//hUe46lMV4Y2KPpSk1++IbU77YMdE4W83eu2Y0hgFmFyyTvdG7ViQ8z/wftzmUY2KkcwtEgadP5w3sZXwS3Ha9keiSNrk1UigTTCudESrD7WITaLUuCnzLMHLB9uKwTIkfIWy/Nljy6/DRqMHJQsmROEFkFTOuSH/sX 76nMEmhI l3mWpL5sctc8D7+FMTdkEjKz7kVPb7NEPdGxGQDgaduNIdsuZ0nO262qFUVos3jkVUYNkIjr2OKuJlvGoYuVzEszHMpgcTccg4NlU/2hjixzHwZrpz9zqgd33omNFHnumHZnMdr4q4kSq3OqFDYMxVoLaM4Ld1rIr1wbF4hjuU469Zwd6tU0wbDV1eYpmPrS8rSRvJjG65NmLH4MfILjpyi8cQ3OqrL5K+8Gz5qQYv8Mb0pgyGIPjjYAi3Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit 19eaf44954df adds multi-size THP (mTHP) for anonymous pages, that can allow THP to be configured through the sysfs interface located at '/sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled'. However, the anonymous shmem will ignore the anonymous mTHP rule configured through the sysfs interface, and can only use the PMD-mapped THP, that is not reasonable. Users expect to apply the mTHP rule for all anonymous pages, including the anonymous shmem, in order to enjoy the benefits of mTHP. For example, lower latency than PMD-mapped THP, smaller memory bloat than PMD-mapped THP, contiguous PTEs on ARM architecture to reduce TLB miss etc. In addition, the mTHP interfaces can be extended to support all shmem/tmpfs scenarios in the future, especially for the shmem mmap() case. The primary strategy is similar to supporting anonymous mTHP. Introduce a new interface '/mm/transparent_hugepage/hugepage-XXkb/shmem_enabled', which can have almost the same values as the top-level '/sys/kernel/mm/transparent_hugepage/shmem_enabled', with adding a new additional "inherit" option and dropping the testing options 'force' and 'deny'. By default all sizes will be set to "never" except PMD size, which is set to "inherit". This ensures backward compatibility with the anonymous shmem enabled of the top level, meanwhile also allows independent control of anonymous shmem enabled for each mTHP. Signed-off-by: Baolin Wang --- include/linux/huge_mm.h | 10 +++ mm/shmem.c | 187 +++++++++++++++++++++++++++++++++------- 2 files changed, 167 insertions(+), 30 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index fac21548c5de..909cfc67521d 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -575,6 +575,16 @@ static inline bool thp_migration_supported(void) { return false; } + +static inline int highest_order(unsigned long orders) +{ + return 0; +} + +static inline int next_order(unsigned long *orders, int prev) +{ + return 0; +} #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ static inline int split_folio_to_list_to_order(struct folio *folio, diff --git a/mm/shmem.c b/mm/shmem.c index 643ff7516b4d..9a8533482208 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1611,6 +1611,107 @@ static gfp_t limit_gfp_mask(gfp_t huge_gfp, gfp_t limit_gfp) return result; } +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static unsigned long anon_shmem_allowable_huge_orders(struct inode *inode, + struct vm_area_struct *vma, pgoff_t index, + bool global_huge) +{ + unsigned long mask = READ_ONCE(huge_anon_shmem_orders_always); + unsigned long within_size_orders = READ_ONCE(huge_anon_shmem_orders_within_size); + unsigned long vm_flags = vma->vm_flags; + /* + * Check all the (large) orders below HPAGE_PMD_ORDER + 1 that + * are enabled for this vma. + */ + unsigned long orders = BIT(PMD_ORDER + 1) - 1; + loff_t i_size; + int order; + + if ((vm_flags & VM_NOHUGEPAGE) || + test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)) + return 0; + + /* If the hardware/firmware marked hugepage support disabled. */ + if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED)) + return 0; + + /* + * Following the 'deny' semantics of the top level, force the huge + * option off from all mounts. + */ + if (shmem_huge == SHMEM_HUGE_DENY) + return 0; + + /* + * Only allow inherit orders if the top-level value is 'force', which + * means non-PMD sized THP can not override 'huge' mount option now. + */ + if (shmem_huge == SHMEM_HUGE_FORCE) + return READ_ONCE(huge_anon_shmem_orders_inherit); + + /* Allow mTHP that will be fully within i_size. */ + order = highest_order(within_size_orders); + while (within_size_orders) { + index = round_up(index + 1, order); + i_size = round_up(i_size_read(inode), PAGE_SIZE); + if (i_size >> PAGE_SHIFT >= index) { + mask |= within_size_orders; + break; + } + + order = next_order(&within_size_orders, order); + } + + if (vm_flags & VM_HUGEPAGE) + mask |= READ_ONCE(huge_anon_shmem_orders_madvise); + + if (global_huge) + mask |= READ_ONCE(huge_anon_shmem_orders_inherit); + + return orders & mask; +} + +static unsigned long anon_shmem_suitable_orders(struct inode *inode, struct vm_fault *vmf, + struct address_space *mapping, pgoff_t index, + unsigned long orders) +{ + struct vm_area_struct *vma = vmf->vma; + unsigned long pages; + int order; + + orders = thp_vma_suitable_orders(vma, vmf->address, orders); + if (!orders) + return 0; + + /* Find the highest order that can add into the page cache */ + order = highest_order(orders); + while (orders) { + pages = 1UL << order; + index = round_down(index, pages); + if (!xa_find(&mapping->i_pages, &index, + index + pages - 1, XA_PRESENT)) + break; + order = next_order(&orders, order); + } + + return orders; +} +#else +static unsigned long anon_shmem_allowable_huge_orders(struct inode *inode, + struct vm_area_struct *vma, pgoff_t index, + bool global_huge) +{ + return 0; +} + +static unsigned long anon_shmem_suitable_orders(struct inode *inode, struct vm_fault *vmf, + struct address_space *mapping, pgoff_t index, + unsigned long orders) +{ + return 0; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + static struct folio *shmem_alloc_folio(gfp_t gfp, int order, struct shmem_inode_info *info, pgoff_t index) { @@ -1625,38 +1726,55 @@ static struct folio *shmem_alloc_folio(gfp_t gfp, int order, return folio; } -static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, - struct inode *inode, pgoff_t index, - struct mm_struct *fault_mm, bool huge) +static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, + gfp_t gfp, struct inode *inode, pgoff_t index, + struct mm_struct *fault_mm, unsigned long orders) { struct address_space *mapping = inode->i_mapping; struct shmem_inode_info *info = SHMEM_I(inode); - struct folio *folio; + struct vm_area_struct *vma = vmf ? vmf->vma : NULL; + unsigned long suitable_orders = 0; + struct folio *folio = NULL; long pages; - int error; + int error, order; if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) - huge = false; + orders = 0; - if (huge) { - pages = HPAGE_PMD_NR; - index = round_down(index, HPAGE_PMD_NR); + if (orders > 0) { + if (vma && vma_is_anon_shmem(vma)) { + suitable_orders = anon_shmem_suitable_orders(inode, vmf, + mapping, index, orders); + } else if (orders & BIT(HPAGE_PMD_ORDER)) { + pages = HPAGE_PMD_NR; + suitable_orders = BIT(HPAGE_PMD_ORDER); + index = round_down(index, HPAGE_PMD_NR); - /* - * Check for conflict before waiting on a huge allocation. - * Conflict might be that a huge page has just been allocated - * and added to page cache by a racing thread, or that there - * is already at least one small page in the huge extent. - * Be careful to retry when appropriate, but not forever! - * Elsewhere -EEXIST would be the right code, but not here. - */ - if (xa_find(&mapping->i_pages, &index, - index + HPAGE_PMD_NR - 1, XA_PRESENT)) - return ERR_PTR(-E2BIG); + /* + * Check for conflict before waiting on a huge allocation. + * Conflict might be that a huge page has just been allocated + * and added to page cache by a racing thread, or that there + * is already at least one small page in the huge extent. + * Be careful to retry when appropriate, but not forever! + * Elsewhere -EEXIST would be the right code, but not here. + */ + if (xa_find(&mapping->i_pages, &index, + index + HPAGE_PMD_NR - 1, XA_PRESENT)) + return ERR_PTR(-E2BIG); + } - folio = shmem_alloc_folio(gfp, HPAGE_PMD_ORDER, info, index); - if (!folio && pages == HPAGE_PMD_NR) - count_vm_event(THP_FILE_FALLBACK); + order = highest_order(suitable_orders); + while (suitable_orders) { + pages = 1UL << order; + index = round_down(index, pages); + folio = shmem_alloc_folio(gfp, order, info, index); + if (folio) + goto allocated; + + if (pages == HPAGE_PMD_NR) + count_vm_event(THP_FILE_FALLBACK); + order = next_order(&suitable_orders, order); + } } else { pages = 1; folio = shmem_alloc_folio(gfp, 0, info, index); @@ -1664,6 +1782,7 @@ static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, if (!folio) return ERR_PTR(-ENOMEM); +allocated: __folio_set_locked(folio); __folio_set_swapbacked(folio); @@ -1958,7 +2077,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, struct mm_struct *fault_mm; struct folio *folio; int error; - bool alloced; + bool alloced, huge; + unsigned long orders = 0; if (WARN_ON_ONCE(!shmem_mapping(inode->i_mapping))) return -EINVAL; @@ -2030,14 +2150,21 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, return 0; } - if (shmem_is_huge(inode, index, false, fault_mm, - vma ? vma->vm_flags : 0)) { + huge = shmem_is_huge(inode, index, false, fault_mm, + vma ? vma->vm_flags : 0); + /* Find hugepage orders that are allowed for anonymous shmem. */ + if (vma && vma_is_anon_shmem(vma)) + orders = anon_shmem_allowable_huge_orders(inode, vma, index, huge); + else if (huge) + orders = BIT(HPAGE_PMD_ORDER); + + if (orders > 0) { gfp_t huge_gfp; huge_gfp = vma_thp_gfp_mask(vma); huge_gfp = limit_gfp_mask(huge_gfp, gfp); - folio = shmem_alloc_and_add_folio(huge_gfp, - inode, index, fault_mm, true); + folio = shmem_alloc_and_add_folio(vmf, huge_gfp, + inode, index, fault_mm, orders); if (!IS_ERR(folio)) { if (folio_test_pmd_mappable(folio)) count_vm_event(THP_FILE_ALLOC); @@ -2047,7 +2174,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, goto repeat; } - folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false); + folio = shmem_alloc_and_add_folio(vmf, gfp, inode, index, fault_mm, 0); if (IS_ERR(folio)) { error = PTR_ERR(folio); if (error == -EEXIST) @@ -2058,7 +2185,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, alloced: alloced = true; - if (folio_test_pmd_mappable(folio) && + if (folio_test_large(folio) && DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE) < folio_next_index(folio) - 1) { struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); From patchwork Tue Jun 4 10:17:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13685020 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F15BC25B78 for ; Tue, 4 Jun 2024 10:18:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 803856B00BA; Tue, 4 Jun 2024 06:18:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 78ACB6B00BB; Tue, 4 Jun 2024 06:18:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DEA06B00BC; Tue, 4 Jun 2024 06:18:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3AB906B00BA for ; Tue, 4 Jun 2024 06:18:18 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BCFB4406A3 for ; Tue, 4 Jun 2024 10:18:17 +0000 (UTC) X-FDA: 82192806234.07.D4BEEFD Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf18.hostedemail.com (Postfix) with ESMTP id C20E71C0014 for ; Tue, 4 Jun 2024 10:18:14 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=FK08sUdG; spf=pass (imf18.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717496295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mF0SOFHOYjsa4DTAJ1q9V/3Dk01gX+rLS8KbnSGFj9Q=; b=MwFT2mQx3BO/OgPHvij9MVfx2p/5ohzLQT6aROdgL6O5VoLTxUPo6gfgUALiMOv2J2h8ul vC4l4rD1MW4yT9811XAh/V5zrsp7yCLGP50GSYM2dXV5y3PDcaZ4Q3wbu0XzOvYiT54phO vc55BNHXNhnabbN3U1XaredUXl/uTCE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717496295; a=rsa-sha256; cv=none; b=7hTdbOjjjHwEv5QK+iO6h2djd78A8O1gwkctRVEJbVkaYaQRj1dxmZhCvLSl4qbdps5Uo2 C/4vymyBu7elvgsWvzlKjx6pujuLlHOTrvQNCwVR5xVe+kvJp2OXfjAMuqPkeH5YjfPs/C xgpUFL2Xj8pWZ+lrNw65ICvJ9Tpnji4= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=FK08sUdG; spf=pass (imf18.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1717496292; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=mF0SOFHOYjsa4DTAJ1q9V/3Dk01gX+rLS8KbnSGFj9Q=; b=FK08sUdGePBdy/o5PZmibrK5llvpDr0X6bC/6NQUXfd6sUczNTDPv6KdBWU0t009yygTZDoqDd6j5C73pyD8VmKJHYmCvBCSQcQHSXs7s7HLIHYbT7eiCbxzKqNYZjLZb88Vx3iXDVuQvnBDX/qCv0aQ21QcbL/K0skRujt7sgw= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067113;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0W7qooJc_1717496289; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W7qooJc_1717496289) by smtp.aliyun-inc.com; Tue, 04 Jun 2024 18:18:09 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 5/6] mm: shmem: add mTHP size alignment in shmem_get_unmapped_area Date: Tue, 4 Jun 2024 18:17:49 +0800 Message-Id: <1d2f719d558d5c789c4cceebdaac42a814e8f107.1717495894.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C20E71C0014 X-Rspam-User: X-Stat-Signature: gi57tngz4sep1f9g99mkp8uzo8h6qoah X-HE-Tag: 1717496294-836763 X-HE-Meta: U2FsdGVkX1/nVZ+If595mn5WebT9eCPDOUVNW/PvRavCzGcI9X3H0g8+SuJ1t0GfncVnXxaC6/31eG9b4H7U8iaRRR4igqxa8WcUmH1bXzQVjrAhEqhIwEqaomGqWjitkEl+NBBfRS7M8gRLvybvEYSinYuBVqnAStCiwa8wKzfSXHKmp4NjQAc85tiuaLlCAWaqQ8E+l+5dFcc645v7nDenjO7olAfNvb8lFRSilmHwNiobvrZGLzpDa8FUHIaf39WHSF8MVGUgJle986/lBh8uKSnHVSzNo/V6qMoovgvhsa1mtOpBau94XaXlpG46/hKqnhlYrAs86oBVGW9GV1nqjUn2lHChdlVjWxAb7gOeRlByIV2HCymW1bypBIk8iG/RPG6dMhcrGgzfuvtctqqloERI/vtfm2VF3zqrviPjEOj110Bg+uRr12tRZPlhQ1P7rOzt1h5mcK0MT3DDETJ25MEalIrwuEXrR30sPiNmTuaDfrhGBQw5DOJ4D7MjgLRLR5rzSLtyKsHklyh4zhNtRND3WNQGcFCkuuOM8C6XoiUxozE194w1LUYKji7AWduCVpzIkPmVG52nNlaGQdwPzqwKDcV+oqaRSvYxsr70eReI4ER39wtb5rtYPK8vQE90WexaQl1R2g0EYJrzx414x5RrKYu5cwBqtnou0k9FssGLdlJ75xPDn/pauvzKgAjsH1JHGz9DlXZKhOTt3nVasM9Cw5DJjtesriUXG69jzMQjfvXhpiR1H9kX4fHg4Hvl1KUwvYjg/ShrhWKZfQKzzeMiWu+ryQ55Uul+YN6azhdnnLiAVw9U2YoIP746GwYIgz0Sfu2zB59f2gvMB/DLHI/HkPhIAP6sxPgQOk1pVtkI3Y4BTYxyDHr01SzACKJDBtBuEk7mcGX/q42MWL8j0+DecjvEmv8rruTwZrSndvNS4gIy24xY1F/DPoC7aTybghy3qd+bUfSnYkp j86xbrI/ G1MWJ1gMiAPhUxKvN+9KqDNYyhE4p04FjyQmp2uNYl4ErSfz3ihAIEVNyMu2CzNBFQzHufqUh/Z3fgfvFpeOUWxy5sXj7hVd38ubpYIAoYLbyM0gDlkxLlnd82djIYy/NDFAxVdI4VHIUUDecaGtVxIeeqBJGyccZxpM55QHbFlmb56hGacv23chYMxTDWgqI1hqJz7jYLLvBqsGoHRGVx4tJPjDfhDdkEon4NcNckulyjAa5KE3hOuP0YbxbOroHi96jblBSrltzvQssdsQPgFMnkOoXsLO7dCCcnnd8H5RlqdA5erqGtOKCEg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Although the top-level hugepage allocation can be turned off, anonymous shmem can still use mTHP by configuring the sysfs interface located at '/sys/kernel/mm/transparent_hugepage/hugepage-XXkb/shmem_enabled'. Therefore, add alignment for mTHP size to provide a suitable alignment address in shmem_get_unmapped_area(). Signed-off-by: Baolin Wang Tested-by: Lance Yang --- mm/shmem.c | 40 +++++++++++++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 9 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 9a8533482208..2ecc41521dbb 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2394,6 +2394,7 @@ unsigned long shmem_get_unmapped_area(struct file *file, unsigned long inflated_len; unsigned long inflated_addr; unsigned long inflated_offset; + unsigned long hpage_size; if (len > TASK_SIZE) return -ENOMEM; @@ -2412,8 +2413,6 @@ unsigned long shmem_get_unmapped_area(struct file *file, if (shmem_huge == SHMEM_HUGE_DENY) return addr; - if (len < HPAGE_PMD_SIZE) - return addr; if (flags & MAP_FIXED) return addr; /* @@ -2425,8 +2424,11 @@ unsigned long shmem_get_unmapped_area(struct file *file, if (uaddr == addr) return addr; + hpage_size = HPAGE_PMD_SIZE; if (shmem_huge != SHMEM_HUGE_FORCE) { struct super_block *sb; + unsigned long __maybe_unused hpage_orders; + int order = 0; if (file) { VM_BUG_ON(file->f_op != &shmem_file_operations); @@ -2439,18 +2441,38 @@ unsigned long shmem_get_unmapped_area(struct file *file, if (IS_ERR(shm_mnt)) return addr; sb = shm_mnt->mnt_sb; + + /* + * Find the highest mTHP order used for anonymous shmem to + * provide a suitable alignment address. + */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + hpage_orders = READ_ONCE(huge_anon_shmem_orders_always); + hpage_orders |= READ_ONCE(huge_anon_shmem_orders_within_size); + hpage_orders |= READ_ONCE(huge_anon_shmem_orders_madvise); + if (SHMEM_SB(sb)->huge != SHMEM_HUGE_NEVER) + hpage_orders |= READ_ONCE(huge_anon_shmem_orders_inherit); + + if (hpage_orders > 0) { + order = highest_order(hpage_orders); + hpage_size = PAGE_SIZE << order; + } +#endif } - if (SHMEM_SB(sb)->huge == SHMEM_HUGE_NEVER) + if (SHMEM_SB(sb)->huge == SHMEM_HUGE_NEVER && !order) return addr; } - offset = (pgoff << PAGE_SHIFT) & (HPAGE_PMD_SIZE-1); - if (offset && offset + len < 2 * HPAGE_PMD_SIZE) + if (len < hpage_size) + return addr; + + offset = (pgoff << PAGE_SHIFT) & (hpage_size - 1); + if (offset && offset + len < 2 * hpage_size) return addr; - if ((addr & (HPAGE_PMD_SIZE-1)) == offset) + if ((addr & (hpage_size - 1)) == offset) return addr; - inflated_len = len + HPAGE_PMD_SIZE - PAGE_SIZE; + inflated_len = len + hpage_size - PAGE_SIZE; if (inflated_len > TASK_SIZE) return addr; if (inflated_len < len) @@ -2463,10 +2485,10 @@ unsigned long shmem_get_unmapped_area(struct file *file, if (inflated_addr & ~PAGE_MASK) return addr; - inflated_offset = inflated_addr & (HPAGE_PMD_SIZE-1); + inflated_offset = inflated_addr & (hpage_size - 1); inflated_addr += offset - inflated_offset; if (inflated_offset > offset) - inflated_addr += HPAGE_PMD_SIZE; + inflated_addr += hpage_size; if (inflated_addr > TASK_SIZE - len) return addr; From patchwork Tue Jun 4 10:17:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13685019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68162C25B78 for ; Tue, 4 Jun 2024 10:18:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE6D06B00B8; Tue, 4 Jun 2024 06:18:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C2FFF6B00BB; Tue, 4 Jun 2024 06:18:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC0E36B00BA; Tue, 4 Jun 2024 06:18:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 857B46B00B6 for ; Tue, 4 Jun 2024 06:18:17 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1BFC7A135E for ; Tue, 4 Jun 2024 10:18:17 +0000 (UTC) X-FDA: 82192806234.23.27E928D Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf26.hostedemail.com (Postfix) with ESMTP id E57A714000D for ; Tue, 4 Jun 2024 10:18:14 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=KNk+Xooh; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf26.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717496295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hr324HRcgNNfrgZ8GDchBNfhicaPktU9TGw54OvPLi4=; b=7Csnp2f/jLOGW+XFoB4bs4qwplAw8rYxiNo1OxvE/qnQ7h9PR646pogMWxllFsM5EXMzc+ CZveHoO/iPT4SqW4PeY9tuxINMpX4zWcP+mxGBGiaSGHN2covd26L2WqYZrQc1+QRIhPyi SuPuK23viah60Dm88QvUyBeOhG51kx8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=KNk+Xooh; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf26.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717496295; a=rsa-sha256; cv=none; b=Kvv3qb3HJ1lj8x9V5X+D/vQkdVjQRg7l9QyPLMC0oXXeIEuyq/Gowgmv0bmvs/xY8Xxk9X zygztgRj0LRCXW75RbNjVx+V8PnCkDHQbCO3v3TSZY4by9iRc8AAiMZJUkNmf2Eln85+dc jwTEMiZnwTyDXSh4vOsX5RP3L8qNdDM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1717496291; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=hr324HRcgNNfrgZ8GDchBNfhicaPktU9TGw54OvPLi4=; b=KNk+XoohyGOW9PB5ooDCSJsizrW2ClIC6oNMPuZOh1VzC7k1QB3DLTGmADAdVt0cP2/s+NlJYUaAw05QJh5+KmAvjsqIDCM7KXaAiDpHMkbEm8QRY5ihajX5LTahGfIW+6FXJP89p3Y5PBrRjrE/9ylf48HM10iV1jrBmMJtAuk= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033068173054;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0W7qsp1y_1717496290; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W7qsp1y_1717496290) by smtp.aliyun-inc.com; Tue, 04 Jun 2024 18:18:10 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, ying.huang@intel.com, 21cnbao@gmail.com, ryan.roberts@arm.com, shy828301@gmail.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 6/6] mm: shmem: add mTHP counters for anonymous shmem Date: Tue, 4 Jun 2024 18:17:50 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: E57A714000D X-Stat-Signature: rd7ksu1g9un1xzwc3oft9m44b1dfdnxf X-HE-Tag: 1717496294-802656 X-HE-Meta: U2FsdGVkX18Hya8k4X3XnNNVTPPCuDq9NbrmpIBHISYTX2jCr992Go58TbSK6NrjJHvQMmJzQD78vPuoW/G1pqRzkiY8+P2bnkx0ao24554oyvZzdMd0UDrVADpkro/igYC0PPNyzrytQrIvocQ+2TF47BbOAfYMQUh6i1H+3IulWzsvNEoxhKHzdVVc7g8jNq7Xpq9erQTw+mNsA/VFghH+DSzWTuF1gn/dDw+IqyOqbROOzk0DFHQafLbjDR8wi8SzgqnrSZo7YjClxKecbt9YP6YIOFj1ZCyvM41XNvUk9J9ckZlH3Pb80mWGf7960NB+vD9iAYadTrcAcvX6JmINAR04E7sdthZq4ms49AshPrrPgtvQRGcA+rKPQJUtO8ol0mNgwvra4ahTAK8hqRMTUFx2VBl2XCnOz/8Hng67KTcXCj3a6prmsxGbJPENC2hnS91i+CEpb7gxrv6D36uZYAkzuGjTV7clgDyyrdrzsXTrvAcB/KtMircSt5KiNaYl+aRxCjHjsj2IQJqzvn3qoOLPj/FwS1C0xA6Ek8fahOyvd48vOs9/pnxG8BDLxIa3fuajEz3sE8blJp+WPSezYcqdM6RzA4cuqWyavcMKjLRG3rwwTKvg52rIGVzZKRY7kQLjo6QNfOBdw4JUAl+7moHLMjJKziNLfjDl+Jao476cpIRhVyKO8DQwA1gKOV7nMmUYX2dmtgfji9m91w+BTUVK/e2OEQbHOnakOdwfQF1vQOoqFwYarExMpQENE4VFo13LuGIApX2S/iSBDK3RnT7iWZQ9qWXbR6Tp1rPBSA4VKNR+yAl7lXTwWPeW0YxnjE5RLd3erJg1PhF/ZWhDxpQZ0wzxhGpcSTRqT06LYngjjAnEBUwsoqAINdqw1Qp3ik54PIAo/sfqhAaRJBPUCOr2gKz1t7M+EhYtEsZF49oSUYycD+wZOxqWtm7ckh+irOqA7R4nQJKZYR/ X2CT23n0 1Bj+ACW9xJfZiEuqnbVzQpniRCFBBcOgavDDkiNmrCNKFCmi0xE7qeYbc3Eg1ooxqRdfzjPLOhXQax0cRYwh353203V6pTN9u5IA9l9S3Flg706Y2y3XHWWV2IBJIu59jUAwkhC16Hl+E0Xm8j0I+IbPtVB1k9F8fpesDmnlZG7aL5v/MPHdMZ5vY2lAzY6vZvaMiIDcAYU2IjY9KusEBNen9NDkmc08jc2qNc6A0v6DbOee5tMhl/1F64g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add mTHP counters for anonymous shmem. Signed-off-by: Baolin Wang --- include/linux/huge_mm.h | 3 +++ mm/huge_memory.c | 6 ++++++ mm/shmem.c | 18 +++++++++++++++--- 3 files changed, 24 insertions(+), 3 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 909cfc67521d..212cca384d7e 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -281,6 +281,9 @@ enum mthp_stat_item { MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, MTHP_STAT_SWPOUT, MTHP_STAT_SWPOUT_FALLBACK, + MTHP_STAT_FILE_ALLOC, + MTHP_STAT_FILE_FALLBACK, + MTHP_STAT_FILE_FALLBACK_CHARGE, __MTHP_STAT_COUNT }; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1360a1903b66..3fbcd77f5957 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -555,6 +555,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK); DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); DEFINE_MTHP_STAT_ATTR(swpout, MTHP_STAT_SWPOUT); DEFINE_MTHP_STAT_ATTR(swpout_fallback, MTHP_STAT_SWPOUT_FALLBACK); +DEFINE_MTHP_STAT_ATTR(file_alloc, MTHP_STAT_FILE_ALLOC); +DEFINE_MTHP_STAT_ATTR(file_fallback, MTHP_STAT_FILE_FALLBACK); +DEFINE_MTHP_STAT_ATTR(file_fallback_charge, MTHP_STAT_FILE_FALLBACK_CHARGE); static struct attribute *stats_attrs[] = { &anon_fault_alloc_attr.attr, @@ -562,6 +565,9 @@ static struct attribute *stats_attrs[] = { &anon_fault_fallback_charge_attr.attr, &swpout_attr.attr, &swpout_fallback_attr.attr, + &file_alloc_attr.attr, + &file_fallback_attr.attr, + &file_fallback_charge_attr.attr, NULL, }; diff --git a/mm/shmem.c b/mm/shmem.c index 2ecc41521dbb..d9a11950c586 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1773,6 +1773,9 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, if (pages == HPAGE_PMD_NR) count_vm_event(THP_FILE_FALLBACK); +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + count_mthp_stat(order, MTHP_STAT_FILE_FALLBACK); +#endif order = next_order(&suitable_orders, order); } } else { @@ -1792,9 +1795,15 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, if (xa_find(&mapping->i_pages, &index, index + pages - 1, XA_PRESENT)) { error = -EEXIST; - } else if (pages == HPAGE_PMD_NR) { - count_vm_event(THP_FILE_FALLBACK); - count_vm_event(THP_FILE_FALLBACK_CHARGE); + } else if (pages > 1) { + if (pages == HPAGE_PMD_NR) { + count_vm_event(THP_FILE_FALLBACK); + count_vm_event(THP_FILE_FALLBACK_CHARGE); + } +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_FALLBACK); + count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_FALLBACK_CHARGE); +#endif } goto unlock; } @@ -2168,6 +2177,9 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, if (!IS_ERR(folio)) { if (folio_test_pmd_mappable(folio)) count_vm_event(THP_FILE_ALLOC); +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_ALLOC); +#endif goto alloced; } if (PTR_ERR(folio) == -EEXIST)