From patchwork Fri Jun 7 02:31:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "zhaoyang.huang" X-Patchwork-Id: 13689223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 340FFC27C5E for ; Fri, 7 Jun 2024 02:32:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF7716B009D; Thu, 6 Jun 2024 22:32:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AA5A76B009E; Thu, 6 Jun 2024 22:32:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96E816B009F; Thu, 6 Jun 2024 22:32:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 75FF36B009D for ; Thu, 6 Jun 2024 22:32:13 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 242EC120D73 for ; Fri, 7 Jun 2024 02:32:13 +0000 (UTC) X-FDA: 82202518146.22.1592B1E Received: from SHSQR01.spreadtrum.com (unknown [222.66.158.135]) by imf15.hostedemail.com (Postfix) with ESMTP id 5AECFA0006 for ; Fri, 7 Jun 2024 02:32:09 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of zhaoyang.huang@unisoc.com designates 222.66.158.135 as permitted sender) smtp.mailfrom=zhaoyang.huang@unisoc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717727531; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=5OzgKsiYClhl2hthTs2jo0H8/UT1fZHyBIgFqykUrxg=; b=DzBb/y8+CdM5Obgc/xQcyrKoe7Z8T3py58kbP6UQn2qujb1C8Q3wUQHUDNbaoS04Sbk2Bm fNU0xEmUoA3ReSdTKJh7nvknrKkTVd66hCkf1qn1EyKTr+/TFKPRavYBbIYJ4jec8DPS2+ dnhZ6gwZw8pL8BbhMPvV/zngZgy1vwo= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of zhaoyang.huang@unisoc.com designates 222.66.158.135 as permitted sender) smtp.mailfrom=zhaoyang.huang@unisoc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717727531; a=rsa-sha256; cv=none; b=kTD+ZF70NP/cXXqfb0+INSv8OaJ8hoWvTWGGqmQPBroJiLOK0Y6NwGiW2p6YmConwnsbri n6kp/Cgm2u26Des3Y9nVdzcu9X4Phe+1d0zB/eVZdQBWmt0BrIzYQxghpJgRidXdm7e30y JIBerRfslyv6HzZnB1W2mVfreOZwHTk= Received: from dlp.unisoc.com ([10.29.3.86]) by SHSQR01.spreadtrum.com with ESMTP id 4572VNQe079441; Fri, 7 Jun 2024 10:31:23 +0800 (+08) (envelope-from zhaoyang.huang@unisoc.com) Received: from SHDLP.spreadtrum.com (bjmbx01.spreadtrum.com [10.0.64.7]) by dlp.unisoc.com (SkyGuard) with ESMTPS id 4VwQ870Jdyz2QNRs5; Fri, 7 Jun 2024 10:27:19 +0800 (CST) Received: from bj03382pcu01.spreadtrum.com (10.0.73.40) by BJMBX01.spreadtrum.com (10.0.64.7) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Fri, 7 Jun 2024 10:31:20 +0800 From: "zhaoyang.huang" To: Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Baoquan He , Thomas Gleixner , hailong liu , , , , Zhaoyang Huang , Subject: [Resend PATCHv4 1/1] mm: fix incorrect vbq reference in purge_fragmented_block Date: Fri, 7 Jun 2024 10:31:16 +0800 Message-ID: <20240607023116.1720640-1-zhaoyang.huang@unisoc.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.0.73.40] X-ClientProxiedBy: SHCAS01.spreadtrum.com (10.0.1.201) To BJMBX01.spreadtrum.com (10.0.64.7) X-MAIL: SHSQR01.spreadtrum.com 4572VNQe079441 X-Stat-Signature: 4qs9gd1qt51fnbquj6dm6fbkc1ptj4iw X-Rspamd-Queue-Id: 5AECFA0006 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717727529-474604 X-HE-Meta: U2FsdGVkX1/DGkefzTdN06T86aJFJdmjmvObWPbEcQMa0a4yAvjAzvQHFiBcSRQd2pJ6LrJYPYIB7sSa1xlXL4HYjF9ip3leFdU9geZbpZ65OjJYitFjhmKyYtCXRlDfHTd5JJXWWE7Fix14lkruqCB1/U8BkDZR/LuoDDY11LabFWIQvzEgRwv2708fpF4+MvXSQHBwvTGTaUBWTAsx3xZT4LK7kcR4zpg53i6tSEeSQS3gweU9UtDsF7LMEYGN6+aQ7KE79I9ZwdN7XLqxJ4B5ocVfHdFpCyNCVQUnGL216zNIjhGdIRu9Dy3NyxV4bcFvKmTIdAULyMkklWDXz+sDpt6ahx6GI2pBC0UKur1JXTSCyIEq/YGauKkWXxkJ+LowkWv/aITAj5n6AbOTrVrMN2cILNM3S2yg0hqZiEdjh8cAzjf07aYSnLlZbKnhYHzRCxzXUQj+DS3pymfS5KeNaHCA0IZvrAMIb50Olj8PC7itcitzaiWW1tIoHVuMBbqkezbjFjHYOuX3viA92q27XLE1IK7BI/mkPW2L66it+airbLuHiJqUtd/pemftG6mNMblTK9T9dOPJy7aBxrjMN0va2wFSYq/RwHwVIcRwkatFtWQRH9v+2QnaSqQHZyq+HONFzz9m8jMa9Rf9pyPFNMS7ZtHTQuwoqswsl2YtQZiguyADWbqsRgmgMXgHfgijCoGojxya/8myC8USjOtQjD/sOVEc13fgM3rtwDM4vsVdbOsBvdgplYoAJFulc+MLDcNwpAK0KsmG17/cqq3J845mr5vNesYhiz6c/jwsKb/h8BV7HAxC/hFgksWgl0wCtya9IucRcQJ3iqlj7PbTEpEz1XFhQNdPq4fdWArbB+mM+pa14Ug/7Z82cGliYAt6k5Vf4+lVnaa5Kq43/VLmtZUFuAvtFTNohJKxdhd7sY0SLi6/VJGIvo2h4O//LT2OqZ7UnkRH+BOcwx3 EcSyh0xl UanblBse96KCq1o8OfaWWruVMhuSKgozxR16baOqykXAPgJaDOToP3+To2C4R7Hl8S/Q27EW3+ydroqYJjOlIIBjSxUKCvHogzV/8LCp9BCs9xK+kP1UcLw/iSZWvcpx9aMrwJpwZeaWJ+6KwXdlz2xJhVQtvH6EPiJDdgfdTtOjzlFgHCB/twjEWSUE1dHXVP0N7hd1NZlCxR1DBHbqRSjradHBtCWCX9XKmb8Ka/FdTJ/dokEbHTpmWztXkgeBVOJUBs+fx9QBm+Dx2WT14qc0zTbGQ0y3tQeoN58nYjga2iRpwfvSXG92uOLX872weeeSrc6EDy2lneY6z1fzooNF7fEX6oNsQ+cVZmFgfBMy06FA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Zhaoyang Huang vmalloc area runs out in our ARM64 system during an erofs test as vm_map_ram failed[1]. By following the debug log, we find that vm_map_ram()->vb_alloc() will allocate new vb->va which corresponding to 4MB vmalloc area as list_for_each_entry_rcu returns immediately when vbq->free->next points to vbq->free. That is to say, 65536 times of page fault after the list's broken will run out of the whole vmalloc area. This should be introduced by one vbq->free->next point to vbq->free which makes list_for_each_entry_rcu can not iterate the list and find the BUG. [1] PID: 1 TASK: ffffff80802b4e00 CPU: 6 COMMAND: "init" #0 [ffffffc08006afe0] __switch_to at ffffffc08111d5cc #1 [ffffffc08006b040] __schedule at ffffffc08111dde0 #2 [ffffffc08006b0a0] schedule at ffffffc08111e294 #3 [ffffffc08006b0d0] schedule_preempt_disabled at ffffffc08111e3f0 #4 [ffffffc08006b140] __mutex_lock at ffffffc08112068c #5 [ffffffc08006b180] __mutex_lock_slowpath at ffffffc08111f8f8 #6 [ffffffc08006b1a0] mutex_lock at ffffffc08111f834 #7 [ffffffc08006b1d0] reclaim_and_purge_vmap_areas at ffffffc0803ebc3c #8 [ffffffc08006b290] alloc_vmap_area at ffffffc0803e83fc #9 [ffffffc08006b300] vm_map_ram at ffffffc0803e78c0 Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks") For detailed reason of broken list, please refer to below URL https://lore.kernel.org/all/20240531024820.5507-1-hailong.liu@oppo.com/ Suggested-by: Hailong.Liu Signed-off-by: Zhaoyang Huang Reviewed-by: Uladzislau Rezki (Sony) --- v2: introduce cpu in vmap_block to record the right CPU number v3: use get_cpu/put_cpu to prevent schedule between core v4: replace get_cpu/put_cpu by another API to avoid disabling preemption --- --- mm/vmalloc.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 22aa63f4ef63..89eb034f4ac6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2458,6 +2458,7 @@ struct vmap_block { struct list_head free_list; struct rcu_head rcu_head; struct list_head purge; + unsigned int cpu; }; /* Queue of free and dirty vmap blocks, for allocation and flushing purposes */ @@ -2585,8 +2586,15 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) free_vmap_area(va); return ERR_PTR(err); } - - vbq = raw_cpu_ptr(&vmap_block_queue); + /* + * list_add_tail_rcu could happened in another core + * rather than vb->cpu due to task migration, which + * is safe as list_add_tail_rcu will ensure the list's + * integrity together with list_for_each_rcu from read + * side. + */ + vb->cpu = raw_smp_processor_id(); + vbq = per_cpu_ptr(&vmap_block_queue, vb->cpu); spin_lock(&vbq->lock); list_add_tail_rcu(&vb->free_list, &vbq->free); spin_unlock(&vbq->lock); @@ -2614,9 +2622,10 @@ static void free_vmap_block(struct vmap_block *vb) } static bool purge_fragmented_block(struct vmap_block *vb, - struct vmap_block_queue *vbq, struct list_head *purge_list, - bool force_purge) + struct list_head *purge_list, bool force_purge) { + struct vmap_block_queue *vbq = &per_cpu(vmap_block_queue, vb->cpu); + if (vb->free + vb->dirty != VMAP_BBMAP_BITS || vb->dirty == VMAP_BBMAP_BITS) return false; @@ -2664,7 +2673,7 @@ static void purge_fragmented_blocks(int cpu) continue; spin_lock(&vb->lock); - purge_fragmented_block(vb, vbq, &purge, true); + purge_fragmented_block(vb, &purge, true); spin_unlock(&vb->lock); } rcu_read_unlock(); @@ -2801,7 +2810,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) * not purgeable, check whether there is dirty * space to be flushed. */ - if (!purge_fragmented_block(vb, vbq, &purge_list, false) && + if (!purge_fragmented_block(vb, &purge_list, false) && vb->dirty_max && vb->dirty != VMAP_BBMAP_BITS) { unsigned long va_start = vb->va->va_start; unsigned long s, e;