From patchwork Sat Apr 1 22:19:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rongwei Wang X-Patchwork-Id: 13197268 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1400BC6FD1D for ; Sat, 1 Apr 2023 22:19:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6BE53900002; Sat, 1 Apr 2023 18:19:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 66DEB6B0074; Sat, 1 Apr 2023 18:19:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55BEF900002; Sat, 1 Apr 2023 18:19:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4A94C6B0072 for ; Sat, 1 Apr 2023 18:19:30 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 19C98803AF for ; Sat, 1 Apr 2023 22:19:30 +0000 (UTC) X-FDA: 80634239700.01.009BA5C Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) by imf22.hostedemail.com (Postfix) with ESMTP id 28AD8C001B for ; Sat, 1 Apr 2023 22:19:26 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of rongwei.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=rongwei.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680387568; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=nlPv6VxPGgAUXGLbyVxfhT75dDXEKhjsm6C3qrb5rAE=; b=jBF2d/TrODa6rQQUtDHAdSiVixgfhA1etxf8s1rfEN9HwQH87POezfH5aRvfOafgM5DFrq W6Fx6J3j18vQMLs02A06mHwLHWlklzxBpvnjeANXt1xyeT/fKRBlXZpLvTewgr5mVIvX+B w+UtlRYwQNHhG+qL1Mmhluo0JGDAzlo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of rongwei.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=rongwei.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680387568; a=rsa-sha256; cv=none; b=qXFQw4ZQRotpx4iuJKb9KphFhTZuKG8xZXgaZrffTzO3OEhcF/popPP4RPWj6ZIj1ZItfn VxSy7y21iWo5d49sflt1u+LFC/RBlQB9RviUMhkkie8Qde3SzUhtAqUV3GMFZczkZMdb9S p230IvHrVJC9pcht/A5GBzErGju5NRM= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R721e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=rongwei.wang@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0Vf6Rq6N_1680387561; Received: from localhost.localdomain(mailfrom:rongwei.wang@linux.alibaba.com fp:SMTPD_---0Vf6Rq6N_1680387561) by smtp.aliyun-inc.com; Sun, 02 Apr 2023 06:19:22 +0800 From: Rongwei Wang To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-stable@vger.kernel.org Subject: [PATCH] mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() Date: Sun, 2 Apr 2023 06:19:20 +0800 Message-Id: <20230401221920.57986-1-rongwei.wang@linux.alibaba.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: dfqhekwst671hb8o3basyifwuo3fbiwy X-Rspamd-Queue-Id: 28AD8C001B X-HE-Tag: 1680387566-113664 X-HE-Meta: U2FsdGVkX18zvwRc9+5uI67+OfkCUBJSuQorUi0XblIw7X9AaLKnaDN0YpubXwBA2Q4FucgjXoiJ+SOawscy+DUgYyKULEcpI0d4M30r7CS7Rb8EaNi4nkG1wGZBE/vWGIF2teU0SVk9ThoJpUks+CLHpRQcU2est2LsRVsTuZz6VlszljoSz9MGRdEQy3w75dCSyWeKzSxwqCxSMlrZvFkAT7EFF3HRXvbbTpMvJlo4F2Bi8bYMe9ctkyKWQMuWuPlhuAL1v5ybQu3TyqQO4i7pnpoH5k4iG9xh/KDVdK3xJ+0/QAIsh2o0QgTcTJHAf3kgcCEVmziG1MxhPLqeMyq0+PTP69T1srYaJB4+cwyblT6YOhRJwX/CYYM/KkKyfc+njiWi9d5WNGUCOCbs3UQ0ZXHLlDnpHMSsFMVs9OejQqLdjCKhKLdrS+a5Fx1+NfPTfaHS54M6yQIAVOly6ij5Q2yw6652OpaRsvrPEyOb+2FQFW10+LHBIOXaHFn4sYPBpnFV6MqAc25/U0mmEwJ3/c21IokX/JJ4t00MhRA6si8DA4cz82aNMEwHShz4Gbcx3knkPKxfanBaTetrRObUg5jnUOH4nid3dh0jWyZrR+SyROIJwicSaGVRipSESwOpLYpF7euO73VXrHb8/hxWpfSw00hKNSXyCybm29yfyFYyq0iCoeD4uETml5AM0G1Jq6mt8oQVBJmnO9A8LXpjvLDpt0tnUU8bMxTDpXDop2FtGNM94LX6le1pehcO/KhRwmlQd12stZ2gK0v1vIax/ySsd9QcYECvYAV8JG26ygb6y8hwUgj1AdM5yUp90I5G844df0zx+eBpCNBl81xz17fZanswYrx04eNfHEp0x24kJbfHJJzGQbpFow+IO9ygFvJFj2Le3maGgKLin2/4ABdjWmU2kIgMRqxJWpBGawy9mq4nZzYoxi5NRkWP5fw22L2ddnjqe+4jt2J oL8BY9Fo BvqAKvRlI4DpgjePQyEUCqc6hnZR6/X5TnNPbbWvyhcJcVaPA3hH0qNdWkPCa06iTnErG6ol6mCsnUJRYKivQw+nGuGJMIYmyiErHtmnvNiYk2N3mutJ2ug2UEBwS4Y0YvzxQdAdNEhZldLRt5ygTqt0akC48tPo2ArPS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Without this modification, a core will wait (mostly) 'swap_info_struct->lock' when completing 'del_from_avail_list(p)'. Immediately, other cores soon calling 'add_to_avail_list()' to add the same object again when acquiring the lock that released by former. It's not the desired result but exists indeed. This case can be described as below: core 0 core 1 swapoff del_from_avail_list(p) waiting try lock p->lock acquire swap_avail_lock and add p into swap_avail_head again acquire p->lock but missing p already be added again, and continuing to clear SWP_WRITEOK, etc. It can be easily found a massive warning messages can be triggered inside get_swap_pages() by some special cases, for example, we call madvise(MADV_PAGEOUT) on blocks of touched memory concurrently, meanwhile, run much swapon-swapoff operations (e.g. stress-ng-swap). But, a worse consequence, panic also can be caused by the above scene. In swapoff(), p, refers to one swap_info_struct variable, maybe reinsert swap_avail_head by 'reinsert_swap_info', or as we wanted, turns off this swap block successfully. the worse case is that swapoff() run the last code of function but the p still linked in swap_avail_head[]. It has very bad effects, such as the memory used by p could be kept in swap_info[], this means reuse it will destroy the data. A panic message caused: (with CONFIG_PLIST_DEBUG enabled) ------------[ cut here ]------------ top: ffff001800875c00, n: ffff001800fdc6e0, p: ffff001800fdc6e0 prev: ffff001800875c00, n: ffff001800fdc6e0, p: ffff001800fdc6e0 next: ffff001800fdc6e0, n: ffff001800fdc6e0, p: ffff001800fdc6e0 WARNING: CPU: 21 PID: 1843 at lib/plist.c:60 plist_check_prev_next_node+0x50/0x70 Modules linked in: rfkill(E) crct10dif_ce(E)... CPU: 21 PID: 1843 Comm: stress-ng Kdump: ... 5.10.134+ Hardware name: Alibaba Cloud ECS, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) pc : plist_check_prev_next_node+0x50/0x70 lr : plist_check_prev_next_node+0x50/0x70 sp : ffff0018009d3c30 x29: ffff0018009d3c40 x28: ffff800011b32a98 x27: 0000000000000000 x26: ffff001803908000 x25: ffff8000128ea088 x24: ffff800011b32a48 x23: 0000000000000028 x22: ffff001800875c00 x21: ffff800010f9e520 x20: ffff001800875c00 x19: ffff001800fdc6e0 x18: 0000000000000030 x17: 0000000000000000 x16: 0000000000000000 x15: 0736076307640766 x14: 0730073007380731 x13: 0736076307640766 x12: 0730073007380731 x11: 000000000004058d x10: 0000000085a85b76 x9 : ffff8000101436e4 x8 : ffff800011c8ce08 x7 : 0000000000000000 x6 : 0000000000000001 x5 : ffff0017df9ed338 x4 : 0000000000000001 x3 : ffff8017ce62a000 x2 : ffff0017df9ed340 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: plist_check_prev_next_node+0x50/0x70 plist_check_head+0x80/0xf0 plist_add+0x28/0x140 add_to_avail_list+0x9c/0xf0 _enable_swap_info+0x78/0xb4 __do_sys_swapon+0x918/0xa10 __arm64_sys_swapon+0x20/0x30 el0_svc_common+0x8c/0x220 do_el0_svc+0x2c/0x90 el0_svc+0x1c/0x30 el0_sync_handler+0xa8/0xb0 el0_sync+0x148/0x180 irq event stamp: 2082270 In this patch, we lock p->lock before calling 'del_from_avail_list()' to make sure other thread see the swap_info_struct object had been deleted and SWP_WRITEOK cleared together, will not reinsert again. We also find this problem exists in stable 5.10. Signed-off-by: Rongwei Wang --- mm/swapfile.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 5af6b0f770de..4df77fef50b5 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2610,8 +2610,12 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) spin_unlock(&swap_lock); goto out_dput; } - del_from_avail_list(p); + /* + * Here lock is used to protect deleting and SWP_WRITEOK clearing + * can be seen concurrently. + */ spin_lock(&p->lock); + del_from_avail_list(p); if (p->prio < 0) { struct swap_info_struct *si = p; int nid;