From patchwork Tue Nov 3 13:17:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11877459 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F18C86A2 for ; Tue, 3 Nov 2020 13:18:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9E57C216C4 for ; Tue, 3 Nov 2020 13:18:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="U4Sq0WIh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9E57C216C4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9E95A6B0070; Tue, 3 Nov 2020 08:18:28 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9985E6B0071; Tue, 3 Nov 2020 08:18:28 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 837396B0072; Tue, 3 Nov 2020 08:18:28 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id 4402E6B0070 for ; Tue, 3 Nov 2020 08:18:28 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E51F58249980 for ; Tue, 3 Nov 2020 13:18:27 +0000 (UTC) X-FDA: 77443161054.22.crib34_3c16017272b8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 5460D18038E6F for ; Tue, 3 Nov 2020 13:18:27 +0000 (UTC) X-Spam-Summary: 1,0,0,a7029096f376e5eb,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:41:69:355:379:541:800:960:966:973:981:988:989:1260:1345:1359:1437:1535:1544:1605:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2693:2731:2901:2903:3138:3139:3140:3141:3142:3165:3865:3866:3867:3868:3870:3871:3872:3873:3874:4118:4250:4321:4385:4605:5007:6119:6261:6653:7514:7576:9413:9592:10004:11026:11473:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12895:12986:13869:14130:14181:14394:14687:14721:21080:21324:21433:21444:21451:21627:21666:21740:21990,0,RBL:209.85.167.194:@gmail.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100;04ygdqmks6yjagtp8bfwshoub3amfocwbat9ppnn36bn8kbww1yqxgz1be9potj.4ecx9coi6ixkgup89ya9b3f7t69yxdmgqyrkgi63dnq9yroqjpy1qs31kqgimyt.o-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:70,LUA_SUMMARY:none X-HE-Tag: crib34_3c16017272b8 X-Filterd-Recvd-Size: 7208 Received: from mail-oi1-f194.google.com (mail-oi1-f194.google.com [209.85.167.194]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Tue, 3 Nov 2020 13:18:26 +0000 (UTC) Received: by mail-oi1-f194.google.com with SMTP id c80so6808500oib.2 for ; Tue, 03 Nov 2020 05:18:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wcewLsncu9ZYLAdf8Dj+b3i3xIrhlEzCNXPbs1EqjSc=; b=U4Sq0WIhR9xhMN9zPWhvg/zls9REX7I5XS/eUk2Dmrg1PTGMlH1hqkaQhkJsZW4aFs HrzLLXJxQx4kGNkUK94cTwvSumfEUW+OSbdZYyoZx3Qqee+1b7NjQowPXNXKtQlMcqqs iIwgdq+MNQBS+u0Kn4Wn/U+aAR5ZdkGh1ICGYhD3fNGn8gb9XE2GKYGBV8WcpbRtaw/c htJYPkbAEBUYmJK0SGX2FdKMPK33YSZtRS1HdZ836qNuvDSdrIw2S1FOvdcd053EIuuK 6A+2buP86nqP0qO5t2ucS15ZhxZM9yku26PMyApS3/c48JLNnV9oo4FYG8a1AhbrcZwm UGtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wcewLsncu9ZYLAdf8Dj+b3i3xIrhlEzCNXPbs1EqjSc=; b=DQtcxVBrVaKR1WP7ofGRgyBtL61i269kVWpsgk4vPZ5VoKFbKOvDgE1jcMFl915geW uVHoSgBEetUeZ3xKPc1VfjoREHqpecOkFt9diJbuIIVzO3uqva5v3TSSmtCnPMEs1E+q PAOrBMeIaD9E0cdopHsLjCMga7rSvKsjUyG5A2f/MGFs9LYjDSVNPd5tVaxnMgMI5/az ovGywwja0DptmkSin952atNJV2ALh4S/yS8zXs6FfwZwC2nrk7PKo12YafTNK/dK9x9Q FcCKffmqKYZzFZ30QkianvWsFXDUYagwpUcsApZ2xQIU8ygAlkldGQaGwNcbQn8ITBfe EanQ== X-Gm-Message-State: AOAM530lrcFulHzR6TlNGdHqU17gfU4dqWsVb60dTKe2CvNRT6z72LAh HOTs1RdFnPKylmYcXd7ra9w= X-Google-Smtp-Source: ABdhPJxxY1oARG/kzprtfN6LQpvSv/dkZkBdU/f6ENapc5lPESBoit+o9PJKuZzR/35lv0PKgZQ8Aw== X-Received: by 2002:aca:1e08:: with SMTP id m8mr1763659oic.168.1604409506161; Tue, 03 Nov 2020 05:18:26 -0800 (PST) Received: from localhost.localdomain ([50.236.19.102]) by smtp.gmail.com with ESMTPSA id f18sm4396138otf.55.2020.11.03.05.18.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Nov 2020 05:18:25 -0800 (PST) From: Yafang Shao To: akpm@linux-foundation.org Cc: david@fromorbit.com, hch@infradead.org, darrick.wong@oracle.com, willy@infradead.org, mhocko@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, Yafang Shao Subject: [PATCH v8 resend 1/2] mm: Add become_kswapd and restore_kswapd Date: Tue, 3 Nov 2020 21:17:53 +0800 Message-Id: <20201103131754.94949-2-laoar.shao@gmail.com> X-Mailer: git-send-email 2.17.2 (Apple Git-113) In-Reply-To: <20201103131754.94949-1-laoar.shao@gmail.com> References: <20201103131754.94949-1-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Since XFS needs to pretend to be kswapd in some of its worker threads, create methods to save & restore kswapd state. Don't bother restoring kswapd state in kswapd -- the only time we reach this code is when we're exiting and the task_struct is about to be destroyed anyway. Cc: Dave Chinner Acked-by: Michal Hocko Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Yafang Shao --- fs/xfs/libxfs/xfs_btree.c | 14 ++++++++------ include/linux/sched/mm.h | 23 +++++++++++++++++++++++ mm/vmscan.c | 16 +--------------- 3 files changed, 32 insertions(+), 21 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 2d25bab..a04a442 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -2813,8 +2813,9 @@ struct xfs_btree_split_args { { struct xfs_btree_split_args *args = container_of(work, struct xfs_btree_split_args, work); + bool is_kswapd = args->kswapd; unsigned long pflags; - unsigned long new_pflags = PF_MEMALLOC_NOFS; + int memalloc_nofs; /* * we are in a transaction context here, but may also be doing work @@ -2822,16 +2823,17 @@ struct xfs_btree_split_args { * temporarily to ensure that we don't block waiting for memory reclaim * in any way. */ - if (args->kswapd) - new_pflags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD; - - current_set_flags_nested(&pflags, new_pflags); + if (is_kswapd) + pflags = become_kswapd(); + memalloc_nofs = memalloc_nofs_save(); args->result = __xfs_btree_split(args->cur, args->level, args->ptrp, args->key, args->curp, args->stat); complete(args->done); - current_restore_flags_nested(&pflags, new_pflags); + memalloc_nofs_restore(memalloc_nofs); + if (is_kswapd) + restore_kswapd(pflags); } /* diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index d5ece7a..2faf03e 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -278,6 +278,29 @@ static inline void memalloc_nocma_restore(unsigned int flags) } #endif +/* + * Tell the memory management code that this thread is working on behalf + * of background memory reclaim (like kswapd). That means that it will + * get access to memory reserves should it need to allocate memory in + * order to make forward progress. With this great power comes great + * responsibility to not exhaust those reserves. + */ +#define KSWAPD_PF_FLAGS (PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD) + +static inline unsigned long become_kswapd(void) +{ + unsigned long flags = current->flags & KSWAPD_PF_FLAGS; + + current->flags |= KSWAPD_PF_FLAGS; + + return flags; +} + +static inline void restore_kswapd(unsigned long flags) +{ + current->flags &= ~(flags ^ KSWAPD_PF_FLAGS); +} + #ifdef CONFIG_MEMCG DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg); /** diff --git a/mm/vmscan.c b/mm/vmscan.c index 1b8f0e0..77bc1dd 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3869,19 +3869,7 @@ static int kswapd(void *p) if (!cpumask_empty(cpumask)) set_cpus_allowed_ptr(tsk, cpumask); - /* - * Tell the memory management that we're a "memory allocator", - * and that if we need more memory we should get access to it - * regardless (see "__alloc_pages()"). "kswapd" should - * never get caught in the normal page freeing logic. - * - * (Kswapd normally doesn't need memory anyway, but sometimes - * you need a small amount of memory in order to be able to - * page out something else, and this flag essentially protects - * us from recursively trying to free more memory as we're - * trying to free the first piece of memory in the first place). - */ - tsk->flags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD; + become_kswapd(); set_freezable(); WRITE_ONCE(pgdat->kswapd_order, 0); @@ -3931,8 +3919,6 @@ static int kswapd(void *p) goto kswapd_try_sleep; } - tsk->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD); - return 0; } From patchwork Tue Nov 3 13:17:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 11877465 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC08A14B2 for ; Tue, 3 Nov 2020 13:18:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 48BCB21534 for ; Tue, 3 Nov 2020 13:18:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nLL1uzXC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 48BCB21534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5BF766B0072; Tue, 3 Nov 2020 08:18:34 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 528AA6B0073; Tue, 3 Nov 2020 08:18:34 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E8946B0074; Tue, 3 Nov 2020 08:18:34 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0195.hostedemail.com [216.40.44.195]) by kanga.kvack.org (Postfix) with ESMTP id F0D626B0072 for ; Tue, 3 Nov 2020 08:18:33 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 93C63180AD807 for ; Tue, 3 Nov 2020 13:18:33 +0000 (UTC) X-FDA: 77443161306.04.table24_3215061272b8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 69FA88007FE5 for ; Tue, 3 Nov 2020 13:18:33 +0000 (UTC) X-Spam-Summary: 1,0,0,8df3ab495ecc9fd8,d41d8cd98f00b204,laoar.shao@gmail.com,,RULES_HIT:1:2:41:69:355:379:541:800:960:966:973:981:988:989:1260:1345:1359:1437:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2892:2895:2897:3138:3139:3140:3141:3142:3664:3865:3866:3867:3868:3870:3871:3872:3874:4051:4250:4321:4385:4605:5007:6117:6119:6261:6653:7514:7903:8660:9010:9413:9592:10004:11026:11232:11233:11473:11658:11914:12043:12291:12295:12296:12297:12438:12517:12519:12555:12683:12895:13148:13230:14096:14394:14687:21080:21433:21444:21451:21611:21627:21666:21740:21809:21939:21990:30034:30054:30064:30069:30070:30075,0,RBL:209.85.210.66:@gmail.com:.lbl8.mailshell.net-62.50.0.100 66.100.201.100;04yg9q9w1wbwccsxnxdsyx7azzbxjoc5ttexoomy5ntjscae5w1cmbutzjmprhe.a4mkq88ats44gx1hgdnepa6m7qmjqtxs1tschwkodb6i1ezigwbrkcbjhkzhwaf.4-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtim e:69,LUA X-HE-Tag: table24_3215061272b8 X-Filterd-Recvd-Size: 11697 Received: from mail-ot1-f66.google.com (mail-ot1-f66.google.com [209.85.210.66]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Tue, 3 Nov 2020 13:18:32 +0000 (UTC) Received: by mail-ot1-f66.google.com with SMTP id n15so15871123otl.8 for ; Tue, 03 Nov 2020 05:18:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Qv1wGdsij+9w5z85zBACLm0WGUMqCz1dz7LNa3UTDtk=; b=nLL1uzXCYA1FaoNOMbGrHfODfoC31v1Tm0VUBWwPlAjqM5N7X4FEuuaqNMkVlzqdV6 +ICaG0vi2MvwTlbA2/ErhlMIAuD3CDu0IELcctFzBMhETIiKqwTVyVNDQqHgPsn2cDi+ 7/w+v42iu1bGbtA+1vHMETVIiPA0vJMqxtd5R8QdOBxpBKfnhIrFBj4X/PmMyGoxBMou rbzy/2m+ag5FWc4IV2LbPnYeneq2dBZzXvUgzFkA7Q78PhPoP49zXPJ29/co+ASfwt99 TESDvAhHJWWxe7GlUU9J/Qegb23xy4RUltF/4ow00JzI9495kDNre72fqsdbVKF6Iqqr C7Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Qv1wGdsij+9w5z85zBACLm0WGUMqCz1dz7LNa3UTDtk=; b=X7DtNQxjRZc3rmxWsInEweYdQBMsDHqvVCcZdta7Fv7BJ6ezAWhDz8HWKwmo7NDurF SsGvXFV3Y6bgfsW5X40BOYuKpByPnnvCZiEmNwvbdjGpgsf2zs5oxptSb7H5xSyA2xKa Bvng3+tZtb70Jw1zt05/KKAJKSyJYoYWp/aPhChU0moBo4eb3qPe7QMw7HqOAoP5e55H y7LeDfbdV6xLKd3koNSTrUL2fFgaOdBrVLUGypaxbbsqWxEmarG/DDzH1gcUvu4xltP6 vMpjA/piK85pGW0fYxhYBTJYktBGKsh9FbbNL4XoHhEqKwIpCdo4B2wpV+D/HC2hZFCK N4BQ== X-Gm-Message-State: AOAM531twGaB3+Td7xz5gbv9Fj5CZ8D5c0QI5DXG8fXY2JVP4oDyDFst 2ceXo2y94bzemH4ekUEmAuroiDgatPw= X-Google-Smtp-Source: ABdhPJyXf4V3twjJit4yagiweFA1fx+ddVWP5CNLhVwvuGxduXs0ohrKBigLkgLPtmMkz8WCmOBXuw== X-Received: by 2002:a9d:7419:: with SMTP id n25mr15724612otk.183.1604409512309; Tue, 03 Nov 2020 05:18:32 -0800 (PST) Received: from localhost.localdomain ([50.236.19.102]) by smtp.gmail.com with ESMTPSA id f18sm4396138otf.55.2020.11.03.05.18.26 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Nov 2020 05:18:31 -0800 (PST) From: Yafang Shao To: akpm@linux-foundation.org Cc: david@fromorbit.com, hch@infradead.org, darrick.wong@oracle.com, willy@infradead.org, mhocko@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, Yafang Shao Subject: [PATCH v8 resend 2/2] xfs: avoid transaction reservation recursion Date: Tue, 3 Nov 2020 21:17:54 +0800 Message-Id: <20201103131754.94949-3-laoar.shao@gmail.com> X-Mailer: git-send-email 2.17.2 (Apple Git-113) In-Reply-To: <20201103131754.94949-1-laoar.shao@gmail.com> References: <20201103131754.94949-1-laoar.shao@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: PF_FSTRANS which is used to avoid transaction reservation recursion, is dropped since commit 9070733b4efa ("xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS") and commit 7dea19f9ee63 ("mm: introduce memalloc_nofs_{save,restore} API") and replaced by PF_MEMALLOC_NOFS which means to avoid filesystem reclaim recursion. That change is subtle. Let's take the exmple of the check of WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) to explain why this abstraction from PF_FSTRANS to PF_MEMALLOC_NOFS is not proper. Below comment is quoted from Dave, > It wasn't for memory allocation recursion protection in XFS - it was for > transaction reservation recursion protection by something trying to flush > data pages while holding a transaction reservation. Doing > this could deadlock the journal because the existing reservation > could prevent the nested reservation for being able to reserve space > in the journal and that is a self-deadlock vector. > IOWs, this check is not protecting against memory reclaim recursion > bugs at all (that's the previous check [1]). This check is > protecting against the filesystem calling writepages directly from a > context where it can self-deadlock. > So what we are seeing here is that the PF_FSTRANS -> > PF_MEMALLOC_NOFS abstraction lost all the actual useful information > about what type of error this check was protecting against. As a result, we should reintroduce PF_FSTRANS. As current->journal_info isn't used in XFS, we can reuse it to indicate whehter the task is in fstrans or not, Per Willy. To achieve that, four new helpers are introduce in this patch, per Dave: - xfs_trans_context_set() Used in xfs_trans_alloc() - xfs_trans_context_clear() Used in xfs_trans_commit() and xfs_trans_cancel() - xfs_trans_context_update() Used in xfs_trans_roll() - xfs_trans_context_active() To check whehter current is in fs transcation or not [1]. Below check is to avoid memory reclaim recursion. if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == PF_MEMALLOC)) goto redirty; Signed-off-by: Yafang Shao Reviewed-by: Matthew Wilcox (Oracle) Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Cc: Dave Chinner Cc: Michal Hocko --- fs/iomap/buffered-io.c | 7 ------- fs/xfs/xfs_aops.c | 23 +++++++++++++++++++++-- fs/xfs/xfs_linux.h | 4 ---- fs/xfs/xfs_trans.c | 19 +++++++++---------- fs/xfs/xfs_trans.h | 30 ++++++++++++++++++++++++++++++ 5 files changed, 60 insertions(+), 23 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 8180061..2f090b6 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1469,13 +1469,6 @@ static void iomap_writepage_end_bio(struct bio *bio) goto redirty; /* - * Given that we do not allow direct reclaim to call us, we should - * never be called in a recursive filesystem reclaim context. - */ - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) - goto redirty; - - /* * Is this page beyond the end of the file? * * The page index is less than the end_index, adjust the end_offset diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 55d126d..b25196a 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -62,7 +62,8 @@ static inline bool xfs_ioend_is_append(struct iomap_ioend *ioend) * We hand off the transaction to the completion thread now, so * clear the flag here. */ - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_clear(tp); + return 0; } @@ -125,7 +126,7 @@ static inline bool xfs_ioend_is_append(struct iomap_ioend *ioend) * thus we need to mark ourselves as being in a transaction manually. * Similarly for freeze protection. */ - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_set(tp); __sb_writers_acquired(VFS_I(ip)->i_sb, SB_FREEZE_FS); /* we abort the update if there was an IO error */ @@ -564,6 +565,16 @@ static inline bool xfs_ioend_needs_workqueue(struct iomap_ioend *ioend) { struct xfs_writepage_ctx wpc = { }; + /* + * Given that we do not allow direct reclaim to call us, we should + * never be called while in a filesystem transaction. + */ + if (xfs_trans_context_active()) { + redirty_page_for_writepage(wbc, page); + unlock_page(page); + return 0; + } + return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops); } @@ -575,6 +586,14 @@ static inline bool xfs_ioend_needs_workqueue(struct iomap_ioend *ioend) struct xfs_writepage_ctx wpc = { }; xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED); + + /* + * Given that we do not allow direct reclaim to call us, we should + * never be called while in a filesystem transaction. + */ + if (xfs_trans_context_active()) + return 0; + return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops); } diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index 5b7a1e2..6ab0f80 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -102,10 +102,6 @@ #define xfs_cowb_secs xfs_params.cowb_timer.val #define current_cpu() (raw_smp_processor_id()) -#define current_set_flags_nested(sp, f) \ - (*(sp) = current->flags, current->flags |= (f)) -#define current_restore_flags_nested(sp, f) \ - (current->flags = ((current->flags & ~(f)) | (*(sp) & (f)))) #define NBBY 8 /* number of bits per byte */ diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index c94e71f..b272d07 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -153,8 +153,6 @@ int error = 0; bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0; - /* Mark this thread as being in a transaction */ - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); /* * Attempt to reserve the needed disk blocks by decrementing @@ -163,10 +161,8 @@ */ if (blocks > 0) { error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd); - if (error != 0) { - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + if (error != 0) return -ENOSPC; - } tp->t_blk_res += blocks; } @@ -241,8 +237,6 @@ tp->t_blk_res = 0; } - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); - return error; } @@ -284,6 +278,8 @@ INIT_LIST_HEAD(&tp->t_dfops); tp->t_firstblock = NULLFSBLOCK; + /* Mark this thread as being in a transaction */ + xfs_trans_context_set(tp); error = xfs_trans_reserve(tp, resp, blocks, rtextents); if (error) { xfs_trans_cancel(tp); @@ -878,7 +874,8 @@ xfs_log_commit_cil(mp, tp, &commit_lsn, regrant); - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + if (!regrant) + xfs_trans_context_clear(tp); xfs_trans_free(tp); /* @@ -910,7 +907,8 @@ xfs_log_ticket_ungrant(mp->m_log, tp->t_ticket); tp->t_ticket = NULL; } - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + + xfs_trans_context_clear(tp); xfs_trans_free_items(tp, !!error); xfs_trans_free(tp); @@ -971,7 +969,7 @@ } /* mark this thread as no longer being in a transaction */ - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); + xfs_trans_context_clear(tp); xfs_trans_free_items(tp, dirty); xfs_trans_free(tp); @@ -1013,6 +1011,7 @@ if (error) return error; + xfs_trans_context_update(trans, *tpp); /* * Reserve space in the log for the next transaction. * This also pushes items in the "AIL", the list of logged items, diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index 0846589..c4877afc 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -268,4 +268,34 @@ void xfs_trans_buf_copy_type(struct xfs_buf *dst_bp, return lip->li_ops->iop_relog(lip, tp); } +static inline void +xfs_trans_context_set(struct xfs_trans *tp) +{ + ASSERT(!current->journal_info); + current->journal_info = tp; + tp->t_pflags = memalloc_nofs_save(); +} + +static inline void +xfs_trans_context_update(struct xfs_trans *old, struct xfs_trans *new) +{ + ASSERT(current->journal_info == old); + current->journal_info = new; +} + +static inline void +xfs_trans_context_clear(struct xfs_trans *tp) +{ + ASSERT(current->journal_info == tp); + current->journal_info = NULL; + memalloc_nofs_restore(tp->t_pflags); +} + +static inline bool +xfs_trans_context_active(void) +{ + /* Use journal_info to indicate current is in a transaction */ + return current->journal_info != NULL; +} + #endif /* __XFS_TRANS_H__ */