[23/26] xfs: reclaim inodes from the LRU

From: Dave Chinner <dchinner@redhat.com>

From: Dave Chinner <dchinner@redhat.com>

Replace the AG radix tree walking reclaim code with a list_lru
walker, giving us both node-aware and memcg-aware inode reclaim
at the XFS level. This requires adding an inode isolation function to
determine if the inode can be reclaim, and a list walker to
dispose of the inodes that were isolated.

We want the isolation function to be non-blocking. If we can't
grab an inode then we either skip it or rotate it. If it's clean
then we skip it, if it's dirty then we rotate to give it time to be
cleaned before it is scanned again.

This congregates the dirty inodes at the tail of the LRU, which
means that if we start hitting a majority of dirty inodes either
there are lots of unlinked inodes in the reclaim list or we've
reclaimed all the clean inodes and we're looped back on the dirty
inodes. Either way, this is an indication we should tell kswapd to
back off.

The non-blocking isolation function introduces a complexity for the
filesystem shutdown case. When the filesystem is shut down, we want
to free the inode even if it is dirty, and this may require
blocking. We already hold the locks needed to do this blocking, so
what we do is that we leave inodes locked - both the ILOCK and the
flush lock - while they are sitting on the dispose list to be freed
after the LRU walk completes.  This allows us to process the
shutdown state outside the LRU walk where we can block safely.

Because we now are reclaiming inodes from the context that it needs
memory in (memcg and/or node), direct reclaim throttling within the
high level reclaim code in now much more effective. Hence we don't
wait on IO for either kswapd or direct reclaim. However, we have to
tell kswapd to back off if we start hitting too many dirty inodes.
This implies we've wrapped around the LRU and don't have many clean
inodes left to reclaim, so it needs to wait a while for the AIL
pushing to clean some of the remaining reclaimable inodes.

Keep in mind we don't have to care about inode lock order or
blocking with inode locks held here because a) we are using
trylocks, and b) once marked with XFS_IRECLAIM they can't be found
via the LRU and inode cache lookups will abort and retry. Hence
nobody will try to lock them in any other context that might also be
holding other inode locks.

Also convert xfs_reclaim_inodes() to use a LRU walk to free all
the reclaimable inodes in the filesystem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_icache.c | 210 ++++++++++++++++++++++++++++++++++++++------
 fs/xfs/xfs_icache.h |  10 ++-
 fs/xfs/xfs_inode.h  |   8 ++
 fs/xfs/xfs_super.c  |  48 ++++++++--
 4 files changed, 241 insertions(+), 35 deletions(-)

Message ID	20191009032124.10541-24-david@fromorbit.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=Muv8=YC=vger.kernel.org=linux-fsdevel-owner@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D25231709 for <patchwork-linux-fsdevel@patchwork.kernel.org>; Wed, 9 Oct 2019 03:21:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B323521871 for <patchwork-linux-fsdevel@patchwork.kernel.org>; Wed, 9 Oct 2019 03:21:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730646AbfJIDVs (ORCPT <rfc822;patchwork-linux-fsdevel@patchwork.kernel.org>); Tue, 8 Oct 2019 23:21:48 -0400 Received: from mail104.syd.optusnet.com.au ([211.29.132.246]:47367 "EHLO mail104.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730339AbfJIDVp (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>); Tue, 8 Oct 2019 23:21:45 -0400 Received: from dread.disaster.area (pa49-181-226-196.pa.nsw.optusnet.com.au [49.181.226.196]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 5621A43EC8D; Wed, 9 Oct 2019 14:21:27 +1100 (AEDT) Received: from discord.disaster.area ([192.168.253.110]) by dread.disaster.area with esmtp (Exim 4.92.2) (envelope-from <david@fromorbit.com>) id 1iI2XX-0006C6-G7; Wed, 09 Oct 2019 14:21:27 +1100 Received: from dave by discord.disaster.area with local (Exim 4.92) (envelope-from <david@fromorbit.com>) id 1iI2XX-0003A0-EG; Wed, 09 Oct 2019 14:21:27 +1100 From: Dave Chinner <david@fromorbit.com> To: linux-xfs@vger.kernel.org Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH 23/26] xfs: reclaim inodes from the LRU Date: Wed, 9 Oct 2019 14:21:21 +1100 Message-Id: <20191009032124.10541-24-david@fromorbit.com> X-Mailer: git-send-email 2.23.0.rc1 In-Reply-To: <20191009032124.10541-1-david@fromorbit.com> References: <20191009032124.10541-1-david@fromorbit.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=FNpr/6gs c=1 sm=1 tr=0 a=dRuLqZ1tmBNts2YiI0zFQg==:117 a=dRuLqZ1tmBNts2YiI0zFQg==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=XobE76Q3jBoA:10 a=20KFwNOVAAAA:8 a=DZylAeVkA9-7jzjJDzkA:9 a=I6VJddeCkTb8Bg0N:21 a=kod2Zbzq7Elvxm72:21 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: <linux-fsdevel.vger.kernel.org> X-Mailing-List: linux-fsdevel@vger.kernel.org
Series	mm, xfs: non-blocking inode reclaim \| expand [V2,00/26] mm, xfs: non-blocking inode reclaim [01/26] xfs: Lower CIL flush limit for large logs [02/26] xfs: Throttle commits on delayed background CIL push [03/26] xfs: don't allow log IO to be throttled [04/26] xfs: Improve metadata buffer reclaim accountability [05/26] xfs: correctly acount for reclaimable slabs [06/26] xfs: synchronous AIL pushing [07/26] xfs: tail updates only need to occur when LSN changes [08/26] mm: directed shrinker work deferral [09/26] shrinkers: use defer_work for GFP_NOFS sensitive shrinkers [10/26] mm: factor shrinker work calculations [11/26] shrinker: defer work only to kswapd [12/26] shrinker: clean up variable types and tracepoints [13/26] mm: reclaim_state records pages reclaimed, not slabs [14/26] mm: back off direct reclaim on excessive shrinker deferral [15/26] mm: kswapd backoff for shrinkers [16/26] xfs: synchronous AIL pushing [17/26] xfs: don't block kswapd in inode reclaim [18/26] xfs: reduce kswapd blocking on inode locking. [19/26] xfs: kill background reclaim work [20/26] xfs: use AIL pushing for inode reclaim IO [21/26] xfs: remove mode from xfs_reclaim_inodes() [22/26] xfs: track reclaimable inodes using a LRU list [23/26] xfs: reclaim inodes from the LRU [24/26] xfs: remove unusued old inode reclaim code [25/26] xfs: rework unreferenced inode lookups [26/26] xfs: use xfs_ail_push_all_sync in xfs_reclaim_inodes

[23/26] xfs: reclaim inodes from the LRU

Commit Message

Comments

Patch