Message ID | 1578499437-1664-1-git-send-email-laoar.shao@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | protect page cache from freeing inode | expand |
On Thu, Jan 9, 2020 at 12:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > On my server there're some running MEMCGs protected by memory.{min, low}, > but I found the usage of these MEMCGs abruptly became very small, which > were far less than the protect limit. It confused me and finally I > found that was because of inode stealing. > Once an inode is freed, all its belonging page caches will be dropped as > well, no matter how may page caches it has. So if we intend to protect the > page caches in a memcg, we must protect their host (the inode) first. > Otherwise the memcg protection can be easily bypassed with freeing inode, > especially if there're big files in this memcg. > The inherent mismatch between memcg and inode is a trouble. One inode can > be shared by different MEMCGs, but it is a very rare case. If an inode is > shared, its belonging page caches may be charged to different MEMCGs. > Currently there's no perfect solution to fix this kind of issue, but the > inode majority-writer ownership switching can help it more or less. > > - Changes against v2: > 1. Seperates memcg patches from this patchset, suggested by Roman. > A separate patch is alreay ACKed by Roman, please the MEMCG > maintianers help take a look at it[1]. > 2. Improves code around the usage of for_each_mem_cgroup(), suggested > by Dave > 3. Use memcg_low_reclaim passed from scan_control, instead of > introducing a new member in struct mem_cgroup. > 4. Some other code improvement suggested by Dave. > > > - Changes against v1: > Use the memcg passed from the shrink_control, instead of getting it from > inode itself, suggested by Dave. That could make the laying better. > > [1] > https://lore.kernel.org/linux-mm/CALOAHbBhPgh3WEuLu2B6e2vj1J8K=gGOyCKzb8tKWmDqFs-rfQ@mail.gmail.com/ > > Yafang Shao (3): > mm, list_lru: make memcg visible to lru walker isolation function > mm, shrinker: make memcg low reclaim visible to lru walker isolation > function > memcg, inode: protect page cache from freeing inode > > fs/inode.c | 78 ++++++++++++++++++++++++++++++++++++++++++++-- > include/linux/memcontrol.h | 21 +++++++++++++ > include/linux/shrinker.h | 3 ++ > mm/list_lru.c | 47 +++++++++++++++++----------- > mm/memcontrol.c | 15 --------- > mm/vmscan.c | 27 +++++++++------- > 6 files changed, 143 insertions(+), 48 deletions(-) > Dave, Johannes, Any comments on this new version ? Thanks Yafang
On Wed, Jan 22, 2020 at 09:46:57PM +0800, Yafang Shao wrote: > On Thu, Jan 9, 2020 at 12:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > On my server there're some running MEMCGs protected by memory.{min, low}, > > but I found the usage of these MEMCGs abruptly became very small, which > > were far less than the protect limit. It confused me and finally I > > found that was because of inode stealing. > > Once an inode is freed, all its belonging page caches will be dropped as > > well, no matter how may page caches it has. So if we intend to protect the > > page caches in a memcg, we must protect their host (the inode) first. > > Otherwise the memcg protection can be easily bypassed with freeing inode, > > especially if there're big files in this memcg. > > The inherent mismatch between memcg and inode is a trouble. One inode can > > be shared by different MEMCGs, but it is a very rare case. If an inode is > > shared, its belonging page caches may be charged to different MEMCGs. > > Currently there's no perfect solution to fix this kind of issue, but the > > inode majority-writer ownership switching can help it more or less. > > > > - Changes against v2: > > 1. Seperates memcg patches from this patchset, suggested by Roman. > > A separate patch is alreay ACKed by Roman, please the MEMCG > > maintianers help take a look at it[1]. > > 2. Improves code around the usage of for_each_mem_cgroup(), suggested > > by Dave > > 3. Use memcg_low_reclaim passed from scan_control, instead of > > introducing a new member in struct mem_cgroup. > > 4. Some other code improvement suggested by Dave. > > > > > > - Changes against v1: > > Use the memcg passed from the shrink_control, instead of getting it from > > inode itself, suggested by Dave. That could make the laying better. > > > > [1] > > https://lore.kernel.org/linux-mm/CALOAHbBhPgh3WEuLu2B6e2vj1J8K=gGOyCKzb8tKWmDqFs-rfQ@mail.gmail.com/ > > > > Yafang Shao (3): > > mm, list_lru: make memcg visible to lru walker isolation function > > mm, shrinker: make memcg low reclaim visible to lru walker isolation > > function > > memcg, inode: protect page cache from freeing inode > > > > fs/inode.c | 78 ++++++++++++++++++++++++++++++++++++++++++++-- > > include/linux/memcontrol.h | 21 +++++++++++++ > > include/linux/shrinker.h | 3 ++ > > mm/list_lru.c | 47 +++++++++++++++++----------- > > mm/memcontrol.c | 15 --------- > > mm/vmscan.c | 27 +++++++++------- > > 6 files changed, 143 insertions(+), 48 deletions(-) > > > > Dave, Johannes, > > Any comments on this new version ? Sorry, I lost track of this amongst travel and conferences mid january. Can you update and post it again once -rc1 is out? Cheers, Dave.
On Wed, Feb 5, 2020 at 5:20 AM Dave Chinner <david@fromorbit.com> wrote: > > On Wed, Jan 22, 2020 at 09:46:57PM +0800, Yafang Shao wrote: > > On Thu, Jan 9, 2020 at 12:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > On my server there're some running MEMCGs protected by memory.{min, low}, > > > but I found the usage of these MEMCGs abruptly became very small, which > > > were far less than the protect limit. It confused me and finally I > > > found that was because of inode stealing. > > > Once an inode is freed, all its belonging page caches will be dropped as > > > well, no matter how may page caches it has. So if we intend to protect the > > > page caches in a memcg, we must protect their host (the inode) first. > > > Otherwise the memcg protection can be easily bypassed with freeing inode, > > > especially if there're big files in this memcg. > > > The inherent mismatch between memcg and inode is a trouble. One inode can > > > be shared by different MEMCGs, but it is a very rare case. If an inode is > > > shared, its belonging page caches may be charged to different MEMCGs. > > > Currently there's no perfect solution to fix this kind of issue, but the > > > inode majority-writer ownership switching can help it more or less. > > > > > > - Changes against v2: > > > 1. Seperates memcg patches from this patchset, suggested by Roman. > > > A separate patch is alreay ACKed by Roman, please the MEMCG > > > maintianers help take a look at it[1]. > > > 2. Improves code around the usage of for_each_mem_cgroup(), suggested > > > by Dave > > > 3. Use memcg_low_reclaim passed from scan_control, instead of > > > introducing a new member in struct mem_cgroup. > > > 4. Some other code improvement suggested by Dave. > > > > > > > > > - Changes against v1: > > > Use the memcg passed from the shrink_control, instead of getting it from > > > inode itself, suggested by Dave. That could make the laying better. > > > > > > [1] > > > https://lore.kernel.org/linux-mm/CALOAHbBhPgh3WEuLu2B6e2vj1J8K=gGOyCKzb8tKWmDqFs-rfQ@mail.gmail.com/ > > > > > > Yafang Shao (3): > > > mm, list_lru: make memcg visible to lru walker isolation function > > > mm, shrinker: make memcg low reclaim visible to lru walker isolation > > > function > > > memcg, inode: protect page cache from freeing inode > > > > > > fs/inode.c | 78 ++++++++++++++++++++++++++++++++++++++++++++-- > > > include/linux/memcontrol.h | 21 +++++++++++++ > > > include/linux/shrinker.h | 3 ++ > > > mm/list_lru.c | 47 +++++++++++++++++----------- > > > mm/memcontrol.c | 15 --------- > > > mm/vmscan.c | 27 +++++++++------- > > > 6 files changed, 143 insertions(+), 48 deletions(-) > > > > > > > Dave, Johannes, > > > > Any comments on this new version ? > > Sorry, I lost track of this amongst travel and conferences mid > january. Can you update and post it again once -rc1 is out? > Sure, I will do it. Thanks for your reply. Thanks Yafang