Message ID | 20200908142456.89626-1-zangchunxin@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/vmscan: fix infinite loop in drop_slab_node | expand |
drop_caches by its very nature can be extremely performance intensive -- if someone wants to abort after trying too long, they can just send a TASK_KILLABLE signal, no? If exiting the loop and returning to usermode doesn't reliably work when doing that, then _that's_ something to improve, but this looks premature to me until that's demonstrated not to work. zangchunxin@bytedance.com writes: >In one drop caches action, only traverse memcg once maybe is better. >If user need more memory, they can do drop caches again. Can you please provide some measurements of the difference in reclamation in practice?
On 9/8/20 5:09 PM, Chris Down wrote: > drop_caches by its very nature can be extremely performance intensive -- if > someone wants to abort after trying too long, they can just send a > TASK_KILLABLE signal, no? If exiting the loop and returning to usermode doesn't > reliably work when doing that, then _that's_ something to improve, but this > looks premature to me until that's demonstrated not to work. Hm there might be existings scripts (even though I dislike those) running drop_caches periodically, and they are currently not set up to be killed, so one day it might surprise someone. Dropping should be a one-time event, not a continual reclaim. Maybe we could be a bit smarter and e.g. double the threshold currently hardcoded as "10" with each iteration? > zangchunxin@bytedance.com writes: >>In one drop caches action, only traverse memcg once maybe is better. >>If user need more memory, they can do drop caches again. > > Can you please provide some measurements of the difference in reclamation in > practice? >
Vlastimil Babka writes: >On 9/8/20 5:09 PM, Chris Down wrote: >> drop_caches by its very nature can be extremely performance intensive -- if >> someone wants to abort after trying too long, they can just send a >> TASK_KILLABLE signal, no? If exiting the loop and returning to usermode doesn't >> reliably work when doing that, then _that's_ something to improve, but this >> looks premature to me until that's demonstrated not to work. > >Hm there might be existings scripts (even though I dislike those) running >drop_caches periodically, and they are currently not set up to be killed, so one >day it might surprise someone. Dropping should be a one-time event, not a >continual reclaim. Sure, but these scripts can send it to their child themselves instead of using WCE, no? As far as I know, that's the already supported way to abort drop_caches-induced reclaim.
Hi Chris, On Tue, Sep 8, 2020 at 11:09 PM Chris Down <chris@chrisdown.name> wrote: > > drop_caches by its very nature can be extremely performance intensive -- if > someone wants to abort after trying too long, they can just send a > TASK_KILLABLE signal, no? If exiting the loop and returning to usermode doesn't > reliably work when doing that, then _that's_ something to improve, but this > looks premature to me until that's demonstrated not to work. Sending a TASK_KILLABLE signal? It didn't work now. Because the the current task has no chance to handle the signal. So I think we may need to do any of the following things to avoid this case happening. 1. Double the threshold currently hard coded as "10" with each iteration suggested by Vlastimil. It is also a good idea. 2. In the while loop, we can check whether the TASK_KILLABLE signal is set, if so, we should break the loop. like the following code snippe. Thanks. @@ -704,6 +704,9 @@ void drop_slab_node(int nid) do { struct mem_cgroup *memcg = NULL; + if (fatal_signal_pending(current)) + return; + freed = 0; memcg = mem_cgroup_iter(NULL, NULL, NULL); do { > > zangchunxin@bytedance.com writes: > >In one drop caches action, only traverse memcg once maybe is better. > >If user need more memory, they can do drop caches again. > > Can you please provide some measurements of the difference in reclamation in > practice? -- Yours, Muchun
Muchun Song writes: >1. Double the threshold currently hard coded as "10" with each iteration > suggested by Vlastimil. It is also a good idea. I think this sounds reasonable, although I'd like to see what the difference in reclaim looks like in practice. >2. In the while loop, we can check whether the TASK_KILLABLE > signal is set, if so, we should break the loop. like the following code > snippe. Thanks. > >@@ -704,6 +704,9 @@ void drop_slab_node(int nid) > do { > struct mem_cgroup *memcg = NULL; > >+ if (fatal_signal_pending(current)) >+ return; >+ > freed = 0; > memcg = mem_cgroup_iter(NULL, NULL, NULL); > do { Regardless of anything, I think this is probably a good idea. Could you send it as a patch? :-) Thanks, Chris
On Wed, Sep 9, 2020 at 5:59 PM Chris Down <chris@chrisdown.name> wrote: > > Muchun Song writes: > >1. Double the threshold currently hard coded as "10" with each iteration > > suggested by Vlastimil. It is also a good idea. > > I think this sounds reasonable, although I'd like to see what the difference in > reclaim looks like in practice. > > >2. In the while loop, we can check whether the TASK_KILLABLE > > signal is set, if so, we should break the loop. like the following code > > snippe. Thanks. > > > >@@ -704,6 +704,9 @@ void drop_slab_node(int nid) > > do { > > struct mem_cgroup *memcg = NULL; > > > >+ if (fatal_signal_pending(current)) > >+ return; > >+ > > freed = 0; > > memcg = mem_cgroup_iter(NULL, NULL, NULL); > > do { > > Regardless of anything, I think this is probably a good idea. Could you send it > as a patch? :-) OK, Will do that thanks. > > Thanks, > > Chris
Chris Down <chris@chrisdown.name> writes: > Muchun Song writes: > >1. Double the threshold currently hard coded as "10" with each iteration > > suggested by Vlastimil. It is also a good idea. > > I think this sounds reasonable, although I'd like to see what the > difference in > reclaim looks like in practice. > > >2. In the while loop, we can check whether the TASK_KILLABLE > > signal is set, if so, we should break the loop. like the following > code > > snippe. Thanks. > > > >@@ -704,6 +704,9 @@ void drop_slab_node(int nid) > > do { > > struct mem_cgroup *memcg = NULL; > > > >+ if (fatal_signal_pending(current)) > >+ return; > >+ > > freed = 0; > > memcg = mem_cgroup_iter(NULL, NULL, NULL); > > do { > > Regardless of anything, I think this is probably a good idea. Could you > send it > as a patch? :-) > > Thanks, > > Chris We have send the v2 version, please check. Best Wishes ChunXin
diff --git a/mm/vmscan.c b/mm/vmscan.c index b6d84326bdf2..9d8ee2ae5824 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -699,17 +699,12 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, void drop_slab_node(int nid) { - unsigned long freed; + struct mem_cgroup *memcg = NULL; + memcg = mem_cgroup_iter(NULL, NULL, NULL); do { - struct mem_cgroup *memcg = NULL; - - freed = 0; - memcg = mem_cgroup_iter(NULL, NULL, NULL); - do { - freed += shrink_slab(GFP_KERNEL, nid, memcg, 0); - } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); - } while (freed > 10); + shrink_slab(GFP_KERNEL, nid, memcg, 0); + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); } void drop_slab(void)