Message ID | 20210729125755.16871-5-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Cleanups and fixup for memcontrol | expand |
On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: > rtpn might be NULL in very rare case. We have better to check it before > dereferencing it. Since memcg can live with NULL rb_tree_per_node in > soft_limit_tree, warn this case and continue. Why would we need to warn? the GFP flags don't contain NOWARN, so we already know an allocation failed.
On 2021/7/29 21:52, Matthew Wilcox wrote: > On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: >> rtpn might be NULL in very rare case. We have better to check it before >> dereferencing it. Since memcg can live with NULL rb_tree_per_node in >> soft_limit_tree, warn this case and continue. > > Why would we need to warn? the GFP flags don't contain NOWARN, so > we already know an allocation failed. I see. Will remove it. Many thanks! > . >
On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: > rtpn might be NULL in very rare case. We have better to check it before > dereferencing it. Since memcg can live with NULL rb_tree_per_node in > soft_limit_tree, warn this case and continue. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/memcontrol.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 5b4592d1e0f2..70a32174e7c4 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) > rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, > node_online(node) ? node : NUMA_NO_NODE); > > + if (WARN_ON_ONCE(!rtpn)) > + continue; I also really doubt that it makes any sense to continue in this case. If this allocations fails (at the very beginning of the system's life, it's an __init function), something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like a perfect choice. Is this a real world problem? Do I miss something?
On 2021/7/30 11:12, Roman Gushchin wrote: > On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: >> rtpn might be NULL in very rare case. We have better to check it before >> dereferencing it. Since memcg can live with NULL rb_tree_per_node in >> soft_limit_tree, warn this case and continue. >> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/memcontrol.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 5b4592d1e0f2..70a32174e7c4 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) >> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >> node_online(node) ? node : NUMA_NO_NODE); >> >> + if (WARN_ON_ONCE(!rtpn)) >> + continue; > > I also really doubt that it makes any sense to continue in this case. > If this allocations fails (at the very beginning of the system's life, it's an __init function), > something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like > a perfect choice. > > Is this a real world problem? Do I miss something? No, this is a theoretical bug, a very race case but not impossible IMO. Since we can't live with NULL rb_tree_per_node in soft_limit_tree, I thinks simply continue or break here without panic is also acceptable. Or is it more proper to choose panic here? Thanks. > . >
On Thu 29-07-21 20:12:43, Roman Gushchin wrote: > On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: > > rtpn might be NULL in very rare case. We have better to check it before > > dereferencing it. Since memcg can live with NULL rb_tree_per_node in > > soft_limit_tree, warn this case and continue. > > > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > > --- > > mm/memcontrol.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 5b4592d1e0f2..70a32174e7c4 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) > > rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, > > node_online(node) ? node : NUMA_NO_NODE); > > > > + if (WARN_ON_ONCE(!rtpn)) > > + continue; > > I also really doubt that it makes any sense to continue in this case. > If this allocations fails (at the very beginning of the system's life, it's an __init function), > something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like > a perfect choice. Moreover this is 24B allocation during early boot. Kernel will OOM and panic when not being able to find any victim. I do not think we need to do any special handling here.
On 2021/7/30 14:44, Michal Hocko wrote: > On Thu 29-07-21 20:12:43, Roman Gushchin wrote: >> On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: >>> rtpn might be NULL in very rare case. We have better to check it before >>> dereferencing it. Since memcg can live with NULL rb_tree_per_node in >>> soft_limit_tree, warn this case and continue. >>> >>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>> --- >>> mm/memcontrol.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >>> index 5b4592d1e0f2..70a32174e7c4 100644 >>> --- a/mm/memcontrol.c >>> +++ b/mm/memcontrol.c >>> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) >>> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >>> node_online(node) ? node : NUMA_NO_NODE); >>> >>> + if (WARN_ON_ONCE(!rtpn)) >>> + continue; >> >> I also really doubt that it makes any sense to continue in this case. >> If this allocations fails (at the very beginning of the system's life, it's an __init function), >> something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like >> a perfect choice. > > Moreover this is 24B allocation during early boot. Kernel will OOM and > panic when not being able to find any victim. I do not think we need to Agree with you. But IMO it may not be a good idea to leave the rtpn without NULL check. We should defend it though it could hardly happen. But I'm not insist on this check. I will drop this patch if you insist. Thanks both of you. > do any special handling here. >
On Sat 31-07-21 10:05:51, Miaohe Lin wrote: > On 2021/7/30 14:44, Michal Hocko wrote: > > On Thu 29-07-21 20:12:43, Roman Gushchin wrote: > >> On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: > >>> rtpn might be NULL in very rare case. We have better to check it before > >>> dereferencing it. Since memcg can live with NULL rb_tree_per_node in > >>> soft_limit_tree, warn this case and continue. > >>> > >>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >>> --- > >>> mm/memcontrol.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> > >>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c > >>> index 5b4592d1e0f2..70a32174e7c4 100644 > >>> --- a/mm/memcontrol.c > >>> +++ b/mm/memcontrol.c > >>> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) > >>> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, > >>> node_online(node) ? node : NUMA_NO_NODE); > >>> > >>> + if (WARN_ON_ONCE(!rtpn)) > >>> + continue; > >> > >> I also really doubt that it makes any sense to continue in this case. > >> If this allocations fails (at the very beginning of the system's life, it's an __init function), > >> something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like > >> a perfect choice. > > > > Moreover this is 24B allocation during early boot. Kernel will OOM and > > panic when not being able to find any victim. I do not think we need to > > Agree with you. But IMO it may not be a good idea to leave the rtpn without NULL check. We should defend > it though it could hardly happen. But I'm not insist on this check. I will drop this patch if you insist. It is not that I would insist. I just do not see any point in the code churn. This check is not going to ever trigger and there is nothing you can do to recover anyway so crashing the kernel is likely the only choice left.
On 2021/8/2 14:43, Michal Hocko wrote: > On Sat 31-07-21 10:05:51, Miaohe Lin wrote: >> On 2021/7/30 14:44, Michal Hocko wrote: >>> On Thu 29-07-21 20:12:43, Roman Gushchin wrote: >>>> On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: >>>>> rtpn might be NULL in very rare case. We have better to check it before >>>>> dereferencing it. Since memcg can live with NULL rb_tree_per_node in >>>>> soft_limit_tree, warn this case and continue. >>>>> >>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>> --- >>>>> mm/memcontrol.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >>>>> index 5b4592d1e0f2..70a32174e7c4 100644 >>>>> --- a/mm/memcontrol.c >>>>> +++ b/mm/memcontrol.c >>>>> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) >>>>> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >>>>> node_online(node) ? node : NUMA_NO_NODE); >>>>> >>>>> + if (WARN_ON_ONCE(!rtpn)) >>>>> + continue; >>>> >>>> I also really doubt that it makes any sense to continue in this case. >>>> If this allocations fails (at the very beginning of the system's life, it's an __init function), >>>> something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like >>>> a perfect choice. >>> >>> Moreover this is 24B allocation during early boot. Kernel will OOM and >>> panic when not being able to find any victim. I do not think we need to >> >> Agree with you. But IMO it may not be a good idea to leave the rtpn without NULL check. We should defend >> it though it could hardly happen. But I'm not insist on this check. I will drop this patch if you insist. > > It is not that I would insist. I just do not see any point in the code > churn. This check is not going to ever trigger and there is nothing you > can do to recover anyway so crashing the kernel is likely the only > choice left. > I hope I get the point now. What you mean is nothing we can do to recover and panic'ing on a NULL-pointer dereference is a perfect choice ? Should we declare that we leave the rtpn without NULL check on purpose like below ? Many thanks. @@ -7109,8 +7109,12 @@ static int __init mem_cgroup_init(void) rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, node_online(node) ? node : NUMA_NO_NODE); - if (WARN_ON_ONCE(!rtpn)) - continue; + /* + * If this allocation fails (at the very beginning of the + * system's life, it's an __init function), something is + * terribly wrong and panic'ing on a NULL-pointer + * dereference sounds like a perfect choice. + */ rtpn->rb_root = RB_ROOT; rtpn->rb_rightmost = NULL; spin_lock_init(&rtpn->lock);
On Mon 02-08-21 18:00:10, Miaohe Lin wrote: > On 2021/8/2 14:43, Michal Hocko wrote: > > On Sat 31-07-21 10:05:51, Miaohe Lin wrote: > >> On 2021/7/30 14:44, Michal Hocko wrote: > >>> On Thu 29-07-21 20:12:43, Roman Gushchin wrote: > >>>> On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: > >>>>> rtpn might be NULL in very rare case. We have better to check it before > >>>>> dereferencing it. Since memcg can live with NULL rb_tree_per_node in > >>>>> soft_limit_tree, warn this case and continue. > >>>>> > >>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >>>>> --- > >>>>> mm/memcontrol.c | 2 ++ > >>>>> 1 file changed, 2 insertions(+) > >>>>> > >>>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c > >>>>> index 5b4592d1e0f2..70a32174e7c4 100644 > >>>>> --- a/mm/memcontrol.c > >>>>> +++ b/mm/memcontrol.c > >>>>> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) > >>>>> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, > >>>>> node_online(node) ? node : NUMA_NO_NODE); > >>>>> > >>>>> + if (WARN_ON_ONCE(!rtpn)) > >>>>> + continue; > >>>> > >>>> I also really doubt that it makes any sense to continue in this case. > >>>> If this allocations fails (at the very beginning of the system's life, it's an __init function), > >>>> something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like > >>>> a perfect choice. > >>> > >>> Moreover this is 24B allocation during early boot. Kernel will OOM and > >>> panic when not being able to find any victim. I do not think we need to > >> > >> Agree with you. But IMO it may not be a good idea to leave the rtpn without NULL check. We should defend > >> it though it could hardly happen. But I'm not insist on this check. I will drop this patch if you insist. > > > > It is not that I would insist. I just do not see any point in the code > > churn. This check is not going to ever trigger and there is nothing you > > can do to recover anyway so crashing the kernel is likely the only > > choice left. > > > > I hope I get the point now. What you mean is nothing we can do to recover and panic'ing on a > NULL-pointer dereference is a perfect choice ? Should we declare that we leave the rtpn without > NULL check on purpose like below ? > > Many thanks. > > @@ -7109,8 +7109,12 @@ static int __init mem_cgroup_init(void) > rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, > node_online(node) ? node : NUMA_NO_NODE); > > - if (WARN_ON_ONCE(!rtpn)) > - continue; > + /* > + * If this allocation fails (at the very beginning of the > + * system's life, it's an __init function), something is > + * terribly wrong and panic'ing on a NULL-pointer > + * dereference sounds like a perfect choice. > + */ I am not really sure this is really worth it. Really we do not really want to have similar comments all over the early init code, do we? > rtpn->rb_root = RB_ROOT; > rtpn->rb_rightmost = NULL; > spin_lock_init(&rtpn->lock);
On 2021/8/2 18:42, Michal Hocko wrote: > On Mon 02-08-21 18:00:10, Miaohe Lin wrote: >> On 2021/8/2 14:43, Michal Hocko wrote: >>> On Sat 31-07-21 10:05:51, Miaohe Lin wrote: >>>> On 2021/7/30 14:44, Michal Hocko wrote: >>>>> On Thu 29-07-21 20:12:43, Roman Gushchin wrote: >>>>>> On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: >>>>>>> rtpn might be NULL in very rare case. We have better to check it before >>>>>>> dereferencing it. Since memcg can live with NULL rb_tree_per_node in >>>>>>> soft_limit_tree, warn this case and continue. >>>>>>> >>>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>>> --- >>>>>>> mm/memcontrol.c | 2 ++ >>>>>>> 1 file changed, 2 insertions(+) >>>>>>> >>>>>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >>>>>>> index 5b4592d1e0f2..70a32174e7c4 100644 >>>>>>> --- a/mm/memcontrol.c >>>>>>> +++ b/mm/memcontrol.c >>>>>>> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) >>>>>>> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >>>>>>> node_online(node) ? node : NUMA_NO_NODE); >>>>>>> >>>>>>> + if (WARN_ON_ONCE(!rtpn)) >>>>>>> + continue; >>>>>> >>>>>> I also really doubt that it makes any sense to continue in this case. >>>>>> If this allocations fails (at the very beginning of the system's life, it's an __init function), >>>>>> something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like >>>>>> a perfect choice. >>>>> >>>>> Moreover this is 24B allocation during early boot. Kernel will OOM and >>>>> panic when not being able to find any victim. I do not think we need to >>>> >>>> Agree with you. But IMO it may not be a good idea to leave the rtpn without NULL check. We should defend >>>> it though it could hardly happen. But I'm not insist on this check. I will drop this patch if you insist. >>> >>> It is not that I would insist. I just do not see any point in the code >>> churn. This check is not going to ever trigger and there is nothing you >>> can do to recover anyway so crashing the kernel is likely the only >>> choice left. >>> >> >> I hope I get the point now. What you mean is nothing we can do to recover and panic'ing on a >> NULL-pointer dereference is a perfect choice ? Should we declare that we leave the rtpn without >> NULL check on purpose like below ? >> >> Many thanks. >> >> @@ -7109,8 +7109,12 @@ static int __init mem_cgroup_init(void) >> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >> node_online(node) ? node : NUMA_NO_NODE); >> >> - if (WARN_ON_ONCE(!rtpn)) >> - continue; >> + /* >> + * If this allocation fails (at the very beginning of the >> + * system's life, it's an __init function), something is >> + * terribly wrong and panic'ing on a NULL-pointer >> + * dereference sounds like a perfect choice. >> + */ > > I am not really sure this is really worth it. Really we do not really > want to have similar comments all over the early init code, do we? Maybe not. Will drop this patch. Thanks. > >> rtpn->rb_root = RB_ROOT; >> rtpn->rb_rightmost = NULL; >> spin_lock_init(&rtpn->lock); >
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5b4592d1e0f2..70a32174e7c4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, node_online(node) ? node : NUMA_NO_NODE); + if (WARN_ON_ONCE(!rtpn)) + continue; rtpn->rb_root = RB_ROOT; rtpn->rb_rightmost = NULL; spin_lock_init(&rtpn->lock);
rtpn might be NULL in very rare case. We have better to check it before dereferencing it. Since memcg can live with NULL rb_tree_per_node in soft_limit_tree, warn this case and continue. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memcontrol.c | 2 ++ 1 file changed, 2 insertions(+)