Message ID | 20240507091858.36ff767f@imladris.surriel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs/proc: fix softlockup in __read_vmcore | expand |
Hi, On 05/07/24 at 09:18am, Rik van Riel wrote: > While taking a kernel core dump with makedumpfile on a larger system, > softlockup messages often appear. > > While softlockup warnings can be harmless, they can also interfere > with things like RCU freeing memory, which can be problematic when > the kdump kexec image is configured with as little memory as possible. > > Avoid the softlockup, and give things like work items and RCU a > chance to do their thing during __read_vmcore by adding a cond_resched. Thanks for fixing this. By the way, is it easy to reproduce? And should we add some trace of the softlockup into log so that people can search for it and confirm when encountering it? Thanks Baoquan > --- > fs/proc/vmcore.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c > index 1fb213f379a5..d06607a1f137 100644 > --- a/fs/proc/vmcore.c > +++ b/fs/proc/vmcore.c > @@ -383,6 +383,8 @@ static ssize_t __read_vmcore(struct iov_iter *iter, loff_t *fpos) > /* leave now if filled buffer already */ > if (!iov_iter_count(iter)) > return acc; > + > + cond_resched(); > } > > list_for_each_entry(m, &vmcore_list, list) { > -- > 2.42.0 > > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec >
On Thu, 2024-05-09 at 11:52 +0800, Baoquan He wrote: > Hi, > > On 05/07/24 at 09:18am, Rik van Riel wrote: > > While taking a kernel core dump with makedumpfile on a larger > > system, > > softlockup messages often appear. > > > > While softlockup warnings can be harmless, they can also interfere > > with things like RCU freeing memory, which can be problematic when > > the kdump kexec image is configured with as little memory as > > possible. > > > > Avoid the softlockup, and give things like work items and RCU a > > chance to do their thing during __read_vmcore by adding a > > cond_resched. > > Thanks for fixing this. > > By the way, is it easy to reproduce? And should we add some trace of > the > softlockup into log so that people can search for it and confirm when > encountering it? It is pretty easy to reproduce, but it does not happen all the time. With millions of systems, even rare errors are common :) However, we have been running with this fix for long enough (we deployed it in order to test it) that I don't think we have theĀ warning stored any more. Those logs were rotated out long ago. kind regards, Rik
On 05/09/24 at 09:41am, Rik van Riel wrote: > On Thu, 2024-05-09 at 11:52 +0800, Baoquan He wrote: > > Hi, > > > > On 05/07/24 at 09:18am, Rik van Riel wrote: > > > While taking a kernel core dump with makedumpfile on a larger > > > system, > > > softlockup messages often appear. > > > > > > While softlockup warnings can be harmless, they can also interfere > > > with things like RCU freeing memory, which can be problematic when > > > the kdump kexec image is configured with as little memory as > > > possible. > > > > > > Avoid the softlockup, and give things like work items and RCU a > > > chance to do their thing during __read_vmcore by adding a > > > cond_resched. > > > > Thanks for fixing this. > > > > By the way, is it easy to reproduce? And should we add some trace of > > the > > softlockup into log so that people can search for it and confirm when > > encountering it? > > It is pretty easy to reproduce, but it does not happen all the time. > With millions of systems, even rare errors are common :) > > However, we have been running with this fix for long enough (we > deployed it in order to test it) that I don't think we have theĀ > warning stored any more. Those logs were rotated out long ago. OK, thanks for the explanation. Acked-by: Baoquan He <bhe@redhat.com>
diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c index 1fb213f379a5..d06607a1f137 100644 --- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -383,6 +383,8 @@ static ssize_t __read_vmcore(struct iov_iter *iter, loff_t *fpos) /* leave now if filled buffer already */ if (!iov_iter_count(iter)) return acc; + + cond_resched(); } list_for_each_entry(m, &vmcore_list, list) {
While taking a kernel core dump with makedumpfile on a larger system, softlockup messages often appear. While softlockup warnings can be harmless, they can also interfere with things like RCU freeing memory, which can be problematic when the kdump kexec image is configured with as little memory as possible. Avoid the softlockup, and give things like work items and RCU a chance to do their thing during __read_vmcore by adding a cond_resched. Signed-off-by: Rik van Riel <riel@surriel.com> --- fs/proc/vmcore.c | 2 ++ 1 file changed, 2 insertions(+)