Message ID | 20190222014625.107640-1-chenzhou10@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: kdump: fix warning message about failed to stop secondary CPUs | expand |
On Fri, Feb 22, 2019 at 09:46:25AM +0800, Chen Zhou wrote: > The local variable mask indicates non-boot cpus. Just print the num > of failing to stop secondary CPUs using varaible waiting_for_crash_ipi. Why? Also s/varaible/variable/ > Also replace two pr_warning with the shorter pr_warn. > > Signed-off-by: Chen Zhou <chenzhou10@huawei.com> > --- > arch/arm64/kernel/smp.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > index 824de70..a4a952e 100644 > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -989,7 +989,7 @@ void smp_send_stop(void) > udelay(1); > > if (num_online_cpus() > 1) > - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", > + pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", > cpumask_pr_args(cpu_online_mask)); > > sdei_mask_local_cpu(); > @@ -1030,8 +1030,8 @@ void crash_smp_send_stop(void) > udelay(1); > > if (atomic_read(&waiting_for_crash_ipi) > 0) > - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", > - cpumask_pr_args(&mask)); > + pr_warn("SMP: failed to stop secondary CPUs %d\n", > + atomic_read(&waiting_for_crash_ipi)); But now this message doesn't make any sense. Will
On Fri, Feb 22, 2019 at 10:03:05AM +0000, Will Deacon wrote: > On Fri, Feb 22, 2019 at 09:46:25AM +0800, Chen Zhou wrote: > > The local variable mask indicates non-boot cpus. Just print the num > > of failing to stop secondary CPUs using varaible waiting_for_crash_ipi. > > Why? Also s/varaible/variable/ I think the point is that the mask is an over-estimate, since it's all of the CPUs which were online when we sent the stop IPI, not only the ones we failed to stop. If we had 255 secondary CPUs, and one failed to offline, we log the IDs of all 255 CPUs which we tried to offline. > > Also replace two pr_warning with the shorter pr_warn. > > > > Signed-off-by: Chen Zhou <chenzhou10@huawei.com> > > --- > > arch/arm64/kernel/smp.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > > index 824de70..a4a952e 100644 > > --- a/arch/arm64/kernel/smp.c > > +++ b/arch/arm64/kernel/smp.c > > @@ -989,7 +989,7 @@ void smp_send_stop(void) > > udelay(1); > > > > if (num_online_cpus() > 1) > > - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", > > + pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", > > cpumask_pr_args(cpu_online_mask)); > > > > sdei_mask_local_cpu(); > > @@ -1030,8 +1030,8 @@ void crash_smp_send_stop(void) > > udelay(1); > > > > if (atomic_read(&waiting_for_crash_ipi) > 0) > > - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", > > - cpumask_pr_args(&mask)); > > + pr_warn("SMP: failed to stop secondary CPUs %d\n", > > + atomic_read(&waiting_for_crash_ipi)); I think this should be: pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", cpumask_pr_args(cpu_online_mask)); ... matching smp_send_stop(). In either case that includes the current processor, but that's not a new bug, and not a big deal. Thanks, Mark.
Hi Mark, > -----Original Message----- > From: Mark Rutland [mailto:mark.rutland@arm.com] > Sent: Friday, February 22, 2019 7:38 PM > To: Will Deacon > Cc: chenzhou; catalin.marinas@arm.com; chengjian (D); > linux-arm-kernel@lists.infradead.org; takahiro.akashi@linaro.org > Subject: Re: [PATCH] arm64: kdump: fix warning message about failed to > stop secondary CPUs > > On Fri, Feb 22, 2019 at 10:03:05AM +0000, Will Deacon wrote: > > On Fri, Feb 22, 2019 at 09:46:25AM +0800, Chen Zhou wrote: > > > The local variable mask indicates non-boot cpus. Just print the num > > > of failing to stop secondary CPUs using varaible waiting_for_crash_ipi. > > > > Why? Also s/varaible/variable/ > > I think the point is that the mask is an over-estimate, since it's all > of the CPUs which were online when we sent the stop IPI, not only the > ones we failed to stop. > > If we had 255 secondary CPUs, and one failed to offline, we log the IDs > of all 255 CPUs which we tried to offline. > > > > Also replace two pr_warning with the shorter pr_warn. > > > > > > Signed-off-by: Chen Zhou <chenzhou10@huawei.com> > > > --- > > > arch/arm64/kernel/smp.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > > > index 824de70..a4a952e 100644 > > > --- a/arch/arm64/kernel/smp.c > > > +++ b/arch/arm64/kernel/smp.c > > > @@ -989,7 +989,7 @@ void smp_send_stop(void) > > > udelay(1); > > > > > > if (num_online_cpus() > 1) > > > - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", > > > + pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", > > > cpumask_pr_args(cpu_online_mask)); > > > > > > sdei_mask_local_cpu(); > > > @@ -1030,8 +1030,8 @@ void crash_smp_send_stop(void) > > > udelay(1); > > > > > > if (atomic_read(&waiting_for_crash_ipi) > 0) > > > - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", > > > - cpumask_pr_args(&mask)); > > > + pr_warn("SMP: failed to stop secondary CPUs %d\n", > > > + atomic_read(&waiting_for_crash_ipi)); > > I think this should be: > > pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", > cpumask_pr_args(cpu_online_mask)); > > ... matching smp_send_stop(). > > In either case that includes the current processor, but that's not a new > bug, and not a big deal. > You are right. The mask and cpu_online_mask in kdump path are both the online CPUs before sending IPI stop. But the cpu_online_mask of the smp_send_stop() you mentioned is the online CPUs after sending the IPI stop, which is the CPUs failed to stop, not equal to the online CPUs before sending IPI stop. The root cause is that the ipi_cpu_stop() will set related cpu_online_mask false, but the ipi_cpu_crash_stop() won't do this. Therefore, we can't get the correct online CPUs at the end of crash_smp_send_stop(). In kdump path, we introduce waiting_for_crash_ipi to record the number of CPUs failed to stop, and we only have the information about the number of CPUs failed to stop. Sorry, I don't know if I express clear. Thanks Chen Zhou > Thanks, > Mark.
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 824de70..a4a952e 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -989,7 +989,7 @@ void smp_send_stop(void) udelay(1); if (num_online_cpus() > 1) - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", + pr_warn("SMP: failed to stop secondary CPUs %*pbl\n", cpumask_pr_args(cpu_online_mask)); sdei_mask_local_cpu(); @@ -1030,8 +1030,8 @@ void crash_smp_send_stop(void) udelay(1); if (atomic_read(&waiting_for_crash_ipi) > 0) - pr_warning("SMP: failed to stop secondary CPUs %*pbl\n", - cpumask_pr_args(&mask)); + pr_warn("SMP: failed to stop secondary CPUs %d\n", + atomic_read(&waiting_for_crash_ipi)); sdei_mask_local_cpu(); }
The local variable mask indicates non-boot cpus. Just print the num of failing to stop secondary CPUs using varaible waiting_for_crash_ipi. Also replace two pr_warning with the shorter pr_warn. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> --- arch/arm64/kernel/smp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)