Message ID | 20200504170518.2959478-1-daniel.thompson@linaro.org (mailing list archive) |
---|---|
State | Mainlined |
Commit | ab8ad279ceac4fc78ae4dcf1a26326e05695e537 |
Headers | show |
Series | [v2] arm64: cacheflush: Fix KGDB trap detection | expand |
On Mon, May 04, 2020 at 06:05:18PM +0100, Daniel Thompson wrote: > flush_icache_range() contains a bodge to avoid issuing IPIs when the kgdb > trap handler is running because issuing IPIs is unsafe (and not needed) > in this execution context. However the current test, based on > kgdb_connected is flawed: it both over-matches and under-matches. > > The over match occurs because kgdb_connected is set when gdb attaches > to the stub and remains set during normal running. This is relatively > harmelss because in almost all cases irq_disabled() will be false. > > The under match is more serious. When kdb is used instead of kgdb to access > the debugger then kgdb_connected is not set in all the places that the > debug core updates sw breakpoints (and hence flushes the icache). This > can lead to deadlock. > > Fix by replacing the ad-hoc check with the proper kgdb macro. This also > allows us to drop the #ifdef wrapper. > > Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") > Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> > Reviewed-by: Douglas Anderson <dianders@chromium.org> > --- > > Notes: > v2: Improve the commit message based based on feedback from Doug > Anderson > > arch/arm64/include/asm/cacheflush.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h > index e6cca3d4acf7..ce50c1f1f1ea 100644 > --- a/arch/arm64/include/asm/cacheflush.h > +++ b/arch/arm64/include/asm/cacheflush.h > @@ -79,7 +79,7 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > * IPI all online CPUs so that they undergo a context synchronization > * event and are forced to refetch the new instructions. > */ > -#ifdef CONFIG_KGDB > + > /* > * KGDB performs cache maintenance with interrupts disabled, so we > * will deadlock trying to IPI the secondary CPUs. In theory, we can > @@ -89,9 +89,9 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > * the patching operation, so we don't need extra IPIs here anyway. > * In which case, add a KGDB-specific bodge and return early. > */ > - if (kgdb_connected && irqs_disabled()) > + if (in_dbg_master()) Does this imply that irqs are disabled? Will
On Mon, May 04, 2020 at 09:48:04PM +0100, Will Deacon wrote: > On Mon, May 04, 2020 at 06:05:18PM +0100, Daniel Thompson wrote: > > flush_icache_range() contains a bodge to avoid issuing IPIs when the kgdb > > trap handler is running because issuing IPIs is unsafe (and not needed) > > in this execution context. However the current test, based on > > kgdb_connected is flawed: it both over-matches and under-matches. > > > > The over match occurs because kgdb_connected is set when gdb attaches > > to the stub and remains set during normal running. This is relatively > > harmelss because in almost all cases irq_disabled() will be false. > > > > The under match is more serious. When kdb is used instead of kgdb to access > > the debugger then kgdb_connected is not set in all the places that the > > debug core updates sw breakpoints (and hence flushes the icache). This > > can lead to deadlock. > > > > Fix by replacing the ad-hoc check with the proper kgdb macro. This also > > allows us to drop the #ifdef wrapper. > > > > Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") > > Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> > > Reviewed-by: Douglas Anderson <dianders@chromium.org> > > --- > > > > Notes: > > v2: Improve the commit message based based on feedback from Doug > > Anderson > > > > arch/arm64/include/asm/cacheflush.h | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h > > index e6cca3d4acf7..ce50c1f1f1ea 100644 > > --- a/arch/arm64/include/asm/cacheflush.h > > +++ b/arch/arm64/include/asm/cacheflush.h > > @@ -79,7 +79,7 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > > * IPI all online CPUs so that they undergo a context synchronization > > * event and are forced to refetch the new instructions. > > */ > > -#ifdef CONFIG_KGDB > > + > > /* > > * KGDB performs cache maintenance with interrupts disabled, so we > > * will deadlock trying to IPI the secondary CPUs. In theory, we can > > @@ -89,9 +89,9 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > > * the patching operation, so we don't need extra IPIs here anyway. > > * In which case, add a KGDB-specific bodge and return early. > > */ > > - if (kgdb_connected && irqs_disabled()) > > + if (in_dbg_master()) > > Does this imply that irqs are disabled? Yes. Assuming CONFIG_KGDB is enabled then in_dbg_master() expands to: (raw_smp_processor_id() == atomic_read(&kgdb_active)) kgdb_active is written to from exactly four locations in the kernel and all are within a single function, albeit a very big function with control flow the that could politely be called "quirky". I try not to think about what it might be impolitely called. kgdb_active is only ever set to a value other than -1 when we are executing the kgdb exception handler and interrupts have been explicitly disabled. Daniel.
On Tue, May 05, 2020 at 03:15:29PM +0100, Daniel Thompson wrote: > On Mon, May 04, 2020 at 09:48:04PM +0100, Will Deacon wrote: > > On Mon, May 04, 2020 at 06:05:18PM +0100, Daniel Thompson wrote: > > > diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h > > > index e6cca3d4acf7..ce50c1f1f1ea 100644 > > > --- a/arch/arm64/include/asm/cacheflush.h > > > +++ b/arch/arm64/include/asm/cacheflush.h > > > @@ -79,7 +79,7 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > > > * IPI all online CPUs so that they undergo a context synchronization > > > * event and are forced to refetch the new instructions. > > > */ > > > -#ifdef CONFIG_KGDB > > > + > > > /* > > > * KGDB performs cache maintenance with interrupts disabled, so we > > > * will deadlock trying to IPI the secondary CPUs. In theory, we can > > > @@ -89,9 +89,9 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > > > * the patching operation, so we don't need extra IPIs here anyway. > > > * In which case, add a KGDB-specific bodge and return early. > > > */ > > > - if (kgdb_connected && irqs_disabled()) > > > + if (in_dbg_master()) > > > > Does this imply that irqs are disabled? > > Yes. > > Assuming CONFIG_KGDB is enabled then in_dbg_master() expands to: > > (raw_smp_processor_id() == atomic_read(&kgdb_active)) Aha, so this can drop the raw_ prefix and call smp_processor_id() instead? I can queue the arm64 patch regardless. Cheers, Will
On Mon, 4 May 2020 18:05:18 +0100, Daniel Thompson wrote: > flush_icache_range() contains a bodge to avoid issuing IPIs when the kgdb > trap handler is running because issuing IPIs is unsafe (and not needed) > in this execution context. However the current test, based on > kgdb_connected is flawed: it both over-matches and under-matches. > > The over match occurs because kgdb_connected is set when gdb attaches > to the stub and remains set during normal running. This is relatively > harmelss because in almost all cases irq_disabled() will be false. > > [...] Applied to arm64 (for-next/misc), thanks! [1/1] arm64: cacheflush: Fix KGDB trap detection https://git.kernel.org/arm64/c/ab8ad279ceac Cheers,
On Tue, May 05, 2020 at 04:09:16PM +0100, Will Deacon wrote: > On Tue, May 05, 2020 at 03:15:29PM +0100, Daniel Thompson wrote: > > On Mon, May 04, 2020 at 09:48:04PM +0100, Will Deacon wrote: > > > On Mon, May 04, 2020 at 06:05:18PM +0100, Daniel Thompson wrote: > > > > diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h > > > > index e6cca3d4acf7..ce50c1f1f1ea 100644 > > > > --- a/arch/arm64/include/asm/cacheflush.h > > > > +++ b/arch/arm64/include/asm/cacheflush.h > > > > @@ -79,7 +79,7 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > > > > * IPI all online CPUs so that they undergo a context synchronization > > > > * event and are forced to refetch the new instructions. > > > > */ > > > > -#ifdef CONFIG_KGDB > > > > + > > > > /* > > > > * KGDB performs cache maintenance with interrupts disabled, so we > > > > * will deadlock trying to IPI the secondary CPUs. In theory, we can > > > > @@ -89,9 +89,9 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) > > > > * the patching operation, so we don't need extra IPIs here anyway. > > > > * In which case, add a KGDB-specific bodge and return early. > > > > */ > > > > - if (kgdb_connected && irqs_disabled()) > > > > + if (in_dbg_master()) > > > > > > Does this imply that irqs are disabled? > > > > Yes. Except for bugs... > > > > Assuming CONFIG_KGDB is enabled then in_dbg_master() expands to: > > > > (raw_smp_processor_id() == atomic_read(&kgdb_active)) > > Aha, so this can drop the raw_ prefix and call smp_processor_id() instead? We need to allow in_dbg_master() to be called from preemptible contexts (because its job it to disclose information about our executions context) but given irqs are always disabled when we in_dbg_master() then I think we can make this and rely on short-circuit eval to avoid PREEMPT_DEBUG errors: (irqs_disabled() && (smp_processor_id() == atomic_read(&kgdb_active))) > I can queue the arm64 patch regardless. I don't want to hide anything... when I looked closer I realized that the above change also eliminates a small window where the original macro can spuriously evaluate to true. Specifically if we migrate to a new core after reading the processor id and the previous core takes a breakpoint then we would evaluate true if we read kgdb_active before we get the IPI to bring us to halt. Sorry for overlooking this in my reply yesterday! I'll have a patch out for this shortly. Daniel.
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h index e6cca3d4acf7..ce50c1f1f1ea 100644 --- a/arch/arm64/include/asm/cacheflush.h +++ b/arch/arm64/include/asm/cacheflush.h @@ -79,7 +79,7 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) * IPI all online CPUs so that they undergo a context synchronization * event and are forced to refetch the new instructions. */ -#ifdef CONFIG_KGDB + /* * KGDB performs cache maintenance with interrupts disabled, so we * will deadlock trying to IPI the secondary CPUs. In theory, we can @@ -89,9 +89,9 @@ static inline void flush_icache_range(unsigned long start, unsigned long end) * the patching operation, so we don't need extra IPIs here anyway. * In which case, add a KGDB-specific bodge and return early. */ - if (kgdb_connected && irqs_disabled()) + if (in_dbg_master()) return; -#endif + kick_all_cpus_sync(); }