Message ID | 20180713174937.5ddaqpylalcmc3jq@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 2018-07-13 at 19:49 +0200, Sebastian Andrzej Siewior wrote: > In v4.16-RT I noticed a number of warnings from task_fpsimd_load(). The > code disables BH and expects that it is not preemptible. On -RT the > task remains preemptible but remains the same CPU. This may corrupt the > content of the SIMD registers if the task is preempted during > saving/restoring those registers. > Add a locallock around this process. This avoids that the any function > within the locallock block is invoked more than once on the same CPU. > > The kernel_neon_begin() can't be kept preemptible. If the task-switch notices > TIF_FOREIGN_FPSTATE then it would restore task's SIMD state and we lose the > state of registers used for in-kernel-work. We would require additional storage > for the in-kernel copy of the registers. But then the NEON-crypto checks for > the need-resched flag so it shouldn't that bad. > The preempt_disable() avoids the context switch while the kernel uses the SIMD > registers. Unfortunately we have to balance out the migrate_disable() counter > because local_lock_bh() is invoked in different context compared to its unlock > counterpart. > > __efi_fpsimd_begin() should not use kernel_fpu_begin() due to its > preempt_disable() context and instead save the registers always in its > extra spot on RT. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > > This seems to make work (crypto chacha20-neon + cyclictest). I have no > EFI so I have no clue if saving SIMD while calling to EFI works. All is not well on cavium test box. I'm seeing random errors ala... ./include/linux/fs.h:3137:11: internal compiler error: Segmentation fault ./include/linux/bio.h:175:1: internal compiler error: in grokdeclarator, at c/c-decl.c:7023 ...during make -j96 (2*cpus) kbuild. Turns out 4.14-rt has this issue as well, which is unsurprising if it's related to fpsimd woes. Box does not exhibit the issue with NONRT kernels, PREEMPT or NOPREEMPT. To file under FWIW, arm64 configured SLE15-RT, 4.12 based kernel containing virgin @stable arch/arm64/kernel/fpsimd.c, does not exhibit the problem. (relevant? dunno, it may be unrelated to fpsimd.c). -Mike
On Fri, Jul 13, 2018 at 07:49:38PM +0200, Sebastian Andrzej Siewior wrote: > In v4.16-RT I noticed a number of warnings from task_fpsimd_load(). The > code disables BH and expects that it is not preemptible. On -RT the > task remains preemptible but remains the same CPU. This may corrupt the > content of the SIMD registers if the task is preempted during > saving/restoring those registers. > Add a locallock around this process. This avoids that the any function > within the locallock block is invoked more than once on the same CPU. The overall shape of this looks reasonable -- I have some comments and concerns though, below. I'm to blame for much of the current shape of this code, so please Cc me on reposts. > The kernel_neon_begin() can't be kept preemptible. If the task-switch notices > TIF_FOREIGN_FPSTATE then it would restore task's SIMD state and we lose the > state of registers used for in-kernel-work. We would require additional storage Are you trying to say that a kernel_neon_begin()...kernel_neon_end() critical section can't be preemptible? Whether kernel_neon_begin() itself is preemptible is more of an implementation detail. > for the in-kernel copy of the registers. But then the NEON-crypto checks for > the need-resched flag so it shouldn't that bad. This sounds like an (understandably) confused description of how the existing code is intended to work, rather than what the patch does. Can we say something along the lines of: "kernel_neon_begin() ... kernel_neon_end() are intended to delimit a critical section where the executing context has exclusive access to the cpu's FPSIMD/SVE registers. This means that we must not allow preemption by another context or migration to another cpu during this region [...]" (Not trying to be pedantic here: this -rt patch should help to clarify the intended semantics of the code here beyond what is currently explicit. What I truly intended is less clear than I'd like... even to me.) > The preempt_disable() avoids the context switch while the kernel uses the SIMD > registers. Unfortunately we have to balance out the migrate_disable() counter > because local_lock_bh() is invoked in different context compared to its unlock > counterpart. > > __efi_fpsimd_begin() should not use kernel_fpu_begin() due to its Confused -- is this a typo for "kernel_neon_begin()"? > preempt_disable() context and instead save the registers always in its > extra spot on RT. I'm not sure this works. Some EFI runtime service calls are interruptible, so we arrange here to stash off the state as normal if an EFI call is made from an interruptible context; otherwise we stash in a dedicated percpu side buffer and restore eagerly in __efi_fpsimd_end(). If we now unconditionally use the side-buffer with IS_ENABLED(CONFIG_PREEMPT_RT_BASE), I think a nested EFI call will save state over the outer context's saved state, leading to corruption of the vector registers. TBH, I think EFI calls need to be non-preemptible and non-migratable. I doubt anything else will conform to the EFI spec. IIRC EFI runtime services are not specified to be in any way real-time, but I don't think there's a lot we can do about that. > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > > This seems to make work (crypto chacha20-neon + cyclictest). I have no > EFI so I have no clue if saving SIMD while calling to EFI works. > > arch/arm64/kernel/fpsimd.c | 47 ++++++++++++++++++++++++++------------------- > 1 file changed, 28 insertions(+), 19 deletions(-) > > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -38,6 +38,7 @@ > #include <linux/signal.h> > #include <linux/slab.h> > #include <linux/sysctl.h> > +#include <linux/locallock.h> > > #include <asm/fpsimd.h> > #include <asm/cputype.h> > @@ -235,7 +236,7 @@ static void sve_user_enable(void) > * whether TIF_SVE is clear or set, since these are not vector length > * dependent. > */ > - > +static DEFINE_LOCAL_IRQ_LOCK(fpsimd_lock); Does this really disable IRQs, and if so, why? If so, this creates an IRQ blackout around task_fpsimd_save(), which is something we explicitly tried to avoid in the original design. It can be costly. Also, this belongs alongside the declaration for fpsimd_last_state, since that's (part of) what the lock protects. > /* > * Update current's FPSIMD/SVE registers from thread_struct. > * > @@ -594,7 +595,7 @@ int sve_set_vector_length(struct task_st > * non-SVE thread. > */ > if (task == current) { > - local_bh_disable(); > + local_lock_bh(fpsimd_lock); > > task_fpsimd_save(); > set_thread_flag(TIF_FOREIGN_FPSTATE); > @@ -605,7 +606,7 @@ int sve_set_vector_length(struct task_st > sve_to_fpsimd(task); > > if (task == current) > - local_bh_enable(); > + local_unlock_bh(fpsimd_lock); > > /* > * Force reallocation of task SVE state to the correct size > @@ -837,7 +838,7 @@ asmlinkage void do_sve_acc(unsigned int > > sve_alloc(current); > > - local_bh_disable(); > + local_lock_bh(fpsimd_lock); > > task_fpsimd_save(); > fpsimd_to_sve(current); > @@ -849,7 +850,7 @@ asmlinkage void do_sve_acc(unsigned int > if (test_and_set_thread_flag(TIF_SVE)) > WARN_ON(1); /* SVE access shouldn't have trapped */ > > - local_bh_enable(); > + local_unlock_bh(fpsimd_lock); > } > > /* > @@ -925,7 +926,7 @@ void fpsimd_flush_thread(void) > if (!system_supports_fpsimd()) > return; > > - local_bh_disable(); > + local_lock_bh(fpsimd_lock); > > memset(¤t->thread.fpsimd_state, 0, sizeof(struct fpsimd_state)); > fpsimd_flush_task_state(current); > @@ -967,7 +968,7 @@ void fpsimd_flush_thread(void) > > set_thread_flag(TIF_FOREIGN_FPSTATE); > > - local_bh_enable(); > + local_unlock_bh(fpsimd_lock); > } > > /* > @@ -979,9 +980,9 @@ void fpsimd_preserve_current_state(void) > if (!system_supports_fpsimd()) > return; > > - local_bh_disable(); > + local_lock_bh(fpsimd_lock); > task_fpsimd_save(); > - local_bh_enable(); > + local_unlock_bh(fpsimd_lock); > } > > /* > @@ -1021,14 +1022,14 @@ void fpsimd_restore_current_state(void) > if (!system_supports_fpsimd()) > return; > > - local_bh_disable(); > + local_lock_bh(fpsimd_lock); > > if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) { > task_fpsimd_load(); > fpsimd_bind_to_cpu(); > } > > - local_bh_enable(); > + local_unlock_bh(fpsimd_lock); > } > > /* > @@ -1041,7 +1042,7 @@ void fpsimd_update_current_state(struct > if (!system_supports_fpsimd()) > return; > > - local_bh_disable(); > + local_lock_bh(fpsimd_lock); > > current->thread.fpsimd_state.user_fpsimd = *state; > if (system_supports_sve() && test_thread_flag(TIF_SVE)) > @@ -1052,7 +1053,7 @@ void fpsimd_update_current_state(struct > if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) > fpsimd_bind_to_cpu(); > > - local_bh_enable(); > + local_unlock_bh(fpsimd_lock); > } > > /* > @@ -1115,7 +1116,7 @@ void kernel_neon_begin(void) > > BUG_ON(!may_use_simd()); > > - local_bh_disable(); > + local_lock_bh(fpsimd_lock); > > __this_cpu_write(kernel_neon_busy, true); > > @@ -1129,8 +1130,14 @@ void kernel_neon_begin(void) > fpsimd_flush_cpu_state(); > > preempt_disable(); > - > - local_bh_enable(); > + /* > + * ballance atomic vs !atomic context of migrate_disable(). > + * local_lock_bh = get_local_var() + spin_lock_bh (2x migrate_disable) > + */ > + migrate_disable(); > + migrate_disable(); > + migrate_disable(); > + local_unlock_bh(fpsimd_lock); This looks fragile. Do we really need to do it 3 times? Showing my ignorance here, but I'd naively expect that the migrate- disableness changes as follows: /* 0 */ local_lock_bh(); /* +3 */ ... preempt_disable(); /* +3 */ migrate_disable(); /* +4 */ migrate_disable(); /* +5 */ migrate_disable(); /* +6 */ local_unlock_bh(); /* +3 */ If so, I don't understand why it's not OK to call migrate_disable() just once, leaving the count at +1 (?) I'm making assumptions about the relationship between preempt_disable() and migrate_disable() here. > } > EXPORT_SYMBOL(kernel_neon_begin); > > @@ -1154,6 +1161,10 @@ void kernel_neon_end(void) > WARN_ON(!busy); /* No matching kernel_neon_begin()? */ > > preempt_enable(); > + /* balance migrate_disable(). See kernel_neon_begin() */ > + migrate_enable(); > + migrate_enable(); > + migrate_enable(); > } > EXPORT_SYMBOL(kernel_neon_end); > > @@ -1185,9 +1196,7 @@ void __efi_fpsimd_begin(void) > if (!system_supports_fpsimd()) > return; > > - WARN_ON(preemptible()); > - > - if (may_use_simd()) { > + if (!IS_ENABLED(CONFIG_PREEMPT_RT_BASE) && may_use_simd()) { I suspect this is wrong -- see comments on the commit message. > kernel_neon_begin(); > } else { [...] Cheers ---Dave
On 2018-07-18 11:12:10 [+0200], To Dave Martin wrote: > > > - if (may_use_simd()) { > > > + if (!IS_ENABLED(CONFIG_PREEMPT_RT_BASE) && may_use_simd()) { > > > > I suspect this is wrong -- see comments on the commit message. I'm sorry, I pressed send too early, I was aiming for the draft folder. So yes, based on that EFI that might be interruptible, let me try to look at the initial issue again and maybe I get another idea how to deal with this. One question: If EFI is interruptible that means, we call into EFI - how do we get out? Does EFI enable interrupts and the kernel receives an interrupt and treats this EFI call like a regular context switch? > > > kernel_neon_begin(); > > > } else { > > > > [...] > > > > Cheers > > ---Dave Sebastian
On 2018-07-14 00:03:44 [+0200], Mike Galbraith wrote: > > This seems to make work (crypto chacha20-neon + cyclictest). I have no > > EFI so I have no clue if saving SIMD while calling to EFI works. > > All is not well on cavium test box. I'm seeing random errors ala... > > ./include/linux/fs.h:3137:11: internal compiler error: Segmentation fault > ./include/linux/bio.h:175:1: internal compiler error: in grokdeclarator, at c/c-decl.c:7023 > > ...during make -j96 (2*cpus) kbuild. Turns out 4.14-rt has this issue > as well, which is unsurprising if it's related to fpsimd woes. Box > does not exhibit the issue with NONRT kernels, PREEMPT or NOPREEMPT. > > To file under FWIW, arm64 configured SLE15-RT, 4.12 based kernel > containing virgin @stable arch/arm64/kernel/fpsimd.c, does not exhibit > the problem. (relevant? dunno, it may be unrelated to fpsimd.c). Okay, so you did not test this because you can't compile. But the "internal compiler error" is usually something the gcc folks are interested in :) > -Mike Sebastian
On Wed, 2018-07-18 at 11:27 +0200, Sebastian Andrzej Siewior wrote: > On 2018-07-14 00:03:44 [+0200], Mike Galbraith wrote: > > > This seems to make work (crypto chacha20-neon + cyclictest). I have no > > > EFI so I have no clue if saving SIMD while calling to EFI works. > > > > All is not well on cavium test box. I'm seeing random errors ala... > > > > ./include/linux/fs.h:3137:11: internal compiler error: Segmentation fault > > ./include/linux/bio.h:175:1: internal compiler error: in grokdeclarator, at c/c-decl.c:7023 > > > > ...during make -j96 (2*cpus) kbuild. Turns out 4.14-rt has this issue > > as well, which is unsurprising if it's related to fpsimd woes. Box > > does not exhibit the issue with NONRT kernels, PREEMPT or NOPREEMPT. > > > > To file under FWIW, arm64 configured SLE15-RT, 4.12 based kernel > > containing virgin @stable arch/arm64/kernel/fpsimd.c, does not exhibit > > the problem. (relevant? dunno, it may be unrelated to fpsimd.c). > > Okay, so you did not test this because you can't compile. Nope, the running kernel, the one that is doing the segfaulting etc, has the patches applied. It is exhibiting that symptom because those patches do not cure this symptom, one which I verified to be present in virgin 4.14-rt as well. The pseudo-patch I sent, disabling preemption where it is assumed to be disabled instead, does cure it. With preemption so disabled, I can beat on affected kernels (>=4.14-rt) as long as I like. This particular 48 core Cavium is very slow, maybe that makes it easier to reproduce, dunno. According to pipe-test, the thing is essentially a dozen RPi super-glued together. pipe-test pinned to a single core can only context switch at ~40KHz with PREEMPT_RT, or ~90 with NOPREEMPT, comparable to measurement done in real deal RPi. -Mike
On 2018-07-18 12:28:48 [+0200], Mike Galbraith wrote: > > Okay, so you did not test this because you can't compile. > > Nope, the running kernel, the one that is doing the segfaulting etc, > has the patches applied. > > It is exhibiting that symptom because those patches do not cure this > symptom, one which I verified to be present in virgin 4.14-rt as well. > The pseudo-patch I sent, disabling preemption where it is assumed to be > disabled instead, does cure it. With preemption so disabled, I can > beat on affected kernels (>=4.14-rt) as long as I like. ah. so gcc shows the problem instead gcc explodes with the patch applied. Okay. Let me stare at this a little more… > This particular 48 core Cavium is very slow, maybe that makes it easier > to reproduce, dunno. According to pipe-test, the thing is essentially > a dozen RPi super-glued together. pipe-test pinned to a single core > can only context switch at ~40KHz with PREEMPT_RT, or ~90 with > NOPREEMPT, comparable to measurement done in real deal RPi. > > -Mike Sebastian
On Wed, 18 Jul 2018 11:12:10 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > > @@ -1115,7 +1116,7 @@ void kernel_neon_begin(void) > > > > > > BUG_ON(!may_use_simd()); > > > > > > - local_bh_disable(); > > > + local_lock_bh(fpsimd_lock); > > > > > > __this_cpu_write(kernel_neon_busy, true); > > > > > > @@ -1129,8 +1130,14 @@ void kernel_neon_begin(void) > > > fpsimd_flush_cpu_state(); > > > > > > preempt_disable(); > > > - > > > - local_bh_enable(); > > > + /* > > > + * ballance atomic vs !atomic context of migrate_disable(). > > > + * local_lock_bh = get_local_var() + spin_lock_bh (2x migrate_disable) > > > + */ > > > + migrate_disable(); > > > + migrate_disable(); > > > + migrate_disable(); > > > + local_unlock_bh(fpsimd_lock); > > > > This looks fragile. > > > > Do we really need to do it 3 times? > > Unfortunately yes. Then we need to find another solution, because this is way too ugly and as Dave said, fragile to keep. > > > Showing my ignorance here, but I'd naively expect that the migrate- > > disableness changes as follows: > > > > /* 0 */ > > local_lock_bh(); /* +3 */ > > > > ... > > > > preempt_disable(); /* +3 */ > > > > migrate_disable(); /* +4 */ > > migrate_disable(); /* +5 */ > > migrate_disable(); /* +6 */ > > > > local_unlock_bh(); /* +3 */ > > > > > > If so, I don't understand why it's not OK to call migrate_disable() > > just once, leaving the count at +1 (?) > > > > I'm making assumptions about the relationship between preempt_disable() > > and migrate_disable() here. > > Exactly. So local_lock_bh() does +3 which I try to explain why it is 3. How does local_lock_bh() do a +3 (not seeing it in the code). And leaking internal implementation of local_lock_bh() into other parts of the kernel is a no no. > Now migrate_disable() has two counters: One for atomic context (irq or > preemption disabled) and on for non-atomic context. The difference is > that in atomic context migrate_disable() does nothing and in !atomic > context it does other things which can't be done in atomic context > anyway (I can explain this in full detail but you may lose interest so I > keep it short). To update your example: > > /* ATOMIC COUNTER | non-ATOMIC COUNTER | change */ > /* 0 | 0 | - */ > local_lock_bh(); /* 0 | 3 | N+3 */ > > ... > > preempt_disable(); /* atomic context */ > > migrate_disable(); /* 1 | 3 | A+1 */ > migrate_disable(); /* 2 | 3 | A+1 */ > migrate_disable(); /* 3 | 3 | A+1 */ > > local_unlock_bh(); /* 0 | 3 | A-3 */ > > /* later */ > preempt_enable(); /* non-atomic context */ > migrate_enable(); /* 0 | 2 | N-1 */ > migrate_enable(); /* 0 | 1 | N-1 */ > migrate_enable(); /* 0 | 0 | N-1 */ If anything, this needs a wrapper. local_lock_preempt_fixup() ? -- Steve > > > > } > > > EXPORT_SYMBOL(kernel_neon_begin); > > > > > > @@ -1154,6 +1161,10 @@ void kernel_neon_end(void) > > > WARN_ON(!busy); /* No matching kernel_neon_begin()? */ > > > > > > preempt_enable(); > > > + /* balance migrate_disable(). See kernel_neon_begin() */ > > > + migrate_enable(); > > > + migrate_enable(); > > > + migrate_enable(); > > > } > > > EXPORT_SYMBOL(kernel_neon_end); > > > > > > @@ -1185,9 +1196,7 @@ void __efi_fpsimd_begin(void) > > > if (!system_supports_fpsimd()) > > > return; > > > > > > - WARN_ON(preemptible()); > > > - > > > - if (may_use_simd()) { > > > + if (!IS_ENABLED(CONFIG_PREEMPT_RT_BASE) && may_use_simd()) { > > > > I suspect this is wrong -- see comments on the commit message. > > > > > kernel_neon_begin(); > > > } else { > > > > [...] > > > > Cheers > > ---Dave
On 2018-07-24 09:46:23 [-0400], Steven Rostedt wrote: > > Unfortunately yes. > > Then we need to find another solution, because this is way too ugly and > as Dave said, fragile to keep. Yes. I have something new where Mike said it works (while this causes Mike's gcc to segfault). Need to test this myself… > How does local_lock_bh() do a +3 (not seeing it in the code). And get_local_var() +1 spin_lock_bh() +2 because local_bh_disable() +1 spin_lock() +1 Sebastian
On Wed, Jul 18, 2018 at 11:24:48AM +0200, Sebastian Andrzej Siewior wrote: > On 2018-07-18 11:12:10 [+0200], To Dave Martin wrote: > > > > - if (may_use_simd()) { > > > > + if (!IS_ENABLED(CONFIG_PREEMPT_RT_BASE) && may_use_simd()) { > > > > > > I suspect this is wrong -- see comments on the commit message. > > I'm sorry, I pressed send too early, I was aiming for the draft folder. > So yes, based on that EFI that might be interruptible, let me try to > look at the initial issue again and maybe I get another idea how to deal > with this. > One question: If EFI is interruptible that means, we call into EFI - how > do we get out? Does EFI enable interrupts and the kernel receives an > interrupt and treats this EFI call like a regular context switch? AFAIK the only safe way to get out permanently is for the call to return. Note, I've not gone through the spec in fine detail myself. The OS may handle interrupts occurring during the EFI call, but we still have to return to EFI afterwards to finish off the call. From the Linux perspective, I think this means that EFI calls are non- preemptible. Under RT, I'm pretty sure that we can't safely resume the interrupted EFI call on a different cpu from the one it was interrupted on. Even if it doesn't say this explicitly in the UEFI spec, I think it will be assumed in implementations. Certain EFI calls are not long-running and may need to be called from interrupt context in Linux, which means that there may be live kernel- mode NEON state. This is why there are separate FPSIMD/SVE percpu stash buffers for EFI specifically. Does this make sense? It's is probably not very clear, but I'm trying to hide the fact that I haven't looked at the UEFI spec for ages... Cheers ---Dave
On 24 July 2018 at 16:45, Dave Martin <Dave.Martin@arm.com> wrote: > On Wed, Jul 18, 2018 at 11:24:48AM +0200, Sebastian Andrzej Siewior wrote: >> On 2018-07-18 11:12:10 [+0200], To Dave Martin wrote: >> > > > - if (may_use_simd()) { >> > > > + if (!IS_ENABLED(CONFIG_PREEMPT_RT_BASE) && may_use_simd()) { >> > > >> > > I suspect this is wrong -- see comments on the commit message. >> >> I'm sorry, I pressed send too early, I was aiming for the draft folder. >> So yes, based on that EFI that might be interruptible, let me try to >> look at the initial issue again and maybe I get another idea how to deal >> with this. >> One question: If EFI is interruptible that means, we call into EFI - how >> do we get out? Does EFI enable interrupts and the kernel receives an >> interrupt and treats this EFI call like a regular context switch? > > AFAIK the only safe way to get out permanently is for the call to > return. Note, I've not gone through the spec in fine detail myself. > > The OS may handle interrupts occurring during the EFI call, but we > still have to return to EFI afterwards to finish off the call. From > the Linux perspective, I think this means that EFI calls are non- > preemptible. > > Under RT, I'm pretty sure that we can't safely resume the interrupted > EFI call on a different cpu from the one it was interrupted on. Even > if it doesn't say this explicitly in the UEFI spec, I think it will be > assumed in implementations. > This is a can of worms I would rather not open, although I don't think the UEFI spec makes it explicit that you cannot migrate runtime service calls while in progress. Also, I don't think EFI calls are worth obsessing about, given that they shouldn't be that common under normal operation. I know that RT is not about the common case but about the worst case, though. What problem is migrating a non-preemptible task intended to solve? > > Certain EFI calls are not long-running and may need to be called from > interrupt context in Linux, This suggests that the need to be called in interrupt context is a property of the firmware implementation but it is not. In the kernel, we have efi-pstore which may attempt to invoke runtime services to record the kernel's dying gasp into a EFI variable. Other than that, I don't think there are any reasons to call EFI services from non-process context. > which means that there may be live kernel- > mode NEON state. This is why there are separate FPSIMD/SVE percpu stash > buffers for EFI specifically. > So given the above, and the fact that you can panic() in an interrupt handler, we needed those buffers, although I wonder if we'd ever reach the point where we end up resuming execution of the code that uses the restored contents of those buffers. > Does this make sense? It's is probably not very clear, but I'm trying > to hide the fact that I haven't looked at the UEFI spec for ages... > > Cheers > ---Dave > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -38,6 +38,7 @@ #include <linux/signal.h> #include <linux/slab.h> #include <linux/sysctl.h> +#include <linux/locallock.h> #include <asm/fpsimd.h> #include <asm/cputype.h> @@ -235,7 +236,7 @@ static void sve_user_enable(void) * whether TIF_SVE is clear or set, since these are not vector length * dependent. */ - +static DEFINE_LOCAL_IRQ_LOCK(fpsimd_lock); /* * Update current's FPSIMD/SVE registers from thread_struct. * @@ -594,7 +595,7 @@ int sve_set_vector_length(struct task_st * non-SVE thread. */ if (task == current) { - local_bh_disable(); + local_lock_bh(fpsimd_lock); task_fpsimd_save(); set_thread_flag(TIF_FOREIGN_FPSTATE); @@ -605,7 +606,7 @@ int sve_set_vector_length(struct task_st sve_to_fpsimd(task); if (task == current) - local_bh_enable(); + local_unlock_bh(fpsimd_lock); /* * Force reallocation of task SVE state to the correct size @@ -837,7 +838,7 @@ asmlinkage void do_sve_acc(unsigned int sve_alloc(current); - local_bh_disable(); + local_lock_bh(fpsimd_lock); task_fpsimd_save(); fpsimd_to_sve(current); @@ -849,7 +850,7 @@ asmlinkage void do_sve_acc(unsigned int if (test_and_set_thread_flag(TIF_SVE)) WARN_ON(1); /* SVE access shouldn't have trapped */ - local_bh_enable(); + local_unlock_bh(fpsimd_lock); } /* @@ -925,7 +926,7 @@ void fpsimd_flush_thread(void) if (!system_supports_fpsimd()) return; - local_bh_disable(); + local_lock_bh(fpsimd_lock); memset(¤t->thread.fpsimd_state, 0, sizeof(struct fpsimd_state)); fpsimd_flush_task_state(current); @@ -967,7 +968,7 @@ void fpsimd_flush_thread(void) set_thread_flag(TIF_FOREIGN_FPSTATE); - local_bh_enable(); + local_unlock_bh(fpsimd_lock); } /* @@ -979,9 +980,9 @@ void fpsimd_preserve_current_state(void) if (!system_supports_fpsimd()) return; - local_bh_disable(); + local_lock_bh(fpsimd_lock); task_fpsimd_save(); - local_bh_enable(); + local_unlock_bh(fpsimd_lock); } /* @@ -1021,14 +1022,14 @@ void fpsimd_restore_current_state(void) if (!system_supports_fpsimd()) return; - local_bh_disable(); + local_lock_bh(fpsimd_lock); if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) { task_fpsimd_load(); fpsimd_bind_to_cpu(); } - local_bh_enable(); + local_unlock_bh(fpsimd_lock); } /* @@ -1041,7 +1042,7 @@ void fpsimd_update_current_state(struct if (!system_supports_fpsimd()) return; - local_bh_disable(); + local_lock_bh(fpsimd_lock); current->thread.fpsimd_state.user_fpsimd = *state; if (system_supports_sve() && test_thread_flag(TIF_SVE)) @@ -1052,7 +1053,7 @@ void fpsimd_update_current_state(struct if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) fpsimd_bind_to_cpu(); - local_bh_enable(); + local_unlock_bh(fpsimd_lock); } /* @@ -1115,7 +1116,7 @@ void kernel_neon_begin(void) BUG_ON(!may_use_simd()); - local_bh_disable(); + local_lock_bh(fpsimd_lock); __this_cpu_write(kernel_neon_busy, true); @@ -1129,8 +1130,14 @@ void kernel_neon_begin(void) fpsimd_flush_cpu_state(); preempt_disable(); - - local_bh_enable(); + /* + * ballance atomic vs !atomic context of migrate_disable(). + * local_lock_bh = get_local_var() + spin_lock_bh (2x migrate_disable) + */ + migrate_disable(); + migrate_disable(); + migrate_disable(); + local_unlock_bh(fpsimd_lock); } EXPORT_SYMBOL(kernel_neon_begin); @@ -1154,6 +1161,10 @@ void kernel_neon_end(void) WARN_ON(!busy); /* No matching kernel_neon_begin()? */ preempt_enable(); + /* balance migrate_disable(). See kernel_neon_begin() */ + migrate_enable(); + migrate_enable(); + migrate_enable(); } EXPORT_SYMBOL(kernel_neon_end); @@ -1185,9 +1196,7 @@ void __efi_fpsimd_begin(void) if (!system_supports_fpsimd()) return; - WARN_ON(preemptible()); - - if (may_use_simd()) { + if (!IS_ENABLED(CONFIG_PREEMPT_RT_BASE) && may_use_simd()) { kernel_neon_begin(); } else { /*
In v4.16-RT I noticed a number of warnings from task_fpsimd_load(). The code disables BH and expects that it is not preemptible. On -RT the task remains preemptible but remains the same CPU. This may corrupt the content of the SIMD registers if the task is preempted during saving/restoring those registers. Add a locallock around this process. This avoids that the any function within the locallock block is invoked more than once on the same CPU. The kernel_neon_begin() can't be kept preemptible. If the task-switch notices TIF_FOREIGN_FPSTATE then it would restore task's SIMD state and we lose the state of registers used for in-kernel-work. We would require additional storage for the in-kernel copy of the registers. But then the NEON-crypto checks for the need-resched flag so it shouldn't that bad. The preempt_disable() avoids the context switch while the kernel uses the SIMD registers. Unfortunately we have to balance out the migrate_disable() counter because local_lock_bh() is invoked in different context compared to its unlock counterpart. __efi_fpsimd_begin() should not use kernel_fpu_begin() due to its preempt_disable() context and instead save the registers always in its extra spot on RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- This seems to make work (crypto chacha20-neon + cyclictest). I have no EFI so I have no clue if saving SIMD while calling to EFI works. arch/arm64/kernel/fpsimd.c | 47 ++++++++++++++++++++++++++------------------- 1 file changed, 28 insertions(+), 19 deletions(-)