diff mbox series

arm64: Trap WFI executed in userspace

Message ID 20180807093326.5090-1-marc.zyngier@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: Trap WFI executed in userspace | expand

Commit Message

Marc Zyngier Aug. 7, 2018, 9:33 a.m. UTC
It recently came to light that userspace can execute WFI, and that
the arm64 kernel doesn trap this event. This sounds rather benign,
but the kernel should decide when it wants to wait for an interrupt,
and not userspace.

Let's trap WFI and treat it as a way to yield the CPU to another
process.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/esr.h    |  4 ++++
 arch/arm64/include/asm/sysreg.h |  4 ++--
 arch/arm64/kernel/entry.S       |  1 +
 arch/arm64/kernel/traps.c       | 12 ++++++++++++
 4 files changed, 19 insertions(+), 2 deletions(-)

Comments

Dave Martin Aug. 7, 2018, 10:05 a.m. UTC | #1
On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> It recently came to light that userspace can execute WFI, and that
> the arm64 kernel doesn trap this event. This sounds rather benign,
> but the kernel should decide when it wants to wait for an interrupt,
> and not userspace.
> 
> Let's trap WFI and treat it as a way to yield the CPU to another
> process.

This doesn't amount to a justification.

If the power controller is unexpectedly left in a bad state so that
WFI will do something nasty to a cpu that may enter userspace, then we
probably have bigger problems.

So, maybe it really is pretty harmless to let userspace execute this.

I can't think of a legitimate reason for userspace to execute WFI
however.  Userspace doesn't have interrupts under Linux, so it makes
no sense to wait for one.

Have we seen anybody using WFI in userspace?  It may be cleaner to
map this to SIGILL rather than be permissive and regret it later.

[...]

Cheers
---Dave
Marc Zyngier Aug. 7, 2018, 10:24 a.m. UTC | #2
On 07/08/18 11:05, Dave Martin wrote:
> On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
>> It recently came to light that userspace can execute WFI, and that
>> the arm64 kernel doesn trap this event. This sounds rather benign,
>> but the kernel should decide when it wants to wait for an interrupt,
>> and not userspace.
>>
>> Let's trap WFI and treat it as a way to yield the CPU to another
>> process.
> 
> This doesn't amount to a justification.
> 
> If the power controller is unexpectedly left in a bad state so that
> WFI will do something nasty to a cpu that may enter userspace, then we
> probably have bigger problems.
> 
> So, maybe it really is pretty harmless to let userspace execute this.

Or not. It is also a very good way for userspace to find out when an
interrupt gets delivered and start doing all kind of probing on the
kernel. The least the userspace knows about that, the better I feel.

> I can't think of a legitimate reason for userspace to execute WFI
> however.  Userspace doesn't have interrupts under Linux, so it makes
> no sense to wait for one.
> 
> Have we seen anybody using WFI in userspace?  It may be cleaner to
> map this to SIGILL rather than be permissive and regret it later.

I couldn't find any user, and I'm happy to just send userspace to hell
in that case. But it could also been said that since it was never
prevented, it is a de-facto ABI.

	M.
Dave Martin Aug. 7, 2018, 10:30 a.m. UTC | #3
On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
> On 07/08/18 11:05, Dave Martin wrote:
> > On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> >> It recently came to light that userspace can execute WFI, and that
> >> the arm64 kernel doesn trap this event. This sounds rather benign,
> >> but the kernel should decide when it wants to wait for an interrupt,
> >> and not userspace.
> >>
> >> Let's trap WFI and treat it as a way to yield the CPU to another
> >> process.
> > 
> > This doesn't amount to a justification.
> > 
> > If the power controller is unexpectedly left in a bad state so that
> > WFI will do something nasty to a cpu that may enter userspace, then we
> > probably have bigger problems.
> > 
> > So, maybe it really is pretty harmless to let userspace execute this.
> 
> Or not. It is also a very good way for userspace to find out when an
> interrupt gets delivered and start doing all kind of probing on the
> kernel. The least the userspace knows about that, the better I feel.

Possibly.  I suspect there are other ways to guess pretty accurately
when an interrupt occurs, but WFI allows greater precision.

> > I can't think of a legitimate reason for userspace to execute WFI
> > however.  Userspace doesn't have interrupts under Linux, so it makes
> > no sense to wait for one.
> > 
> > Have we seen anybody using WFI in userspace?  It may be cleaner to
> > map this to SIGILL rather than be permissive and regret it later.
> 
> I couldn't find any user, and I'm happy to just send userspace to hell
> in that case. But it could also been said that since it was never
> prevented, it is a de-facto ABI.

Agreed.  I wonder whether it's sufficient to have this mapping to SIGILL
in -next for a while and see whether anybody complains.

Cheers
---Dave
Robin Murphy Aug. 7, 2018, 12:12 p.m. UTC | #4
On 07/08/18 11:30, Dave Martin wrote:
> On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
>> On 07/08/18 11:05, Dave Martin wrote:
>>> On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
>>>> It recently came to light that userspace can execute WFI, and that
>>>> the arm64 kernel doesn trap this event. This sounds rather benign,
>>>> but the kernel should decide when it wants to wait for an interrupt,
>>>> and not userspace.
>>>>
>>>> Let's trap WFI and treat it as a way to yield the CPU to another
>>>> process.
>>>
>>> This doesn't amount to a justification.
>>>
>>> If the power controller is unexpectedly left in a bad state so that
>>> WFI will do something nasty to a cpu that may enter userspace, then we
>>> probably have bigger problems.
>>>
>>> So, maybe it really is pretty harmless to let userspace execute this.
>>
>> Or not. It is also a very good way for userspace to find out when an
>> interrupt gets delivered and start doing all kind of probing on the
>> kernel. The least the userspace knows about that, the better I feel.
> 
> Possibly.  I suspect there are other ways to guess pretty accurately
> when an interrupt occurs, but WFI allows greater precision.

...unless you're running in a VM and it traps to KVM anyway ;)

>>> I can't think of a legitimate reason for userspace to execute WFI
>>> however.  Userspace doesn't have interrupts under Linux, so it makes
>>> no sense to wait for one.
>>>
>>> Have we seen anybody using WFI in userspace?  It may be cleaner to
>>> map this to SIGILL rather than be permissive and regret it later.
>>
>> I couldn't find any user, and I'm happy to just send userspace to hell
>> in that case. But it could also been said that since it was never
>> prevented, it is a de-facto ABI.
> 
> Agreed.  I wonder whether it's sufficient to have this mapping to SIGILL
> in -next for a while and see whether anybody complains.

I think we'd have to avoid that for compat, though, since v7 code would 
have the expectation that WFI can't be trapped by the kernel at all. 
Personally I'm in favour of this patch as-is, since the architectural 
intent of the instruction is essentially "I've got nothing better to do 
right now than wait for something to happen", so treating it as a poor 
man's sched_vield() stays close to that while mitigating the potential 
nefarious and/or minor DoS implications of letting it architecturally 
execute.

Robin.
Dave Martin Aug. 7, 2018, 1:02 p.m. UTC | #5
On Tue, Aug 07, 2018 at 01:12:03PM +0100, Robin Murphy wrote:
> On 07/08/18 11:30, Dave Martin wrote:
> >On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
> >>On 07/08/18 11:05, Dave Martin wrote:
> >>>On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> >>>>It recently came to light that userspace can execute WFI, and that
> >>>>the arm64 kernel doesn trap this event. This sounds rather benign,
> >>>>but the kernel should decide when it wants to wait for an interrupt,
> >>>>and not userspace.
> >>>>
> >>>>Let's trap WFI and treat it as a way to yield the CPU to another
> >>>>process.
> >>>
> >>>This doesn't amount to a justification.
> >>>
> >>>If the power controller is unexpectedly left in a bad state so that
> >>>WFI will do something nasty to a cpu that may enter userspace, then we
> >>>probably have bigger problems.
> >>>
> >>>So, maybe it really is pretty harmless to let userspace execute this.
> >>
> >>Or not. It is also a very good way for userspace to find out when an
> >>interrupt gets delivered and start doing all kind of probing on the
> >>kernel. The least the userspace knows about that, the better I feel.
> >
> >Possibly.  I suspect there are other ways to guess pretty accurately
> >when an interrupt occurs, but WFI allows greater precision.
> 
> ...unless you're running in a VM and it traps to KVM anyway ;)
> 
> >>>I can't think of a legitimate reason for userspace to execute WFI
> >>>however.  Userspace doesn't have interrupts under Linux, so it makes
> >>>no sense to wait for one.
> >>>
> >>>Have we seen anybody using WFI in userspace?  It may be cleaner to
> >>>map this to SIGILL rather than be permissive and regret it later.
> >>
> >>I couldn't find any user, and I'm happy to just send userspace to hell
> >>in that case. But it could also been said that since it was never
> >>prevented, it is a de-facto ABI.
> >
> >Agreed.  I wonder whether it's sufficient to have this mapping to SIGILL
> >in -next for a while and see whether anybody complains.
> 
> I think we'd have to avoid that for compat, though, since v7 code would have
> the expectation that WFI can't be trapped by the kernel at all. Personally
> I'm in favour of this patch as-is, since the architectural intent of the
> instruction is essentially "I've got nothing better to do right now than
> wait for something to happen", so treating it as a poor man's sched_vield()

Congratulations for knowing what the architects intended ;)

That said, sched_yield() does seem a reasonable approximation if
we don't go for SIGILL.


Note, the patch currently maps WFI to schedule(), which is not quite
the same, though I don't understand the details.

I suspect that sched_yield() is more appropriate, since it tells the
scheduler that the task doesn't expect to run in the very near future,
whereas schedule() only tells the scheduler not to run the task right
now.

> stays close to that while mitigating the potential nefarious and/or minor
> DoS implications of letting it architecturally execute.

Agreed.


For cleanliness I still prefer SIGILL, since WFI just provides code
with a grubby, nonportable means of yielding that won't work the
same on all kernel versions anyway, whereas there is a perfectly good
standard API for yielding already.

This argument works for compat too (if a little more controversial,
due to the historial lack of a trap).

But sched_yield() is minimally invasive and highly unlikely to break
anything (even where deserved).


I wonder whether we can get the Debian folks to act as guinea-pigs
for the SIGILL approach.  Their package builders are likely to be good
at finding compat regressions...  May not be worth it for such a small
change though.

Cheers
---Dave
Catalin Marinas Aug. 8, 2018, 12:34 p.m. UTC | #6
On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
> On 07/08/18 11:05, Dave Martin wrote:
> > On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> >> It recently came to light that userspace can execute WFI, and that
> >> the arm64 kernel doesn trap this event. This sounds rather benign,

Nitpick: "doesn't".

> >> but the kernel should decide when it wants to wait for an interrupt,
> >> and not userspace.
> >>
> >> Let's trap WFI and treat it as a way to yield the CPU to another
> >> process.
[...]
> > I can't think of a legitimate reason for userspace to execute WFI
> > however.  Userspace doesn't have interrupts under Linux, so it makes
> > no sense to wait for one.
> > 
> > Have we seen anybody using WFI in userspace?  It may be cleaner to
> > map this to SIGILL rather than be permissive and regret it later.
> 
> I couldn't find any user, and I'm happy to just send userspace to hell
> in that case. But it could also been said that since it was never
> prevented, it is a de-facto ABI.

I wouldn't really go as far as SIGILL on WFI. I think the patch is fine
as it is. In case Will plans to merge it:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Dave Martin Aug. 9, 2018, 12:34 p.m. UTC | #7
On Wed, Aug 08, 2018 at 01:34:09PM +0100, Catalin Marinas wrote:
> On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
> > On 07/08/18 11:05, Dave Martin wrote:
> > > On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> > >> It recently came to light that userspace can execute WFI, and that
> > >> the arm64 kernel doesn trap this event. This sounds rather benign,
> 
> Nitpick: "doesn't".
> 
> > >> but the kernel should decide when it wants to wait for an interrupt,
> > >> and not userspace.
> > >>
> > >> Let's trap WFI and treat it as a way to yield the CPU to another
> > >> process.
> [...]
> > > I can't think of a legitimate reason for userspace to execute WFI
> > > however.  Userspace doesn't have interrupts under Linux, so it makes
> > > no sense to wait for one.
> > > 
> > > Have we seen anybody using WFI in userspace?  It may be cleaner to
> > > map this to SIGILL rather than be permissive and regret it later.
> > 
> > I couldn't find any user, and I'm happy to just send userspace to hell
> > in that case. But it could also been said that since it was never
> > prevented, it is a de-facto ABI.
> 
> I wouldn't really go as far as SIGILL on WFI. I think the patch is fine
> as it is. In case Will plans to merge it:

For practical purposes I agree, because we can't control the binary
blobs out there: I just wanted to bang the drum because we are creating
semantics here and there is not an obvious correct answer to what they
should be.

I'd still like to see rationale for why this should map to schedule()
(which userspace currently has no direct way to trigger) as opposed to
sched_yield() or something like that.

[...]

Cheers
---Dave
Will Deacon Aug. 9, 2018, 12:38 p.m. UTC | #8
On Thu, Aug 09, 2018 at 01:34:57PM +0100, Dave Martin wrote:
> On Wed, Aug 08, 2018 at 01:34:09PM +0100, Catalin Marinas wrote:
> > On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
> > > On 07/08/18 11:05, Dave Martin wrote:
> > > > On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> > > >> It recently came to light that userspace can execute WFI, and that
> > > >> the arm64 kernel doesn trap this event. This sounds rather benign,
> > 
> > Nitpick: "doesn't".
> > 
> > > >> but the kernel should decide when it wants to wait for an interrupt,
> > > >> and not userspace.
> > > >>
> > > >> Let's trap WFI and treat it as a way to yield the CPU to another
> > > >> process.
> > [...]
> > > > I can't think of a legitimate reason for userspace to execute WFI
> > > > however.  Userspace doesn't have interrupts under Linux, so it makes
> > > > no sense to wait for one.
> > > > 
> > > > Have we seen anybody using WFI in userspace?  It may be cleaner to
> > > > map this to SIGILL rather than be permissive and regret it later.
> > > 
> > > I couldn't find any user, and I'm happy to just send userspace to hell
> > > in that case. But it could also been said that since it was never
> > > prevented, it is a de-facto ABI.
> > 
> > I wouldn't really go as far as SIGILL on WFI. I think the patch is fine
> > as it is. In case Will plans to merge it:
> 
> For practical purposes I agree, because we can't control the binary
> blobs out there: I just wanted to bang the drum because we are creating
> semantics here and there is not an obvious correct answer to what they
> should be.
> 
> I'd still like to see rationale for why this should map to schedule()
> (which userspace currently has no direct way to trigger) as opposed to
> sched_yield() or something like that.

A better idea might just be to do pc +=4 and return. If there's work
pending, we'll hit it on the return path (just like any other ret_to_user
call).

I initially thought about sched_yield(), but it's not clear whether that
creates a problem if, e.g. seccomp has been used to restrict that syscall.

Will
Dave Martin Aug. 9, 2018, 12:47 p.m. UTC | #9
On Thu, Aug 09, 2018 at 01:38:12PM +0100, Will Deacon wrote:
> On Thu, Aug 09, 2018 at 01:34:57PM +0100, Dave Martin wrote:
> > On Wed, Aug 08, 2018 at 01:34:09PM +0100, Catalin Marinas wrote:
> > > On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
> > > > On 07/08/18 11:05, Dave Martin wrote:
> > > > > On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> > > > >> It recently came to light that userspace can execute WFI, and that
> > > > >> the arm64 kernel doesn trap this event. This sounds rather benign,
> > > 
> > > Nitpick: "doesn't".
> > > 
> > > > >> but the kernel should decide when it wants to wait for an interrupt,
> > > > >> and not userspace.
> > > > >>
> > > > >> Let's trap WFI and treat it as a way to yield the CPU to another
> > > > >> process.
> > > [...]
> > > > > I can't think of a legitimate reason for userspace to execute WFI
> > > > > however.  Userspace doesn't have interrupts under Linux, so it makes
> > > > > no sense to wait for one.
> > > > > 
> > > > > Have we seen anybody using WFI in userspace?  It may be cleaner to
> > > > > map this to SIGILL rather than be permissive and regret it later.
> > > > 
> > > > I couldn't find any user, and I'm happy to just send userspace to hell
> > > > in that case. But it could also been said that since it was never
> > > > prevented, it is a de-facto ABI.
> > > 
> > > I wouldn't really go as far as SIGILL on WFI. I think the patch is fine
> > > as it is. In case Will plans to merge it:
> > 
> > For practical purposes I agree, because we can't control the binary
> > blobs out there: I just wanted to bang the drum because we are creating
> > semantics here and there is not an obvious correct answer to what they
> > should be.
> > 
> > I'd still like to see rationale for why this should map to schedule()
> > (which userspace currently has no direct way to trigger) as opposed to
> > sched_yield() or something like that.
> 
> A better idea might just be to do pc +=4 and return. If there's work
> pending, we'll hit it on the return path (just like any other ret_to_user
> call).
> 
> I initially thought about sched_yield(), but it's not clear whether that
> creates a problem if, e.g. seccomp has been used to restrict that syscall.

Indeed.  I can't see why that might be restricted, but there's presumably
nothing to stop people doing that today.

Other than putting the task to sleep for 1ms or something, I don't know
what to suggest ;)

Perhaps we can patch a NOP into .text, like Marc's BX trick :P

Cheers
---Dave
Marc Zyngier Aug. 9, 2018, 1:25 p.m. UTC | #10
On Thu, 09 Aug 2018 13:47:01 +0100,
Dave Martin <Dave.Martin@arm.com> wrote:
> 
> On Thu, Aug 09, 2018 at 01:38:12PM +0100, Will Deacon wrote:
> > On Thu, Aug 09, 2018 at 01:34:57PM +0100, Dave Martin wrote:
> > > On Wed, Aug 08, 2018 at 01:34:09PM +0100, Catalin Marinas wrote:
> > > > On Tue, Aug 07, 2018 at 11:24:34AM +0100, Marc Zyngier wrote:
> > > > > On 07/08/18 11:05, Dave Martin wrote:
> > > > > > On Tue, Aug 07, 2018 at 10:33:26AM +0100, Marc Zyngier wrote:
> > > > > >> It recently came to light that userspace can execute WFI, and that
> > > > > >> the arm64 kernel doesn trap this event. This sounds rather benign,
> > > > 
> > > > Nitpick: "doesn't".
> > > > 
> > > > > >> but the kernel should decide when it wants to wait for an interrupt,
> > > > > >> and not userspace.
> > > > > >>
> > > > > >> Let's trap WFI and treat it as a way to yield the CPU to another
> > > > > >> process.
> > > > [...]
> > > > > > I can't think of a legitimate reason for userspace to execute WFI
> > > > > > however.  Userspace doesn't have interrupts under Linux, so it makes
> > > > > > no sense to wait for one.
> > > > > > 
> > > > > > Have we seen anybody using WFI in userspace?  It may be cleaner to
> > > > > > map this to SIGILL rather than be permissive and regret it later.
> > > > > 
> > > > > I couldn't find any user, and I'm happy to just send userspace to hell
> > > > > in that case. But it could also been said that since it was never
> > > > > prevented, it is a de-facto ABI.
> > > > 
> > > > I wouldn't really go as far as SIGILL on WFI. I think the patch is fine
> > > > as it is. In case Will plans to merge it:
> > > 
> > > For practical purposes I agree, because we can't control the binary
> > > blobs out there: I just wanted to bang the drum because we are creating
> > > semantics here and there is not an obvious correct answer to what they
> > > should be.
> > > 
> > > I'd still like to see rationale for why this should map to schedule()
> > > (which userspace currently has no direct way to trigger) as opposed to
> > > sched_yield() or something like that.
> > 
> > A better idea might just be to do pc +=4 and return. If there's work
> > pending, we'll hit it on the return path (just like any other ret_to_user
> > call).
> > 
> > I initially thought about sched_yield(), but it's not clear whether that
> > creates a problem if, e.g. seccomp has been used to restrict that syscall.
> 
> Indeed.  I can't see why that might be restricted, but there's presumably
> nothing to stop people doing that today.
> 
> Other than putting the task to sleep for 1ms or something, I don't know
> what to suggest ;)
> 
> Perhaps we can patch a NOP into .text, like Marc's BX trick :P

Do not give me any additional bad idea... ;-)

I'll respin the patch without the schedule() call.

Thanks,

	M.
Pavel Machek Sept. 20, 2018, 10:04 p.m. UTC | #11
On Tue 2018-08-07 10:33:26, Marc Zyngier wrote:
> It recently came to light that userspace can execute WFI, and that
> the arm64 kernel doesn trap this event. This sounds rather benign,
> but the kernel should decide when it wants to wait for an interrupt,
> and not userspace.
> 
> Let's trap WFI and treat it as a way to yield the CPU to another
> process.

I don't think that's reasonable. Userspace should not be doing that, and we do not want
to encourage it. SIGILL would be reasonable, I'd  say.

(Very old mail, was forgotton in the queue)
Marc Zyngier Sept. 20, 2018, 10:33 p.m. UTC | #12
On Thu, 20 Sep 2018 23:04:30 +0100,
Pavel Machek <pavel@ucw.cz> wrote:
> 
> On Tue 2018-08-07 10:33:26, Marc Zyngier wrote:
> > It recently came to light that userspace can execute WFI, and that
> > the arm64 kernel doesn trap this event. This sounds rather benign,
> > but the kernel should decide when it wants to wait for an interrupt,
> > and not userspace.
> > 
> > Let's trap WFI and treat it as a way to yield the CPU to another
> > process.
> 
> I don't think that's reasonable. Userspace should not be doing that,
> and we do not want to encourage it. SIGILL would be reasonable, I'd
> say.

This ship has already sailed. Sending SIGILL changes the ABI in an
incompatible way. Immediately returning to userspace without doing
anything else (see v2 [1]) is the best we can do without changing the
behaviour of userspace.

Thanks,

	M.

[1] https://patchwork.kernel.org/patch/10562517/
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index ce70c3ffb993..9a2b6cee4e2b 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -137,6 +137,7 @@ 
 #define ESR_ELx_CV		(UL(1) << 24)
 #define ESR_ELx_COND_SHIFT	(20)
 #define ESR_ELx_COND_MASK	(UL(0xF) << ESR_ELx_COND_SHIFT)
+#define ESR_ELx_WFx_ISS_WFI	(UL(0) << 0)
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
@@ -148,6 +149,9 @@ 
 #define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
 
 /* ESR value templates for specific events */
+#define ESR_ELx_WFx_MASK	(ESR_ELx_EC_MASK | 1)
+#define ESR_ELx_WFx_WFI_VAL	((ESR_ELx_EC_WFx << ESR_ELx_EC_SHIFT) |	\
+				 ESR_ELx_WFx_ISS_WFI)
 
 /* BRK instruction trap from AArch64 state */
 #define ESR_ELx_VAL_BRK64(imm)					\
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 98af0b37fb31..56bcf0408dbb 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -485,12 +485,12 @@ 
 
 #define SCTLR_EL1_SET	(SCTLR_ELx_M    | SCTLR_ELx_C    | SCTLR_ELx_SA   |\
 			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
-			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT  | SCTLR_EL1_NTWI |\
+			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT  |		   \
 			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
 			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
 #define SCTLR_EL1_CLEAR	(SCTLR_ELx_A   | SCTLR_EL1_CP15BEN | SCTLR_EL1_ITD    |\
 			 SCTLR_EL1_UMA | SCTLR_ELx_WXN     | ENDIAN_CLEAR_EL1 |\
-			 SCTLR_EL1_RES0)
+			 SCTLR_EL1_NTWI | SCTLR_EL1_RES0)
 
 /* Check all the bits are accounted for */
 #define SCTLR_EL1_BUILD_BUG_ON_MISSING_BITS	BUILD_BUG_ON((SCTLR_EL1_SET ^ SCTLR_EL1_CLEAR) != ~0)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 28ad8799406f..4c01c0ba81aa 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -675,6 +675,7 @@  el0_sync:
 	cmp	x24, #ESR_ELx_EC_FP_EXC64	// FP/ASIMD exception
 	b.eq	el0_fpsimd_exc
 	cmp	x24, #ESR_ELx_EC_SYS64		// configurable trap
+	ccmp	x24, #ESR_ELx_EC_WFx, #4, ne
 	b.eq	el0_sys
 	cmp	x24, #ESR_ELx_EC_SP_ALIGN	// stack alignment exception
 	b.eq	el0_sp_pc
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index d399d459397b..2e4b389a6c8f 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -496,6 +496,12 @@  static void cntfrq_read_handler(unsigned int esr, struct pt_regs *regs)
 	arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
 }
 
+static void wfi_handler(unsigned int esr, struct pt_regs *regs)
+{
+	arm64_skip_faulting_instruction(regs, AARCH64_INSN_SIZE);
+	schedule();
+}
+
 struct sys64_hook {
 	unsigned int esr_mask;
 	unsigned int esr_val;
@@ -526,6 +532,12 @@  static struct sys64_hook sys64_hooks[] = {
 		.esr_val = ESR_ELx_SYS64_ISS_SYS_CNTFRQ,
 		.handler = cntfrq_read_handler,
 	},
+	{
+		/* Trap WFI instructions executed in userspace */
+		.esr_mask = ESR_ELx_WFx_MASK,
+		.esr_val = ESR_ELx_WFx_WFI_VAL,
+		.handler = wfi_handler,
+	},
 	{},
 };