diff mbox series

[2/3] arm64: signal: sigreturn() and rt_sigreturn() sometime returns the wrong signals

Message ID 20210420165001.3790670-2-Liam.Howlett@Oracle.com (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series [1/3] arm64: armv8_deprecated: Fix swp_handler() signal generation | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Liam R. Howlett April 20, 2021, 4:50 p.m. UTC
arm64_notify_segfault() was used to force a SIGSEGV in all error cases
in sigreturn() and rt_sigreturn() to avoid writing a new sig handler.
There is now a better sig handler to use which does not search the VMA
address space and return a slightly incorrect error code.  Restore the
older and correct si_code of SI_KERNEL by using arm64_notify_die().  In
the case of !access_ok(), simply return SIGSEGV with si_code
SEGV_ACCERR.

This change requires exporting arm64_notfiy_die() to the arm64 traps.h

Fixes: f71016a8a8c5 (arm64: signal: Call arm64_notify_segfault when
failing to deliver signal)
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
---
 arch/arm64/include/asm/traps.h |  2 ++
 arch/arm64/kernel/signal.c     |  8 ++++++--
 arch/arm64/kernel/signal32.c   | 18 ++++++++++++++----
 3 files changed, 22 insertions(+), 6 deletions(-)

Comments

Will Deacon April 22, 2021, 12:48 p.m. UTC | #1
[+Eric as he actually understands how this is supposed to work]

On Tue, Apr 20, 2021 at 04:50:13PM +0000, Liam Howlett wrote:
> arm64_notify_segfault() was used to force a SIGSEGV in all error cases
> in sigreturn() and rt_sigreturn() to avoid writing a new sig handler.
> There is now a better sig handler to use which does not search the VMA
> address space and return a slightly incorrect error code.  Restore the
> older and correct si_code of SI_KERNEL by using arm64_notify_die().  In
> the case of !access_ok(), simply return SIGSEGV with si_code
> SEGV_ACCERR.
> 
> This change requires exporting arm64_notfiy_die() to the arm64 traps.h
> 
> Fixes: f71016a8a8c5 (arm64: signal: Call arm64_notify_segfault when
> failing to deliver signal)
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> ---
>  arch/arm64/include/asm/traps.h |  2 ++
>  arch/arm64/kernel/signal.c     |  8 ++++++--
>  arch/arm64/kernel/signal32.c   | 18 ++++++++++++++----
>  3 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> index 54f32a0675df..9b76144fcba6 100644
> --- a/arch/arm64/include/asm/traps.h
> +++ b/arch/arm64/include/asm/traps.h
> @@ -29,6 +29,8 @@ void arm64_notify_segfault(unsigned long addr);
>  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
>  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
> +void arm64_notify_die(const char *str, struct pt_regs *regs, int signo,
> +		      int sicode, unsigned long far, int err);
>  
>  /*
>   * Move regs->pc to next instruction and do necessary setup before it
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 6237486ff6bb..9fde6dc760c3 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -544,7 +544,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  	frame = (struct rt_sigframe __user *)regs->sp;
>  
>  	if (!access_ok(frame, sizeof (*frame)))
> -		goto badframe;
> +		goto e_access;
>  
>  	if (restore_sigframe(regs, frame))
>  		goto badframe;
> @@ -555,7 +555,11 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  	return regs->regs[0];
>  
>  badframe:
> -	arm64_notify_segfault(regs->sp);
> +	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL, regs->sp, 0);
> +	return 0;
> +
> +e_access:
> +	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->sp, 0);
>  	return 0;

This seems really error-prone to me, but maybe I'm just missing some
context. What's the rule for reporting an si_code of SI_KERNEL vs
SEGV_ACCERR, and is the former actually valid for SIGSEGV?

With this change, pointing the (signal) stack to a kernel address will
result in SEGV_ACCERR but pointing it to something like a PROT_NONE user
address will give SI_KERNEL (well, assuming that we manage to deliver
the SEGV somehow). I'm having a hard time seeing why that's a useful
distinction to make..

If it's important to get this a particular way around, please can you
add some selftests?

Will
Eric W. Biederman April 22, 2021, 6:22 p.m. UTC | #2
Will Deacon <will@kernel.org> writes:

> [+Eric as he actually understands how this is supposed to work]

I try.

> On Tue, Apr 20, 2021 at 04:50:13PM +0000, Liam Howlett wrote:
>> arm64_notify_segfault() was used to force a SIGSEGV in all error cases
>> in sigreturn() and rt_sigreturn() to avoid writing a new sig handler.
>> There is now a better sig handler to use which does not search the VMA
>> address space and return a slightly incorrect error code.  Restore the
>> older and correct si_code of SI_KERNEL by using arm64_notify_die().  In
>> the case of !access_ok(), simply return SIGSEGV with si_code
>> SEGV_ACCERR.

What is userspace cares?  Why does it care?

This is changing userspace visible semantics so understanding userspace
and understanding how it might break, and what the risk of regression
seems the most important detail here.

>> This change requires exporting arm64_notfiy_die() to the arm64 traps.h
>> 
>> Fixes: f71016a8a8c5 (arm64: signal: Call arm64_notify_segfault when
>> failing to deliver signal)
>> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
>> ---
>>  arch/arm64/include/asm/traps.h |  2 ++
>>  arch/arm64/kernel/signal.c     |  8 ++++++--
>>  arch/arm64/kernel/signal32.c   | 18 ++++++++++++++----
>>  3 files changed, 22 insertions(+), 6 deletions(-)
>> 
>> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
>> index 54f32a0675df..9b76144fcba6 100644
>> --- a/arch/arm64/include/asm/traps.h
>> +++ b/arch/arm64/include/asm/traps.h
>> @@ -29,6 +29,8 @@ void arm64_notify_segfault(unsigned long addr);
>>  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
>>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
>>  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
>> +void arm64_notify_die(const char *str, struct pt_regs *regs, int signo,
>> +		      int sicode, unsigned long far, int err);
>>  
>>  /*
>>   * Move regs->pc to next instruction and do necessary setup before it
>> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
>> index 6237486ff6bb..9fde6dc760c3 100644
>> --- a/arch/arm64/kernel/signal.c
>> +++ b/arch/arm64/kernel/signal.c
>> @@ -544,7 +544,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
>>  	frame = (struct rt_sigframe __user *)regs->sp;
>>  
>>  	if (!access_ok(frame, sizeof (*frame)))
>> -		goto badframe;
>> +		goto e_access;
>>  
>>  	if (restore_sigframe(regs, frame))
>>  		goto badframe;
>> @@ -555,7 +555,11 @@ SYSCALL_DEFINE0(rt_sigreturn)
>>  	return regs->regs[0];
>>  
>>  badframe:
>> -	arm64_notify_segfault(regs->sp);
>> +	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL, regs->sp, 0);
>> +	return 0;
>> +
>> +e_access:
>> +	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->sp, 0);
>>  	return 0;
>
> This seems really error-prone to me, but maybe I'm just missing some
> context. What's the rule for reporting an si_code of SI_KERNEL vs
> SEGV_ACCERR, and is the former actually valid for SIGSEGV?

The si_codes SI_USER == 0 and SI_KERNEL == 0x80 are valid for all
signals.  SI_KERNEL means I don't have any information for you other
than signal number.

In general something better than SI_KERNEL is desirable.

> With this change, pointing the (signal) stack to a kernel address will
> result in SEGV_ACCERR but pointing it to something like a PROT_NONE user
> address will give SI_KERNEL (well, assuming that we manage to deliver
> the SEGV somehow). I'm having a hard time seeing why that's a useful
> distinction to make..
>
> If it's important to get this a particular way around, please can you
> add some selftests?

Going down the current path I see 3 possible cases:

copy_from_user returns -EFAULT which becomes SEGV_MAPERR or SEGV_ACCERR.

A signal frame parse error.  For which SI_KERNEL seems as good an error
code as any.


On x86 there is no attempt to figure out the cause of the -EFAULT, and
always uses SI_KERNEL.  This is because x86 just does:
"force_sig(SIGSEGV);" As arm64 did until f71016a8a8c5 ("arm64: signal:
Call arm64_notify_segfault when failing to deliver signal")



I think the big question is what does it make sense to do here.

The big picture.  Upon return from a signal the kernel arranges
for rt_sigreturn to be called to return to a pre-signal state.
As such rt_sigreturn can not return an error code.

In general the kernel will write the signal frame and that will
guarantee that the signal from can be processes by rt_sigreturn.

For error handling we are dealing with the case that userspace
has modified the signal frame.  So that it either does not
parse or that it is unmapped.


So who cares?  The only two cases I can think of are debuggers, and
programs playing fancy memory management games like garbage collections.
I have heard of applications (like garbage collectors)
unmapping/mprotecting memory to create a barrier.

Liam Howlett is that the issue here?  Is not seeing SI_KERNEL confusing
the JVM?

For debuggers I expect the stack backtrace from SIGSEGV is enough to see
that something is wrong.

For applications performing fancy processing of the signal frame I think
that tends to be very architecture specific.  In part because even
knowing the faulting address the size of the access is not known so the
instruction must be interpreted.  Getting a system call instead of a
load or store instruction might be enough to completely confuse
applications processing SEGV_MAPERR or SEGV_ACCERR.  Such applications
may also struggle with the fact that the address in siginfo is less
precise than it would be for an ordinary page fault.


So my sense is if you known you are helping userspace returning either
SEGV_MAPERR or SEGV_ACCERR go for it.  Otherwise there are enough
variables that returning less information when rt_sigreturn fails
would be more reliable.


Or in short what is userspace doing?  What does userspace care about?

Eric
Liam R. Howlett April 22, 2021, 7:24 p.m. UTC | #3
* Eric W. Biederman <ebiederm@xmission.com> [210422 14:23]:
> Will Deacon <will@kernel.org> writes:
> 
> > [+Eric as he actually understands how this is supposed to work]
> 
> I try.
> 

Thanks to both of you for looking at this.

> > On Tue, Apr 20, 2021 at 04:50:13PM +0000, Liam Howlett wrote:
> >> arm64_notify_segfault() was used to force a SIGSEGV in all error cases
> >> in sigreturn() and rt_sigreturn() to avoid writing a new sig handler.
> >> There is now a better sig handler to use which does not search the VMA
> >> address space and return a slightly incorrect error code.  Restore the
> >> older and correct si_code of SI_KERNEL by using arm64_notify_die().  In
> >> the case of !access_ok(), simply return SIGSEGV with si_code
> >> SEGV_ACCERR.
> 
> What is userspace cares?  Why does it care?


Calling arm64_notify_segfault() as it is written is unreliable.
Searching for the address with find_vma() will return SEG_ACCERR almost
always, unless the address is larger than anything within *any* VMA.
I'm trying to fix this issue by cleaning up the callers to that function
and fix the function itself.

I don't have an example of why userspace cares about SI_KERNEL vs
SEGV_ACCERR/SEGV_MAPERR, but the git log on f71016a8a8c5 specifies that
this function was used to avoid having specific code to print an error
code.  I am restoring the old return code as it seems to makes more
sense and avoids the bug in the calling path.  If you'd rather, I can
change the notify_die line to use SIG_ACCERR as this is *almost* always
what is returend - except when the above mentioned bug is hit.  Upon
examining the code here, it seems unnecessary to walk the VMA tree to
find where the address lands in either of the error scenarios to know
what should be returned.

> 
> This is changing userspace visible semantics so understanding userspace
> and understanding how it might break, and what the risk of regression
> seems the most important detail here.
> 
> >> This change requires exporting arm64_notfiy_die() to the arm64 traps.h
> >> 
> >> Fixes: f71016a8a8c5 (arm64: signal: Call arm64_notify_segfault when
> >> failing to deliver signal)
> >> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> >> ---
> >>  arch/arm64/include/asm/traps.h |  2 ++
> >>  arch/arm64/kernel/signal.c     |  8 ++++++--
> >>  arch/arm64/kernel/signal32.c   | 18 ++++++++++++++----
> >>  3 files changed, 22 insertions(+), 6 deletions(-)
> >> 
> >> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> >> index 54f32a0675df..9b76144fcba6 100644
> >> --- a/arch/arm64/include/asm/traps.h
> >> +++ b/arch/arm64/include/asm/traps.h
> >> @@ -29,6 +29,8 @@ void arm64_notify_segfault(unsigned long addr);
> >>  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
> >>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
> >>  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
> >> +void arm64_notify_die(const char *str, struct pt_regs *regs, int signo,
> >> +		      int sicode, unsigned long far, int err);
> >>  
> >>  /*
> >>   * Move regs->pc to next instruction and do necessary setup before it
> >> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> >> index 6237486ff6bb..9fde6dc760c3 100644
> >> --- a/arch/arm64/kernel/signal.c
> >> +++ b/arch/arm64/kernel/signal.c
> >> @@ -544,7 +544,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >>  	frame = (struct rt_sigframe __user *)regs->sp;
> >>  
> >>  	if (!access_ok(frame, sizeof (*frame)))
> >> -		goto badframe;
> >> +		goto e_access;
> >>  
> >>  	if (restore_sigframe(regs, frame))
> >>  		goto badframe;
> >> @@ -555,7 +555,11 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >>  	return regs->regs[0];
> >>  
> >>  badframe:
> >> -	arm64_notify_segfault(regs->sp);
> >> +	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL, regs->sp, 0);
> >> +	return 0;
> >> +
> >> +e_access:
> >> +	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->sp, 0);
> >>  	return 0;
> >
> > This seems really error-prone to me, but maybe I'm just missing some
> > context. What's the rule for reporting an si_code of SI_KERNEL vs
> > SEGV_ACCERR, and is the former actually valid for SIGSEGV?
> 
> The si_codes SI_USER == 0 and SI_KERNEL == 0x80 are valid for all
> signals.  SI_KERNEL means I don't have any information for you other
> than signal number.
> 
> In general something better than SI_KERNEL is desirable.

I went with SI_KERNEL as that's what was there before.  I have no strong
opinion on what should be returned; I do favour SIGBUS with si_code of
BUS_ADRALN but I didn't want to change user visable code too much -
especially to fix a bug in another function.

> 
> > With this change, pointing the (signal) stack to a kernel address will
> > result in SEGV_ACCERR but pointing it to something like a PROT_NONE user
> > address will give SI_KERNEL (well, assuming that we manage to deliver
> > the SEGV somehow). I'm having a hard time seeing why that's a useful
> > distinction to make..
> >
> > If it's important to get this a particular way around, please can you
> > add some selftests?
> 
> Going down the current path I see 3 possible cases:
> 
> copy_from_user returns -EFAULT which becomes SEGV_MAPERR or SEGV_ACCERR.

Almost always SEGV_ACCERR with the current bug, as mentioned above.
find_vma() searches from addr until the end of the address space, it
isn't just a simple lookup of the address.

> 
> A signal frame parse error.  For which SI_KERNEL seems as good an error
> code as any.
> 
> 
> On x86 there is no attempt to figure out the cause of the -EFAULT, and
> always uses SI_KERNEL.  This is because x86 just does:
> "force_sig(SIGSEGV);" As arm64 did until f71016a8a8c5 ("arm64: signal:
> Call arm64_notify_segfault when failing to deliver signal")
> 
> 
> 
> I think the big question is what does it make sense to do here.
> 
> The big picture.  Upon return from a signal the kernel arranges
> for rt_sigreturn to be called to return to a pre-signal state.
> As such rt_sigreturn can not return an error code.
> 
> In general the kernel will write the signal frame and that will
> guarantee that the signal from can be processes by rt_sigreturn.
> 
> For error handling we are dealing with the case that userspace
> has modified the signal frame.  So that it either does not
> parse or that it is unmapped.
> 
> 
> So who cares?  The only two cases I can think of are debuggers, and
> programs playing fancy memory management games like garbage collections.
> I have heard of applications (like garbage collectors)
> unmapping/mprotecting memory to create a barrier.
> 
> Liam Howlett is that the issue here?  Is not seeing SI_KERNEL confusing
> the JVM?

No, the issue here is that arm64_notify_segfault() has a bug which sent
me down a rabbit hole of issues and I'm really trying my best to help
out as best I can.  The bug certainly affects this function as it is
written today, but my patch will generate a consistent signal.

> 
> For debuggers I expect the stack backtrace from SIGSEGV is enough to see
> that something is wrong.
> 
> For applications performing fancy processing of the signal frame I think
> that tends to be very architecture specific.  In part because even
> knowing the faulting address the size of the access is not known so the
> instruction must be interpreted.  Getting a system call instead of a
> load or store instruction might be enough to completely confuse
> applications processing SEGV_MAPERR or SEGV_ACCERR.  Such applications
> may also struggle with the fact that the address in siginfo is less
> precise than it would be for an ordinary page fault.
> 
> 
> So my sense is if you known you are helping userspace returning either
> SEGV_MAPERR or SEGV_ACCERR go for it.  Otherwise there are enough
> variables that returning less information when rt_sigreturn fails
> would be more reliable.
> 
> 
> Or in short what is userspace doing?  What does userspace care about?

I think I've answered this, but it's more of trying to fix a bug which
causes an *almost* reliable return code to be a reliable return code.
Am I correct in stating that in both of these scnarios - !access_ok()
and badframe, it is unnecessary to search the VMAs for where the address
falls to know what error code to return?

Thanks,
Liam
Eric W. Biederman April 23, 2021, 6:17 p.m. UTC | #4
Liam Howlett <liam.howlett@oracle.com> writes:

> * Eric W. Biederman <ebiederm@xmission.com> [210422 14:23]:
>> Will Deacon <will@kernel.org> writes:
>> 
>> > [+Eric as he actually understands how this is supposed to work]
>> 
>> I try.
>> 
>
> Thanks to both of you for looking at this.
>
>> > On Tue, Apr 20, 2021 at 04:50:13PM +0000, Liam Howlett wrote:
>> >> arm64_notify_segfault() was used to force a SIGSEGV in all error cases
>> >> in sigreturn() and rt_sigreturn() to avoid writing a new sig handler.
>> >> There is now a better sig handler to use which does not search the VMA
>> >> address space and return a slightly incorrect error code.  Restore the
>> >> older and correct si_code of SI_KERNEL by using arm64_notify_die().  In
>> >> the case of !access_ok(), simply return SIGSEGV with si_code
>> >> SEGV_ACCERR.
>> 
>> What is userspace cares?  Why does it care?
>
>
> Calling arm64_notify_segfault() as it is written is unreliable.
> Searching for the address with find_vma() will return SEG_ACCERR almost
> always, unless the address is larger than anything within *any* VMA.
> I'm trying to fix this issue by cleaning up the callers to that function
> and fix the function itself.

I can't see the rest of the patches in your series so I am not quite
certain what you are doing.

Looking at the places that arm64_notify_segfault is called I do
think you are right to be suspicious of that function.

swp_handler seems a legitimate user as it emulates an instruction.
For that case at least you should probably add the necessary
check to see if the address is contained in the returned vma.

user_cache_maint_handler calls it with the tagged address, when
it should use an untagged address with find_vma.  Otherwise
user_cache_maint_handler is also performing instruction emulation
and should work the same as swp_handler.

The only other users are in sigreturn and rt_sigreturn, where the
address is iffy.  But assuming the proper address is passed sigreturn
and rt_sigreturn are also performing instruction emulation.

> I don't have an example of why userspace cares about SI_KERNEL vs
> SEGV_ACCERR/SEGV_MAPERR, but the git log on f71016a8a8c5 specifies that
> this function was used to avoid having specific code to print an error
> code.  I am restoring the old return code as it seems to makes more
> sense and avoids the bug in the calling path.  If you'd rather, I can
> change the notify_die line to use SIG_ACCERR as this is *almost* always
> what is returend - except when the above mentioned bug is hit.  Upon
> examining the code here, it seems unnecessary to walk the VMA tree to
> find where the address lands in either of the error scenarios to know
> what should be returned.

Ignoring the possibility of a parse error what the code has is -EFAULT.
If we want to distinguish between SEGV_ACCERR and SEGV_MAPERR we need
the find_vma (and confirmation that the found vma contains the specified
address) to see if there is a mapping at the specified address.

Or as you point out later this could be a SIGBUS situation.

>> This is changing userspace visible semantics so understanding userspace
>> and understanding how it might break, and what the risk of regression
>> seems the most important detail here.
>> 
>> >> This change requires exporting arm64_notfiy_die() to the arm64 traps.h
>> >> 
>> >> Fixes: f71016a8a8c5 (arm64: signal: Call arm64_notify_segfault when
>> >> failing to deliver signal)
>> >> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
>> >> ---
>> >>  arch/arm64/include/asm/traps.h |  2 ++
>> >>  arch/arm64/kernel/signal.c     |  8 ++++++--
>> >>  arch/arm64/kernel/signal32.c   | 18 ++++++++++++++----
>> >>  3 files changed, 22 insertions(+), 6 deletions(-)
>> >> 
>> >> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
>> >> index 54f32a0675df..9b76144fcba6 100644
>> >> --- a/arch/arm64/include/asm/traps.h
>> >> +++ b/arch/arm64/include/asm/traps.h
>> >> @@ -29,6 +29,8 @@ void arm64_notify_segfault(unsigned long addr);
>> >>  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
>> >>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
>> >>  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
>> >> +void arm64_notify_die(const char *str, struct pt_regs *regs, int signo,
>> >> +		      int sicode, unsigned long far, int err);
>> >>  
>> >>  /*
>> >>   * Move regs->pc to next instruction and do necessary setup before it
>> >> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
>> >> index 6237486ff6bb..9fde6dc760c3 100644
>> >> --- a/arch/arm64/kernel/signal.c
>> >> +++ b/arch/arm64/kernel/signal.c
>> >> @@ -544,7 +544,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
>> >>  	frame = (struct rt_sigframe __user *)regs->sp;
>> >>  
>> >>  	if (!access_ok(frame, sizeof (*frame)))
>> >> -		goto badframe;
>> >> +		goto e_access;
>> >>  
>> >>  	if (restore_sigframe(regs, frame))
>> >>  		goto badframe;
>> >> @@ -555,7 +555,11 @@ SYSCALL_DEFINE0(rt_sigreturn)
>> >>  	return regs->regs[0];
>> >>  
>> >>  badframe:
>> >> -	arm64_notify_segfault(regs->sp);
>> >> +	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL, regs->sp, 0);
>> >> +	return 0;
>> >> +
>> >> +e_access:
>> >> +	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->sp, 0);
>> >>  	return 0;
>> >
>> > This seems really error-prone to me, but maybe I'm just missing some
>> > context. What's the rule for reporting an si_code of SI_KERNEL vs
>> > SEGV_ACCERR, and is the former actually valid for SIGSEGV?
>> 
>> The si_codes SI_USER == 0 and SI_KERNEL == 0x80 are valid for all
>> signals.  SI_KERNEL means I don't have any information for you other
>> than signal number.
>> 
>> In general something better than SI_KERNEL is desirable.
>
> I went with SI_KERNEL as that's what was there before.  I have no strong
> opinion on what should be returned; I do favour SIGBUS with si_code of
> BUS_ADRALN but I didn't want to change user visable code too much -
> especially to fix a bug in another function.

I hadn't noticed earlier, but I agree that what the code is doing is a
little strange.  I get SIGBUS and SIGEGV confused when I have not looked
at the recently.  I think SIGBUS is for mapped access that fail, and
SIGSEGV is for everything else.  

>> > With this change, pointing the (signal) stack to a kernel address will
>> > result in SEGV_ACCERR but pointing it to something like a PROT_NONE user
>> > address will give SI_KERNEL (well, assuming that we manage to deliver
>> > the SEGV somehow). I'm having a hard time seeing why that's a useful
>> > distinction to make..
>> >
>> > If it's important to get this a particular way around, please can you
>> > add some selftests?
>> 
>> Going down the current path I see 3 possible cases:
>> 
>> copy_from_user returns -EFAULT which becomes SEGV_MAPERR or SEGV_ACCERR.
>
> Almost always SEGV_ACCERR with the current bug, as mentioned above.
> find_vma() searches from addr until the end of the address space, it
> isn't just a simple lookup of the address.

Which is clearly a logic error.

In practice I don't know if anyone cares how sigreturn fails so we are
in a grey area.

>> A signal frame parse error.  For which SI_KERNEL seems as good an error
>> code as any.
>> 
>> 
>> On x86 there is no attempt to figure out the cause of the -EFAULT, and
>> always uses SI_KERNEL.  This is because x86 just does:
>> "force_sig(SIGSEGV);" As arm64 did until f71016a8a8c5 ("arm64: signal:
>> Call arm64_notify_segfault when failing to deliver signal")
>> 
>> 
>> 
>> I think the big question is what does it make sense to do here.
>> 
>> The big picture.  Upon return from a signal the kernel arranges
>> for rt_sigreturn to be called to return to a pre-signal state.
>> As such rt_sigreturn can not return an error code.
>> 
>> In general the kernel will write the signal frame and that will
>> guarantee that the signal from can be processes by rt_sigreturn.
>> 
>> For error handling we are dealing with the case that userspace
>> has modified the signal frame.  So that it either does not
>> parse or that it is unmapped.
>> 
>> 
>> So who cares?  The only two cases I can think of are debuggers, and
>> programs playing fancy memory management games like garbage collections.
>> I have heard of applications (like garbage collectors)
>> unmapping/mprotecting memory to create a barrier.
>> 
>> Liam Howlett is that the issue here?  Is not seeing SI_KERNEL confusing
>> the JVM?
>
> No, the issue here is that arm64_notify_segfault() has a bug which sent
> me down a rabbit hole of issues and I'm really trying my best to help
> out as best I can.  The bug certainly affects this function as it is
> written today, but my patch will generate a consistent signal.

Not so much.  There are other cases where -EFAULT causing a failing
return in that function.  So I think you have in practice made the
matter worse.  As after this patch it becomes less clear what the
return code is.

>> For debuggers I expect the stack backtrace from SIGSEGV is enough to see
>> that something is wrong.
>> 
>> For applications performing fancy processing of the signal frame I think
>> that tends to be very architecture specific.  In part because even
>> knowing the faulting address the size of the access is not known so the
>> instruction must be interpreted.  Getting a system call instead of a
>> load or store instruction might be enough to completely confuse
>> applications processing SEGV_MAPERR or SEGV_ACCERR.  Such applications
>> may also struggle with the fact that the address in siginfo is less
>> precise than it would be for an ordinary page fault.
>> 
>> 
>> So my sense is if you known you are helping userspace returning either
>> SEGV_MAPERR or SEGV_ACCERR go for it.  Otherwise there are enough
>> variables that returning less information when rt_sigreturn fails
>> would be more reliable.
>> 
>> 
>> Or in short what is userspace doing?  What does userspace care about?
>
> I think I've answered this, but it's more of trying to fix a bug which
> causes an *almost* reliable return code to be a reliable return code.
> Am I correct in stating that in both of these scnarios - !access_ok()
> and badframe, it is unnecessary to search the VMAs for where the address
> falls to know what error code to return?

You have answered you don't know what piece of userspace cares.

For access_ok failure you are correct we don't need a find_vma call
to tell what kind of failure we have.

For anything return -EFAULT we don't know as -EFAULT has less precision
than the instruction itself.

My sense is that you should concentrate on the userspace instruction
emulation case from swp_handler or user_cache_maint_handler.  Because
those instructions can run unemulated we know exactly what the semantics
should be in those cases.

I suspect arm64_notify_segfault should be renamed arm64_notify_fault
and generate SIGBUS or SIGSEGV as appropriate.

In fact I suspect that proper handling is to call __do_page_fault or
handle_mm_fault as do_page_fault does and then parse the VM_FAULT code
for which signal to generate.   Possibly factoring out a helper
converting VM_FAULT codes to signals.

Eric
Liam R. Howlett April 23, 2021, 8:03 p.m. UTC | #5
* Eric W. Biederman <ebiederm@xmission.com> [210423 14:23]:
> Liam Howlett <liam.howlett@oracle.com> writes:
> 
> > * Eric W. Biederman <ebiederm@xmission.com> [210422 14:23]:
> >> Will Deacon <will@kernel.org> writes:
> >> 
> >> > [+Eric as he actually understands how this is supposed to work]
> >> 
> >> I try.
> >> 
> >
> > Thanks to both of you for looking at this.
> >
> >> > On Tue, Apr 20, 2021 at 04:50:13PM +0000, Liam Howlett wrote:
> >> >> arm64_notify_segfault() was used to force a SIGSEGV in all error cases
> >> >> in sigreturn() and rt_sigreturn() to avoid writing a new sig handler.
> >> >> There is now a better sig handler to use which does not search the VMA
> >> >> address space and return a slightly incorrect error code.  Restore the
> >> >> older and correct si_code of SI_KERNEL by using arm64_notify_die().  In
> >> >> the case of !access_ok(), simply return SIGSEGV with si_code
> >> >> SEGV_ACCERR.
> >> 
> >> What is userspace cares?  Why does it care?
> >
> >
> > Calling arm64_notify_segfault() as it is written is unreliable.
> > Searching for the address with find_vma() will return SEG_ACCERR almost
> > always, unless the address is larger than anything within *any* VMA.
> > I'm trying to fix this issue by cleaning up the callers to that function
> > and fix the function itself.
> 
> I can't see the rest of the patches in your series so I am not quite
> certain what you are doing.
> 
> Looking at the places that arm64_notify_segfault is called I do
> think you are right to be suspicious of that function.
> 
> swp_handler seems a legitimate user as it emulates an instruction.
> For that case at least you should probably add the necessary
> check to see if the address is contained in the returned vma.

The initial patch was to do just that, but the discussion resulted in
other patches to try and clean up the signal handling around this area.
I was switching to find_vma_intersection() to find the vma.  There was
concern on if SEGV_ACCERR should be returned if the next vma
VM_GROWSDOWN as that would potentially indicate a stack expansion. [1]

> 
> user_cache_maint_handler calls it with the tagged address, when
> it should use an untagged address with find_vma.  Otherwise
> user_cache_maint_handler is also performing instruction emulation
> and should work the same as swp_handler.

arm64_notify_segfault() calls untagged_addr() on the address passed to
find_vma().  Is there anything missing/wrong with this?

> 
> The only other users are in sigreturn and rt_sigreturn, where the
> address is iffy.  But assuming the proper address is passed sigreturn
> and rt_sigreturn are also performing instruction emulation.
> 
> > I don't have an example of why userspace cares about SI_KERNEL vs
> > SEGV_ACCERR/SEGV_MAPERR, but the git log on f71016a8a8c5 specifies that
> > this function was used to avoid having specific code to print an error
> > code.  I am restoring the old return code as it seems to makes more
> > sense and avoids the bug in the calling path.  If you'd rather, I can
> > change the notify_die line to use SIG_ACCERR as this is *almost* always
> > what is returend - except when the above mentioned bug is hit.  Upon
> > examining the code here, it seems unnecessary to walk the VMA tree to
> > find where the address lands in either of the error scenarios to know
> > what should be returned.
> 
> Ignoring the possibility of a parse error what the code has is -EFAULT.
> If we want to distinguish between SEGV_ACCERR and SEGV_MAPERR we need
> the find_vma (and confirmation that the found vma contains the specified
> address) to see if there is a mapping at the specified address.
> 
> Or as you point out later this could be a SIGBUS situation.
> 
> >> This is changing userspace visible semantics so understanding userspace
> >> and understanding how it might break, and what the risk of regression
> >> seems the most important detail here.
> >> 
> >> >> This change requires exporting arm64_notfiy_die() to the arm64 traps.h
> >> >> 
> >> >> Fixes: f71016a8a8c5 (arm64: signal: Call arm64_notify_segfault when
> >> >> failing to deliver signal)
> >> >> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> >> >> ---
> >> >>  arch/arm64/include/asm/traps.h |  2 ++
> >> >>  arch/arm64/kernel/signal.c     |  8 ++++++--
> >> >>  arch/arm64/kernel/signal32.c   | 18 ++++++++++++++----
> >> >>  3 files changed, 22 insertions(+), 6 deletions(-)
> >> >> 
> >> >> diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
> >> >> index 54f32a0675df..9b76144fcba6 100644
> >> >> --- a/arch/arm64/include/asm/traps.h
> >> >> +++ b/arch/arm64/include/asm/traps.h
> >> >> @@ -29,6 +29,8 @@ void arm64_notify_segfault(unsigned long addr);
> >> >>  void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
> >> >>  void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
> >> >>  void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
> >> >> +void arm64_notify_die(const char *str, struct pt_regs *regs, int signo,
> >> >> +		      int sicode, unsigned long far, int err);
> >> >>  
> >> >>  /*
> >> >>   * Move regs->pc to next instruction and do necessary setup before it
> >> >> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> >> >> index 6237486ff6bb..9fde6dc760c3 100644
> >> >> --- a/arch/arm64/kernel/signal.c
> >> >> +++ b/arch/arm64/kernel/signal.c
> >> >> @@ -544,7 +544,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >> >>  	frame = (struct rt_sigframe __user *)regs->sp;
> >> >>  
> >> >>  	if (!access_ok(frame, sizeof (*frame)))
> >> >> -		goto badframe;
> >> >> +		goto e_access;
> >> >>  
> >> >>  	if (restore_sigframe(regs, frame))
> >> >>  		goto badframe;
> >> >> @@ -555,7 +555,11 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >> >>  	return regs->regs[0];
> >> >>  
> >> >>  badframe:
> >> >> -	arm64_notify_segfault(regs->sp);
> >> >> +	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL, regs->sp, 0);
> >> >> +	return 0;
> >> >> +
> >> >> +e_access:
> >> >> +	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->sp, 0);
> >> >>  	return 0;
> >> >
> >> > This seems really error-prone to me, but maybe I'm just missing some
> >> > context. What's the rule for reporting an si_code of SI_KERNEL vs
> >> > SEGV_ACCERR, and is the former actually valid for SIGSEGV?
> >> 
> >> The si_codes SI_USER == 0 and SI_KERNEL == 0x80 are valid for all
> >> signals.  SI_KERNEL means I don't have any information for you other
> >> than signal number.
> >> 
> >> In general something better than SI_KERNEL is desirable.
> >
> > I went with SI_KERNEL as that's what was there before.  I have no strong
> > opinion on what should be returned; I do favour SIGBUS with si_code of
> > BUS_ADRALN but I didn't want to change user visable code too much -
> > especially to fix a bug in another function.
> 
> I hadn't noticed earlier, but I agree that what the code is doing is a
> little strange.  I get SIGBUS and SIGEGV confused when I have not looked
> at the recently.  I think SIGBUS is for mapped access that fail, and
> SIGSEGV is for everything else.  
> 
> >> > With this change, pointing the (signal) stack to a kernel address will
> >> > result in SEGV_ACCERR but pointing it to something like a PROT_NONE user
> >> > address will give SI_KERNEL (well, assuming that we manage to deliver
> >> > the SEGV somehow). I'm having a hard time seeing why that's a useful
> >> > distinction to make..
> >> >
> >> > If it's important to get this a particular way around, please can you
> >> > add some selftests?
> >> 
> >> Going down the current path I see 3 possible cases:
> >> 
> >> copy_from_user returns -EFAULT which becomes SEGV_MAPERR or SEGV_ACCERR.
> >
> > Almost always SEGV_ACCERR with the current bug, as mentioned above.
> > find_vma() searches from addr until the end of the address space, it
> > isn't just a simple lookup of the address.
> 
> Which is clearly a logic error.
> 
> In practice I don't know if anyone cares how sigreturn fails so we are
> in a grey area.
> 
> >> A signal frame parse error.  For which SI_KERNEL seems as good an error
> >> code as any.
> >> 
> >> 
> >> On x86 there is no attempt to figure out the cause of the -EFAULT, and
> >> always uses SI_KERNEL.  This is because x86 just does:
> >> "force_sig(SIGSEGV);" As arm64 did until f71016a8a8c5 ("arm64: signal:
> >> Call arm64_notify_segfault when failing to deliver signal")
> >> 
> >> 
> >> 
> >> I think the big question is what does it make sense to do here.
> >> 
> >> The big picture.  Upon return from a signal the kernel arranges
> >> for rt_sigreturn to be called to return to a pre-signal state.
> >> As such rt_sigreturn can not return an error code.
> >> 
> >> In general the kernel will write the signal frame and that will
> >> guarantee that the signal from can be processes by rt_sigreturn.
> >> 
> >> For error handling we are dealing with the case that userspace
> >> has modified the signal frame.  So that it either does not
> >> parse or that it is unmapped.
> >> 
> >> 
> >> So who cares?  The only two cases I can think of are debuggers, and
> >> programs playing fancy memory management games like garbage collections.
> >> I have heard of applications (like garbage collectors)
> >> unmapping/mprotecting memory to create a barrier.
> >> 
> >> Liam Howlett is that the issue here?  Is not seeing SI_KERNEL confusing
> >> the JVM?
> >
> > No, the issue here is that arm64_notify_segfault() has a bug which sent
> > me down a rabbit hole of issues and I'm really trying my best to help
> > out as best I can.  The bug certainly affects this function as it is
> > written today, but my patch will generate a consistent signal.
> 
> Not so much.  There are other cases where -EFAULT causing a failing
> return in that function.  So I think you have in practice made the
> matter worse.  As after this patch it becomes less clear what the
> return code is.

I do not see my error.  The rt_sigreturn() checks for unaligned access,
the frame is access_ok(), then tries to restore the sigframe/altstack.
If it's unaligned the patch would retrun SI_KERNEL, if the access is not
okay then it returns SEGV_ACCERR, if the sigframe/altstack fail to be
restored then it retuns SI_KERNEL.

> 
> >> For debuggers I expect the stack backtrace from SIGSEGV is enough to see
> >> that something is wrong.
> >> 
> >> For applications performing fancy processing of the signal frame I think
> >> that tends to be very architecture specific.  In part because even
> >> knowing the faulting address the size of the access is not known so the
> >> instruction must be interpreted.  Getting a system call instead of a
> >> load or store instruction might be enough to completely confuse
> >> applications processing SEGV_MAPERR or SEGV_ACCERR.  Such applications
> >> may also struggle with the fact that the address in siginfo is less
> >> precise than it would be for an ordinary page fault.
> >> 
> >> 
> >> So my sense is if you known you are helping userspace returning either
> >> SEGV_MAPERR or SEGV_ACCERR go for it.  Otherwise there are enough
> >> variables that returning less information when rt_sigreturn fails
> >> would be more reliable.
> >> 
> >> 
> >> Or in short what is userspace doing?  What does userspace care about?
> >
> > I think I've answered this, but it's more of trying to fix a bug which
> > causes an *almost* reliable return code to be a reliable return code.
> > Am I correct in stating that in both of these scnarios - !access_ok()
> > and badframe, it is unnecessary to search the VMAs for where the address
> > falls to know what error code to return?
> 
> You have answered you don't know what piece of userspace cares.
> 
> For access_ok failure you are correct we don't need a find_vma call
> to tell what kind of failure we have.
> 
> For anything return -EFAULT we don't know as -EFAULT has less precision
> than the instruction itself.
> 
> My sense is that you should concentrate on the userspace instruction
> emulation case from swp_handler or user_cache_maint_handler.  Because
> those instructions can run unemulated we know exactly what the semantics
> should be in those cases.
> 
> I suspect arm64_notify_segfault should be renamed arm64_notify_fault
> and generate SIGBUS or SIGSEGV as appropriate.
> 
> In fact I suspect that proper handling is to call __do_page_fault or
> handle_mm_fault as do_page_fault does and then parse the VM_FAULT code
> for which signal to generate.   Possibly factoring out a helper
> converting VM_FAULT codes to signals.
> 

This seems a bit too much for something that is in a gray area, as you
said above.

My goal was to tidy up the signal processing in these two functions and
alter the behaviour to be reliable.  If you believe that
arm64_notify_segfault() should be called anyways, then is there harm in
fixing the logic in that function and leaving rt_sigreturn() and
sigreturn() as they are? Please see my 3rd patch in the series at [2].

Thanks,
Liam

Link:
[1] https://lore.kernel.org/lkml/20210422185611.ccdf3rm4zr3xtuzl@revolver/
[2]
https://lore.kernel.org/lkml/20210420165001.3790670-3-Liam.Howlett@Oracle.com/
Eric W. Biederman April 29, 2021, 5:52 p.m. UTC | #6
This entire discussion seems to come down to what are the expected
semantics of arm64_notify_segfault.  The use of this helper in
swp_handler and user_cache_main_handler is clearly for the purposes of
instruction emulation.  With instruction emulation it is a bug if the
emulated instruction behaves differently than a real instruction in
the same circumstances.

To properly fix the instruction emulation in arm64_notify_segfault it
looks to me that the proper solution is to call __do_page_fault or
handle_mm_fault the way do_page_fault does and them parse the VM_FAULT
code for which signal to generate.

I would probably rename arm64_notify_segfault to arm64_emulate_fault, or
possibly arm64_notify_fault while fixing the emulation so that it
can return different signals and so that people don't have to guess
what the function is supposed to do.

For the specific case of sigreturn and rt_sigreturn it looks sufficient
to use the fixed arm64_notify_segfault.  As it appears the that the code
is attempting to act like it is emulating an instruction that does not
exist.


There is an argument that sigreturn and rt_sigreturn do a poor enough
job of acting like the fault was caused by an instruction, as well
as failing for other reasons it might make more sense to just have
sigreturn and rt_sigreturn call "force_sig(SIGSEGV);"  But that seems
out of scope of what you are trying to fix right now so I would not
worry about it.

Eric
Liam R. Howlett April 30, 2021, 6:48 p.m. UTC | #7
This is way out of scope for what I'm doing.  I'm trying to fix a call
to the wrong mm API.  I was trying to clean up any obvious errors in
calling functions which were exposed by fixing that error.  If you want
this fixed differently, then please go ahead and tackle the problems you
see.

Thank you,
Liam


* Eric W. Biederman <ebiederm@xmission.com> [210429 13:52]:
> 
> This entire discussion seems to come down to what are the expected
> semantics of arm64_notify_segfault.  The use of this helper in
> swp_handler and user_cache_main_handler is clearly for the purposes of
> instruction emulation.  With instruction emulation it is a bug if the
> emulated instruction behaves differently than a real instruction in
> the same circumstances.
> 
> To properly fix the instruction emulation in arm64_notify_segfault it
> looks to me that the proper solution is to call __do_page_fault or
> handle_mm_fault the way do_page_fault does and them parse the VM_FAULT
> code for which signal to generate.
> 
> I would probably rename arm64_notify_segfault to arm64_emulate_fault, or
> possibly arm64_notify_fault while fixing the emulation so that it
> can return different signals and so that people don't have to guess
> what the function is supposed to do.
> 
> For the specific case of sigreturn and rt_sigreturn it looks sufficient
> to use the fixed arm64_notify_segfault.  As it appears the that the code
> is attempting to act like it is emulating an instruction that does not
> exist.
> 
> 
> There is an argument that sigreturn and rt_sigreturn do a poor enough
> job of acting like the fault was caused by an instruction, as well
> as failing for other reasons it might make more sense to just have
> sigreturn and rt_sigreturn call "force_sig(SIGSEGV);"  But that seems
> out of scope of what you are trying to fix right now so I would not
> worry about it.
> 
> Eric
Eric W. Biederman April 30, 2021, 7:57 p.m. UTC | #8
Liam Howlett <liam.howlett@oracle.com> writes:

> This is way out of scope for what I'm doing.  I'm trying to fix a call
> to the wrong mm API.  I was trying to clean up any obvious errors in
> calling functions which were exposed by fixing that error.  If you want
> this fixed differently, then please go ahead and tackle the problems you
> see.

I was asked by the arm maintainers to describe what the code should be
doing here.  I hope I have done that.

What is very interesting is that the code in __do_page_fault does not
use find_vma_intersection it uses find_vma.  Which suggests that
find_vma_intersection may not be the proper mm api.

The logic is:

From __do_page_fault:
	struct vm_area_struct *vma = find_vma(mm, addr);

	if (unlikely(!vma))
		return VM_FAULT_BADMAP;

	/*
	 * Ok, we have a good vm_area for this memory access, so we can handle
	 * it.
	 */
	if (unlikely(vma->vm_start > addr)) {
		if (!(vma->vm_flags & VM_GROWSDOWN))
			return VM_FAULT_BADMAP;
		if (expand_stack(vma, addr))
			return VM_FAULT_BADMAP;
	}

	/*
	 * Check that the permissions on the VMA allow for the fault which
	 * occurred.
	 */
	if (!(vma->vm_flags & vm_flags))
		return VM_FAULT_BADACCESS;

From do_page_fault:

	arm64_force_sig_fault(SIGSEGV,
			      fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR,
			      far, inf->name);


Hmm.  If the expand_stack step is skipped. Does is the logic equivalent
to find_vma_intersection?

	static inline struct vm_area_struct *find_vma_intersection(
        	struct mm_struct * mm,
                unsigned long start_addr,
                unsigned long end_addr)
	{
		struct vm_area_struct * vma = find_vma(mm,start_addr);
	
		if (vma && end_addr <= vma->vm_start)
			vma = NULL;
		return vma;
	}

Yes. It does look that way.  VM_FAULT_BADMAP is returned when a vma
covering the specified address is not found.  And VM_FAULT_BADACCESS is
returned when there is a vma and there is a permission problem.

There are also two SIGBUS cases that arm64_notify_segfault does not
handle.

So it appears changing arm64_notify_segfault to use
find_vma_intersection instead of find_vma would be a correct but
incomplete fix.

I don't see a point in changing sigerturn or rt_sigreturn.

Eric
Liam R. Howlett April 30, 2021, 8:31 p.m. UTC | #9
* Eric W. Biederman <ebiederm@xmission.com> [210430 15:57]:
> Liam Howlett <liam.howlett@oracle.com> writes:
> 
> > This is way out of scope for what I'm doing.  I'm trying to fix a call
> > to the wrong mm API.  I was trying to clean up any obvious errors in
> > calling functions which were exposed by fixing that error.  If you want
> > this fixed differently, then please go ahead and tackle the problems you
> > see.
> 
> I was asked by the arm maintainers to describe what the code should be
> doing here.  I hope I have done that.
> 
> What is very interesting is that the code in __do_page_fault does not
> use find_vma_intersection it uses find_vma.  Which suggests that
> find_vma_intersection may not be the proper mm api.
> 
> The logic is:
> 
> From __do_page_fault:
> 	struct vm_area_struct *vma = find_vma(mm, addr);
> 
> 	if (unlikely(!vma))
> 		return VM_FAULT_BADMAP;
> 
> 	/*
> 	 * Ok, we have a good vm_area for this memory access, so we can handle
> 	 * it.
> 	 */
> 	if (unlikely(vma->vm_start > addr)) {
> 		if (!(vma->vm_flags & VM_GROWSDOWN))
> 			return VM_FAULT_BADMAP;
> 		if (expand_stack(vma, addr))
> 			return VM_FAULT_BADMAP;
> 	}
> 
> 	/*
> 	 * Check that the permissions on the VMA allow for the fault which
> 	 * occurred.
> 	 */
> 	if (!(vma->vm_flags & vm_flags))
> 		return VM_FAULT_BADACCESS;
> 
> From do_page_fault:
> 
> 	arm64_force_sig_fault(SIGSEGV,
> 			      fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR,
> 			      far, inf->name);
> 
> 
> Hmm.  If the expand_stack step is skipped. Does is the logic equivalent
> to find_vma_intersection?
> 
> 	static inline struct vm_area_struct *find_vma_intersection(
>         	struct mm_struct * mm,
>                 unsigned long start_addr,
>                 unsigned long end_addr)
> 	{
> 		struct vm_area_struct * vma = find_vma(mm,start_addr);
> 	
> 		if (vma && end_addr <= vma->vm_start)
> 			vma = NULL;
> 		return vma;
> 	}
> 
> Yes. It does look that way.  VM_FAULT_BADMAP is returned when a vma
> covering the specified address is not found.  And VM_FAULT_BADACCESS is
> returned when there is a vma and there is a permission problem.
> 
> There are also two SIGBUS cases that arm64_notify_segfault does not
> handle.
> 
> So it appears changing arm64_notify_segfault to use
> find_vma_intersection instead of find_vma would be a correct but
> incomplete fix.

The reason VM_GROWSDOWN is checked here is to see if the stack should be
attempted to be expanded, which happens above.  If VM_GROWSDOWN is true,
then wouldn't the do_page_fault() and __do_page_fault() calls already be
done and be handling this case?

> 
> I don't see a point in changing sigerturn or rt_sigreturn.

Both functions call arm64_notify_segfault() on !access_ok() which I've
increased the potential for returning SEGV_MAPERR.  It is also not
necessary to walk the VMAs when !access_ok(), so it seemed a decent
thing to do.

Thanks,
Liam
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index 54f32a0675df..9b76144fcba6 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -29,6 +29,8 @@  void arm64_notify_segfault(unsigned long addr);
 void arm64_force_sig_fault(int signo, int code, unsigned long far, const char *str);
 void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char *str);
 void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const char *str);
+void arm64_notify_die(const char *str, struct pt_regs *regs, int signo,
+		      int sicode, unsigned long far, int err);
 
 /*
  * Move regs->pc to next instruction and do necessary setup before it
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 6237486ff6bb..9fde6dc760c3 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -544,7 +544,7 @@  SYSCALL_DEFINE0(rt_sigreturn)
 	frame = (struct rt_sigframe __user *)regs->sp;
 
 	if (!access_ok(frame, sizeof (*frame)))
-		goto badframe;
+		goto e_access;
 
 	if (restore_sigframe(regs, frame))
 		goto badframe;
@@ -555,7 +555,11 @@  SYSCALL_DEFINE0(rt_sigreturn)
 	return regs->regs[0];
 
 badframe:
-	arm64_notify_segfault(regs->sp);
+	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL, regs->sp, 0);
+	return 0;
+
+e_access:
+	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->sp, 0);
 	return 0;
 }
 
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
index 2f507f565c48..af8b6c0eb8aa 100644
--- a/arch/arm64/kernel/signal32.c
+++ b/arch/arm64/kernel/signal32.c
@@ -248,7 +248,7 @@  COMPAT_SYSCALL_DEFINE0(sigreturn)
 	frame = (struct compat_sigframe __user *)regs->compat_sp;
 
 	if (!access_ok(frame, sizeof (*frame)))
-		goto badframe;
+		goto e_access;
 
 	if (compat_restore_sigframe(regs, frame))
 		goto badframe;
@@ -256,7 +256,12 @@  COMPAT_SYSCALL_DEFINE0(sigreturn)
 	return regs->regs[0];
 
 badframe:
-	arm64_notify_segfault(regs->compat_sp);
+	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL,
+			 regs->compat_sp, 0);
+	return 0;
+
+e_access:
+	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->compat_sp, 0);
 	return 0;
 }
 
@@ -279,7 +284,7 @@  COMPAT_SYSCALL_DEFINE0(rt_sigreturn)
 	frame = (struct compat_rt_sigframe __user *)regs->compat_sp;
 
 	if (!access_ok(frame, sizeof (*frame)))
-		goto badframe;
+		goto e_access;
 
 	if (compat_restore_sigframe(regs, &frame->sig))
 		goto badframe;
@@ -290,7 +295,12 @@  COMPAT_SYSCALL_DEFINE0(rt_sigreturn)
 	return regs->regs[0];
 
 badframe:
-	arm64_notify_segfault(regs->compat_sp);
+	arm64_notify_die("Bad frame", regs, SIGSEGV, SI_KERNEL,
+			 regs->compat_sp, 0);
+	return 0;
+
+e_access:
+	force_signal_inject(SIGSEGV, SEGV_ACCERR, regs->compat_sp, 0);
 	return 0;
 }