Message ID | 0eb601a5d1906fadd7099149eb605181911cfc04.1604523707.git.pcc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: expose FAR_EL1 tag bits in siginfo | expand |
Peter you are patching buggy code for your siginfo extension can you please ignore vas-fault.c. The code in vas-fault.c should be fixed separately. Futher it uses clear_siginfo so you should get well defined behavior even if your new field is not initialized. I have copied the powerpc folks so hopefully this buggy code can be fixed. > diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c > index 3d21fce254b7..877e7d5fb4a2 100644 > --- a/arch/powerpc/platforms/powernv/vas-fault.c > +++ b/arch/powerpc/platforms/powernv/vas-fault.c > @@ -154,6 +154,7 @@ static void update_csb(struct vas_window *window, > info.si_errno = EFAULT; > info.si_code = SEGV_MAPERR; > info.si_addr = csb_addr; > + info.si_faultflags = 0; Powerpc folks. This code was introduced in c96c4436aba4 ("powerpc/vas: Update CSB and notify process for fault CRBs") and is badly buggy. Let me count the bugs: a) Using kill_pid_info. That performs a permission check that does not make sense from a kernel thread. b) Manually filling in struct siginfo. Everyone gets it wrong and the powerpc code is no exception setting si_errno when that is something Linux as a rule does not do. Technically we have send_sig_fault to handle sending a fault from a non-sychrnous context but I am not convinced it make sense in this case. c) Sending an asynchronous SIGSEGV with the si_code set to SEGV_MAPERR. How can userspace detect it is an asynchronous signal? What can userspace do if it detects an asynchronous signal? If userspace is so buggered as to give your kernel thread a bogus address I suspect uncerimonious sending SIGKILL is probably the best you can do. There are some additional questionable things in that code like taking a task_struct reference simply to be able to test tsk->flags but no locks are held to ensure that tsk->flags are meaningful. Nor are any tests performed to see if the task being tested still uses the designated mm. I suspect exec could have been called. In which case the code needs to check the mm, or at least play with exec_id to ensure you are not improperly signaling a process after exec. None of this is to say that update_csb is fundmentally bad or hard to correct just that it has some significant defects in it's implementation right now that need to be corrected. I am hoping a detailed accounting and pointing out those defects will allow the bug to be fixed. Thank you, Eric
Peter Collingbourne <pcc@google.com> writes: > This field will contain flags that may be used by signal handlers to > determine whether other fields in the _sigfault portion of siginfo are > valid. An example use case is the following patch, which introduces > the si_addr_tag_bits{,_mask} fields. > > A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow > a signal handler to require the kernel to set the field (but note > that the field will be set anyway if the kernel supports the flag, > regardless of its value). In combination with the previous patches, > this allows a userspace program to determine whether the kernel will > set the field. > > It is possible for an si_faultflags-unaware program to cause a signal > handler in an si_faultflags-aware program to be called with a provided > siginfo data structure by using one of the following syscalls: > > - ptrace(PTRACE_SETSIGINFO) > - pidfd_send_signal > - rt_sigqueueinfo > - rt_tgsigqueueinfo > > So we need to prevent the si_faultflags-unaware program from causing an > uninitialized read of si_faultflags in the si_faultflags-aware program when > it uses one of these syscalls. > > The last three cases can be handled by observing that each of these > syscalls fails if si_code >= 0. We also observe that kill(2) and > tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), > so we define si_faultflags to only be valid if si_code > 0. > > There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so > we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it > detects that the signal would use the _sigfault layout, and introduce > a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware > program may use to opt out of this behavior. So I think while well intentioned this is misguided. gdb and the like may use this but I expect the primary user is CRIU which simply reads the signal out of one process saves it on disk and then restores the signal as read into the new process (possibly on a different machine). At least for the CRIU usage PTRACE_SETSIGINFO need to remain a raw pass through kind of operation. Eric
On Mon, 2020-11-09 at 19:54 -0600, Eric W. Biederman wrote: > Peter you are patching buggy code for your siginfo extension can > you please ignore vas-fault.c. The code in vas-fault.c should > be fixed separately. Futher it uses clear_siginfo so you should > get well defined behavior even if your new field is not initialized. > > I have copied the powerpc folks so hopefully this buggy code > can be fixed. > > > diff --git a/arch/powerpc/platforms/powernv/vas-fault.c > > b/arch/powerpc/platforms/powernv/vas-fault.c > > index 3d21fce254b7..877e7d5fb4a2 100644 > > --- a/arch/powerpc/platforms/powernv/vas-fault.c > > +++ b/arch/powerpc/platforms/powernv/vas-fault.c > > @@ -154,6 +154,7 @@ static void update_csb(struct vas_window > > *window, > > info.si_errno = EFAULT; > > info.si_code = SEGV_MAPERR; > > info.si_addr = csb_addr; > > + info.si_faultflags = 0; > Thanks Eric for your comments and pointing possible issues. Here is the NX coprocessor interaction with user space and kernel: - Process opens NX window / channel. The user space sends requests to NX without kernel involvement. This request contains data buffer and status block called coprocessor status block (CSB). - if NX sees fault on the request buffer or on CSB address, issue an interrupt to kernel - kernel updates the CSB with the fault information and then the process can reissue request. - If the fault is on CSB address and is not a valid address, sending SEGV signal so that the process assign proper CSB and reissue new request - We are not seeing the invalid CSB address in the process context, but during handling the fault later. So thought about sending SEGV signal instead of killing the process since it is not a standard segfault. - All these windows will be closed upon process exit, but waits until all pending requests are completed. So process will not exit with pending requests, means after all faults handled if any. - In the case of multithread applications, NX windows will be closed with the last thread. Means other threads can still issue requests with these windows. So to support in these applications, take PID and MM references during window open and release them later in close. > Powerpc folks. This code was introduced in c96c4436aba4 > ("powerpc/vas: > Update CSB and notify process for fault CRBs") and is badly buggy. > > Let me count the bugs: > > a) Using kill_pid_info. That performs a permission check that > does not make sense from a kernel thread. > > b) Manually filling in struct siginfo. Everyone gets it wrong > and the powerpc code is no exception setting si_errno when > that is something Linux as a rule does not do. > > Technically we have send_sig_fault to handle sending > a fault from a non-sychrnous context but I am not convinced > it make sense in this case. Yes, kill_pid_info() -> group_send_sig_info() checks permissions which is an extra step. I think send_sig_fault may be used to replace the above steps. > > c) Sending an asynchronous SIGSEGV with the si_code set to > SEGV_MAPERR. > How can userspace detect it is an asynchronous signal? What can > userspace do if it detects an asynchronous signal? If userspace > is > so buggered as to give your kernel thread a bogus address I > suspect > uncerimonious sending SIGKILL is probably the best you can do. Application can assign new CSB and send new request when it catches the signal. For example it can use csb_addr passed in si_addr amd decide whether this SEGV is due to to CSB fault. Since it is an async signal, was thinking for the application to recover instead of killing theprocess. > > There are some additional questionable things in that code like > taking a > task_struct reference simply to be able to test tsk->flags but no > locks are held to ensure that tsk->flags are meaningful. Nor are > any tests performed to see if the task being tested still uses > the designated mm. I suspect exec could have been called. tsk->flags is used to make sure not to send a signal if the task is in exiting. We access this task under get/put_task_struct(). Also kill_pid_info() sends signal if pid_task() is available. Since we are taken mm reference, it can not be freed. So the task has to be present until all NX windows are closed. > > In which case the code needs to check the mm, or at least play with > exec_id to ensure you are not improperly signaling a process after > exec. > > None of this is to say that update_csb is fundmentally bad or hard to > correct just that it has some significant defects in it's > implementation > right now that need to be corrected. I am hoping a detailed > accounting > and pointing out those defects will allow the bug to be fixed. > > Thank you, > Eric
On Mon, Nov 09, 2020 at 07:57:33PM -0600, Eric W. Biederman wrote: > Peter Collingbourne <pcc@google.com> writes: > > > This field will contain flags that may be used by signal handlers to > > determine whether other fields in the _sigfault portion of siginfo are > > valid. An example use case is the following patch, which introduces > > the si_addr_tag_bits{,_mask} fields. > > > > A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow > > a signal handler to require the kernel to set the field (but note > > that the field will be set anyway if the kernel supports the flag, > > regardless of its value). In combination with the previous patches, > > this allows a userspace program to determine whether the kernel will > > set the field. > > > > It is possible for an si_faultflags-unaware program to cause a signal > > handler in an si_faultflags-aware program to be called with a provided > > siginfo data structure by using one of the following syscalls: > > > > - ptrace(PTRACE_SETSIGINFO) > > - pidfd_send_signal > > - rt_sigqueueinfo > > - rt_tgsigqueueinfo > > > > So we need to prevent the si_faultflags-unaware program from causing an > > uninitialized read of si_faultflags in the si_faultflags-aware program when > > it uses one of these syscalls. > > > > The last three cases can be handled by observing that each of these > > syscalls fails if si_code >= 0. We also observe that kill(2) and > > tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), > > so we define si_faultflags to only be valid if si_code > 0. > > > > There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so > > we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it > > detects that the signal would use the _sigfault layout, and introduce > > a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware > > program may use to opt out of this behavior. > > So I think while well intentioned this is misguided. > > gdb and the like may use this but I expect the primary user is CRIU > which simply reads the signal out of one process saves it on disk > and then restores the signal as read into the new process (possibly > on a different machine). > > At least for the CRIU usage PTRACE_SETSIGINFO need to remain a raw > pass through kind of operation. This is a problem, though. How can we tell the difference between a siginfo that was generated by the kernel and a siginfo that was generated (or altered) by a non-xflags aware userspace? Short of revving the whole API, I don't see a simple solution to this. Although a bit of a hack, could we include some kind of checksum in the siginfo? If the checksum matches during PTRACE_SETSIGINFO, we could accept the whole thing; xflags included. Otherwise, we could silently drop non-self-describing extensions. If we only need to generate the checksum when PTRACE_GETSIGINFO is called then it might be feasible to use a strong hash; otherwise, this mechanism will be far from bulletproof. A hash has the advantage that we don't need any other information to validate it beyond a salt: if the hash matches, it's self- validating. We could also package other data with it to describe the presence of extensions, but relying on this for regular sigaction()/ signal delivery use feels too high-overhead. For debuggers, I suspect that PTRACE_SETSIGINFO2 is still useful: userspace callers that want to write an extension field that they knowingly generated themselves should have a way to express that. Thoughts? Cheers ---Dave
Dave Martin <Dave.Martin@arm.com> writes: > On Mon, Nov 09, 2020 at 07:57:33PM -0600, Eric W. Biederman wrote: >> Peter Collingbourne <pcc@google.com> writes: >> >> > This field will contain flags that may be used by signal handlers to >> > determine whether other fields in the _sigfault portion of siginfo are >> > valid. An example use case is the following patch, which introduces >> > the si_addr_tag_bits{,_mask} fields. >> > >> > A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow >> > a signal handler to require the kernel to set the field (but note >> > that the field will be set anyway if the kernel supports the flag, >> > regardless of its value). In combination with the previous patches, >> > this allows a userspace program to determine whether the kernel will >> > set the field. >> > >> > It is possible for an si_faultflags-unaware program to cause a signal >> > handler in an si_faultflags-aware program to be called with a provided >> > siginfo data structure by using one of the following syscalls: >> > >> > - ptrace(PTRACE_SETSIGINFO) >> > - pidfd_send_signal >> > - rt_sigqueueinfo >> > - rt_tgsigqueueinfo >> > >> > So we need to prevent the si_faultflags-unaware program from causing an >> > uninitialized read of si_faultflags in the si_faultflags-aware program when >> > it uses one of these syscalls. >> > >> > The last three cases can be handled by observing that each of these >> > syscalls fails if si_code >= 0. We also observe that kill(2) and >> > tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), >> > so we define si_faultflags to only be valid if si_code > 0. >> > >> > There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so >> > we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it >> > detects that the signal would use the _sigfault layout, and introduce >> > a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware >> > program may use to opt out of this behavior. >> >> So I think while well intentioned this is misguided. >> >> gdb and the like may use this but I expect the primary user is CRIU >> which simply reads the signal out of one process saves it on disk >> and then restores the signal as read into the new process (possibly >> on a different machine). >> >> At least for the CRIU usage PTRACE_SETSIGINFO need to remain a raw >> pass through kind of operation. > > This is a problem, though. > > How can we tell the difference between a siginfo that was generated by > the kernel and a siginfo that was generated (or altered) by a non-xflags > aware userspace? > > Short of revving the whole API, I don't see a simple solution to this. Unlike receiving a signal. We do know that userspace old and new always sends unused fields as zero into PTRACE_SETSIGINFO. The split into kernel_siginfo verifies this and fails userspace if it does something different. No problems have been reported. So in the case of xflags a non-xflags aware userspace would either pass the siginfo from through from somewhere else (such as PTRACE_GETSIGINFO), or it would simply generate a signal with all of the xflags bits clear. So everything should work regardless. > Although a bit of a hack, could we include some kind of checksum in the > siginfo? If the checksum matches during PTRACE_SETSIGINFO, we could > accept the whole thing; xflags included. Otherwise, we could silently > drop non-self-describing extensions. > > If we only need to generate the checksum when PTRACE_GETSIGINFO is > called then it might be feasible to use a strong hash; otherwise, this > mechanism will be far from bulletproof. > > A hash has the advantage that we don't need any other information > to validate it beyond a salt: if the hash matches, it's self- > validating. We could also package other data with it to describe the > presence of extensions, but relying on this for regular sigaction()/ > signal delivery use feels too high-overhead. > > For debuggers, I suspect that PTRACE_SETSIGINFO2 is still useful: > userspace callers that want to write an extension field that they > knowingly generated themselves should have a way to express that. > > Thoughts? I think there are two cases: 1) CRIU -- It is just a passthrough of PTRACE_GETSIGINFO 2) Creating a signal from nowhere -- Code that does not know about xflags would leave xflags at 0 so no problem. Does anyone see any other cases I am missing? Eric
ebiederm@xmission.com (Eric W. Biederman) writes: > Dave Martin <Dave.Martin@arm.com> writes: > >> On Mon, Nov 09, 2020 at 07:57:33PM -0600, Eric W. Biederman wrote: >>> Peter Collingbourne <pcc@google.com> writes: >>> >>> > This field will contain flags that may be used by signal handlers to >>> > determine whether other fields in the _sigfault portion of siginfo are >>> > valid. An example use case is the following patch, which introduces >>> > the si_addr_tag_bits{,_mask} fields. >>> > >>> > A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow >>> > a signal handler to require the kernel to set the field (but note >>> > that the field will be set anyway if the kernel supports the flag, >>> > regardless of its value). In combination with the previous patches, >>> > this allows a userspace program to determine whether the kernel will >>> > set the field. >>> > >>> > It is possible for an si_faultflags-unaware program to cause a signal >>> > handler in an si_faultflags-aware program to be called with a provided >>> > siginfo data structure by using one of the following syscalls: >>> > >>> > - ptrace(PTRACE_SETSIGINFO) >>> > - pidfd_send_signal >>> > - rt_sigqueueinfo >>> > - rt_tgsigqueueinfo >>> > >>> > So we need to prevent the si_faultflags-unaware program from causing an >>> > uninitialized read of si_faultflags in the si_faultflags-aware program when >>> > it uses one of these syscalls. >>> > >>> > The last three cases can be handled by observing that each of these >>> > syscalls fails if si_code >= 0. We also observe that kill(2) and >>> > tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), >>> > so we define si_faultflags to only be valid if si_code > 0. >>> > >>> > There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so >>> > we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it >>> > detects that the signal would use the _sigfault layout, and introduce >>> > a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware >>> > program may use to opt out of this behavior. >>> >>> So I think while well intentioned this is misguided. >>> >>> gdb and the like may use this but I expect the primary user is CRIU >>> which simply reads the signal out of one process saves it on disk >>> and then restores the signal as read into the new process (possibly >>> on a different machine). >>> >>> At least for the CRIU usage PTRACE_SETSIGINFO need to remain a raw >>> pass through kind of operation. >> >> This is a problem, though. >> >> How can we tell the difference between a siginfo that was generated by >> the kernel and a siginfo that was generated (or altered) by a non-xflags >> aware userspace? >> >> Short of revving the whole API, I don't see a simple solution to this. > > Unlike receiving a signal. We do know that userspace old and new > always sends unused fields as zero into PTRACE_SETSIGINFO. > > The split into kernel_siginfo verifies this and fails userspace if it > does something different. No problems have been reported. > > So in the case of xflags a non-xflags aware userspace would either pass > the siginfo from through from somewhere else (such as > PTRACE_GETSIGINFO), or it would simply generate a signal with all of > the xflags bits clear. So everything should work regardless. > >> Although a bit of a hack, could we include some kind of checksum in the >> siginfo? If the checksum matches during PTRACE_SETSIGINFO, we could >> accept the whole thing; xflags included. Otherwise, we could silently >> drop non-self-describing extensions. >> >> If we only need to generate the checksum when PTRACE_GETSIGINFO is >> called then it might be feasible to use a strong hash; otherwise, this >> mechanism will be far from bulletproof. >> >> A hash has the advantage that we don't need any other information >> to validate it beyond a salt: if the hash matches, it's self- >> validating. We could also package other data with it to describe the >> presence of extensions, but relying on this for regular sigaction()/ >> signal delivery use feels too high-overhead. >> >> For debuggers, I suspect that PTRACE_SETSIGINFO2 is still useful: >> userspace callers that want to write an extension field that they >> knowingly generated themselves should have a way to express that. >> >> Thoughts? > > I think there are two cases: > 1) CRIU -- It is just a passthrough of PTRACE_GETSIGINFO > 2) Creating a signal from nowhere -- Code that does not know about > xflags would leave xflags at 0 so no problem. > > Does anyone see any other cases I am missing? > Zoinks. I forgot to read and double check the code I wrote. copy_siginfo_from_user only verifies against 0 when we don't know the layout. So I don't know if we can count on userspace providing the extra data as 0 or not. So if we do indeed continue to need xflags we might care. It is currently an undefined non-sense case to provide non-zero fields there. So I think it is reasonable to expect even debuggers generating signals to set those fields to know values such as 0. Further I expect it is rare for debuggers to generate pretend faults. So I would say perform whatever testing we can, so that there are no obvious problem users of PTRACE_SETSIGINFO and then to simply not worry about the PTRACE_SETSIGINFO unless someone reports a regression. But fist let's see if we really need xflags at all. Eric
Haren Myneni <haren@linux.ibm.com> writes: > On Mon, 2020-11-09 at 19:54 -0600, Eric W. Biederman wrote: >> Peter you are patching buggy code for your siginfo extension can >> you please ignore vas-fault.c. The code in vas-fault.c should >> be fixed separately. Futher it uses clear_siginfo so you should >> get well defined behavior even if your new field is not initialized. >> >> I have copied the powerpc folks so hopefully this buggy code >> can be fixed. >> >> > diff --git a/arch/powerpc/platforms/powernv/vas-fault.c >> > b/arch/powerpc/platforms/powernv/vas-fault.c >> > index 3d21fce254b7..877e7d5fb4a2 100644 >> > --- a/arch/powerpc/platforms/powernv/vas-fault.c >> > +++ b/arch/powerpc/platforms/powernv/vas-fault.c >> > @@ -154,6 +154,7 @@ static void update_csb(struct vas_window >> > *window, >> > info.si_errno = EFAULT; >> > info.si_code = SEGV_MAPERR; >> > info.si_addr = csb_addr; >> > + info.si_faultflags = 0; >> > > Thanks Eric for your comments and pointing possible issues. > > Here is the NX coprocessor interaction with user space and kernel: > > - Process opens NX window / channel. The user space sends requests to > NX without kernel involvement. This request contains data buffer and > status block called coprocessor status block (CSB). > - if NX sees fault on the request buffer or on CSB address, issue an > interrupt to kernel > - kernel updates the CSB with the fault information and then the > process can reissue request. > - If the fault is on CSB address and is not a valid address, sending > SEGV signal so that the process assign proper CSB and reissue new > request > - We are not seeing the invalid CSB address in the process context, but > during handling the fault later. So thought about sending SEGV signal > instead of killing the process since it is not a standard segfault. > - All these windows will be closed upon process exit, but waits > until all pending requests are completed. So process will not exit with > pending requests, means after all faults handled if any. I see a lot that doesn't seem to match that description. Consider forking a child and having that child exit, but still have the file descriptor open. The release method will be called. If you consider file descriptor passing there are much more interesting cases involved. > - In the case of multithread applications, NX windows will be closed > with the last thread. Means other threads can still issue requests with > these windows. So to support in these applications, take PID and MM > references during window open and release them later in close. > >> Powerpc folks. This code was introduced in c96c4436aba4 >> ("powerpc/vas: >> Update CSB and notify process for fault CRBs") and is badly buggy. >> >> Let me count the bugs: >> >> a) Using kill_pid_info. That performs a permission check that >> does not make sense from a kernel thread. >> >> b) Manually filling in struct siginfo. Everyone gets it wrong >> and the powerpc code is no exception setting si_errno when >> that is something Linux as a rule does not do. >> >> Technically we have send_sig_fault to handle sending >> a fault from a non-sychrnous context but I am not convinced >> it make sense in this case. > > Yes, kill_pid_info() -> group_send_sig_info() checks permissions which > is an extra step. I think send_sig_fault may be used to replace the > above steps. Make that change at a minimum, please. >> c) Sending an asynchronous SIGSEGV with the si_code set to >> SEGV_MAPERR. >> How can userspace detect it is an asynchronous signal? What can >> userspace do if it detects an asynchronous signal? If userspace >> is >> so buggered as to give your kernel thread a bogus address I >> suspect >> uncerimonious sending SIGKILL is probably the best you can do. > > Application can assign new CSB and send new request when it catches the > signal. For example it can use csb_addr passed in si_addr amd decide > whether this SEGV is due to to CSB fault. Since it is an async signal, > was thinking for the application to recover instead of killing theprocess. An async signal the process can recover from is fine (caveats about exec, exit, and file descripotr passing aside). You want something to mark this as an asynchronous event so that userspace can tell that signal from other signals. If only a single coprocessor is possible it could be as simple as using a special si_code for this case. >> There are some additional questionable things in that code like >> taking a >> task_struct reference simply to be able to test tsk->flags but no >> locks are held to ensure that tsk->flags are meaningful. Nor are >> any tests performed to see if the task being tested still uses >> the designated mm. I suspect exec could have been called. > > tsk->flags is used to make sure not to send a signal if the task is in > exiting. We access this task under get/put_task_struct(). The thing ins get/put_task_struct is about a reference count. It is not a lock. So the task can still set PF_EXITING after you have tested it and before you call copy_to_user. So testing PF_EXITING gets you practically nothing. Further the signal delivery code can already cop with attempting to send a signal to a task that is in PF_EXITING so I don't see the point of trying to avoid that situation. > Also > kill_pid_info() sends signal if pid_task() is available. Since we are > taken mm reference, it can not be freed. But task->mm can be different if the task has called exec. > So the task has to be present until all NX windows are closed. I don't see how that follows from anything the code does or you have said. The lifetime of an mm and the lifetime of a pid are independent of the lifetime of a task. >> In which case the code needs to check the mm, or at least play with >> exec_id to ensure you are not improperly signaling a process after >> exec. >> >> None of this is to say that update_csb is fundmentally bad or hard to >> correct just that it has some significant defects in it's >> implementation >> right now that need to be corrected. I am hoping a detailed >> accounting >> and pointing out those defects will allow the bug to be fixed. >> >> Thank you, >> Eric Eric
On Wed, Nov 11, 2020 at 02:28:38PM -0600, Eric W. Biederman wrote: > ebiederm@xmission.com (Eric W. Biederman) writes: > > > Dave Martin <Dave.Martin@arm.com> writes: > > > >> On Mon, Nov 09, 2020 at 07:57:33PM -0600, Eric W. Biederman wrote: > >>> Peter Collingbourne <pcc@google.com> writes: > >>> > >>> > This field will contain flags that may be used by signal handlers to > >>> > determine whether other fields in the _sigfault portion of siginfo are > >>> > valid. An example use case is the following patch, which introduces > >>> > the si_addr_tag_bits{,_mask} fields. > >>> > > >>> > A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow > >>> > a signal handler to require the kernel to set the field (but note > >>> > that the field will be set anyway if the kernel supports the flag, > >>> > regardless of its value). In combination with the previous patches, > >>> > this allows a userspace program to determine whether the kernel will > >>> > set the field. > >>> > > >>> > It is possible for an si_faultflags-unaware program to cause a signal > >>> > handler in an si_faultflags-aware program to be called with a provided > >>> > siginfo data structure by using one of the following syscalls: > >>> > > >>> > - ptrace(PTRACE_SETSIGINFO) > >>> > - pidfd_send_signal > >>> > - rt_sigqueueinfo > >>> > - rt_tgsigqueueinfo > >>> > > >>> > So we need to prevent the si_faultflags-unaware program from causing an > >>> > uninitialized read of si_faultflags in the si_faultflags-aware program when > >>> > it uses one of these syscalls. > >>> > > >>> > The last three cases can be handled by observing that each of these > >>> > syscalls fails if si_code >= 0. We also observe that kill(2) and > >>> > tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), > >>> > so we define si_faultflags to only be valid if si_code > 0. > >>> > > >>> > There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so > >>> > we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it > >>> > detects that the signal would use the _sigfault layout, and introduce > >>> > a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware > >>> > program may use to opt out of this behavior. > >>> > >>> So I think while well intentioned this is misguided. > >>> > >>> gdb and the like may use this but I expect the primary user is CRIU > >>> which simply reads the signal out of one process saves it on disk > >>> and then restores the signal as read into the new process (possibly > >>> on a different machine). > >>> > >>> At least for the CRIU usage PTRACE_SETSIGINFO need to remain a raw > >>> pass through kind of operation. > >> > >> This is a problem, though. > >> > >> How can we tell the difference between a siginfo that was generated by > >> the kernel and a siginfo that was generated (or altered) by a non-xflags > >> aware userspace? > >> > >> Short of revving the whole API, I don't see a simple solution to this. > > > > Unlike receiving a signal. We do know that userspace old and new > > always sends unused fields as zero into PTRACE_SETSIGINFO. > > > > The split into kernel_siginfo verifies this and fails userspace if it > > does something different. No problems have been reported. > > > > So in the case of xflags a non-xflags aware userspace would either pass > > the siginfo from through from somewhere else (such as > > PTRACE_GETSIGINFO), or it would simply generate a signal with all of > > the xflags bits clear. So everything should work regardless. > > > >> Although a bit of a hack, could we include some kind of checksum in the > >> siginfo? If the checksum matches during PTRACE_SETSIGINFO, we could > >> accept the whole thing; xflags included. Otherwise, we could silently > >> drop non-self-describing extensions. > >> > >> If we only need to generate the checksum when PTRACE_GETSIGINFO is > >> called then it might be feasible to use a strong hash; otherwise, this > >> mechanism will be far from bulletproof. > >> > >> A hash has the advantage that we don't need any other information > >> to validate it beyond a salt: if the hash matches, it's self- > >> validating. We could also package other data with it to describe the > >> presence of extensions, but relying on this for regular sigaction()/ > >> signal delivery use feels too high-overhead. > >> > >> For debuggers, I suspect that PTRACE_SETSIGINFO2 is still useful: > >> userspace callers that want to write an extension field that they > >> knowingly generated themselves should have a way to express that. > >> > >> Thoughts? > > > > I think there are two cases: > > 1) CRIU -- It is just a passthrough of PTRACE_GETSIGINFO > > 2) Creating a signal from nowhere -- Code that does not know about > > xflags would leave xflags at 0 so no problem. > > > > Does anyone see any other cases I am missing? > > > > Zoinks. I forgot to read and double check the code I wrote. > > copy_siginfo_from_user only verifies against 0 when we don't know the > layout. So I don't know if we can count on userspace providing the > extra data as 0 or not. > > So if we do indeed continue to need xflags we might care. > > It is currently an undefined non-sense case to provide non-zero fields > there. So I think it is reasonable to expect even debuggers generating > signals to set those fields to know values such as 0. You may well be right that the only time extra fields coming from userspace should be deliberately nonzero is in the passthrough case. I'm still concerned about padding or type-punning issues causing unallocated fields to be initialised with junk, even in the presence of a memset(0). I haven't actually seen this happen, but the language standards seem to allow it and compilers may get more aggressive in the future. This is why I think we still want an explicit way for userspace to indicate what extension fields initialised. Am I missing something that eliminates the danger today? > Further I expect it is rare for debuggers to generate pretend faults. That seems a reasonable assumption. Hand-rolling a fault siginfo seems a pretty fragile thing to do, if it works reliably at all. > So I would say perform whatever testing we can, so that there are no > obvious problem users of PTRACE_SETSIGINFO and then to simply not worry > about the PTRACE_SETSIGINFO unless someone reports a regression. > > But fist let's see if we really need xflags at all. > > Eric Ack Cheers ---Dave
On Wed, Nov 11, 2020 at 02:15:15PM -0600, Eric W. Biederman wrote: > Dave Martin <Dave.Martin@arm.com> writes: > > > On Mon, Nov 09, 2020 at 07:57:33PM -0600, Eric W. Biederman wrote: > >> Peter Collingbourne <pcc@google.com> writes: > >> > >> > This field will contain flags that may be used by signal handlers to > >> > determine whether other fields in the _sigfault portion of siginfo are > >> > valid. An example use case is the following patch, which introduces > >> > the si_addr_tag_bits{,_mask} fields. > >> > > >> > A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow > >> > a signal handler to require the kernel to set the field (but note > >> > that the field will be set anyway if the kernel supports the flag, > >> > regardless of its value). In combination with the previous patches, > >> > this allows a userspace program to determine whether the kernel will > >> > set the field. > >> > > >> > It is possible for an si_faultflags-unaware program to cause a signal > >> > handler in an si_faultflags-aware program to be called with a provided > >> > siginfo data structure by using one of the following syscalls: > >> > > >> > - ptrace(PTRACE_SETSIGINFO) > >> > - pidfd_send_signal > >> > - rt_sigqueueinfo > >> > - rt_tgsigqueueinfo > >> > > >> > So we need to prevent the si_faultflags-unaware program from causing an > >> > uninitialized read of si_faultflags in the si_faultflags-aware program when > >> > it uses one of these syscalls. > >> > > >> > The last three cases can be handled by observing that each of these > >> > syscalls fails if si_code >= 0. We also observe that kill(2) and > >> > tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), > >> > so we define si_faultflags to only be valid if si_code > 0. > >> > > >> > There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so > >> > we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it > >> > detects that the signal would use the _sigfault layout, and introduce > >> > a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware > >> > program may use to opt out of this behavior. > >> > >> So I think while well intentioned this is misguided. > >> > >> gdb and the like may use this but I expect the primary user is CRIU > >> which simply reads the signal out of one process saves it on disk > >> and then restores the signal as read into the new process (possibly > >> on a different machine). > >> > >> At least for the CRIU usage PTRACE_SETSIGINFO need to remain a raw > >> pass through kind of operation. > > > > This is a problem, though. > > > > How can we tell the difference between a siginfo that was generated by > > the kernel and a siginfo that was generated (or altered) by a non-xflags > > aware userspace? > > > > Short of revving the whole API, I don't see a simple solution to this. > > Unlike receiving a signal. We do know that userspace old and new > always sends unused fields as zero into PTRACE_SETSIGINFO. > > The split into kernel_siginfo verifies this and fails userspace if it > does something different. No problems have been reported. > > So in the case of xflags a non-xflags aware userspace would either pass > the siginfo from through from somewhere else (such as > PTRACE_GETSIGINFO), or it would simply generate a signal with all of > the xflags bits clear. So everything should work regardless. > > > Although a bit of a hack, could we include some kind of checksum in the > > siginfo? If the checksum matches during PTRACE_SETSIGINFO, we could > > accept the whole thing; xflags included. Otherwise, we could silently > > drop non-self-describing extensions. > > > > If we only need to generate the checksum when PTRACE_GETSIGINFO is > > called then it might be feasible to use a strong hash; otherwise, this > > mechanism will be far from bulletproof. > > > > A hash has the advantage that we don't need any other information > > to validate it beyond a salt: if the hash matches, it's self- > > validating. We could also package other data with it to describe the > > presence of extensions, but relying on this for regular sigaction()/ > > signal delivery use feels too high-overhead. > > > > For debuggers, I suspect that PTRACE_SETSIGINFO2 is still useful: > > userspace callers that want to write an extension field that they > > knowingly generated themselves should have a way to express that. > > > > Thoughts? Eric, did you have any view on the hash idea here? Cheers ---Dave
Dave Martin <Dave.Martin@arm.com> writes: > On Wed, Nov 11, 2020 at 02:15:15PM -0600, Eric W. Biederman wrote: >> Dave Martin <Dave.Martin@arm.com> writes: >> >> > On Mon, Nov 09, 2020 at 07:57:33PM -0600, Eric W. Biederman wrote: >> >> Peter Collingbourne <pcc@google.com> writes: >> >> >> >> > This field will contain flags that may be used by signal handlers to >> >> > determine whether other fields in the _sigfault portion of siginfo are >> >> > valid. An example use case is the following patch, which introduces >> >> > the si_addr_tag_bits{,_mask} fields. >> >> > >> >> > A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow >> >> > a signal handler to require the kernel to set the field (but note >> >> > that the field will be set anyway if the kernel supports the flag, >> >> > regardless of its value). In combination with the previous patches, >> >> > this allows a userspace program to determine whether the kernel will >> >> > set the field. >> >> > >> >> > It is possible for an si_faultflags-unaware program to cause a signal >> >> > handler in an si_faultflags-aware program to be called with a provided >> >> > siginfo data structure by using one of the following syscalls: >> >> > >> >> > - ptrace(PTRACE_SETSIGINFO) >> >> > - pidfd_send_signal >> >> > - rt_sigqueueinfo >> >> > - rt_tgsigqueueinfo >> >> > >> >> > So we need to prevent the si_faultflags-unaware program from causing an >> >> > uninitialized read of si_faultflags in the si_faultflags-aware program when >> >> > it uses one of these syscalls. >> >> > >> >> > The last three cases can be handled by observing that each of these >> >> > syscalls fails if si_code >= 0. We also observe that kill(2) and >> >> > tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), >> >> > so we define si_faultflags to only be valid if si_code > 0. >> >> > >> >> > There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so >> >> > we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it >> >> > detects that the signal would use the _sigfault layout, and introduce >> >> > a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware >> >> > program may use to opt out of this behavior. >> >> >> >> So I think while well intentioned this is misguided. >> >> >> >> gdb and the like may use this but I expect the primary user is CRIU >> >> which simply reads the signal out of one process saves it on disk >> >> and then restores the signal as read into the new process (possibly >> >> on a different machine). >> >> >> >> At least for the CRIU usage PTRACE_SETSIGINFO need to remain a raw >> >> pass through kind of operation. >> > >> > This is a problem, though. >> > >> > How can we tell the difference between a siginfo that was generated by >> > the kernel and a siginfo that was generated (or altered) by a non-xflags >> > aware userspace? >> > >> > Short of revving the whole API, I don't see a simple solution to this. >> >> Unlike receiving a signal. We do know that userspace old and new >> always sends unused fields as zero into PTRACE_SETSIGINFO. >> >> The split into kernel_siginfo verifies this and fails userspace if it >> does something different. No problems have been reported. >> >> So in the case of xflags a non-xflags aware userspace would either pass >> the siginfo from through from somewhere else (such as >> PTRACE_GETSIGINFO), or it would simply generate a signal with all of >> the xflags bits clear. So everything should work regardless. >> >> > Although a bit of a hack, could we include some kind of checksum in the >> > siginfo? If the checksum matches during PTRACE_SETSIGINFO, we could >> > accept the whole thing; xflags included. Otherwise, we could silently >> > drop non-self-describing extensions. >> > >> > If we only need to generate the checksum when PTRACE_GETSIGINFO is >> > called then it might be feasible to use a strong hash; otherwise, this >> > mechanism will be far from bulletproof. >> > >> > A hash has the advantage that we don't need any other information >> > to validate it beyond a salt: if the hash matches, it's self- >> > validating. We could also package other data with it to describe the >> > presence of extensions, but relying on this for regular sigaction()/ >> > signal delivery use feels too high-overhead. >> > >> > For debuggers, I suspect that PTRACE_SETSIGINFO2 is still useful: >> > userspace callers that want to write an extension field that they >> > knowingly generated themselves should have a way to express that. >> > >> > Thoughts? > > Eric, did you have any view on the hash idea here? I am not quite certain what you meant by salt. A per kernel instance secret I suspect. Such a secret would break creating siginfo by hand and checkpointing and restoring on a different machine. If you don't go with full crypto security it sounds like it would work. If we really need to deploy xflags I think it bears looking at, but right now it feels like one thing too many. Eric
diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c index 3d21fce254b7..877e7d5fb4a2 100644 --- a/arch/powerpc/platforms/powernv/vas-fault.c +++ b/arch/powerpc/platforms/powernv/vas-fault.c @@ -154,6 +154,7 @@ static void update_csb(struct vas_window *window, info.si_errno = EFAULT; info.si_code = SEGV_MAPERR; info.si_addr = csb_addr; + info.si_faultflags = 0; /* * process will be polling on csb.flags after request is sent to diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c index ddfd919be46c..222ff6178571 100644 --- a/arch/x86/kernel/signal_compat.c +++ b/arch/x86/kernel/signal_compat.c @@ -121,8 +121,8 @@ static inline void signal_compat_build_tests(void) #endif CHECK_CSI_OFFSET(_sigfault); - CHECK_CSI_SIZE (_sigfault, 4*sizeof(int)); - CHECK_SI_SIZE (_sigfault, 8*sizeof(int)); + CHECK_CSI_SIZE (_sigfault, 8*sizeof(int)); + CHECK_SI_SIZE (_sigfault, 16*sizeof(int)); BUILD_BUG_ON(offsetof(siginfo_t, si_addr) != 0x10); BUILD_BUG_ON(offsetof(compat_siginfo_t, si_addr) != 0x0C); @@ -138,6 +138,9 @@ static inline void signal_compat_build_tests(void) BUILD_BUG_ON(offsetof(siginfo_t, si_pkey) != 0x20); BUILD_BUG_ON(offsetof(compat_siginfo_t, si_pkey) != 0x14); + BUILD_BUG_ON(offsetof(siginfo_t, si_faultflags) != 0x48); + BUILD_BUG_ON(offsetof(compat_siginfo_t, si_faultflags) != 0x28); + CHECK_CSI_OFFSET(_sigpoll); CHECK_CSI_SIZE (_sigpoll, 2*sizeof(int)); CHECK_SI_SIZE (_sigpoll, 4*sizeof(int)); diff --git a/include/linux/compat.h b/include/linux/compat.h index 14d514233e1d..84d3b72be701 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -236,7 +236,9 @@ typedef struct compat_siginfo { char _dummy_pkey[__COMPAT_ADDR_BND_PKEY_PAD]; u32 _pkey; } _addr_pkey; + compat_uptr_t _pad[6]; }; + u32 _faultflags; } _sigfault; /* SIGPOLL */ diff --git a/include/linux/signal_types.h b/include/linux/signal_types.h index a7887ad84d36..7501209eae33 100644 --- a/include/linux/signal_types.h +++ b/include/linux/signal_types.h @@ -78,6 +78,6 @@ struct ksignal { #define UAPI_SA_FLAGS \ (SA_NOCLDSTOP | SA_NOCLDWAIT | SA_SIGINFO | SA_ONSTACK | SA_RESTART | \ - SA_NODEFER | SA_RESETHAND | __ARCH_UAPI_SA_FLAGS) + SA_NODEFER | SA_RESETHAND | SA_FAULTFLAGS | __ARCH_UAPI_SA_FLAGS) #endif /* _LINUX_SIGNAL_TYPES_H */ diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index 7aacf9389010..10a55aed9ede 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -91,7 +91,12 @@ union __sifields { char _dummy_pkey[__ADDR_BND_PKEY_PAD]; __u32 _pkey; } _addr_pkey; + void *_pad[6]; }; + __u32 _faultflags; +#ifdef __LP64__ + __u32 _pad2; /* to be used if we add another 32-bit field */ +#endif } _sigfault; /* SIGPOLL */ @@ -152,6 +157,8 @@ typedef struct siginfo { #define si_trapno _sifields._sigfault._trapno #endif #define si_addr_lsb _sifields._sigfault._addr_lsb +/* si_faultflags is only valid if 0 < si_code < SI_KERNEL */ +#define si_faultflags _sifields._sigfault._faultflags #define si_lower _sifields._sigfault._addr_bnd._lower #define si_upper _sifields._sigfault._addr_bnd._upper #define si_pkey _sifields._sigfault._addr_pkey._pkey diff --git a/include/uapi/asm-generic/signal-defs.h b/include/uapi/asm-generic/signal-defs.h index 0126ebda4d31..e27bf959d4c4 100644 --- a/include/uapi/asm-generic/signal-defs.h +++ b/include/uapi/asm-generic/signal-defs.h @@ -20,6 +20,9 @@ * so this bit allows flag bit support to be detected from userspace while * allowing an old kernel to be distinguished from a kernel that supports every * flag bit. + * SA_FAULTFLAGS indicates that the signal handler requires the siginfo.si_faultflags + * field to be valid. Note that if the kernel supports SA_FAULTFLAGS, the field will + * be valid regardless of the value of this flag. * * SA_ONESHOT and SA_NOMASK are the historical Linux names for the Single * Unix names RESETHAND and NODEFER respectively. @@ -49,6 +52,7 @@ #define SA_RESETHAND 0x80000000 #endif #define SA_UNSUPPORTED 0x00000400 +#define SA_FAULTFLAGS 0x00000800 #define SA_NOMASK SA_NODEFER #define SA_ONESHOT SA_RESETHAND diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h index a71b6e3b03eb..722a2c8a4d3d 100644 --- a/include/uapi/linux/ptrace.h +++ b/include/uapi/linux/ptrace.h @@ -101,6 +101,18 @@ struct ptrace_syscall_info { }; }; +#define PTRACE_SETSIGINFO2 0x420f +/* + * These flags are passed as the addr argument to ptrace. + */ + +/* + * Asserts that the caller is aware of the field siginfo.si_faultflags. Prevents + * the kernel from automatically setting the field to 0 when the signal uses + * a sigfault layout. + */ +#define PTRACE_SIGINFO_FAULTFLAGS 0x1 + /* * These values are stored in task->ptrace_message * by tracehook_report_syscall_* to describe the current syscall-stop. diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 43d6179508d6..ab0618b4602c 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -687,18 +687,31 @@ static int ptrace_getsiginfo(struct task_struct *child, kernel_siginfo_t *info) return error; } -static int ptrace_setsiginfo(struct task_struct *child, const kernel_siginfo_t *info) +static int ptrace_setsiginfo(struct task_struct *child, unsigned long flags, + kernel_siginfo_t *info) { - unsigned long flags; + unsigned long lock_flags; int error = -ESRCH; - if (lock_task_sighand(child, &flags)) { + if (flags & ~PTRACE_SIGINFO_FAULTFLAGS) + return -EINVAL; + + /* + * If the caller is unaware of si_faultflags and we're using a layout that + * requires it, set it to 0 which means "no fields are available". + */ + if (!(flags & PTRACE_SIGINFO_FAULTFLAGS) && + siginfo_layout_is_fault( + siginfo_layout(info->si_signo, info->si_code))) + info->si_faultflags = 0; + + if (lock_task_sighand(child, &lock_flags)) { error = -EINVAL; if (likely(child->last_siginfo != NULL)) { copy_siginfo(child->last_siginfo, info); error = 0; } - unlock_task_sighand(child, &flags); + unlock_task_sighand(child, &lock_flags); } return error; } @@ -1038,9 +1051,13 @@ int ptrace_request(struct task_struct *child, long request, break; case PTRACE_SETSIGINFO: + addr = 0; + fallthrough; + + case PTRACE_SETSIGINFO2: ret = copy_siginfo_from_user(&siginfo, datavp); if (!ret) - ret = ptrace_setsiginfo(child, &siginfo); + ret = ptrace_setsiginfo(child, addr, &siginfo); break; case PTRACE_GETSIGMASK: { @@ -1347,10 +1364,14 @@ int compat_ptrace_request(struct task_struct *child, compat_long_t request, break; case PTRACE_SETSIGINFO: + addr = 0; + fallthrough; + + case PTRACE_SETSIGINFO2: ret = copy_siginfo_from_user32( &siginfo, (struct compat_siginfo __user *) datap); if (!ret) - ret = ptrace_setsiginfo(child, &siginfo); + ret = ptrace_setsiginfo(child, addr, &siginfo); break; #ifdef CONFIG_HAVE_ARCH_TRACEHOOK case PTRACE_GETREGSET: diff --git a/kernel/signal.c b/kernel/signal.c index d18930aafbf4..1fd1f0d12174 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1657,6 +1657,7 @@ static void set_sigfault_common_fields(struct kernel_siginfo *info, int sig, info->si_errno = 0; info->si_code = code; info->si_addr = addr; + info->si_faultflags = 0; } int force_sig_fault_to_task(int sig, int code, void __user *addr @@ -3270,6 +3271,7 @@ void copy_siginfo_to_external32(struct compat_siginfo *to, #ifdef __ARCH_SI_TRAPNO to->si_trapno = from->si_trapno; #endif + to->si_faultflags = from->si_faultflags; } switch (layout) { @@ -3345,6 +3347,7 @@ static int post_copy_siginfo_from_user32(kernel_siginfo_t *to, #ifdef __ARCH_SI_TRAPNO to->si_trapno = from->si_trapno; #endif + to->si_faultflags = from->si_faultflags; } switch (layout) {
This field will contain flags that may be used by signal handlers to determine whether other fields in the _sigfault portion of siginfo are valid. An example use case is the following patch, which introduces the si_addr_tag_bits{,_mask} fields. A new sigcontext flag, SA_FAULTFLAGS, is introduced in order to allow a signal handler to require the kernel to set the field (but note that the field will be set anyway if the kernel supports the flag, regardless of its value). In combination with the previous patches, this allows a userspace program to determine whether the kernel will set the field. It is possible for an si_faultflags-unaware program to cause a signal handler in an si_faultflags-aware program to be called with a provided siginfo data structure by using one of the following syscalls: - ptrace(PTRACE_SETSIGINFO) - pidfd_send_signal - rt_sigqueueinfo - rt_tgsigqueueinfo So we need to prevent the si_faultflags-unaware program from causing an uninitialized read of si_faultflags in the si_faultflags-aware program when it uses one of these syscalls. The last three cases can be handled by observing that each of these syscalls fails if si_code >= 0. We also observe that kill(2) and tgkill(2) may be used to send a signal where si_code == 0 (SI_USER), so we define si_faultflags to only be valid if si_code > 0. There is no such check on si_code in ptrace(PTRACE_SETSIGINFO), so we make ptrace(PTRACE_SETSIGINFO) clear the si_faultflags field if it detects that the signal would use the _sigfault layout, and introduce a new ptrace request type, PTRACE_SETSIGINFO2, that a si_faultflags-aware program may use to opt out of this behavior. It is also possible for the kernel to inject a signal specified to use _sigfault by calling force_sig (e.g. there are numerous calls to force_sig(SIGSEGV)). In this case si_code is set to SI_KERNEL and the _kill union member is used, so document that si_code must be < SI_KERNEL. Ideally this field could have just been named si_flags, but that name was already taken by ia64, so a different name was chosen. I considered making ia64's si_flags a generic field and having it appear at the end of _sigfault (in the same place as this patch has si_faultflags) on non-ia64, keeping it in the same place on ia64. ia64's si_flags is a 32-bit field with only one flag bit allocated, so we would have 31 bits to use if we do this. However, it seems simplest to avoid entangling these fields. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/Ide155ce29366c3eab2a944ae4c51205982e5b8b2 --- v14: - make the padding explicit so we can easily use it later v13: - renamed si_xflags to si_faultflags - use fallthrough macros in kernel/ptrace.c - fixed a style warning pointed out by checkpatch.pl v12: - Change type of si_xflags to u32 to avoid increasing alignment - Add si_xflags to signal_compat.c test cases v11: - update comment to say that si_code must > 0 - change ptrace(PTRACE_SETSIGINFO2) to take a flags argument v10: - make the new field compatible with the various ways that a siginfo can be injected from another process - eliminate some duplication by adding a refactoring patch before this one arch/powerpc/platforms/powernv/vas-fault.c | 1 + arch/x86/kernel/signal_compat.c | 7 +++-- include/linux/compat.h | 2 ++ include/linux/signal_types.h | 2 +- include/uapi/asm-generic/siginfo.h | 7 +++++ include/uapi/asm-generic/signal-defs.h | 4 +++ include/uapi/linux/ptrace.h | 12 ++++++++ kernel/ptrace.c | 33 ++++++++++++++++++---- kernel/signal.c | 3 ++ 9 files changed, 62 insertions(+), 9 deletions(-)