Message ID | 20210407150940.542103-1-Liam.Howlett@Oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arch/arm64/kernel/traps: Use find_vma_intersection() in traps for setting si_code | expand |
On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote: > find_vma() will continue to search upwards until the end of the virtual > memory space. This means the si_code would almost never be set to > SEGV_MAPERR even when the address falls outside of any VMA. The result > is that the si_code is not reliable as it may or may not be set to the > correct result, depending on where the address falls in the address > space. > > Using find_vma_intersection() allows for what is intended by only > returning a VMA if it falls within the range provided, in this case a > window of 1. > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > --- > arch/arm64/kernel/traps.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c > index a05d34f0e82a..a44007904a64 100644 > --- a/arch/arm64/kernel/traps.c > +++ b/arch/arm64/kernel/traps.c > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i > void arm64_notify_segfault(unsigned long addr) > { > int code; > + unsigned long ut_addr = untagged_addr(addr); > > mmap_read_lock(current->mm); > - if (find_vma(current->mm, untagged_addr(addr)) == NULL) > + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL) > code = SEGV_MAPERR; > else > code = SEGV_ACCERR; I don't think your change is entirely correct either. We can have a fault below the vma of a stack (with VM_GROWSDOWN) and find_vma_intersection() would return NULL but it should be a SEGV_ACCERR instead. Maybe this should employ similar checks as __do_page_fault() (with expand_stack() and VM_GROWSDOWN).
* Catalin Marinas <catalin.marinas@arm.com> [210412 13:44]: > On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote: > > find_vma() will continue to search upwards until the end of the virtual > > memory space. This means the si_code would almost never be set to > > SEGV_MAPERR even when the address falls outside of any VMA. The result > > is that the si_code is not reliable as it may or may not be set to the > > correct result, depending on where the address falls in the address > > space. > > > > Using find_vma_intersection() allows for what is intended by only > > returning a VMA if it falls within the range provided, in this case a > > window of 1. > > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > > --- > > arch/arm64/kernel/traps.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c > > index a05d34f0e82a..a44007904a64 100644 > > --- a/arch/arm64/kernel/traps.c > > +++ b/arch/arm64/kernel/traps.c > > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i > > void arm64_notify_segfault(unsigned long addr) > > { > > int code; > > + unsigned long ut_addr = untagged_addr(addr); > > > > mmap_read_lock(current->mm); > > - if (find_vma(current->mm, untagged_addr(addr)) == NULL) > > + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL) > > code = SEGV_MAPERR; > > else > > code = SEGV_ACCERR; Thank you for taking the time to thoroughly review this patch. > > I don't think your change is entirely correct either. We can have a > fault below the vma of a stack (with VM_GROWSDOWN) and > find_vma_intersection() would return NULL but it should be a SEGV_ACCERR > instead. I'm pretty sure I am missing something. From what you said above, I think this means that there can be a user cache fault below the stack which should notify the user application that they are not allowed to expand the stack by sending a SIGV_ACCERR in the si_code? Is this expected behaviour or am I missing a code path to this function? > > Maybe this should employ similar checks as __do_page_fault() (with > expand_stack() and VM_GROWSDOWN). You mean the code needs to detect endianness and to check if this is an attempt to expand the stack for both cases? Thanks, Liam
On Tue, Apr 13, 2021 at 04:52:34PM +0000, Liam Howlett wrote: > * Catalin Marinas <catalin.marinas@arm.com> [210412 13:44]: > > On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote: > > > find_vma() will continue to search upwards until the end of the virtual > > > memory space. This means the si_code would almost never be set to > > > SEGV_MAPERR even when the address falls outside of any VMA. The result > > > is that the si_code is not reliable as it may or may not be set to the > > > correct result, depending on where the address falls in the address > > > space. > > > > > > Using find_vma_intersection() allows for what is intended by only > > > returning a VMA if it falls within the range provided, in this case a > > > window of 1. > > > > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > > > --- > > > arch/arm64/kernel/traps.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c > > > index a05d34f0e82a..a44007904a64 100644 > > > --- a/arch/arm64/kernel/traps.c > > > +++ b/arch/arm64/kernel/traps.c > > > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i > > > void arm64_notify_segfault(unsigned long addr) > > > { > > > int code; > > > + unsigned long ut_addr = untagged_addr(addr); > > > > > > mmap_read_lock(current->mm); > > > - if (find_vma(current->mm, untagged_addr(addr)) == NULL) > > > + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL) > > > code = SEGV_MAPERR; > > > else > > > code = SEGV_ACCERR; [...] > > I don't think your change is entirely correct either. We can have a > > fault below the vma of a stack (with VM_GROWSDOWN) and > > find_vma_intersection() would return NULL but it should be a SEGV_ACCERR > > instead. > > I'm pretty sure I am missing something. From what you said above, I > think this means that there can be a user cache fault below the stack > which should notify the user application that they are not allowed to > expand the stack by sending a SIGV_ACCERR in the si_code? Is this > expected behaviour or am I missing a code path to this function? My point was that find_vma() may return a valid vma where addr < vm_end but also addr < vm_addr. It's the responsibility of the caller to check that that vma can be expanded (VM_GROWSDOWN) and we do something like this in __do_page_fault(). find_vma_intersection(), OTOH, requires addr >= vm_start. If we hit this case (addr < vm_start), normally we'd first need to check whether it's expandable and, if not, return MAPERR. If it's expandable, it should be ACCERR since something else caused the fault. Now, I think at least for user_cache_maint_handler(), we can assume that __do_page_fault() handled any expansion already, so we don't need to check it here. In this case, your find_vma_intersection() check should work. Are there other cases where we invoke arm64_notify_segfault() without a prior fault? I think in swp_handler() we can bail out early before we even attempted the access so we may report MAPERR but ACCERR is a better indication. Also in sys_rt_sigreturn() we always call it as arm64_notify_segfault(regs->sp). I'm not sure that's correct in all cases, see restore_altstack(). I guess this code needs some tidying up. > > Maybe this should employ similar checks as __do_page_fault() (with > > expand_stack() and VM_GROWSDOWN). > > You mean the code needs to detect endianness and to check if this is an > attempt to expand the stack for both cases? Nothing to do with endianness, just the relation between the address and the vma->vm_start and whether the vma can be expanded down.
* Catalin Marinas <catalin.marinas@arm.com> [210413 14:00]: > On Tue, Apr 13, 2021 at 04:52:34PM +0000, Liam Howlett wrote: > > * Catalin Marinas <catalin.marinas@arm.com> [210412 13:44]: > > > On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote: > > > > find_vma() will continue to search upwards until the end of the virtual > > > > memory space. This means the si_code would almost never be set to > > > > SEGV_MAPERR even when the address falls outside of any VMA. The result > > > > is that the si_code is not reliable as it may or may not be set to the > > > > correct result, depending on where the address falls in the address > > > > space. > > > > > > > > Using find_vma_intersection() allows for what is intended by only > > > > returning a VMA if it falls within the range provided, in this case a > > > > window of 1. > > > > > > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > > > > --- > > > > arch/arm64/kernel/traps.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c > > > > index a05d34f0e82a..a44007904a64 100644 > > > > --- a/arch/arm64/kernel/traps.c > > > > +++ b/arch/arm64/kernel/traps.c > > > > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i > > > > void arm64_notify_segfault(unsigned long addr) > > > > { > > > > int code; > > > > + unsigned long ut_addr = untagged_addr(addr); > > > > > > > > mmap_read_lock(current->mm); > > > > - if (find_vma(current->mm, untagged_addr(addr)) == NULL) > > > > + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL) > > > > code = SEGV_MAPERR; > > > > else > > > > code = SEGV_ACCERR; > [...] > > > I don't think your change is entirely correct either. We can have a > > > fault below the vma of a stack (with VM_GROWSDOWN) and > > > find_vma_intersection() would return NULL but it should be a SEGV_ACCERR > > > instead. > > > > I'm pretty sure I am missing something. From what you said above, I > > think this means that there can be a user cache fault below the stack > > which should notify the user application that they are not allowed to > > expand the stack by sending a SIGV_ACCERR in the si_code? Is this > > expected behaviour or am I missing a code path to this function? > > My point was that find_vma() may return a valid vma where addr < vm_end > but also addr < vm_addr. It's the responsibility of the caller to check > that that vma can be expanded (VM_GROWSDOWN) and we do something like > this in __do_page_fault(). find_vma_intersection(), OTOH, requires addr > >= vm_start. Right. The find_vma() interface is not clear by the function name; returning a VMA that doesn't include the address of interest is unclear. I think this is why we ended up with the bug in the first place. > > If we hit this case (addr < vm_start), normally we'd first need to check > whether it's expandable and, if not, return MAPERR. If it's expandable, > it should be ACCERR since something else caused the fault. > > Now, I think at least for user_cache_maint_handler(), we can assume that > __do_page_fault() handled any expansion already, so we don't need to > check it here. In this case, your find_vma_intersection() check should > work. > > Are there other cases where we invoke arm64_notify_segfault() without a > prior fault? I think in swp_handler() we can bail out early before we > even attempted the access so we may report MAPERR but ACCERR is a better > indication. swp_handler() is also buggy. It is currently getting the ACCERR as long as the address being checked is > mm->highest_vm_end. If access_ok() fails, it should return ACCERR and not search VMAs for the address at all. ... >Also in sys_rt_sigreturn() we always call it as > arm64_notify_segfault(regs->sp). I'm not sure that's correct in all > cases, see restore_altstack(). Ditto for sys_rt_sigreturn() and sys_sigreturn(), they both suffer the same bug as swp_handler() outlined above. In the case of restore_sigframe() or restore_altstack() failing, it seems that the signal shouldn't be dependent on where the address falls within the VMA at all. Should the signal still be SIGSEGV or something else? By the comments, I would have thought SIGBUS, si_code of BUS_ADRALN? > > I guess this code needs some tidying up. Indeed, there seems to be a few code paths that need to skip this function all together and just set the code to ACCERR - especially since the access_ok() just validates the number itself. I don't think the right answer is to rewrite the function to check VM_GROWSDOWN, as I cannot see a way to reach this function when trying to expand the stack which should report back ACCERR. Do you agree? I also see that my fix would expose other bugs which need to be addressed at the same time. > > > > Maybe this should employ similar checks as __do_page_fault() (with > > > expand_stack() and VM_GROWSDOWN). > > > > You mean the code needs to detect endianness and to check if this is an > > attempt to expand the stack for both cases? > > Nothing to do with endianness, just the relation between the address and > the vma->vm_start and whether the vma can be expanded down. Okay, thanks for clarifying. Cheers, Liam
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index a05d34f0e82a..a44007904a64 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i void arm64_notify_segfault(unsigned long addr) { int code; + unsigned long ut_addr = untagged_addr(addr); mmap_read_lock(current->mm); - if (find_vma(current->mm, untagged_addr(addr)) == NULL) + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL) code = SEGV_MAPERR; else code = SEGV_ACCERR;
find_vma() will continue to search upwards until the end of the virtual memory space. This means the si_code would almost never be set to SEGV_MAPERR even when the address falls outside of any VMA. The result is that the si_code is not reliable as it may or may not be set to the correct result, depending on where the address falls in the address space. Using find_vma_intersection() allows for what is intended by only returning a VMA if it falls within the range provided, in this case a window of 1. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> --- arch/arm64/kernel/traps.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)