Message ID | 20200327191915.257116-1-pcc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] arm64: Expose original FAR_EL1 value in sigcontext | expand |
On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > the tag bits may be needed by tools in order to accurately diagnose > memory errors, such as HWASan [1] or future tools based on the Memory > Tagging Extension (MTE). > > We should not stop clearing these bits in the existing fault address > fields, because there may be existing userspace applications that are > expecting the tag bits to be cleared. Instead, create a far_context in > sigcontext (similar to the existing esr_context), and store the original > value of FAR_EL1 (including the tag bits) there. > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > the tag bits may be needed by tools in order to accurately diagnose > memory errors, such as HWASan [1] or future tools based on the Memory > Tagging Extension (MTE). > > We should not stop clearing these bits in the existing fault address > fields, because there may be existing userspace applications that are > expecting the tag bits to be cleared. Instead, create a far_context in > sigcontext (similar to the existing esr_context), and store the original > value of FAR_EL1 (including the tag bits) there. > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > Signed-off-by: Peter Collingbourne <pcc@google.com> > --- > v3: > - add documentation to tagged-pointers.rst > - update comments in sigcontext.h Hmm, although the code looks fine, why don't we just expose the tag in the new field, rather than duplicate the address information? I'm nervous about exposing privileged registers directly to userspace. Also, Catalin, could you elaborate on the MTE use-case please? The architecture says that FAR_EL1[63:60] are UNKNOWN on a synchronous tag check fault, so we'd have to *avoid* exposing them in that case! Will
On Wed, Apr 29, 2020 at 2:08 PM Will Deacon <will@kernel.org> wrote: > > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > the tag bits may be needed by tools in order to accurately diagnose > > memory errors, such as HWASan [1] or future tools based on the Memory > > Tagging Extension (MTE). > > > > We should not stop clearing these bits in the existing fault address > > fields, because there may be existing userspace applications that are > > expecting the tag bits to be cleared. Instead, create a far_context in > > sigcontext (similar to the existing esr_context), and store the original > > value of FAR_EL1 (including the tag bits) there. > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > --- > > v3: > > - add documentation to tagged-pointers.rst > > - update comments in sigcontext.h > > Hmm, although the code looks fine, why don't we just expose the tag in the > new field, rather than duplicate the address information? I'm nervous about > exposing privileged registers directly to userspace. I have no strong opinion on whether this should just contain the tag or not. > Also, Catalin, could you elaborate on the MTE use-case please? The > architecture says that FAR_EL1[63:60] are UNKNOWN on a synchronous tag > check fault, so we'd have to *avoid* exposing them in that case! The basic use case is to allow a signal handler to identify which allocation was accessed improperly in order to provide better diagnostics. For example, if you have granules tagged 1,2,3 consecutively and see an access with pointer tag 1 on the granule tagged 2, you can tell that it was probably a buffer overflow from the 1 granule and you can report that to the user. It seems unfortunate that bits 63:60 are now UNKNOWN on synchronous tag check faults. It seems to be a recent change to the specification. I can think of a number of use cases for bits 63:60 after a synchronous tag check fault -- for example, an allocator could store an additional set of random bits there, and later use them to determine with more accuracy which allocation was used after free or out of bounds. That being said, we aren't planning to do this in the initial version of the MTE-aware scudo allocator. If there is no chance of changing the architecture at this point, I will send a v4 with the bits masked out when handling a tag check fault. Peter
On Wed, Apr 29, 2020 at 10:08:26PM +0100, Will Deacon wrote: > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > the tag bits may be needed by tools in order to accurately diagnose > > memory errors, such as HWASan [1] or future tools based on the Memory > > Tagging Extension (MTE). > > > > We should not stop clearing these bits in the existing fault address > > fields, because there may be existing userspace applications that are > > expecting the tag bits to be cleared. Instead, create a far_context in > > sigcontext (similar to the existing esr_context), and store the original > > value of FAR_EL1 (including the tag bits) there. > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > --- > > v3: > > - add documentation to tagged-pointers.rst > > - update comments in sigcontext.h > > Hmm, although the code looks fine, why don't we just expose the tag in the > new field, rather than duplicate the address information? I'm nervous about > exposing privileged registers directly to userspace. That's for consistency with ESR_EL1 which we expose in a similar way, though with bits of it not relevant to user masked out. For FAR_EL1, all the bits are relevant, even if some of them are duplicated in the si_addr field. > Also, Catalin, could you elaborate on the MTE use-case please? The > architecture says that FAR_EL1[63:60] are UNKNOWN on a synchronous tag > check fault, so we'd have to *avoid* exposing them in that case! With MTE, FAR_EL1[63:60] will be cleared on sync tag check faults (not currently done as I don't have this patch in my MTE series).
On Thu, Apr 30, 2020 at 10:50:01AM +0100, Catalin Marinas wrote: > On Wed, Apr 29, 2020 at 10:08:26PM +0100, Will Deacon wrote: > > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > > the tag bits may be needed by tools in order to accurately diagnose > > > memory errors, such as HWASan [1] or future tools based on the Memory > > > Tagging Extension (MTE). > > > > > > We should not stop clearing these bits in the existing fault address > > > fields, because there may be existing userspace applications that are > > > expecting the tag bits to be cleared. Instead, create a far_context in > > > sigcontext (similar to the existing esr_context), and store the original > > > value of FAR_EL1 (including the tag bits) there. > > > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > > --- > > > v3: > > > - add documentation to tagged-pointers.rst > > > - update comments in sigcontext.h > > > > Hmm, although the code looks fine, why don't we just expose the tag in the > > new field, rather than duplicate the address information? I'm nervous about > > exposing privileged registers directly to userspace. > > That's for consistency with ESR_EL1 which we expose in a similar way, > though with bits of it not relevant to user masked out. For FAR_EL1, all > the bits are relevant, even if some of them are duplicated in the > si_addr field. It may be consistent, but I would argue that exposing ESR_EL1 was a mistake, as illustrated by cc19846079a7 ("arm64: fault: Don't leak data in ESR context for user fault on kernel VA"). We have to live with that, but we should try to do better for new fields in the sigcontext. > > Also, Catalin, could you elaborate on the MTE use-case please? The > > architecture says that FAR_EL1[63:60] are UNKNOWN on a synchronous tag > > check fault, so we'd have to *avoid* exposing them in that case! > > With MTE, FAR_EL1[63:60] will be cleared on sync tag check faults (not > currently done as I don't have this patch in my MTE series). Ok, but in [1] you said "I'm fine with this change for consistency and may help with the fault information printed by the kernel with khwasan or (later) MTE." But I don't think consistency is necessarily a good thing here and I don't see how it helps with MTE if we zap the bits to 0! We'd be better off not exposing the information at all in this situation. Will [1] https://lore.kernel.org/r/20200325131023.GN3901@mbp
On Thu, Apr 30, 2020 at 10:59:19AM +0100, Will Deacon wrote: > On Thu, Apr 30, 2020 at 10:50:01AM +0100, Catalin Marinas wrote: > > On Wed, Apr 29, 2020 at 10:08:26PM +0100, Will Deacon wrote: > > > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > > > the tag bits may be needed by tools in order to accurately diagnose > > > > memory errors, such as HWASan [1] or future tools based on the Memory > > > > Tagging Extension (MTE). > > > > > > > > We should not stop clearing these bits in the existing fault address > > > > fields, because there may be existing userspace applications that are > > > > expecting the tag bits to be cleared. Instead, create a far_context in > > > > sigcontext (similar to the existing esr_context), and store the original > > > > value of FAR_EL1 (including the tag bits) there. > > > > > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > > > --- > > > > v3: > > > > - add documentation to tagged-pointers.rst > > > > - update comments in sigcontext.h > > > > > > Hmm, although the code looks fine, why don't we just expose the tag in the > > > new field, rather than duplicate the address information? I'm nervous about > > > exposing privileged registers directly to userspace. > > > > That's for consistency with ESR_EL1 which we expose in a similar way, > > though with bits of it not relevant to user masked out. For FAR_EL1, all > > the bits are relevant, even if some of them are duplicated in the > > si_addr field. > > It may be consistent, but I would argue that exposing ESR_EL1 was a mistake, > as illustrated by cc19846079a7 ("arm64: fault: Don't leak data in ESR > context for user fault on kernel VA"). We have to live with that, but we > should try to do better for new fields in the sigcontext. The alternative would be to only expose the tag of the faulting address if you are worried of leaking some kernel address from FAR_EL1 (and I agree, there is a risk). > > > Also, Catalin, could you elaborate on the MTE use-case please? The > > > architecture says that FAR_EL1[63:60] are UNKNOWN on a synchronous tag > > > check fault, so we'd have to *avoid* exposing them in that case! > > > > With MTE, FAR_EL1[63:60] will be cleared on sync tag check faults (not > > currently done as I don't have this patch in my MTE series). > > Ok, but in [1] you said "I'm fine with this change for consistency and > may help with the fault information printed by the kernel with khwasan > or (later) MTE." > > But I don't think consistency is necessarily a good thing here and I don't > see how it helps with MTE if we zap the bits to 0! We'd be better off not > exposing the information at all in this situation. The plan for MTE is to only zap bits 63:60 which are unknown on tag check fault. The actual MTE tag in 59:56 would be preserved. There is an inconsistency with the TBI feature but that only happens for tag check faults which is a new thing. The behaviour on any other fault will be consistent with the non-MTE case (page faults etc.).
On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > the tag bits may be needed by tools in order to accurately diagnose > memory errors, such as HWASan [1] or future tools based on the Memory > Tagging Extension (MTE). > > We should not stop clearing these bits in the existing fault address > fields, because there may be existing userspace applications that are > expecting the tag bits to be cleared. Instead, create a far_context in > sigcontext (similar to the existing esr_context), and store the original > value of FAR_EL1 (including the tag bits) there. > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > Signed-off-by: Peter Collingbourne <pcc@google.com> > --- > v3: > - add documentation to tagged-pointers.rst > - update comments in sigcontext.h > > v2: > - revert changes to hw_breakpoint.c > - rename set_thread_esr to set_thread_far_esr > > Documentation/arm64/tagged-pointers.rst | 17 +++++---- > arch/arm64/include/asm/exception.h | 2 +- > arch/arm64/include/asm/processor.h | 2 +- > arch/arm64/include/uapi/asm/sigcontext.h | 21 +++++++---- > arch/arm64/kernel/entry-common.c | 2 -- > arch/arm64/kernel/signal.c | 20 ++++++++++- > arch/arm64/mm/fault.c | 45 ++++++++++++++---------- > 7 files changed, 74 insertions(+), 35 deletions(-) [...] > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h > index 8b0ebce92427..6782394633cb 100644 > --- a/arch/arm64/include/uapi/asm/sigcontext.h > +++ b/arch/arm64/include/uapi/asm/sigcontext.h > @@ -44,11 +44,12 @@ struct sigcontext { > * > * 0x210 fpsimd_context > * 0x10 esr_context > + * 0x10 far_context > * 0x8a0 sve_context (vl <= 64) (optional) > * 0x20 extra_context (optional) > * 0x10 terminator (null _aarch64_ctx) > * > - * 0x510 (reserved for future allocation) > + * 0x500 (reserved for future allocation) > * > * New records that can exceed this space need to be opt-in for userspace, so > * that an expanded signal frame is not generated unexpectedly. The mechanism > @@ -94,17 +95,25 @@ struct esr_context { > __u64 esr; > }; > > +/* FAR_EL1 context */ > +#define FAR_MAGIC 0x46415201 > + > +struct far_context { > + struct _aarch64_ctx head; > + __u64 far; > +}; > + > /* > * extra_context: describes extra space in the signal frame for > * additional structures that don't fit in sigcontext.__reserved[]. > * > * Note: > * > - * 1) fpsimd_context, esr_context and extra_context must be placed in > - * sigcontext.__reserved[] if present. They cannot be placed in the > - * extra space. Any other record can be placed either in the extra > - * space or in sigcontext.__reserved[], unless otherwise specified in > - * this file. > + * 1) fpsimd_context, esr_context, far_context and extra_context must be > + * placed in sigcontext.__reserved[] if present. They cannot be placed > + * in the extra space. Any other record can be placed either in the > + * extra space or in sigcontext.__reserved[], unless otherwise specified > + * in this file. This is for backwards compatibility only. We don't need this constraint for any new field, so you can probably leave the paragraph as-is. Removing this would mean constraint would mean that userspace must be prepared to traverse extra_context when looking for far_context. But really we want modern userspace to do this anyway, since it reduces backwards compatibilty worries when adding more new records in the future. The nasty loop in parse_user_sigframe() allows some flexibility regarding the order of records, but prior to this patch there is no record that can be actually be moved, due to other backwards compatibility constraints -- so the flexibility isn't used today. I'd like to avoid reorderability creeping in, so that we can get rid of the loop. So, mandating that records must be in a consistent order to sigcontext.h could be helpful. inserting new records in the middle should be fine, so long as there is no shuffling. I'm not sure this patch needs to do anything extra for that: perhaps we can leave this no-shuffling rule implicit for now (?) People already get shouted at for needslessly noisy diffs, so there is a strong disincentive to shuffle existing headers in any case... [...] Cheers ---Dave
Hi Peter, On Wed, Apr 29, 2020 at 02:42:01PM -0700, Peter Collingbourne wrote: > On Wed, Apr 29, 2020 at 2:08 PM Will Deacon <will@kernel.org> wrote: > > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > > the tag bits may be needed by tools in order to accurately diagnose > > > memory errors, such as HWASan [1] or future tools based on the Memory > > > Tagging Extension (MTE). > > > > > > We should not stop clearing these bits in the existing fault address > > > fields, because there may be existing userspace applications that are > > > expecting the tag bits to be cleared. Instead, create a far_context in > > > sigcontext (similar to the existing esr_context), and store the original > > > value of FAR_EL1 (including the tag bits) there. > > > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > > --- > > > v3: > > > - add documentation to tagged-pointers.rst > > > - update comments in sigcontext.h > > > > Hmm, although the code looks fine, why don't we just expose the tag in the > > new field, rather than duplicate the address information? I'm nervous about > > exposing privileged registers directly to userspace. > > I have no strong opinion on whether this should just contain the tag or not. A few of us chatted about this today. Please could you spin a v4 where only the top byte is exposed in the new sigcontext record as a __u8? You'll need to think of a better name than "FAR"; perhaps something like 'si_addr_top_byte', 'si_addr_63_56' or whatever you fancy. Naming is hard. For MTE we can add a separate record later on, so as not to overload this (e.g. si_addr_mte_tag). Ta, Will
On Mon, May 4, 2020 at 3:19 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > the tag bits may be needed by tools in order to accurately diagnose > > memory errors, such as HWASan [1] or future tools based on the Memory > > Tagging Extension (MTE). > > > > We should not stop clearing these bits in the existing fault address > > fields, because there may be existing userspace applications that are > > expecting the tag bits to be cleared. Instead, create a far_context in > > sigcontext (similar to the existing esr_context), and store the original > > value of FAR_EL1 (including the tag bits) there. > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > --- > > v3: > > - add documentation to tagged-pointers.rst > > - update comments in sigcontext.h > > > > v2: > > - revert changes to hw_breakpoint.c > > - rename set_thread_esr to set_thread_far_esr > > > > Documentation/arm64/tagged-pointers.rst | 17 +++++---- > > arch/arm64/include/asm/exception.h | 2 +- > > arch/arm64/include/asm/processor.h | 2 +- > > arch/arm64/include/uapi/asm/sigcontext.h | 21 +++++++---- > > arch/arm64/kernel/entry-common.c | 2 -- > > arch/arm64/kernel/signal.c | 20 ++++++++++- > > arch/arm64/mm/fault.c | 45 ++++++++++++++---------- > > 7 files changed, 74 insertions(+), 35 deletions(-) > > [...] > > > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h > > index 8b0ebce92427..6782394633cb 100644 > > --- a/arch/arm64/include/uapi/asm/sigcontext.h > > +++ b/arch/arm64/include/uapi/asm/sigcontext.h > > @@ -44,11 +44,12 @@ struct sigcontext { > > * > > * 0x210 fpsimd_context > > * 0x10 esr_context > > + * 0x10 far_context > > * 0x8a0 sve_context (vl <= 64) (optional) > > * 0x20 extra_context (optional) > > * 0x10 terminator (null _aarch64_ctx) > > * > > - * 0x510 (reserved for future allocation) > > + * 0x500 (reserved for future allocation) > > * > > * New records that can exceed this space need to be opt-in for userspace, so > > * that an expanded signal frame is not generated unexpectedly. The mechanism > > @@ -94,17 +95,25 @@ struct esr_context { > > __u64 esr; > > }; > > > > +/* FAR_EL1 context */ > > +#define FAR_MAGIC 0x46415201 > > + > > +struct far_context { > > + struct _aarch64_ctx head; > > + __u64 far; > > +}; > > + > > /* > > * extra_context: describes extra space in the signal frame for > > * additional structures that don't fit in sigcontext.__reserved[]. > > * > > * Note: > > * > > - * 1) fpsimd_context, esr_context and extra_context must be placed in > > - * sigcontext.__reserved[] if present. They cannot be placed in the > > - * extra space. Any other record can be placed either in the extra > > - * space or in sigcontext.__reserved[], unless otherwise specified in > > - * this file. > > + * 1) fpsimd_context, esr_context, far_context and extra_context must be > > + * placed in sigcontext.__reserved[] if present. They cannot be placed > > + * in the extra space. Any other record can be placed either in the > > + * extra space or in sigcontext.__reserved[], unless otherwise specified > > + * in this file. > > This is for backwards compatibility only. We don't need this constraint > for any new field, so you can probably leave the paragraph as-is. > > Removing this would mean constraint would mean that userspace must be > prepared to traverse extra_context when looking for far_context. But > really we want modern userspace to do this anyway, since it reduces > backwards compatibilty worries when adding more new records in the > future. My original reason for updating this comment was that I figured that this record was small enough that we could just always include it in __reserved. But thinking about this a bit more, it doesn't seem that just wanting userspace to read extra_context will guarantee that it will do so. In practice, it would be easy to write userspace code that works right now but doesn't read extra_context correctly (either because extra_context wasn't considered at all, or because the code purporting to read the record from extra_context contains a latent bug because it wasn't exercised). Since we may be practically constrained from moving the record anyway, we might as well document it and allow the userspace code to be a little simpler. I guess one alternative is that we always place this record in extra_context, which would force userspace to read it correctly. That has something of the opposite problem (userspace code could be written to only expect the record in extra_context), but at least we're less constrained there, and it's more likely that the code would be parsing __reserved correctly since it would need to do so in order to find extra_context. Anyway, I've reverted the comment change for now in v4, but let me know what you think. Peter > > > The nasty loop in parse_user_sigframe() allows some flexibility > regarding the order of records, but prior to this patch there is no > record that can be actually be moved, due to other backwards > compatibility constraints -- so the flexibility isn't used today. I'd > like to avoid reorderability creeping in, so that we can get rid of the > loop. > > So, mandating that records must be in a consistent order to sigcontext.h > could be helpful. inserting new records in the middle should be fine, > so long as there is no shuffling. > > I'm not sure this patch needs to do anything extra for that: perhaps we > can leave this no-shuffling rule implicit for now (?) > > People already get shouted at for needslessly noisy diffs, so there is a > strong disincentive to shuffle existing headers in any case... > > [...] > > Cheers > ---Dave
On Thu, May 07, 2020 at 10:55:02AM -0700, Peter Collingbourne wrote: > On Mon, May 4, 2020 at 3:19 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > > the tag bits may be needed by tools in order to accurately diagnose > > > memory errors, such as HWASan [1] or future tools based on the Memory > > > Tagging Extension (MTE). > > > > > > We should not stop clearing these bits in the existing fault address > > > fields, because there may be existing userspace applications that are > > > expecting the tag bits to be cleared. Instead, create a far_context in > > > sigcontext (similar to the existing esr_context), and store the original > > > value of FAR_EL1 (including the tag bits) there. > > > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > > --- > > > v3: > > > - add documentation to tagged-pointers.rst > > > - update comments in sigcontext.h > > > > > > v2: > > > - revert changes to hw_breakpoint.c > > > - rename set_thread_esr to set_thread_far_esr > > > > > > Documentation/arm64/tagged-pointers.rst | 17 +++++---- > > > arch/arm64/include/asm/exception.h | 2 +- > > > arch/arm64/include/asm/processor.h | 2 +- > > > arch/arm64/include/uapi/asm/sigcontext.h | 21 +++++++---- > > > arch/arm64/kernel/entry-common.c | 2 -- > > > arch/arm64/kernel/signal.c | 20 ++++++++++- > > > arch/arm64/mm/fault.c | 45 ++++++++++++++---------- > > > 7 files changed, 74 insertions(+), 35 deletions(-) > > > > [...] > > > > > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h > > > index 8b0ebce92427..6782394633cb 100644 > > > --- a/arch/arm64/include/uapi/asm/sigcontext.h > > > +++ b/arch/arm64/include/uapi/asm/sigcontext.h > > > @@ -44,11 +44,12 @@ struct sigcontext { > > > * > > > * 0x210 fpsimd_context > > > * 0x10 esr_context > > > + * 0x10 far_context > > > * 0x8a0 sve_context (vl <= 64) (optional) > > > * 0x20 extra_context (optional) > > > * 0x10 terminator (null _aarch64_ctx) > > > * > > > - * 0x510 (reserved for future allocation) > > > + * 0x500 (reserved for future allocation) > > > * > > > * New records that can exceed this space need to be opt-in for userspace, so > > > * that an expanded signal frame is not generated unexpectedly. The mechanism > > > @@ -94,17 +95,25 @@ struct esr_context { > > > __u64 esr; > > > }; > > > > > > +/* FAR_EL1 context */ > > > +#define FAR_MAGIC 0x46415201 > > > + > > > +struct far_context { > > > + struct _aarch64_ctx head; > > > + __u64 far; > > > +}; > > > + > > > /* > > > * extra_context: describes extra space in the signal frame for > > > * additional structures that don't fit in sigcontext.__reserved[]. > > > * > > > * Note: > > > * > > > - * 1) fpsimd_context, esr_context and extra_context must be placed in > > > - * sigcontext.__reserved[] if present. They cannot be placed in the > > > - * extra space. Any other record can be placed either in the extra > > > - * space or in sigcontext.__reserved[], unless otherwise specified in > > > - * this file. > > > + * 1) fpsimd_context, esr_context, far_context and extra_context must be > > > + * placed in sigcontext.__reserved[] if present. They cannot be placed > > > + * in the extra space. Any other record can be placed either in the > > > + * extra space or in sigcontext.__reserved[], unless otherwise specified > > > + * in this file. > > > > This is for backwards compatibility only. We don't need this constraint > > for any new field, so you can probably leave the paragraph as-is. > > > > Removing this would mean constraint would mean that userspace must be > > prepared to traverse extra_context when looking for far_context. But > > really we want modern userspace to do this anyway, since it reduces > > backwards compatibilty worries when adding more new records in the > > future. > > My original reason for updating this comment was that I figured that > this record was small enough that we could just always include it in > __reserved. > > But thinking about this a bit more, it doesn't seem that just wanting > userspace to read extra_context will guarantee that it will do so. In > practice, it would be easy to write userspace code that works right > now but doesn't read extra_context correctly (either because > extra_context wasn't considered at all, or because the code purporting > to read the record from extra_context contains a latent bug because it > wasn't exercised). Since we may be practically constrained from moving > the record anyway, we might as well document it and allow the > userspace code to be a little simpler. > > I guess one alternative is that we always place this record in > extra_context, which would force userspace to read it correctly. That > has something of the opposite problem (userspace code could be written > to only expect the record in extra_context), but at least we're less > constrained there, and it's more likely that the code would be parsing > __reserved correctly since it would need to do so in order to find > extra_context. > > Anyway, I've reverted the comment change for now in v4, but let me > know what you think. Apologies for the delay in responding -- I think it does make sense to reserve space in __reserved[] for the new record, the the location you suggested for it is sensible. __reserved[] is a scarce resource, and should only be burned on "small" records, but far_context is small. here's another reason too, which is that we don't want to needlessly block new software from using this field without allocating larger stacks -- not least because they just won't, and the problem won't bite them until much later. Hope that helps clarify things. Cheers ---Dave
On Wed, May 13, 2020 at 10:27 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Thu, May 07, 2020 at 10:55:02AM -0700, Peter Collingbourne wrote: > > On Mon, May 4, 2020 at 3:19 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > On Fri, Mar 27, 2020 at 12:19:15PM -0700, Peter Collingbourne wrote: > > > > The kernel currently clears the tag bits (i.e. bits 56-63) in the fault > > > > address exposed via siginfo.si_addr and sigcontext.fault_address. However, > > > > the tag bits may be needed by tools in order to accurately diagnose > > > > memory errors, such as HWASan [1] or future tools based on the Memory > > > > Tagging Extension (MTE). > > > > > > > > We should not stop clearing these bits in the existing fault address > > > > fields, because there may be existing userspace applications that are > > > > expecting the tag bits to be cleared. Instead, create a far_context in > > > > sigcontext (similar to the existing esr_context), and store the original > > > > value of FAR_EL1 (including the tag bits) there. > > > > > > > > [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html > > > > > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > > > --- > > > > v3: > > > > - add documentation to tagged-pointers.rst > > > > - update comments in sigcontext.h > > > > > > > > v2: > > > > - revert changes to hw_breakpoint.c > > > > - rename set_thread_esr to set_thread_far_esr > > > > > > > > Documentation/arm64/tagged-pointers.rst | 17 +++++---- > > > > arch/arm64/include/asm/exception.h | 2 +- > > > > arch/arm64/include/asm/processor.h | 2 +- > > > > arch/arm64/include/uapi/asm/sigcontext.h | 21 +++++++---- > > > > arch/arm64/kernel/entry-common.c | 2 -- > > > > arch/arm64/kernel/signal.c | 20 ++++++++++- > > > > arch/arm64/mm/fault.c | 45 ++++++++++++++---------- > > > > 7 files changed, 74 insertions(+), 35 deletions(-) > > > > > > [...] > > > > > > > diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h > > > > index 8b0ebce92427..6782394633cb 100644 > > > > --- a/arch/arm64/include/uapi/asm/sigcontext.h > > > > +++ b/arch/arm64/include/uapi/asm/sigcontext.h > > > > @@ -44,11 +44,12 @@ struct sigcontext { > > > > * > > > > * 0x210 fpsimd_context > > > > * 0x10 esr_context > > > > + * 0x10 far_context > > > > * 0x8a0 sve_context (vl <= 64) (optional) > > > > * 0x20 extra_context (optional) > > > > * 0x10 terminator (null _aarch64_ctx) > > > > * > > > > - * 0x510 (reserved for future allocation) > > > > + * 0x500 (reserved for future allocation) > > > > * > > > > * New records that can exceed this space need to be opt-in for userspace, so > > > > * that an expanded signal frame is not generated unexpectedly. The mechanism > > > > @@ -94,17 +95,25 @@ struct esr_context { > > > > __u64 esr; > > > > }; > > > > > > > > +/* FAR_EL1 context */ > > > > +#define FAR_MAGIC 0x46415201 > > > > + > > > > +struct far_context { > > > > + struct _aarch64_ctx head; > > > > + __u64 far; > > > > +}; > > > > + > > > > /* > > > > * extra_context: describes extra space in the signal frame for > > > > * additional structures that don't fit in sigcontext.__reserved[]. > > > > * > > > > * Note: > > > > * > > > > - * 1) fpsimd_context, esr_context and extra_context must be placed in > > > > - * sigcontext.__reserved[] if present. They cannot be placed in the > > > > - * extra space. Any other record can be placed either in the extra > > > > - * space or in sigcontext.__reserved[], unless otherwise specified in > > > > - * this file. > > > > + * 1) fpsimd_context, esr_context, far_context and extra_context must be > > > > + * placed in sigcontext.__reserved[] if present. They cannot be placed > > > > + * in the extra space. Any other record can be placed either in the > > > > + * extra space or in sigcontext.__reserved[], unless otherwise specified > > > > + * in this file. > > > > > > This is for backwards compatibility only. We don't need this constraint > > > for any new field, so you can probably leave the paragraph as-is. > > > > > > Removing this would mean constraint would mean that userspace must be > > > prepared to traverse extra_context when looking for far_context. But > > > really we want modern userspace to do this anyway, since it reduces > > > backwards compatibilty worries when adding more new records in the > > > future. > > > > My original reason for updating this comment was that I figured that > > this record was small enough that we could just always include it in > > __reserved. > > > > But thinking about this a bit more, it doesn't seem that just wanting > > userspace to read extra_context will guarantee that it will do so. In > > practice, it would be easy to write userspace code that works right > > now but doesn't read extra_context correctly (either because > > extra_context wasn't considered at all, or because the code purporting > > to read the record from extra_context contains a latent bug because it > > wasn't exercised). Since we may be practically constrained from moving > > the record anyway, we might as well document it and allow the > > userspace code to be a little simpler. > > > > I guess one alternative is that we always place this record in > > extra_context, which would force userspace to read it correctly. That > > has something of the opposite problem (userspace code could be written > > to only expect the record in extra_context), but at least we're less > > constrained there, and it's more likely that the code would be parsing > > __reserved correctly since it would need to do so in order to find > > extra_context. > > > > Anyway, I've reverted the comment change for now in v4, but let me > > know what you think. > > Apologies for the delay in responding -- I think it does make sense to > reserve space in __reserved[] for the new record, the the location you > suggested for it is sensible. > > __reserved[] is a scarce resource, and should only be burned on "small" > records, but far_context is small. > > > here's another reason too, which is that we don't want to needlessly > block new software from using this field without allocating larger > stacks -- not least because they just won't, and the problem won't > bite them until much later. > > > Hope that helps clarify things. Thanks, that makes sense. I will send a v6 with the comment brought back. Peter
diff --git a/Documentation/arm64/tagged-pointers.rst b/Documentation/arm64/tagged-pointers.rst index eab4323609b9..9da7f6262fad 100644 --- a/Documentation/arm64/tagged-pointers.rst +++ b/Documentation/arm64/tagged-pointers.rst @@ -53,12 +53,17 @@ visibility. Preserving tags --------------- -Non-zero tags are not preserved when delivering signals. This means that -signal handlers in applications making use of tags cannot rely on the -tag information for user virtual addresses being maintained for fields -inside siginfo_t. One exception to this rule is for signals raised in -response to watchpoint debug exceptions, where the tag information will -be preserved. +Non-zero tags are not preserved in the fault address fields +siginfo.si_addr or sigcontext.fault_address when delivering +signals. This means that signal handlers in applications making use +of tags cannot rely on the tag information for user virtual addresses +being maintained in these fields. One exception to this rule is for +signals raised in response to watchpoint debug exceptions, where the +tag information will be preserved. + +The fault address tag is preserved in the far field of the signal +frame record far_context, which is present for signals raised in +response to data aborts and instruction aborts. The architecture prevents the use of a tagged PC, so the upper byte will be set to a sign-extension of bit 55 on exception return. diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h index 7a6e81ca23a8..90e772d9b2cd 100644 --- a/arch/arm64/include/asm/exception.h +++ b/arch/arm64/include/asm/exception.h @@ -32,7 +32,7 @@ static inline u32 disr_to_esr(u64 disr) } asmlinkage void enter_from_user_mode(void); -void do_mem_abort(unsigned long addr, unsigned int esr, struct pt_regs *regs); +void do_mem_abort(unsigned long far, unsigned int esr, struct pt_regs *regs); void do_undefinstr(struct pt_regs *regs); asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr); void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr, diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index 5ba63204d078..77d916c07531 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -142,7 +142,7 @@ struct thread_struct { void *sve_state; /* SVE registers, if any */ unsigned int sve_vl; /* SVE vector length */ unsigned int sve_vl_onexec; /* SVE vl after next exec */ - unsigned long fault_address; /* fault info */ + unsigned long fault_address; /* FAR_EL1 value */ unsigned long fault_code; /* ESR_EL1 value */ struct debug_info debug; /* debugging */ #ifdef CONFIG_ARM64_PTR_AUTH diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h index 8b0ebce92427..6782394633cb 100644 --- a/arch/arm64/include/uapi/asm/sigcontext.h +++ b/arch/arm64/include/uapi/asm/sigcontext.h @@ -44,11 +44,12 @@ struct sigcontext { * * 0x210 fpsimd_context * 0x10 esr_context + * 0x10 far_context * 0x8a0 sve_context (vl <= 64) (optional) * 0x20 extra_context (optional) * 0x10 terminator (null _aarch64_ctx) * - * 0x510 (reserved for future allocation) + * 0x500 (reserved for future allocation) * * New records that can exceed this space need to be opt-in for userspace, so * that an expanded signal frame is not generated unexpectedly. The mechanism @@ -94,17 +95,25 @@ struct esr_context { __u64 esr; }; +/* FAR_EL1 context */ +#define FAR_MAGIC 0x46415201 + +struct far_context { + struct _aarch64_ctx head; + __u64 far; +}; + /* * extra_context: describes extra space in the signal frame for * additional structures that don't fit in sigcontext.__reserved[]. * * Note: * - * 1) fpsimd_context, esr_context and extra_context must be placed in - * sigcontext.__reserved[] if present. They cannot be placed in the - * extra space. Any other record can be placed either in the extra - * space or in sigcontext.__reserved[], unless otherwise specified in - * this file. + * 1) fpsimd_context, esr_context, far_context and extra_context must be + * placed in sigcontext.__reserved[] if present. They cannot be placed + * in the extra space. Any other record can be placed either in the + * extra space or in sigcontext.__reserved[], unless otherwise specified + * in this file. * * 2) There must not be more than one extra_context. * diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c index fde59981445c..290ea59c68b8 100644 --- a/arch/arm64/kernel/entry-common.c +++ b/arch/arm64/kernel/entry-common.c @@ -22,7 +22,6 @@ static void notrace el1_abort(struct pt_regs *regs, unsigned long esr) unsigned long far = read_sysreg(far_el1); local_daif_inherit(regs); - far = untagged_addr(far); do_mem_abort(far, esr, regs); } NOKPROBE_SYMBOL(el1_abort); @@ -104,7 +103,6 @@ static void notrace el0_da(struct pt_regs *regs, unsigned long esr) user_exit_irqoff(); local_daif_restore(DAIF_PROCCTX); - far = untagged_addr(far); do_mem_abort(far, esr, regs); } NOKPROBE_SYMBOL(el0_da); diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index 339882db5a91..48e8b6c7b536 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -55,6 +55,7 @@ struct rt_sigframe_user_layout { unsigned long fpsimd_offset; unsigned long esr_offset; + unsigned long far_offset; unsigned long sve_offset; unsigned long extra_offset; unsigned long end_offset; @@ -383,6 +384,7 @@ static int parse_user_sigframe(struct user_ctxs *user, break; case ESR_MAGIC: + case FAR_MAGIC: /* ignore */ break; @@ -581,6 +583,11 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user, sizeof(struct esr_context)); if (err) return err; + + err = sigframe_alloc(user, &user->far_offset, + sizeof(struct far_context)); + if (err) + return err; } if (system_supports_sve()) { @@ -621,7 +628,8 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user, __put_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err); __put_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err); - __put_user_error(current->thread.fault_address, &sf->uc.uc_mcontext.fault_address, err); + __put_user_error(untagged_addr(current->thread.fault_address), + &sf->uc.uc_mcontext.fault_address, err); err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set)); @@ -641,6 +649,16 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user, __put_user_error(current->thread.fault_code, &esr_ctx->esr, err); } + if (err == 0 && user->far_offset) { + struct far_context __user *far_ctx = + apply_user_offset(user, user->far_offset); + + __put_user_error(FAR_MAGIC, &far_ctx->head.magic, err); + __put_user_error(sizeof(*far_ctx), &far_ctx->head.size, err); + __put_user_error(current->thread.fault_address, &far_ctx->far, + err); + } + /* Scalable Vector Extension state, if present */ if (system_supports_sve() && err == 0 && user->sve_offset) { struct sve_context __user *sve_ctx = diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 85566d32958f..738adc950012 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -41,7 +41,7 @@ #include <asm/traps.h> struct fault_info { - int (*fn)(unsigned long addr, unsigned int esr, + int (*fn)(unsigned long far, unsigned int esr, struct pt_regs *regs); int sig; int code; @@ -320,9 +320,11 @@ static void __do_kernel_fault(unsigned long addr, unsigned int esr, die_kernel_fault(msg, addr, esr, regs); } -static void set_thread_esr(unsigned long address, unsigned int esr) +static void set_thread_far_esr(unsigned long far, unsigned int esr) { - current->thread.fault_address = address; + unsigned long addr = untagged_addr(far); + + current->thread.fault_address = far; /* * If the faulting address is in the kernel, we must sanitize the ESR. @@ -336,7 +338,7 @@ static void set_thread_esr(unsigned long address, unsigned int esr) * type", so we ignore this wrinkle and just return the translation * fault.) */ - if (!is_ttbr0_addr(current->thread.fault_address)) { + if (!is_ttbr0_addr(addr)) { switch (ESR_ELx_EC(esr)) { case ESR_ELx_EC_DABT_LOW: /* @@ -377,8 +379,11 @@ static void set_thread_esr(unsigned long address, unsigned int esr) current->thread.fault_code = esr; } -static void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *regs) +static void do_bad_area(unsigned long far, unsigned int esr, + struct pt_regs *regs) { + unsigned long addr = untagged_addr(far); + /* * If we are in kernel mode at this point, we have no context to * handle this fault with. @@ -386,7 +391,7 @@ static void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *re if (user_mode(regs)) { const struct fault_info *inf = esr_to_fault_info(esr); - set_thread_esr(addr, esr); + set_thread_far_esr(far, esr); arm64_force_sig_fault(inf->sig, inf->code, (void __user *)addr, inf->name); } else { @@ -439,7 +444,7 @@ static bool is_write_abort(unsigned int esr) return (esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM); } -static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, +static int __kprobes do_page_fault(unsigned long far, unsigned int esr, struct pt_regs *regs) { const struct fault_info *inf; @@ -447,6 +452,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, vm_fault_t fault, major = 0; unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC; unsigned int mm_flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; + unsigned long addr = untagged_addr(far); if (kprobe_page_fault(regs, esr)) return 0; @@ -580,7 +586,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, } inf = esr_to_fault_info(esr); - set_thread_esr(addr, esr); + set_thread_far_esr(far, esr); if (fault & VM_FAULT_SIGBUS) { /* * We had some memory, but were unable to successfully fix up @@ -615,30 +621,32 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, return 0; } -static int __kprobes do_translation_fault(unsigned long addr, +static int __kprobes do_translation_fault(unsigned long far, unsigned int esr, struct pt_regs *regs) { + unsigned long addr = untagged_addr(far); + if (is_ttbr0_addr(addr)) - return do_page_fault(addr, esr, regs); + return do_page_fault(far, esr, regs); - do_bad_area(addr, esr, regs); + do_bad_area(far, esr, regs); return 0; } -static int do_alignment_fault(unsigned long addr, unsigned int esr, +static int do_alignment_fault(unsigned long far, unsigned int esr, struct pt_regs *regs) { - do_bad_area(addr, esr, regs); + do_bad_area(far, esr, regs); return 0; } -static int do_bad(unsigned long addr, unsigned int esr, struct pt_regs *regs) +static int do_bad(unsigned long far, unsigned int esr, struct pt_regs *regs) { return 1; /* "fault" */ } -static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) +static int do_sea(unsigned long far, unsigned int esr, struct pt_regs *regs) { const struct fault_info *inf; void __user *siaddr; @@ -654,7 +662,7 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs) if (esr & ESR_ELx_FnV) siaddr = NULL; else - siaddr = (void __user *)addr; + siaddr = (void __user *)untagged_addr(far); arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr); return 0; @@ -727,11 +735,12 @@ static const struct fault_info fault_info[] = { { do_bad, SIGKILL, SI_KERNEL, "unknown 63" }, }; -void do_mem_abort(unsigned long addr, unsigned int esr, struct pt_regs *regs) +void do_mem_abort(unsigned long far, unsigned int esr, struct pt_regs *regs) { const struct fault_info *inf = esr_to_fault_info(esr); + unsigned long addr = untagged_addr(far); - if (!inf->fn(addr, esr, regs)) + if (!inf->fn(far, esr, regs)) return; if (!user_mode(regs)) {
The kernel currently clears the tag bits (i.e. bits 56-63) in the fault address exposed via siginfo.si_addr and sigcontext.fault_address. However, the tag bits may be needed by tools in order to accurately diagnose memory errors, such as HWASan [1] or future tools based on the Memory Tagging Extension (MTE). We should not stop clearing these bits in the existing fault address fields, because there may be existing userspace applications that are expecting the tag bits to be cleared. Instead, create a far_context in sigcontext (similar to the existing esr_context), and store the original value of FAR_EL1 (including the tag bits) there. [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html Signed-off-by: Peter Collingbourne <pcc@google.com> --- v3: - add documentation to tagged-pointers.rst - update comments in sigcontext.h v2: - revert changes to hw_breakpoint.c - rename set_thread_esr to set_thread_far_esr Documentation/arm64/tagged-pointers.rst | 17 +++++---- arch/arm64/include/asm/exception.h | 2 +- arch/arm64/include/asm/processor.h | 2 +- arch/arm64/include/uapi/asm/sigcontext.h | 21 +++++++---- arch/arm64/kernel/entry-common.c | 2 -- arch/arm64/kernel/signal.c | 20 ++++++++++- arch/arm64/mm/fault.c | 45 ++++++++++++++---------- 7 files changed, 74 insertions(+), 35 deletions(-)