Message ID | 20220815041803.17954-7-kirill.shutemov@linux.intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Linear Address Masking enabling | expand |
On Mon, Aug 15, 2022 at 07:17:58AM +0300, Kirill A. Shutemov wrote: > +static void enable_lam_func(void *mm) > +{ > + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); > + unsigned long lam_mask; > + unsigned long cr3; > + > + if (loaded_mm != mm) > + return; > + > + lam_mask = READ_ONCE(loaded_mm->context.lam_cr3_mask); > + > + /* Update CR3 to get LAM active on the CPU */ > + cr3 = __read_cr3(); > + cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57); > + cr3 |= lam_mask; > + write_cr3(cr3); > + set_tlbstate_cr3_lam_mask(lam_mask); > +} > + > +static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) > +{ > + int ret = 0; > + > + if (!cpu_feature_enabled(X86_FEATURE_LAM)) > + return -ENODEV; > + > + mutex_lock(&mm->context.lock); > + > + /* Already enabled? */ > + if (mm->context.lam_cr3_mask) { > + ret = -EBUSY; > + goto out; > + } > + > + if (!nr_bits) { > + ret = -EINVAL; > + goto out; > + } else if (nr_bits <= 6) { > + mm->context.lam_cr3_mask = X86_CR3_LAM_U57; > + mm->context.untag_mask = ~GENMASK(62, 57); > + } else { > + ret = -EINVAL; > + goto out; > + } > + > + /* Make lam_cr3_mask and untag_mask visible on other CPUs */ > + smp_mb(); smp_mb() doesn't make visible -- it is about ordering, what does it order against that the below on_each_cpu_mask() doesn't already take care of? > + > + on_each_cpu_mask(mm_cpumask(mm), enable_lam_func, mm, true); > +out: > + mutex_unlock(&mm->context.lock); > + return ret; > +}
On Mon, Aug 15, 2022 at 03:37:16PM +0200, Peter Zijlstra wrote: > On Mon, Aug 15, 2022 at 07:17:58AM +0300, Kirill A. Shutemov wrote: > > > +static void enable_lam_func(void *mm) > > +{ > > + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); > > + unsigned long lam_mask; > > + unsigned long cr3; > > + > > + if (loaded_mm != mm) > > + return; > > + > > + lam_mask = READ_ONCE(loaded_mm->context.lam_cr3_mask); > > + > > + /* Update CR3 to get LAM active on the CPU */ > > + cr3 = __read_cr3(); > > + cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57); > > + cr3 |= lam_mask; > > + write_cr3(cr3); > > + set_tlbstate_cr3_lam_mask(lam_mask); > > +} > > + > > +static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) > > +{ > > + int ret = 0; > > + > > + if (!cpu_feature_enabled(X86_FEATURE_LAM)) > > + return -ENODEV; > > + > > + mutex_lock(&mm->context.lock); > > + > > + /* Already enabled? */ > > + if (mm->context.lam_cr3_mask) { > > + ret = -EBUSY; > > + goto out; > > + } > > + > > + if (!nr_bits) { > > + ret = -EINVAL; > > + goto out; > > + } else if (nr_bits <= 6) { > > + mm->context.lam_cr3_mask = X86_CR3_LAM_U57; > > + mm->context.untag_mask = ~GENMASK(62, 57); > > + } else { > > + ret = -EINVAL; > > + goto out; > > + } > > + > > + /* Make lam_cr3_mask and untag_mask visible on other CPUs */ > > + smp_mb(); > > smp_mb() doesn't make visible -- it is about ordering, what does it > order against that the below on_each_cpu_mask() doesn't already take > care of? You are right it is redundant. I will drop it.
On Mon, Aug 15, 2022 at 6:15 AM Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote: > > Add a couple of arch_prctl() handles: > > - ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number > of tag bits. It is rounded up to the nearest LAM mode that can > provide it. For now only LAM_U57 is supported, with 6 tag bits. > > - ARCH_GET_UNTAG_MASK returns untag mask. It can indicates where tag > bits located in the address. > > - ARCH_GET_MAX_TAG_BITS returns the maximum tag bits user can request. > Zero if LAM is not supported. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Alexander Potapenko <glider@google.com> (with a nit, see below) > +static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) > +{ > + int ret = 0; > + > + if (!cpu_feature_enabled(X86_FEATURE_LAM)) > + return -ENODEV; > + > + mutex_lock(&mm->context.lock); > + > + /* Already enabled? */ > + if (mm->context.lam_cr3_mask) { > + ret = -EBUSY; > + goto out; > + } > + > + if (!nr_bits) { > + ret = -EINVAL; > + goto out; > + } else if (nr_bits <= 6) { Can you please make this 6 a #define? > + return put_user(6, (unsigned long __user *)arg2); ... and use it at least here (could also express masks in terms of this number, but maybe it's enough to just declare them in the same header next to each other).
On Mon, Aug 22, 2022 at 11:32 AM Alexander Potapenko <glider@google.com> wrote: > > On Mon, Aug 15, 2022 at 6:15 AM Kirill A. Shutemov > <kirill.shutemov@linux.intel.com> wrote: > > > > Add a couple of arch_prctl() handles: > > > > - ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number > > of tag bits. It is rounded up to the nearest LAM mode that can > > provide it. For now only LAM_U57 is supported, with 6 tag bits. > > > > - ARCH_GET_UNTAG_MASK returns untag mask. It can indicates where tag > > bits located in the address. > > > > - ARCH_GET_MAX_TAG_BITS returns the maximum tag bits user can request. > > Zero if LAM is not supported. > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Reviewed-by: Alexander Potapenko <glider@google.com> And also: Tested-by: Alexander Potapenko <glider@google.com>
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 500b96e71f18..a31e27b95b19 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -20,4 +20,8 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 +#define ARCH_GET_UNTAG_MASK 0x4001 +#define ARCH_ENABLE_TAGGED_ADDR 0x4002 +#define ARCH_GET_MAX_TAG_BITS 0x4003 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 1962008fe743..4f9f0f8ccd26 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -742,6 +742,60 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr) } #endif +static void enable_lam_func(void *mm) +{ + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); + unsigned long lam_mask; + unsigned long cr3; + + if (loaded_mm != mm) + return; + + lam_mask = READ_ONCE(loaded_mm->context.lam_cr3_mask); + + /* Update CR3 to get LAM active on the CPU */ + cr3 = __read_cr3(); + cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57); + cr3 |= lam_mask; + write_cr3(cr3); + set_tlbstate_cr3_lam_mask(lam_mask); +} + +static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) +{ + int ret = 0; + + if (!cpu_feature_enabled(X86_FEATURE_LAM)) + return -ENODEV; + + mutex_lock(&mm->context.lock); + + /* Already enabled? */ + if (mm->context.lam_cr3_mask) { + ret = -EBUSY; + goto out; + } + + if (!nr_bits) { + ret = -EINVAL; + goto out; + } else if (nr_bits <= 6) { + mm->context.lam_cr3_mask = X86_CR3_LAM_U57; + mm->context.untag_mask = ~GENMASK(62, 57); + } else { + ret = -EINVAL; + goto out; + } + + /* Make lam_cr3_mask and untag_mask visible on other CPUs */ + smp_mb(); + + on_each_cpu_mask(mm_cpumask(mm), enable_lam_func, mm, true); +out: + mutex_unlock(&mm->context.lock); + return ret; +} + long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) { int ret = 0; @@ -829,7 +883,16 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) case ARCH_MAP_VDSO_64: return prctl_map_vdso(&vdso_image_64, arg2); #endif - + case ARCH_GET_UNTAG_MASK: + return put_user(task->mm->context.untag_mask, + (unsigned long __user *)arg2); + case ARCH_ENABLE_TAGGED_ADDR: + return prctl_enable_tagged_addr(task->mm, arg2); + case ARCH_GET_MAX_TAG_BITS: + if (!cpu_feature_enabled(X86_FEATURE_LAM)) + return put_user(0, (unsigned long __user *)arg2); + else + return put_user(6, (unsigned long __user *)arg2); default: ret = -EINVAL; break;
Add a couple of arch_prctl() handles: - ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number of tag bits. It is rounded up to the nearest LAM mode that can provide it. For now only LAM_U57 is supported, with 6 tag bits. - ARCH_GET_UNTAG_MASK returns untag mask. It can indicates where tag bits located in the address. - ARCH_GET_MAX_TAG_BITS returns the maximum tag bits user can request. Zero if LAM is not supported. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> --- arch/x86/include/uapi/asm/prctl.h | 4 ++ arch/x86/kernel/process_64.c | 65 ++++++++++++++++++++++++++++++- 2 files changed, 68 insertions(+), 1 deletion(-)