mbox series

[v8,0/4] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis

Message ID 20210630231509.3773172-1-pcc@google.com (mailing list archive)
Headers show
Series arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis | expand

Message

Peter Collingbourne June 30, 2021, 11:15 p.m. UTC
On some CPUs the performance of MTE in synchronous mode is similar
to that of asynchronous mode. This makes it worthwhile to enable
synchronous mode on those CPUs when asynchronous mode is requested,
in order to gain the error detection benefits of synchronous mode
without the performance downsides. Therefore, make it possible for
user programs to opt into upgrading to synchronous mode on those CPUs.

This is done by introducing a notion of a preferred TCF mode, which is
controlled on a per-CPU basis by a sysfs node. The existing SYNC and
ASYNC TCF settings are repurposed as bitfields that specify a set of
possible modes. If the preferred TCF mode for a particular CPU is in
the user-provided mode set (this will always be the case for mode sets
containing more than one mode because the kernel only supports two tag
checking modes, but future kernels may support more modes) then that
mode is used when running on that CPU, otherwise one of the modes in
the task's mode set will be selected in a currently unspecified manner.

v8:
- split into multiple patches
- remove MTE_CTRL_TCF_NONE
- improve documentation
- disable preemption and add comment to mte_update_sctlr_user
- bring back PR_MTE_TCF_SHIFT for source compatibility
- address formatting nit

v7:
- switch to new API proposed on list

v6:
- switch to strings in sysfs nodes instead of TCF values

v5:
- updated documentation
- address some nits in mte.c

v4:
- switch to new mte_ctrl field
- make register_mte_upgrade_async_sysctl return an int
- change the sysctl to take 0 or 1 instead of raw TCF values
- "same as" -> "similar to"

v3:
- drop the device tree support
- add documentation
- add static_assert to ensure no overlap with real HW bits
- move per-CPU variable initialization to mte.c
- use smp_call_function_single instead of stop_machine

v2:
- make it an opt-in behavior
- change the format of the device tree node
- also allow controlling the feature via sysfs

Peter Collingbourne (4):
  arm64: mte: rename gcr_user_excl to mte_ctrl
  arm64: mte: change ASYNC and SYNC TCF settings into bitfields
  arm64: mte: introduce a per-CPU tag checking mode preference
  Documentation: document the preferred tag checking mode feature

 .../arm64/memory-tagging-extension.rst        |  48 +++++-
 arch/arm64/include/asm/mte.h                  |   4 +
 arch/arm64/include/asm/processor.h            |   8 +-
 arch/arm64/kernel/asm-offsets.c               |   2 +-
 arch/arm64/kernel/entry.S                     |   4 +-
 arch/arm64/kernel/mte.c                       | 159 ++++++++++++------
 include/uapi/linux/prctl.h                    |  11 +-
 7 files changed, 171 insertions(+), 65 deletions(-)

Comments

Catalin Marinas July 1, 2021, 5:10 p.m. UTC | #1
On Wed, Jun 30, 2021 at 04:15:05PM -0700, Peter Collingbourne wrote:
> On some CPUs the performance of MTE in synchronous mode is similar
> to that of asynchronous mode. This makes it worthwhile to enable
> synchronous mode on those CPUs when asynchronous mode is requested,
> in order to gain the error detection benefits of synchronous mode
> without the performance downsides. Therefore, make it possible for
> user programs to opt into upgrading to synchronous mode on those CPUs.
> 
> This is done by introducing a notion of a preferred TCF mode, which is
> controlled on a per-CPU basis by a sysfs node. The existing SYNC and
> ASYNC TCF settings are repurposed as bitfields that specify a set of
> possible modes. If the preferred TCF mode for a particular CPU is in
> the user-provided mode set (this will always be the case for mode sets
> containing more than one mode because the kernel only supports two tag
> checking modes, but future kernels may support more modes) then that
> mode is used when running on that CPU, otherwise one of the modes in
> the task's mode set will be selected in a currently unspecified manner.

The series looks good to me but please post it again after -rc1 if it
doesn't apply cleanly.

Thanks.
Peter Collingbourne July 2, 2021, 7:42 p.m. UTC | #2
On Thu, Jul 1, 2021 at 10:10 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Wed, Jun 30, 2021 at 04:15:05PM -0700, Peter Collingbourne wrote:
> > On some CPUs the performance of MTE in synchronous mode is similar
> > to that of asynchronous mode. This makes it worthwhile to enable
> > synchronous mode on those CPUs when asynchronous mode is requested,
> > in order to gain the error detection benefits of synchronous mode
> > without the performance downsides. Therefore, make it possible for
> > user programs to opt into upgrading to synchronous mode on those CPUs.
> >
> > This is done by introducing a notion of a preferred TCF mode, which is
> > controlled on a per-CPU basis by a sysfs node. The existing SYNC and
> > ASYNC TCF settings are repurposed as bitfields that specify a set of
> > possible modes. If the preferred TCF mode for a particular CPU is in
> > the user-provided mode set (this will always be the case for mode sets
> > containing more than one mode because the kernel only supports two tag
> > checking modes, but future kernels may support more modes) then that
> > mode is used when running on that CPU, otherwise one of the modes in
> > the task's mode set will be selected in a currently unspecified manner.
>
> The series looks good to me but please post it again after -rc1 if it
> doesn't apply cleanly.

Thanks. I tried applying this series to linux-next and it applied
cleanly, so it seems likely that it will apply cleanly to rc1. I will
let you know if that is not the case though.

I received feedback elsewhere that we should be adding documentation
under Documentation/ABI for the new sysfs node. Also while developing
my GCR on task switch patch I noticed a small cleanup that could be
made to patch 2 of this series. I went ahead and made both of those
improvements in v9.

Peter