mbox series

[0/5] arm64: Use memory copy instructions in kernel routines

Message ID 20240930161051.3777828-1-kristina.martsenko@arm.com (mailing list archive)
Headers show
Series arm64: Use memory copy instructions in kernel routines | expand

Message

Kristina Martšenko Sept. 30, 2024, 4:10 p.m. UTC
Hi,

Here is a small series to make memcpy() and related functions use the
memory copy/set instructions (Armv8.8 FEAT_MOPS).

The kernel uses several library routines for copying or initializing
memory, for example copy_to_user() and memset(). These routines have
been optimized to make their load/store sequence perform well across a
range of CPUs. However the chosen sequence can't be the fastest possible
for every CPU microarchitecture nor for heterogeneous systems, and needs
to be rewritten periodically as hardware changes.

Future arm64 CPUs will have CPY* and SET* instructions that can copy (or
set) a block of memory of arbitrary size and alignment. The kernel
currently supports using these instructions in userspace applications
[1] and KVM guests [2] but does not use them within the kernel.

CPUs are expected to implement the CPY/SET instructions close to
optimally for their microarchitecture (i.e. close to the performance of
the best load/store sequence performing a generic copy/set). Using the
instructions in the kernel's copy/set routines would therefore make the
routines optimal and avoid the need to rewrite them. It could also lead
to a performance improvement for some CPUs and systems.

This series makes the memcpy(), memmove() and memset() routines use the
CPY/SET instructions, as well as copy_page() and clear_page(). I'll send
a follow-up series to update the usercopy routines (copy_to_user() etc)
"soon", as it needs a bit more work.

The patches were tested on an Arm FVP.

Thanks,
Kristina

[1] https://lore.kernel.org/lkml/20230509142235.3284028-1-kristina.martsenko@arm.com/
[2] https://lore.kernel.org/linux-arm-kernel/20230922112508.1774352-1-kristina.martsenko@arm.com/

Kristina Martsenko (5):
  arm64: probes: Disable kprobes/uprobes on MOPS instructions
  arm64: mops: Handle MOPS exceptions from EL1
  arm64: mops: Document booting requirement for HCR_EL2.MCE2
  arm64: lib: Use MOPS for memcpy() routines
  arm64: lib: Use MOPS for copy_page() and clear_page()

 Documentation/arch/arm64/booting.rst    |  3 +++
 arch/arm64/Kconfig                      |  3 +++
 arch/arm64/include/asm/debug-monitors.h |  1 +
 arch/arm64/include/asm/exception.h      |  1 +
 arch/arm64/include/asm/insn.h           |  1 +
 arch/arm64/kernel/debug-monitors.c      |  5 +++++
 arch/arm64/kernel/entry-common.c        | 12 ++++++++++++
 arch/arm64/kernel/probes/decode-insn.c  |  7 +++++--
 arch/arm64/kernel/traps.c               |  7 +++++++
 arch/arm64/lib/clear_page.S             | 13 +++++++++++++
 arch/arm64/lib/copy_page.S              | 13 +++++++++++++
 arch/arm64/lib/memcpy.S                 | 19 ++++++++++++++++++-
 arch/arm64/lib/memset.S                 | 20 +++++++++++++++++++-
 13 files changed, 101 insertions(+), 4 deletions(-)


base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc

Comments

Catalin Marinas Oct. 2, 2024, 4:20 p.m. UTC | #1
On Mon, Sep 30, 2024 at 05:10:46PM +0100, Kristina Martsenko wrote:
> This series makes the memcpy(), memmove() and memset() routines use the
> CPY/SET instructions, as well as copy_page() and clear_page().

Another function we should optimise is mte_zero_clear_page_tags() using
SETG*.
Kristina Martšenko Oct. 3, 2024, 4:49 p.m. UTC | #2
On 02/10/2024 17:20, Catalin Marinas wrote:
> On Mon, Sep 30, 2024 at 05:10:46PM +0100, Kristina Martsenko wrote:
>> This series makes the memcpy(), memmove() and memset() routines use the
>> CPY/SET instructions, as well as copy_page() and clear_page().
> 
> Another function we should optimise is mte_zero_clear_page_tags() using
> SETG*.

I can send that as a separate series. What about mte_set_mem_tag_range()?

Thanks,
Kristina
Catalin Marinas Oct. 4, 2024, 10:10 a.m. UTC | #3
On Thu, Oct 03, 2024 at 05:49:57PM +0100, Kristina Martsenko wrote:
> On 02/10/2024 17:20, Catalin Marinas wrote:
> > On Mon, Sep 30, 2024 at 05:10:46PM +0100, Kristina Martsenko wrote:
> >> This series makes the memcpy(), memmove() and memset() routines use the
> >> CPY/SET instructions, as well as copy_page() and clear_page().
> > 
> > Another function we should optimise is mte_zero_clear_page_tags() using
> > SETG*.
> 
> I can send that as a separate series.

Sounds good, thanks.

> What about mte_set_mem_tag_range()?

Ah, that as well but I guess only the init == true path since I don't
think we have any FEAT_MOPS instruction for only setting the tags.
Catalin Marinas Oct. 17, 2024, 6 p.m. UTC | #4
On Mon, 30 Sep 2024 17:10:46 +0100, Kristina Martsenko wrote:
> Here is a small series to make memcpy() and related functions use the
> memory copy/set instructions (Armv8.8 FEAT_MOPS).
> 
> The kernel uses several library routines for copying or initializing
> memory, for example copy_to_user() and memset(). These routines have
> been optimized to make their load/store sequence perform well across a
> range of CPUs. However the chosen sequence can't be the fastest possible
> for every CPU microarchitecture nor for heterogeneous systems, and needs
> to be rewritten periodically as hardware changes.
> 
> [...]

Applied to arm64 (for-next/mops), thanks!

I left the documentation patch as is but it may be helpful to add a
dedicated mops.txt one with some explanations around hypevisor
requirements. Not urgent though.

[1/5] arm64: probes: Disable kprobes/uprobes on MOPS instructions
      https://git.kernel.org/arm64/c/c56c599d9002
[2/5] arm64: mops: Handle MOPS exceptions from EL1
      https://git.kernel.org/arm64/c/13840229d6bd
[3/5] arm64: mops: Document booting requirement for HCR_EL2.MCE2
      https://git.kernel.org/arm64/c/b616058c6613
[4/5] arm64: lib: Use MOPS for memcpy() routines
      https://git.kernel.org/arm64/c/836ed3c4e473
[5/5] arm64: lib: Use MOPS for copy_page() and clear_page()
      https://git.kernel.org/arm64/c/ce6b5ff5f16d