mbox series

[v5,0/8] KVM: arm64: permit MAP_SHARED mappings with MTE enabled

Message ID 20221104011041.290951-1-pcc@google.com (mailing list archive)
Headers show
Series KVM: arm64: permit MAP_SHARED mappings with MTE enabled | expand

Message

Peter Collingbourne Nov. 4, 2022, 1:10 a.m. UTC
Hi,

This patch series allows VMMs to use shared mappings in MTE enabled
guests. The first five patches were taken from Catalin's tree [1] which
addressed some review feedback from when they were previously sent out
as v3 of this series. The first patch from Catalin's tree makes room
for an additional PG_arch_3 flag by making the newer PG_arch_* flags
arch-dependent. The next four patches are based on a series that
Catalin sent out prior to v3, whose cover letter [2] I quote from below:

> This series aims to fix the races between initialising the tags on a
> page and setting the PG_mte_tagged flag. Currently the flag is set
> either before or after that tag initialisation and this can lead to CoW
> copying stale tags. The first patch moves the flag setting after the
> tags have been initialised, solving the CoW issue. However, concurrent
> mprotect() on a shared mapping may (very rarely) lead to valid tags
> being zeroed.
>
> The second skips the sanitise_mte_tags() call in kvm_set_spte_gfn(),
> deferring it to user_mem_abort(). The outcome is that no
> sanitise_mte_tags() can be simplified to skip the pfn_to_online_page()
> check and only rely on VM_MTE_ALLOWED vma flag that can be checked in
> user_mem_abort().
>
> The third and fourth patches use PG_arch_3 as a lock for page tagging,
> based on Peter Collingbourne's idea of a two-bit lock.
>
> I think the first patch can be queued but the rest needs some in depth
> review and test. With this series (if correct) we could allos MAP_SHARED
> on KVM guest memory but this is to be discussed separately as there are
> some KVM ABI implications.

In this v5 I rebased Catalin's tree onto -next again. Please double check
my rebase, which resolved the conflict with commit a8e5e5146ad0 ("arm64:
mte: Avoid setting PG_mte_tagged if no tags cleared or restored").

I now have Reviewed-by for all patches except for the last one, which adds
the documentation. Thanks for the reviews so far, and please take a look!

I've tested it on QEMU as well as on MTE-capable hardware by booting a
Linux kernel and userspace under a crosvm with MTE support [3].

[1] git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte-pg-flags
[2] https://lore.kernel.org/all/20220705142619.4135905-1-catalin.marinas@arm.com/
[3] https://chromium-review.googlesource.com/c/crosvm/crosvm/+/3892141

Catalin Marinas (4):
  mm: Do not enable PG_arch_2 for all 64-bit architectures
  arm64: mte: Fix/clarify the PG_mte_tagged semantics
  KVM: arm64: Simplify the sanitise_mte_tags() logic
  arm64: mte: Lock a page for MTE tag initialisation

Peter Collingbourne (4):
  mm: Add PG_arch_3 page flag
  KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled
  KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled
  Documentation: document the ABI changes for KVM_CAP_ARM_MTE

 Documentation/virt/kvm/api.rst    |  5 ++-
 arch/arm64/Kconfig                |  1 +
 arch/arm64/include/asm/mte.h      | 65 ++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/pgtable.h  |  4 +-
 arch/arm64/kernel/cpufeature.c    |  4 +-
 arch/arm64/kernel/elfcore.c       |  2 +-
 arch/arm64/kernel/hibernate.c     |  2 +-
 arch/arm64/kernel/mte.c           | 21 +++++-----
 arch/arm64/kvm/guest.c            | 18 +++++----
 arch/arm64/kvm/mmu.c              | 55 +++++++++++---------------
 arch/arm64/mm/copypage.c          |  7 +++-
 arch/arm64/mm/fault.c             |  4 +-
 arch/arm64/mm/mteswap.c           | 16 +++-----
 fs/proc/page.c                    |  3 +-
 include/linux/kernel-page-flags.h |  1 +
 include/linux/page-flags.h        |  3 +-
 include/trace/events/mmflags.h    |  9 +++--
 mm/Kconfig                        |  8 ++++
 mm/huge_memory.c                  |  3 +-
 19 files changed, 152 insertions(+), 79 deletions(-)

Comments

Marc Zyngier Nov. 4, 2022, 4:23 p.m. UTC | #1
On Fri, 04 Nov 2022 01:10:33 +0000,
Peter Collingbourne <pcc@google.com> wrote:
> 
> Hi,
> 
> This patch series allows VMMs to use shared mappings in MTE enabled
> guests. The first five patches were taken from Catalin's tree [1] which
> addressed some review feedback from when they were previously sent out
> as v3 of this series. The first patch from Catalin's tree makes room
> for an additional PG_arch_3 flag by making the newer PG_arch_* flags
> arch-dependent. The next four patches are based on a series that
> Catalin sent out prior to v3, whose cover letter [2] I quote from below:
> 
> > This series aims to fix the races between initialising the tags on a
> > page and setting the PG_mte_tagged flag. Currently the flag is set
> > either before or after that tag initialisation and this can lead to CoW
> > copying stale tags. The first patch moves the flag setting after the
> > tags have been initialised, solving the CoW issue. However, concurrent
> > mprotect() on a shared mapping may (very rarely) lead to valid tags
> > being zeroed.
> >
> > The second skips the sanitise_mte_tags() call in kvm_set_spte_gfn(),
> > deferring it to user_mem_abort(). The outcome is that no
> > sanitise_mte_tags() can be simplified to skip the pfn_to_online_page()
> > check and only rely on VM_MTE_ALLOWED vma flag that can be checked in
> > user_mem_abort().
> >
> > The third and fourth patches use PG_arch_3 as a lock for page tagging,
> > based on Peter Collingbourne's idea of a two-bit lock.
> >
> > I think the first patch can be queued but the rest needs some in depth
> > review and test. With this series (if correct) we could allos MAP_SHARED
> > on KVM guest memory but this is to be discussed separately as there are
> > some KVM ABI implications.
> 
> In this v5 I rebased Catalin's tree onto -next again. Please double check

Please don't do use -next as a base. In-flight series should be based
on a *stable* tag, either 6.0 or one of the early -RCs. If there is a
known conflict with -next, do mention it in the cover letter and
provide a resolution.

> my rebase, which resolved the conflict with commit a8e5e5146ad0 ("arm64:
> mte: Avoid setting PG_mte_tagged if no tags cleared or restored").

This commit seems part of -rc1, so I guess the patches directly apply
on top of that tag?

> I now have Reviewed-by for all patches except for the last one, which adds
> the documentation. Thanks for the reviews so far, and please take a look!

I'd really like the MM folks (list now cc'd) to look at the relevant
patches (1 and 5) and ack them before I take this.

Thanks,

	M.
Peter Collingbourne Nov. 4, 2022, 5:42 p.m. UTC | #2
On Fri, Nov 4, 2022 at 9:23 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 04 Nov 2022 01:10:33 +0000,
> Peter Collingbourne <pcc@google.com> wrote:
> >
> > Hi,
> >
> > This patch series allows VMMs to use shared mappings in MTE enabled
> > guests. The first five patches were taken from Catalin's tree [1] which
> > addressed some review feedback from when they were previously sent out
> > as v3 of this series. The first patch from Catalin's tree makes room
> > for an additional PG_arch_3 flag by making the newer PG_arch_* flags
> > arch-dependent. The next four patches are based on a series that
> > Catalin sent out prior to v3, whose cover letter [2] I quote from below:
> >
> > > This series aims to fix the races between initialising the tags on a
> > > page and setting the PG_mte_tagged flag. Currently the flag is set
> > > either before or after that tag initialisation and this can lead to CoW
> > > copying stale tags. The first patch moves the flag setting after the
> > > tags have been initialised, solving the CoW issue. However, concurrent
> > > mprotect() on a shared mapping may (very rarely) lead to valid tags
> > > being zeroed.
> > >
> > > The second skips the sanitise_mte_tags() call in kvm_set_spte_gfn(),
> > > deferring it to user_mem_abort(). The outcome is that no
> > > sanitise_mte_tags() can be simplified to skip the pfn_to_online_page()
> > > check and only rely on VM_MTE_ALLOWED vma flag that can be checked in
> > > user_mem_abort().
> > >
> > > The third and fourth patches use PG_arch_3 as a lock for page tagging,
> > > based on Peter Collingbourne's idea of a two-bit lock.
> > >
> > > I think the first patch can be queued but the rest needs some in depth
> > > review and test. With this series (if correct) we could allos MAP_SHARED
> > > on KVM guest memory but this is to be discussed separately as there are
> > > some KVM ABI implications.
> >
> > In this v5 I rebased Catalin's tree onto -next again. Please double check
>
> Please don't do use -next as a base. In-flight series should be based
> on a *stable* tag, either 6.0 or one of the early -RCs. If there is a
> known conflict with -next, do mention it in the cover letter and
> provide a resolution.

Okay, I will keep that in mind.

> > my rebase, which resolved the conflict with commit a8e5e5146ad0 ("arm64:
> > mte: Avoid setting PG_mte_tagged if no tags cleared or restored").
>
> This commit seems part of -rc1, so I guess the patches directly apply
> on top of that tag?

Yes, sorry, this also applies cleanly to -rc1.

> > I now have Reviewed-by for all patches except for the last one, which adds
> > the documentation. Thanks for the reviews so far, and please take a look!
>
> I'd really like the MM folks (list now cc'd) to look at the relevant
> patches (1 and 5) and ack them before I take this.

Okay, here are the lore links for the convenience of the MM folks:
https://lore.kernel.org/all/20221104011041.290951-2-pcc@google.com/
https://lore.kernel.org/all/20221104011041.290951-6-pcc@google.com/

Peter
Marc Zyngier Nov. 24, 2022, 10:39 a.m. UTC | #3
On Fri, 04 Nov 2022 17:42:27 +0000,
Peter Collingbourne <pcc@google.com> wrote:
> 
> On Fri, Nov 4, 2022 at 9:23 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Fri, 04 Nov 2022 01:10:33 +0000,
> > Peter Collingbourne <pcc@google.com> wrote:
> > >
> > > Hi,
> > >
> > > This patch series allows VMMs to use shared mappings in MTE enabled
> > > guests. The first five patches were taken from Catalin's tree [1] which
> > > addressed some review feedback from when they were previously sent out
> > > as v3 of this series. The first patch from Catalin's tree makes room
> > > for an additional PG_arch_3 flag by making the newer PG_arch_* flags
> > > arch-dependent. The next four patches are based on a series that
> > > Catalin sent out prior to v3, whose cover letter [2] I quote from below:
> > >
> > > > This series aims to fix the races between initialising the tags on a
> > > > page and setting the PG_mte_tagged flag. Currently the flag is set
> > > > either before or after that tag initialisation and this can lead to CoW
> > > > copying stale tags. The first patch moves the flag setting after the
> > > > tags have been initialised, solving the CoW issue. However, concurrent
> > > > mprotect() on a shared mapping may (very rarely) lead to valid tags
> > > > being zeroed.
> > > >
> > > > The second skips the sanitise_mte_tags() call in kvm_set_spte_gfn(),
> > > > deferring it to user_mem_abort(). The outcome is that no
> > > > sanitise_mte_tags() can be simplified to skip the pfn_to_online_page()
> > > > check and only rely on VM_MTE_ALLOWED vma flag that can be checked in
> > > > user_mem_abort().
> > > >
> > > > The third and fourth patches use PG_arch_3 as a lock for page tagging,
> > > > based on Peter Collingbourne's idea of a two-bit lock.
> > > >
> > > > I think the first patch can be queued but the rest needs some in depth
> > > > review and test. With this series (if correct) we could allos MAP_SHARED
> > > > on KVM guest memory but this is to be discussed separately as there are
> > > > some KVM ABI implications.
> > >
> > > In this v5 I rebased Catalin's tree onto -next again. Please double check
> >
> > Please don't do use -next as a base. In-flight series should be based
> > on a *stable* tag, either 6.0 or one of the early -RCs. If there is a
> > known conflict with -next, do mention it in the cover letter and
> > provide a resolution.
> 
> Okay, I will keep that in mind.
> 
> > > my rebase, which resolved the conflict with commit a8e5e5146ad0 ("arm64:
> > > mte: Avoid setting PG_mte_tagged if no tags cleared or restored").
> >
> > This commit seems part of -rc1, so I guess the patches directly apply
> > on top of that tag?
> 
> Yes, sorry, this also applies cleanly to -rc1.
> 
> > > I now have Reviewed-by for all patches except for the last one, which adds
> > > the documentation. Thanks for the reviews so far, and please take a look!
> >
> > I'd really like the MM folks (list now cc'd) to look at the relevant
> > patches (1 and 5) and ack them before I take this.
> 
> Okay, here are the lore links for the convenience of the MM folks:
> https://lore.kernel.org/all/20221104011041.290951-2-pcc@google.com/
> https://lore.kernel.org/all/20221104011041.290951-6-pcc@google.com/

I have not seen any Ack from the MM folks so far, and we're really
running out of runway for this merge window.

Short of someone shouting now, I'll take the series into the kvmarm
tree early next week.

Thanks,
Marc Zyngier Nov. 29, 2022, 9:33 a.m. UTC | #4
On Thu, 3 Nov 2022 18:10:33 -0700, Peter Collingbourne wrote:
> This patch series allows VMMs to use shared mappings in MTE enabled
> guests. The first five patches were taken from Catalin's tree [1] which
> addressed some review feedback from when they were previously sent out
> as v3 of this series. The first patch from Catalin's tree makes room
> for an additional PG_arch_3 flag by making the newer PG_arch_* flags
> arch-dependent. The next four patches are based on a series that
> Catalin sent out prior to v3, whose cover letter [2] I quote from below:
> 
> [...]

No feedback has been received, so this code is obviously perfect.

Applied to next, thanks!

[1/8] mm: Do not enable PG_arch_2 for all 64-bit architectures
      commit: b0284cd29a957e62d60c2886fd663be93c56f9c0
[2/8] arm64: mte: Fix/clarify the PG_mte_tagged semantics
      commit: e059853d14ca4ed0f6a190d7109487918a22a976
[3/8] KVM: arm64: Simplify the sanitise_mte_tags() logic
      commit: 2dbf12ae132cc78048615cfa19c9be64baaf0ced
[4/8] mm: Add PG_arch_3 page flag
      commit: ef6458b1b6ca3fdb991ce4182e981a88d4c58c0f
[5/8] arm64: mte: Lock a page for MTE tag initialisation
      commit: d77e59a8fccde7fb5dd8c57594ed147b4291c970
[6/8] KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled
      commit: d89585fbb30869011b326ef26c94c3137d228df9
[7/8] KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled
      commit: c911f0d4687947915f04024aa01803247fcf7f1a
[8/8] Documentation: document the ABI changes for KVM_CAP_ARM_MTE
      commit: a4baf8d2639f24d4d31983ff67c01878e7a5393f

Cheers,

	M.