Message ID | 20250404154352.23078-1-kalyazin@amazon.com (mailing list archive) |
---|---|
Headers | show |
Series | KVM: guest_memfd: support for uffd minor | expand |
On Fri, Apr 04, 2025 at 03:43:46PM +0000, Nikita Kalyazin wrote: > This series is built on top of the Fuad's v7 "mapping guest_memfd backed > memory at the host" [1]. Hm if this is based on an unmerged series this seems quite speculative and should maybe be an RFC? I mean that series at least still seems quite under discussion/experiencing issues? Maybe worth RFC'ing until that one settles down first to avoid complexity in review/application to tree? Thanks! > > With James's KVM userfault [2], it is possible to handle stage-2 faults > in guest_memfd in userspace. However, KVM itself also triggers faults > in guest_memfd in some cases, for example: PV interfaces like kvmclock, > PV EOI and page table walking code when fetching the MMIO instruction on > x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] > that KVM would be accessing those pages via userspace page tables. In > order for such faults to be handled in userspace, guest_memfd needs to > support userfaultfd. > > Changes since v2 [4]: > - James: Fix sgp type when calling shmem_get_folio_gfp > - James: Improved vm_ops->fault() error handling > - James: Add and make use of the can_userfault() VMA operation > - James: Add UFFD_FEATURE_MINOR_GUEST_MEMFD feature flag > - James: Fix typos and add more checks in the test > > Nikita > > [1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/ > [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/ > [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3 > [4] https://lore.kernel.org/kvm/20250402160721.97596-1-kalyazin@amazon.com/T/ > > Nikita Kalyazin (6): > mm: userfaultfd: generic continue for non hugetlbfs > mm: provide can_userfault vma operation > mm: userfaultfd: use can_userfault vma operation > KVM: guest_memfd: add support for userfaultfd minor > mm: userfaultfd: add UFFD_FEATURE_MINOR_GUEST_MEMFD > KVM: selftests: test userfaultfd minor for guest_memfd > > fs/userfaultfd.c | 3 +- > include/linux/mm.h | 5 + > include/linux/mm_types.h | 4 + > include/linux/userfaultfd_k.h | 10 +- > include/uapi/linux/userfaultfd.h | 8 +- > mm/hugetlb.c | 9 +- > mm/shmem.c | 17 +++- > mm/userfaultfd.c | 47 ++++++--- > .../testing/selftests/kvm/guest_memfd_test.c | 99 +++++++++++++++++++ > virt/kvm/guest_memfd.c | 10 ++ > 10 files changed, 188 insertions(+), 24 deletions(-) > > > base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0 > -- > 2.47.1 >
On 04/04/2025 17:33, Lorenzo Stoakes wrote: > On Fri, Apr 04, 2025 at 03:43:46PM +0000, Nikita Kalyazin wrote: >> This series is built on top of the Fuad's v7 "mapping guest_memfd backed >> memory at the host" [1]. > > Hm if this is based on an unmerged series this seems quite speculative and > should maybe be an RFC? I mean that series at least still seems quite under > discussion/experiencing issues? > > Maybe worth RFC'ing until that one settles down first to avoid complexity > in review/application to tree? Hi, I dropped the RFC tag because I saw similar examples before, but I'm happy to bring it back next time if the dependency is not merged until then. > > Thanks! Thanks! > >> >> With James's KVM userfault [2], it is possible to handle stage-2 faults >> in guest_memfd in userspace. However, KVM itself also triggers faults >> in guest_memfd in some cases, for example: PV interfaces like kvmclock, >> PV EOI and page table walking code when fetching the MMIO instruction on >> x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] >> that KVM would be accessing those pages via userspace page tables. In >> order for such faults to be handled in userspace, guest_memfd needs to >> support userfaultfd. >> >> Changes since v2 [4]: >> - James: Fix sgp type when calling shmem_get_folio_gfp >> - James: Improved vm_ops->fault() error handling >> - James: Add and make use of the can_userfault() VMA operation >> - James: Add UFFD_FEATURE_MINOR_GUEST_MEMFD feature flag >> - James: Fix typos and add more checks in the test >> >> Nikita >> >> [1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/ >> [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/ >> [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3 >> [4] https://lore.kernel.org/kvm/20250402160721.97596-1-kalyazin@amazon.com/T/ >> >> Nikita Kalyazin (6): >> mm: userfaultfd: generic continue for non hugetlbfs >> mm: provide can_userfault vma operation >> mm: userfaultfd: use can_userfault vma operation >> KVM: guest_memfd: add support for userfaultfd minor >> mm: userfaultfd: add UFFD_FEATURE_MINOR_GUEST_MEMFD >> KVM: selftests: test userfaultfd minor for guest_memfd >> >> fs/userfaultfd.c | 3 +- >> include/linux/mm.h | 5 + >> include/linux/mm_types.h | 4 + >> include/linux/userfaultfd_k.h | 10 +- >> include/uapi/linux/userfaultfd.h | 8 +- >> mm/hugetlb.c | 9 +- >> mm/shmem.c | 17 +++- >> mm/userfaultfd.c | 47 ++++++--- >> .../testing/selftests/kvm/guest_memfd_test.c | 99 +++++++++++++++++++ >> virt/kvm/guest_memfd.c | 10 ++ >> 10 files changed, 188 insertions(+), 24 deletions(-) >> >> >> base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0 >> -- >> 2.47.1 >>
On Fri, Apr 04, 2025 at 05:56:58PM +0100, Nikita Kalyazin wrote: > > > On 04/04/2025 17:33, Lorenzo Stoakes wrote: > > On Fri, Apr 04, 2025 at 03:43:46PM +0000, Nikita Kalyazin wrote: > > > This series is built on top of the Fuad's v7 "mapping guest_memfd backed > > > memory at the host" [1]. > > > > Hm if this is based on an unmerged series this seems quite speculative and > > should maybe be an RFC? I mean that series at least still seems quite under > > discussion/experiencing issues? > > > > Maybe worth RFC'ing until that one settles down first to avoid complexity > > in review/application to tree? > > Hi, > > I dropped the RFC tag because I saw similar examples before, but I'm happy > to bring it back next time if the dependency is not merged until then. Yeah really sorry to be a pain haha, I realise this particular situation is a bit unclear, but I think just for the sake of getting our ducks in a row and ensuring things are settled on the baseline (and it's sort of a fairly big baseline), it'd be best to bring it back! > > > > > Thanks! > > Thanks! Cheers! > > > > > > > > > With James's KVM userfault [2], it is possible to handle stage-2 faults > > > in guest_memfd in userspace. However, KVM itself also triggers faults > > > in guest_memfd in some cases, for example: PV interfaces like kvmclock, > > > PV EOI and page table walking code when fetching the MMIO instruction on > > > x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] > > > that KVM would be accessing those pages via userspace page tables. In > > > order for such faults to be handled in userspace, guest_memfd needs to > > > support userfaultfd. > > > > > > Changes since v2 [4]: > > > - James: Fix sgp type when calling shmem_get_folio_gfp > > > - James: Improved vm_ops->fault() error handling > > > - James: Add and make use of the can_userfault() VMA operation > > > - James: Add UFFD_FEATURE_MINOR_GUEST_MEMFD feature flag > > > - James: Fix typos and add more checks in the test > > > > > > Nikita > > > > > > [1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/ > > > [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/ > > > [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3 > > > [4] https://lore.kernel.org/kvm/20250402160721.97596-1-kalyazin@amazon.com/T/ > > > > > > Nikita Kalyazin (6): > > > mm: userfaultfd: generic continue for non hugetlbfs > > > mm: provide can_userfault vma operation > > > mm: userfaultfd: use can_userfault vma operation > > > KVM: guest_memfd: add support for userfaultfd minor > > > mm: userfaultfd: add UFFD_FEATURE_MINOR_GUEST_MEMFD > > > KVM: selftests: test userfaultfd minor for guest_memfd > > > > > > fs/userfaultfd.c | 3 +- > > > include/linux/mm.h | 5 + > > > include/linux/mm_types.h | 4 + > > > include/linux/userfaultfd_k.h | 10 +- > > > include/uapi/linux/userfaultfd.h | 8 +- > > > mm/hugetlb.c | 9 +- > > > mm/shmem.c | 17 +++- > > > mm/userfaultfd.c | 47 ++++++--- > > > .../testing/selftests/kvm/guest_memfd_test.c | 99 +++++++++++++++++++ > > > virt/kvm/guest_memfd.c | 10 ++ > > > 10 files changed, 188 insertions(+), 24 deletions(-) > > > > > > > > > base-commit: 3cc51efc17a2c41a480eed36b31c1773936717e0 > > > -- > > > 2.47.1 > > > >
+To authors of v7 series referenced in [1] * Nikita Kalyazin <kalyazin@amazon.com> [250404 11:44]: > This series is built on top of the Fuad's v7 "mapping guest_memfd backed > memory at the host" [1]. I didn't see their addresses in the to/cc, so I added them to my response as I reference the v7 patch set below. > > With James's KVM userfault [2], it is possible to handle stage-2 faults > in guest_memfd in userspace. However, KVM itself also triggers faults > in guest_memfd in some cases, for example: PV interfaces like kvmclock, > PV EOI and page table walking code when fetching the MMIO instruction on > x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3] > that KVM would be accessing those pages via userspace page tables. Thanks for being open about the technical call, but it would be better to capture the reasons and not the call date. I explain why in the linking section as well. >In > order for such faults to be handled in userspace, guest_memfd needs to > support userfaultfd. > > Changes since v2 [4]: > - James: Fix sgp type when calling shmem_get_folio_gfp > - James: Improved vm_ops->fault() error handling > - James: Add and make use of the can_userfault() VMA operation > - James: Add UFFD_FEATURE_MINOR_GUEST_MEMFD feature flag > - James: Fix typos and add more checks in the test > > Nikita Please slow down... This patch is at v3, the v7 patch that you are building off has lockdep issues [1] reported by one of the authors, and (sorry for sounding harsh about the v7 of that patch) the cover letter reads a bit more like an RFC than a set ready to go into linux-mm. Maybe the lockdep issue is just a patch ordering thing or removed in a later patch set, but that's not mentioned in the discovery email? What exactly is the goal here and the path forward for the rest of us trying to build on this once it's in mm-new/mm-unstable? Note that mm-unstable is shared with a lot of other people through linux-next, and we are really trying to stop breaking stuff on them. Obviously v7 cannot go in until it works with lockdep - otherwise none of us can use lockdep which is not okay. Also, I am concerned about the amount of testing in the v7 and v3 patch sets that did not bring up a lockdep issue.. > > [1] https://lore.kernel.org/kvm/20250318161823.4005529-1-tabba@google.com/T/ > [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/T/ > [3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAosPOk/edit?tab=t.0#heading=h.w1126rgli5e3 If there is anything we need to know about the decisions in the call and that document, can you please pull it into this change log? I don't think anyone can ensure google will not rename docs to some other office theme tomorrow - as they famously ditch basically every name and application. Also, most of the community does not want to go to a 17 page (and growing) spreadsheet to hunt down the facts when there is an acceptable and ideal place to document them in git. It's another barrier of entry on reviewing your code as well. But please, don't take this suggestion as carte blanche for copying a conversation from the doc, just give us the technical reasons for your decisions as briefly as possible. > [4] https://lore.kernel.org/kvm/20250402160721.97596-1-kalyazin@amazon.com/T/ [1]. https://lore.kernel.org/all/diqz1puanquh.fsf@ackerleytng-ctop.c.googlers.com/ Thanks, Liam