Message ID | 20210316153303.3216674-4-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | switch to unsafe_follow_pfn | expand |
On Tue, Mar 16, 2021 at 04:33:03PM +0100, Daniel Vetter wrote: > Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after > follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use > follow_pte()")) have lost their callsites of follow_pfn(). All the > other ones have been switched over to unsafe_follow_pfn because they > cannot be fixed without breaking userspace api. > > Argueably the vfio code is still racy, but that's kinda a bigger > picture. But since it does leak the pte beyond where it drops the pt > lock, without anything else like an mmu notifier guaranteeing > coherence, the problem is at least clearly visible in the vfio code. > So good enough with me. > > I've decided to keep the explanation that after dropping the pt lock > you must have an mmu notifier if you keep using the pte somehow by > adjusting it and moving it into the kerneldoc for the new follow_pte() > function. > > Cc: 3pvd@google.com > Cc: Jann Horn <jannh@google.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Jason Gunthorpe <jgg@nvidia.com> > Cc: Cornelia Huck <cohuck@redhat.com> > Cc: Peter Xu <peterx@redhat.com> > Cc: Alex Williamson <alex.williamson@redhat.com> > Cc: linux-mm@kvack.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-samsung-soc@vger.kernel.org > Cc: linux-media@vger.kernel.org > Cc: kvm@vger.kernel.org > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > include/linux/mm.h | 2 -- > mm/memory.c | 26 +++++--------------------- > mm/nommu.c | 13 +------------ > 3 files changed, 6 insertions(+), 35 deletions(-) I think this is the right thing to do. Alex is working on fixing VFIO and while kvm is still racy using follow pte, I think they are working on it too? Jason
On 24/03/21 13:52, Jason Gunthorpe wrote: > I think this is the right thing to do. > > Alex is working on fixing VFIO and while kvm is still racy using > follow pte, I think they are working on it too? Yeah, or at least we have a plan. Paolo
On Wed, Mar 24, 2021 at 09:52:11AM -0300, Jason Gunthorpe wrote: > On Tue, Mar 16, 2021 at 04:33:03PM +0100, Daniel Vetter wrote: > > Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after > > follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use > > follow_pte()")) have lost their callsites of follow_pfn(). All the > > other ones have been switched over to unsafe_follow_pfn because they > > cannot be fixed without breaking userspace api. > > > > Argueably the vfio code is still racy, but that's kinda a bigger > > picture. But since it does leak the pte beyond where it drops the pt > > lock, without anything else like an mmu notifier guaranteeing > > coherence, the problem is at least clearly visible in the vfio code. > > So good enough with me. > > > > I've decided to keep the explanation that after dropping the pt lock > > you must have an mmu notifier if you keep using the pte somehow by > > adjusting it and moving it into the kerneldoc for the new follow_pte() > > function. > > > > Cc: 3pvd@google.com > > Cc: Jann Horn <jannh@google.com> > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Cc: Jason Gunthorpe <jgg@nvidia.com> > > Cc: Cornelia Huck <cohuck@redhat.com> > > Cc: Peter Xu <peterx@redhat.com> > > Cc: Alex Williamson <alex.williamson@redhat.com> > > Cc: linux-mm@kvack.org > > Cc: linux-arm-kernel@lists.infradead.org > > Cc: linux-samsung-soc@vger.kernel.org > > Cc: linux-media@vger.kernel.org > > Cc: kvm@vger.kernel.org > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > --- > > include/linux/mm.h | 2 -- > > mm/memory.c | 26 +++++--------------------- > > mm/nommu.c | 13 +------------ > > 3 files changed, 6 insertions(+), 35 deletions(-) > > I think this is the right thing to do. Was just about to smash this into the topic branch for testing in linux-next. Feel like an ack on the series, or at least the two mm patches? -Daniel > > Alex is working on fixing VFIO and while kvm is still racy using > follow pte, I think they are working on it too? > > Jason
On Wed, Mar 24, 2021 at 08:17:10PM +0100, Daniel Vetter wrote: > On Wed, Mar 24, 2021 at 09:52:11AM -0300, Jason Gunthorpe wrote: > > On Tue, Mar 16, 2021 at 04:33:03PM +0100, Daniel Vetter wrote: > > > Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after > > > follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use > > > follow_pte()")) have lost their callsites of follow_pfn(). All the > > > other ones have been switched over to unsafe_follow_pfn because they > > > cannot be fixed without breaking userspace api. > > > > > > Argueably the vfio code is still racy, but that's kinda a bigger > > > picture. But since it does leak the pte beyond where it drops the pt > > > lock, without anything else like an mmu notifier guaranteeing > > > coherence, the problem is at least clearly visible in the vfio code. > > > So good enough with me. > > > > > > I've decided to keep the explanation that after dropping the pt lock > > > you must have an mmu notifier if you keep using the pte somehow by > > > adjusting it and moving it into the kerneldoc for the new follow_pte() > > > function. > > > > > > Cc: 3pvd@google.com > > > Cc: Jann Horn <jannh@google.com> > > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > > Cc: Jason Gunthorpe <jgg@nvidia.com> > > > Cc: Cornelia Huck <cohuck@redhat.com> > > > Cc: Peter Xu <peterx@redhat.com> > > > Cc: Alex Williamson <alex.williamson@redhat.com> > > > Cc: linux-mm@kvack.org > > > Cc: linux-arm-kernel@lists.infradead.org > > > Cc: linux-samsung-soc@vger.kernel.org > > > Cc: linux-media@vger.kernel.org > > > Cc: kvm@vger.kernel.org > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > > --- > > > include/linux/mm.h | 2 -- > > > mm/memory.c | 26 +++++--------------------- > > > mm/nommu.c | 13 +------------ > > > 3 files changed, 6 insertions(+), 35 deletions(-) > > > > I think this is the right thing to do. > > Was just about to smash this into the topic branch for testing in > linux-next. Feel like an ack on the series, or at least the two mm > patches? Pushed them to my topic branch for a bit of testing in linux-next, hopefully goes all fine for a pull for 5.13. -Daniel
On Tue, Mar 16, 2021 at 04:33:03PM +0100, Daniel Vetter wrote: > Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after > follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use > follow_pte()")) have lost their callsites of follow_pfn(). All the > other ones have been switched over to unsafe_follow_pfn because they > cannot be fixed without breaking userspace api. > > Argueably the vfio code is still racy, but that's kinda a bigger vfio and kvm > picture. But since it does leak the pte beyond where it drops the pt > lock, without anything else like an mmu notifier guaranteeing > coherence, the problem is at least clearly visible in the vfio code. > So good enough with me. > > I've decided to keep the explanation that after dropping the pt lock > you must have an mmu notifier if you keep using the pte somehow by > adjusting it and moving it into the kerneldoc for the new follow_pte() > function. > > Cc: 3pvd@google.com > Cc: Jann Horn <jannh@google.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Jason Gunthorpe <jgg@nvidia.com> > Cc: Cornelia Huck <cohuck@redhat.com> > Cc: Peter Xu <peterx@redhat.com> > Cc: Alex Williamson <alex.williamson@redhat.com> > Cc: linux-mm@kvack.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-samsung-soc@vger.kernel.org > Cc: linux-media@vger.kernel.org > Cc: kvm@vger.kernel.org > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > --- > include/linux/mm.h | 2 -- > mm/memory.c | 26 +++++--------------------- > mm/nommu.c | 13 +------------ > 3 files changed, 6 insertions(+), 35 deletions(-) Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Jason
On Mon, Mar 29, 2021 at 10:31:01AM -0300, Jason Gunthorpe wrote: > On Tue, Mar 16, 2021 at 04:33:03PM +0100, Daniel Vetter wrote: > > Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after > > follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use > > follow_pte()")) have lost their callsites of follow_pfn(). All the > > other ones have been switched over to unsafe_follow_pfn because they > > cannot be fixed without breaking userspace api. > > > > Argueably the vfio code is still racy, but that's kinda a bigger > > vfio and kvm Hm I thought kvm is non-racy due to the mmu notifier catch races? > > > picture. But since it does leak the pte beyond where it drops the pt > > lock, without anything else like an mmu notifier guaranteeing > > coherence, the problem is at least clearly visible in the vfio code. > > So good enough with me. > > > > I've decided to keep the explanation that after dropping the pt lock > > you must have an mmu notifier if you keep using the pte somehow by > > adjusting it and moving it into the kerneldoc for the new follow_pte() > > function. > > > > Cc: 3pvd@google.com > > Cc: Jann Horn <jannh@google.com> > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Cc: Jason Gunthorpe <jgg@nvidia.com> > > Cc: Cornelia Huck <cohuck@redhat.com> > > Cc: Peter Xu <peterx@redhat.com> > > Cc: Alex Williamson <alex.williamson@redhat.com> > > Cc: linux-mm@kvack.org > > Cc: linux-arm-kernel@lists.infradead.org > > Cc: linux-samsung-soc@vger.kernel.org > > Cc: linux-media@vger.kernel.org > > Cc: kvm@vger.kernel.org > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > --- > > include/linux/mm.h | 2 -- > > mm/memory.c | 26 +++++--------------------- > > mm/nommu.c | 13 +------------ > > 3 files changed, 6 insertions(+), 35 deletions(-) > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Thanks for your r-b tags, I'll add them. -Daniel > > Jason
On 08/04/21 12:05, Daniel Vetter wrote: > On Mon, Mar 29, 2021 at 10:31:01AM -0300, Jason Gunthorpe wrote: >> On Tue, Mar 16, 2021 at 04:33:03PM +0100, Daniel Vetter wrote: >>> Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after >>> follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use >>> follow_pte()")) have lost their callsites of follow_pfn(). All the >>> other ones have been switched over to unsafe_follow_pfn because they >>> cannot be fixed without breaking userspace api. >>> >>> Argueably the vfio code is still racy, but that's kinda a bigger >> >> vfio and kvm > > Hm I thought kvm is non-racy due to the mmu notifier catch races? No, but the plan is indeed to have some struct for each page that uses follow_pfn and update it from the MMU notifiers. Paolo >> >>> picture. But since it does leak the pte beyond where it drops the pt >>> lock, without anything else like an mmu notifier guaranteeing >>> coherence, the problem is at least clearly visible in the vfio code. >>> So good enough with me. >>> >>> I've decided to keep the explanation that after dropping the pt lock >>> you must have an mmu notifier if you keep using the pte somehow by >>> adjusting it and moving it into the kerneldoc for the new follow_pte() >>> function. >>> >>> Cc: 3pvd@google.com >>> Cc: Jann Horn <jannh@google.com> >>> Cc: Paolo Bonzini <pbonzini@redhat.com> >>> Cc: Jason Gunthorpe <jgg@nvidia.com> >>> Cc: Cornelia Huck <cohuck@redhat.com> >>> Cc: Peter Xu <peterx@redhat.com> >>> Cc: Alex Williamson <alex.williamson@redhat.com> >>> Cc: linux-mm@kvack.org >>> Cc: linux-arm-kernel@lists.infradead.org >>> Cc: linux-samsung-soc@vger.kernel.org >>> Cc: linux-media@vger.kernel.org >>> Cc: kvm@vger.kernel.org >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> >>> --- >>> include/linux/mm.h | 2 -- >>> mm/memory.c | 26 +++++--------------------- >>> mm/nommu.c | 13 +------------ >>> 3 files changed, 6 insertions(+), 35 deletions(-) >> >> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > Thanks for your r-b tags, I'll add them. > -Daniel > >> >> Jason >
On Thu, Apr 08, 2021 at 01:40:59PM +0200, Paolo Bonzini wrote: > On 08/04/21 12:05, Daniel Vetter wrote: > > On Mon, Mar 29, 2021 at 10:31:01AM -0300, Jason Gunthorpe wrote: > > > On Tue, Mar 16, 2021 at 04:33:03PM +0100, Daniel Vetter wrote: > > > > Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after > > > > follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use > > > > follow_pte()")) have lost their callsites of follow_pfn(). All the > > > > other ones have been switched over to unsafe_follow_pfn because they > > > > cannot be fixed without breaking userspace api. > > > > > > > > Argueably the vfio code is still racy, but that's kinda a bigger > > > > > > vfio and kvm > > > > Hm I thought kvm is non-racy due to the mmu notifier catch races? > > No, but the plan is indeed to have some struct for each page that uses > follow_pfn and update it from the MMU notifiers. Thanks for clarifying, I've fixed the commit message to mention both vfio and kvm as Jason suggested. I didn't know that the follow_pte usage in kvm still has some gaps wrt what's invalidated with mmu notifiers. Thanks, Daniel > > Paolo > > > > > > > > picture. But since it does leak the pte beyond where it drops the pt > > > > lock, without anything else like an mmu notifier guaranteeing > > > > coherence, the problem is at least clearly visible in the vfio code. > > > > So good enough with me. > > > > > > > > I've decided to keep the explanation that after dropping the pt lock > > > > you must have an mmu notifier if you keep using the pte somehow by > > > > adjusting it and moving it into the kerneldoc for the new follow_pte() > > > > function. > > > > > > > > Cc: 3pvd@google.com > > > > Cc: Jann Horn <jannh@google.com> > > > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > > > Cc: Jason Gunthorpe <jgg@nvidia.com> > > > > Cc: Cornelia Huck <cohuck@redhat.com> > > > > Cc: Peter Xu <peterx@redhat.com> > > > > Cc: Alex Williamson <alex.williamson@redhat.com> > > > > Cc: linux-mm@kvack.org > > > > Cc: linux-arm-kernel@lists.infradead.org > > > > Cc: linux-samsung-soc@vger.kernel.org > > > > Cc: linux-media@vger.kernel.org > > > > Cc: kvm@vger.kernel.org > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > > > --- > > > > include/linux/mm.h | 2 -- > > > > mm/memory.c | 26 +++++--------------------- > > > > mm/nommu.c | 13 +------------ > > > > 3 files changed, 6 insertions(+), 35 deletions(-) > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > > Thanks for your r-b tags, I'll add them. > > -Daniel > > > > > > > > Jason > > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff --git a/include/linux/mm.h b/include/linux/mm.h index caec8b25d66f..304588e2f829 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1693,8 +1693,6 @@ int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); -int follow_pfn(struct vm_area_struct *vma, unsigned long address, - unsigned long *pfn); int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); int follow_phys(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index e8a145505b69..317e653c8aeb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4724,7 +4724,10 @@ int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, * should be taken for read. * * KVM uses this function. While it is arguably less bad than ``follow_pfn``, - * it is not a good general-purpose API. + * it is not a good general-purpose API: If callers use the pte after they've + * unlocked @ptlp they must ensure coherency with pte updates by using a + * &mmu_notifier to follow updates. Any caller not following these requirements + * must use unsafe_follow_pfn() instead. * * Return: zero on success, -ve otherwise. */ @@ -4735,25 +4738,7 @@ int follow_pte(struct mm_struct *mm, unsigned long address, } EXPORT_SYMBOL_GPL(follow_pte); -/** - * follow_pfn - look up PFN at a user virtual address - * @vma: memory mapping - * @address: user virtual address - * @pfn: location to store found PFN - * - * Only IO mappings and raw PFN mappings are allowed. Note that callers must - * ensure coherency with pte updates by using a &mmu_notifier to follow updates. - * If this is not feasible, or the access to the @pfn is only very short term, - * use follow_pte_pmd() instead and hold the pagetable lock for the duration of - * the access instead. Any caller not following these requirements must use - * unsafe_follow_pfn() instead. - * - * This function does not allow the caller to read the permissions - * of the PTE. Do not use it. - * - * Return: zero and the pfn at @pfn on success, -ve otherwise. - */ -int follow_pfn(struct vm_area_struct *vma, unsigned long address, +static int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn) { int ret = -EINVAL; @@ -4770,7 +4755,6 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address, pte_unmap_unlock(ptep, ptl); return 0; } -EXPORT_SYMBOL_GPL(follow_pfn); /** * unsafe_follow_pfn - look up PFN at a user virtual address diff --git a/mm/nommu.c b/mm/nommu.c index 1dc983f50e2c..cee29d0791b3 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -111,17 +111,7 @@ unsigned int kobjsize(const void *objp) return page_size(page); } -/** - * follow_pfn - look up PFN at a user virtual address - * @vma: memory mapping - * @address: user virtual address - * @pfn: location to store found PFN - * - * Only IO mappings and raw PFN mappings are allowed. - * - * Returns zero and the pfn at @pfn on success, -ve otherwise. - */ -int follow_pfn(struct vm_area_struct *vma, unsigned long address, +static int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn) { if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) @@ -130,7 +120,6 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address, *pfn = address >> PAGE_SHIFT; return 0; } -EXPORT_SYMBOL_GPL(follow_pfn); /** * unsafe_follow_pfn - look up PFN at a user virtual address
Both kvm (in bd2fae8da794 ("KVM: do not assume PTE is writable after follow_pfn")) and vfio (in 07956b6269d3 ("vfio/type1: Use follow_pte()")) have lost their callsites of follow_pfn(). All the other ones have been switched over to unsafe_follow_pfn because they cannot be fixed without breaking userspace api. Argueably the vfio code is still racy, but that's kinda a bigger picture. But since it does leak the pte beyond where it drops the pt lock, without anything else like an mmu notifier guaranteeing coherence, the problem is at least clearly visible in the vfio code. So good enough with me. I've decided to keep the explanation that after dropping the pt lock you must have an mmu notifier if you keep using the pte somehow by adjusting it and moving it into the kerneldoc for the new follow_pte() function. Cc: 3pvd@google.com Cc: Jann Horn <jannh@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: kvm@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> --- include/linux/mm.h | 2 -- mm/memory.c | 26 +++++--------------------- mm/nommu.c | 13 +------------ 3 files changed, 6 insertions(+), 35 deletions(-)