Message ID | 1590526802-3008-1-git-send-email-jrdr.linux@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/radeon: Convert get_user_pages() --> pin_user_pages() | expand |
On Tue, May 26, 2020 at 03:57:45PM -0700, John Hubbard wrote: > On 2020-05-26 14:00, Souptick Joarder wrote: > > This code was using get_user_pages(), in a "Case 2" scenario > > (DMA/RDMA), using the categorization from [1]. That means that it's > > time to convert the get_user_pages() + release_pages() calls to > > pin_user_pages() + unpin_user_pages() calls. > > > > There is some helpful background in [2]: basically, this is a small > > part of fixing a long-standing disconnect between pinning pages, and > > file systems' use of those pages. > > > > [1] Documentation/core-api/pin_user_pages.rst > > > > [2] "Explicit pinning of user-space pages": > > https://lwn.net/Articles/807108/ I don't think this is a case 2 here, nor is it any of the others. Feels like not covered at all by the doc. radeon has a mmu notifier (might be a bit broken, but hey whatever there's other drivers which have the same concept, but less broken). So when you do an munmap, radeon will release the page refcount. Which case it that? Note that currently only amdgpu doesn't work like that for gpu dma directly to userspace ranges, it uses hmm and afaiui doens't hold a full page pin refcount. Cheers, Daniel > > > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > > Cc: John Hubbard <jhubbard@nvidia.com> > > > > Hi, > > > > I'm compile tested this, but unable to run-time test, so any testing > > help is much appriciated. > > --- > > drivers/gpu/drm/radeon/radeon_ttm.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > > index 5d50c9e..e927de2 100644 > > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > > @@ -506,7 +506,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) > > uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; > > struct page **pages = ttm->pages + pinned; > > - r = get_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, > > + r = pin_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, > > pages, NULL); > > if (r < 0) > > goto release_pages; > > @@ -535,7 +535,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) > > kfree(ttm->sg); > > release_pages: > > - release_pages(ttm->pages, pinned); > > + unpin_user_pages(ttm->pages, pinned); > > return r; > > } > > @@ -562,7 +562,7 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm) > > set_page_dirty(page); > > > Maybe we also need a preceding patch, to fix the above? It should be > set_page_dirty_lock(), rather than set_page_dirty(), unless I'm overlooking > something (which is very possible!). > > Either way, from a tunnel vision perspective of changing gup to pup, this > looks good to me, so > > Acked-by: John Hubbard <jhubbard@nvidia.com> > > > thanks, > -- > John Hubbard > NVIDIA > > > mark_page_accessed(page); > > - put_page(page); > > + unpin_user_page(page); > > } > > sg_free_table(ttm->sg); > > >
On Wed, May 27, 2020 at 10:48:52AM +0200, Daniel Vetter wrote: > On Tue, May 26, 2020 at 03:57:45PM -0700, John Hubbard wrote: > > On 2020-05-26 14:00, Souptick Joarder wrote: > > > This code was using get_user_pages(), in a "Case 2" scenario > > > (DMA/RDMA), using the categorization from [1]. That means that it's > > > time to convert the get_user_pages() + release_pages() calls to > > > pin_user_pages() + unpin_user_pages() calls. > > > > > > There is some helpful background in [2]: basically, this is a small > > > part of fixing a long-standing disconnect between pinning pages, and > > > file systems' use of those pages. > > > > > > [1] Documentation/core-api/pin_user_pages.rst > > > > > > [2] "Explicit pinning of user-space pages": > > > https://lwn.net/Articles/807108/ > > I don't think this is a case 2 here, nor is it any of the others. Feels > like not covered at all by the doc. > > radeon has a mmu notifier (might be a bit broken, but hey whatever there's > other drivers which have the same concept, but less broken). So when you > do an munmap, radeon will release the page refcount. I forgot to add: It's also not case 3, since there's no hw page fault support. It's all faked in software, and explicitly synchronizes against pending io (or preempts it, that depends a bit upon the jobs running). > Which case it that? > > Note that currently only amdgpu doesn't work like that for gpu dma > directly to userspace ranges, it uses hmm and afaiui doens't hold a full > page pin refcount. > > Cheers, Daniel > > > > > > > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > > > Cc: John Hubbard <jhubbard@nvidia.com> > > > > > > Hi, > > > > > > I'm compile tested this, but unable to run-time test, so any testing > > > help is much appriciated. > > > --- > > > drivers/gpu/drm/radeon/radeon_ttm.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > > > index 5d50c9e..e927de2 100644 > > > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > > > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > > > @@ -506,7 +506,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) > > > uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; > > > struct page **pages = ttm->pages + pinned; > > > - r = get_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, > > > + r = pin_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, > > > pages, NULL); > > > if (r < 0) > > > goto release_pages; > > > @@ -535,7 +535,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) > > > kfree(ttm->sg); > > > release_pages: > > > - release_pages(ttm->pages, pinned); > > > + unpin_user_pages(ttm->pages, pinned); > > > return r; > > > } > > > @@ -562,7 +562,7 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm) > > > set_page_dirty(page); > > > > > > Maybe we also need a preceding patch, to fix the above? It should be > > set_page_dirty_lock(), rather than set_page_dirty(), unless I'm overlooking > > something (which is very possible!). > > > > Either way, from a tunnel vision perspective of changing gup to pup, this > > looks good to me, so > > > > Acked-by: John Hubbard <jhubbard@nvidia.com> > > > > > > thanks, > > -- > > John Hubbard > > NVIDIA > > > > > mark_page_accessed(page); > > > - put_page(page); > > > + unpin_user_page(page); > > > } > > > sg_free_table(ttm->sg); > > > > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On 2020-05-27 01:51, Daniel Vetter wrote: > On Wed, May 27, 2020 at 10:48:52AM +0200, Daniel Vetter wrote: >> On Tue, May 26, 2020 at 03:57:45PM -0700, John Hubbard wrote: >>> On 2020-05-26 14:00, Souptick Joarder wrote: >>>> This code was using get_user_pages(), in a "Case 2" scenario >>>> (DMA/RDMA), using the categorization from [1]. That means that it's >>>> time to convert the get_user_pages() + release_pages() calls to >>>> pin_user_pages() + unpin_user_pages() calls. >>>> >>>> There is some helpful background in [2]: basically, this is a small >>>> part of fixing a long-standing disconnect between pinning pages, and >>>> file systems' use of those pages. >>>> >>>> [1] Documentation/core-api/pin_user_pages.rst >>>> >>>> [2] "Explicit pinning of user-space pages": >>>> https://lwn.net/Articles/807108/ >> >> I don't think this is a case 2 here, nor is it any of the others. Feels >> like not covered at all by the doc. >> >> radeon has a mmu notifier (might be a bit broken, but hey whatever there's >> other drivers which have the same concept, but less broken). So when you >> do an munmap, radeon will release the page refcount. > Aha, thanks Daniel. I withdraw my misinformed ACK, then. > I forgot to add: It's also not case 3, since there's no hw page fault > support. It's all faked in software, and explicitly synchronizes against > pending io (or preempts it, that depends a bit upon the jobs running). > This is what case 3 was *intended* to cover, but it looks like case 3 needs to be written a little better. I'll attempt that, and Cc you on the actual patch to -mm. (I think we also need a case 5 for an unrelated scenario, too, so it's time.) thanks,
On Thu, May 28, 2020 at 12:37 AM John Hubbard <jhubbard@nvidia.com> wrote: > > On 2020-05-27 01:51, Daniel Vetter wrote: > > On Wed, May 27, 2020 at 10:48:52AM +0200, Daniel Vetter wrote: > >> On Tue, May 26, 2020 at 03:57:45PM -0700, John Hubbard wrote: > >>> On 2020-05-26 14:00, Souptick Joarder wrote: > >>>> This code was using get_user_pages(), in a "Case 2" scenario > >>>> (DMA/RDMA), using the categorization from [1]. That means that it's > >>>> time to convert the get_user_pages() + release_pages() calls to > >>>> pin_user_pages() + unpin_user_pages() calls. > >>>> > >>>> There is some helpful background in [2]: basically, this is a small > >>>> part of fixing a long-standing disconnect between pinning pages, and > >>>> file systems' use of those pages. > >>>> > >>>> [1] Documentation/core-api/pin_user_pages.rst > >>>> > >>>> [2] "Explicit pinning of user-space pages": > >>>> https://lwn.net/Articles/807108/ > >> > >> I don't think this is a case 2 here, nor is it any of the others. Feels > >> like not covered at all by the doc. > >> > >> radeon has a mmu notifier (might be a bit broken, but hey whatever there's > >> other drivers which have the same concept, but less broken). So when you > >> do an munmap, radeon will release the page refcount. > > > > Aha, thanks Daniel. I withdraw my misinformed ACK, then. > > > I forgot to add: It's also not case 3, since there's no hw page fault > > support. It's all faked in software, and explicitly synchronizes against > > pending io (or preempts it, that depends a bit upon the jobs running). > > > > This is what case 3 was *intended* to cover, but it looks like case 3 needs to > be written a little better. I'll attempt that, and Cc you on the actual patch > to -mm. (I think we also need a case 5 for an unrelated scenario, too, so > it's time.) There were no *case 5* in the other patch posted in -mm. Do we need to add it ? > > > thanks, > -- > John Hubbard > NVIDIA > > > >> Which case it that? > >> > >> Note that currently only amdgpu doesn't work like that for gpu dma > >> directly to userspace ranges, it uses hmm and afaiui doens't hold a full > >> page pin refcount. > >> > >> Cheers, Daniel > >> > >> > >>>> > >>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > >>>> Cc: John Hubbard <jhubbard@nvidia.com> > >>>> > >>>> Hi, > >>>> > >>>> I'm compile tested this, but unable to run-time test, so any testing > >>>> help is much appriciated. > >>>> --- > >>>> drivers/gpu/drm/radeon/radeon_ttm.c | 6 +++--- > >>>> 1 file changed, 3 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > >>>> index 5d50c9e..e927de2 100644 > >>>> --- a/drivers/gpu/drm/radeon/radeon_ttm.c > >>>> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > >>>> @@ -506,7 +506,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) > >>>> uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; > >>>> struct page **pages = ttm->pages + pinned; > >>>> - r = get_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, > >>>> + r = pin_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, > >>>> pages, NULL); > >>>> if (r < 0) > >>>> goto release_pages; > >>>> @@ -535,7 +535,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) > >>>> kfree(ttm->sg); > >>>> release_pages: > >>>> - release_pages(ttm->pages, pinned); > >>>> + unpin_user_pages(ttm->pages, pinned); > >>>> return r; > >>>> } > >>>> @@ -562,7 +562,7 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm) > >>>> set_page_dirty(page); > >>> > >>> > >>> Maybe we also need a preceding patch, to fix the above? It should be > >>> set_page_dirty_lock(), rather than set_page_dirty(), unless I'm overlooking > >>> something (which is very possible!). > >>> > >>> Either way, from a tunnel vision perspective of changing gup to pup, this > >>> looks good to me, so > >>> > >>> Acked-by: John Hubbard <jhubbard@nvidia.com> > >>> > >>> > >>> thanks, > >>> -- > >>> John Hubbard > >>> NVIDIA > >>> > >>>> mark_page_accessed(page); > >>>> - put_page(page); > >>>> + unpin_user_page(page); > >>>> } > >>>> sg_free_table(ttm->sg); > >>>> > >>> > >> > >> -- > >> Daniel Vetter > >> Software Engineer, Intel Corporation > >> http://blog.ffwll.ch > > >
On 2020-05-28 23:49, Souptick Joarder wrote: ... >> This is what case 3 was *intended* to cover, but it looks like case 3 needs to >> be written a little better. I'll attempt that, and Cc you on the actual patch >> to -mm. (I think we also need a case 5 for an unrelated scenario, too, so >> it's time.) > > There were no *case 5* in the other patch posted in -mm. Do we need to add it ? > Working on figuring that out [1], but it's not directly relevant to this thread. Maybe I shouldn't have brought it up here. :) [1] https://lore.kernel.org/r/20200529070343.GL14550@quack2.suse.cz thanks, John Hubbard NVIDIA
On Fri, May 29, 2020 at 12:58 PM John Hubbard <jhubbard@nvidia.com> wrote: > > On 2020-05-28 23:49, Souptick Joarder wrote: > ... > >> This is what case 3 was *intended* to cover, but it looks like case 3 needs to > >> be written a little better. I'll attempt that, and Cc you on the actual patch > >> to -mm. (I think we also need a case 5 for an unrelated scenario, too, so > >> it's time.) > > > > There were no *case 5* in the other patch posted in -mm. Do we need to add it ? > > > > Working on figuring that out [1], but it's not directly relevant to this thread. > Maybe I shouldn't have brought it up here. :) > > > [1] https://lore.kernel.org/r/20200529070343.GL14550@quack2.suse.cz > > thanks, > John Hubbard > NVIDIA > > > As this conversion is not relevant ( mentioned above), I have dropped this patch.
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 5d50c9e..e927de2 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -506,7 +506,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) uint64_t userptr = gtt->userptr + pinned * PAGE_SIZE; struct page **pages = ttm->pages + pinned; - r = get_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, + r = pin_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0, pages, NULL); if (r < 0) goto release_pages; @@ -535,7 +535,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) kfree(ttm->sg); release_pages: - release_pages(ttm->pages, pinned); + unpin_user_pages(ttm->pages, pinned); return r; } @@ -562,7 +562,7 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm) set_page_dirty(page); mark_page_accessed(page); - put_page(page); + unpin_user_page(page); } sg_free_table(ttm->sg);
This code was using get_user_pages(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages() + release_pages() calls to pin_user_pages() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: John Hubbard <jhubbard@nvidia.com> Hi, I'm compile tested this, but unable to run-time test, so any testing help is much appriciated. --- drivers/gpu/drm/radeon/radeon_ttm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)