Message ID | 20210813063150.2938-9-alex.sierra@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Support DEVICE_GENERIC memory in migrate_vma_* | expand |
On Fri, Aug 13, 2021 at 01:31:45AM -0500, Alex Sierra wrote: > Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback. > Device generic type memory case is now able to free its pages properly. How is this going to work for the two existing MEMORY_DEVICE_GENERIC that now change behavior? And which don't have a ->page_free callback at all? > > Signed-off-by: Alex Sierra <alex.sierra@amd.com> > --- > mm/memremap.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/mm/memremap.c b/mm/memremap.c > index 5aa8163fd948..5773e15b6ac9 100644 > --- a/mm/memremap.c > +++ b/mm/memremap.c > @@ -459,7 +459,7 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn, > EXPORT_SYMBOL_GPL(get_dev_pagemap); > > #ifdef CONFIG_DEV_PAGEMAP_OPS > -static void free_device_private_page(struct page *page) > +static void free_device_page(struct page *page) > { > > __ClearPageWaiters(page); > @@ -498,7 +498,8 @@ void free_zone_device_page(struct page *page) > wake_up_var(&page->_refcount); > return; > case MEMORY_DEVICE_PRIVATE: > - free_device_private_page(page); > + case MEMORY_DEVICE_GENERIC: > + free_device_page(page); > return; > default: > return; > -- > 2.32.0 ---end quoted text---
Am 2021-08-15 um 11:40 a.m. schrieb Christoph Hellwig: > On Fri, Aug 13, 2021 at 01:31:45AM -0500, Alex Sierra wrote: >> Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback. >> Device generic type memory case is now able to free its pages properly. > How is this going to work for the two existing MEMORY_DEVICE_GENERIC > that now change behavior? And which don't have a ->page_free callback > at all? That's a good catch. Existing drivers shouldn't need a page_free callback if they didn't have one before. That means we need to add a NULL-pointer check in free_device_page. Regards, Felix > >> Signed-off-by: Alex Sierra <alex.sierra@amd.com> >> --- >> mm/memremap.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/mm/memremap.c b/mm/memremap.c >> index 5aa8163fd948..5773e15b6ac9 100644 >> --- a/mm/memremap.c >> +++ b/mm/memremap.c >> @@ -459,7 +459,7 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn, >> EXPORT_SYMBOL_GPL(get_dev_pagemap); >> >> #ifdef CONFIG_DEV_PAGEMAP_OPS >> -static void free_device_private_page(struct page *page) >> +static void free_device_page(struct page *page) >> { >> >> __ClearPageWaiters(page); >> @@ -498,7 +498,8 @@ void free_zone_device_page(struct page *page) >> wake_up_var(&page->_refcount); >> return; >> case MEMORY_DEVICE_PRIVATE: >> - free_device_private_page(page); >> + case MEMORY_DEVICE_GENERIC: >> + free_device_page(page); >> return; >> default: >> return; >> -- >> 2.32.0 > ---end quoted text--- >
On Mon, Aug 16, 2021 at 03:00:49PM -0400, Felix Kuehling wrote: > > Am 2021-08-15 um 11:40 a.m. schrieb Christoph Hellwig: > > On Fri, Aug 13, 2021 at 01:31:45AM -0500, Alex Sierra wrote: > >> Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback. > >> Device generic type memory case is now able to free its pages properly. > > How is this going to work for the two existing MEMORY_DEVICE_GENERIC > > that now change behavior? And which don't have a ->page_free callback > > at all? > > That's a good catch. Existing drivers shouldn't need a page_free > callback if they didn't have one before. That means we need to add a > NULL-pointer check in free_device_page. Also the other state clearing (__ClearPageWaiters/mem_cgroup_uncharge/ ->mapping = NULL). In many ways this seems like you want to bring back the DEVICE_PUBLIC pgmap type that was removed a while ago due to the lack of users instead of overloading the generic type.
Am 2021-08-17 um 1:50 a.m. schrieb Christoph Hellwig: > On Mon, Aug 16, 2021 at 03:00:49PM -0400, Felix Kuehling wrote: >> Am 2021-08-15 um 11:40 a.m. schrieb Christoph Hellwig: >>> On Fri, Aug 13, 2021 at 01:31:45AM -0500, Alex Sierra wrote: >>>> Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback. >>>> Device generic type memory case is now able to free its pages properly. >>> How is this going to work for the two existing MEMORY_DEVICE_GENERIC >>> that now change behavior? And which don't have a ->page_free callback >>> at all? >> That's a good catch. Existing drivers shouldn't need a page_free >> callback if they didn't have one before. That means we need to add a >> NULL-pointer check in free_device_page. > Also the other state clearing (__ClearPageWaiters/mem_cgroup_uncharge/ > ->mapping = NULL). > > In many ways this seems like you want to bring back the DEVICE_PUBLIC > pgmap type that was removed a while ago due to the lack of users > instead of overloading the generic type. I think so. I'm not clear about how DEVICE_PUBLIC differed from what DEVICE_GENERIC is today. As I understand it, DEVICE_PUBLIC was removed because it was unused and also known to be broken in some ways. DEVICE_GENERIC seemed close enough to what we need, other than not being supported in the migration helpers. Would you see benefit in re-introducing DEVICE_PUBLIC as a distinct memory type from DEVICE_GENERIC? What would be the benefits of making that distinction? Thanks, Felix
On Tue, Aug 17, 2021 at 11:44:54AM -0400, Felix Kuehling wrote: > >> That's a good catch. Existing drivers shouldn't need a page_free > >> callback if they didn't have one before. That means we need to add a > >> NULL-pointer check in free_device_page. > > Also the other state clearing (__ClearPageWaiters/mem_cgroup_uncharge/ > > ->mapping = NULL). > > > > In many ways this seems like you want to bring back the DEVICE_PUBLIC > > pgmap type that was removed a while ago due to the lack of users > > instead of overloading the generic type. > > I think so. I'm not clear about how DEVICE_PUBLIC differed from what > DEVICE_GENERIC is today. As I understand it, DEVICE_PUBLIC was removed > because it was unused and also known to be broken in some ways. > DEVICE_GENERIC seemed close enough to what we need, other than not being > supported in the migration helpers. > > Would you see benefit in re-introducing DEVICE_PUBLIC as a distinct > memory type from DEVICE_GENERIC? What would be the benefits of making > that distinction? The old DEVICE_PUBLIC mostly different in that it allowed the page to be returned from vm_normal_page, which I think was horribly buggy. But the point is not to bring back these old semantics. The idea is to be able to differeniate between your new coherent on-device memory and the existing DEVICE_GENERIC. That is call the code in free_devmap_managed_page that is currently only used for device private pages also for your new public device pages without affecting the devdax and xen use cases.
On Thu, Aug 19, 2021 at 10:05 PM Christoph Hellwig <hch@lst.de> wrote: > > On Tue, Aug 17, 2021 at 11:44:54AM -0400, Felix Kuehling wrote: > > >> That's a good catch. Existing drivers shouldn't need a page_free > > >> callback if they didn't have one before. That means we need to add a > > >> NULL-pointer check in free_device_page. > > > Also the other state clearing (__ClearPageWaiters/mem_cgroup_uncharge/ > > > ->mapping = NULL). > > > > > > In many ways this seems like you want to bring back the DEVICE_PUBLIC > > > pgmap type that was removed a while ago due to the lack of users > > > instead of overloading the generic type. > > > > I think so. I'm not clear about how DEVICE_PUBLIC differed from what > > DEVICE_GENERIC is today. As I understand it, DEVICE_PUBLIC was removed > > because it was unused and also known to be broken in some ways. > > DEVICE_GENERIC seemed close enough to what we need, other than not being > > supported in the migration helpers. > > > > Would you see benefit in re-introducing DEVICE_PUBLIC as a distinct > > memory type from DEVICE_GENERIC? What would be the benefits of making > > that distinction? > > The old DEVICE_PUBLIC mostly different in that it allowed the page > to be returned from vm_normal_page, which I think was horribly buggy. Why was that buggy ? If I were to do it now, i would return DEVICE_PUBLIC page from vm_normal_page but i would ban pinning as pinning is exceptionally wrong for GPU. If you migrate some random anonymous/file back to your GPU memory and it gets pinned there then there is no way for the GPU to migrate the page out. Quickly you will run out of physically contiguous memory and things like big graphic buffer allocation (anything that needs physically contiguous memory) will fail. It is less of an issue on some hardware that rely less and less on physically contiguous memory but i do not think it is completely gone from all hw. > But the point is not to bring back these old semantics. The idea > is to be able to differeniate between your new coherent on-device > memory and the existing DEVICE_GENERIC. That is call the > code in free_devmap_managed_page that is currently only used > for device private pages also for your new public device pages without > affecting the devdax and xen use cases. Yes, I would rather bring back DEVICE_PUBLIC then try to use DEVICE_GENERIC, the GENERIC change was done for users that closely matched DAX semantics and it is not the case here, at least not from my point of view. Jerome
diff --git a/mm/memremap.c b/mm/memremap.c index 5aa8163fd948..5773e15b6ac9 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -459,7 +459,7 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn, EXPORT_SYMBOL_GPL(get_dev_pagemap); #ifdef CONFIG_DEV_PAGEMAP_OPS -static void free_device_private_page(struct page *page) +static void free_device_page(struct page *page) { __ClearPageWaiters(page); @@ -498,7 +498,8 @@ void free_zone_device_page(struct page *page) wake_up_var(&page->_refcount); return; case MEMORY_DEVICE_PRIVATE: - free_device_private_page(page); + case MEMORY_DEVICE_GENERIC: + free_device_page(page); return; default: return;
Add MEMORY_DEVICE_GENERIC case to free_zone_device_page callback. Device generic type memory case is now able to free its pages properly. Signed-off-by: Alex Sierra <alex.sierra@amd.com> --- mm/memremap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)