Message ID | 20250219231337.364133-1-balbirs@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/migrate_device: Do not access pgmap for non zone device pages | expand |
On 20.02.25 00:13, Balbir Singh wrote: > page_pgmap() is referenced before checking if the page is a zone > device page and this triggers the warning in page_pgmap(). Refactor > the code to use the helper function after relevant checks. > > Cc: Alistair Popple <apopple@nvidia.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: David Hildenbrand <david@redhat.com> > Cc: Dan Williams <dan.j.williams@intel.com> > > Signed-off-by: Balbir Singh <balbirs@nvidia.com> > --- > > Fixes: 7f1cfd71153b ("mm: allow compound zone device pages") on > mm-unstable Is there actually something broken? At least for now, reading folio->pgmap should just work, although it might be garbage. > > mm/migrate_device.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/mm/migrate_device.c b/mm/migrate_device.c > index 6771893d4601..e0bf771edb6f 100644 > --- a/mm/migrate_device.c > +++ b/mm/migrate_device.c > @@ -153,14 +153,17 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > goto next; > } > page = vm_normal_page(migrate->vma, addr, pte); > - pgmap = page_pgmap(page); > if (page && !is_zone_device_page(page) && > !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) > goto next; > - else if (page && is_device_coherent_page(page) && > - (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || > - pgmap->owner != migrate->pgmap_owner)) > - goto next; > + else if (page && is_device_coherent_page(page)) { > + pgmap = page_pgmap(page); > + > + if (!(migrate->flags & > + MIGRATE_VMA_SELECT_DEVICE_COHERENT) || > + pgmap->owner != migrate->pgmap_owner) > + goto next; > + } Coding style wants you to use if () { } else if { } Not if () else if { } Something simpler might be page_pgmap(page)->owner != migrate->pgmap_owner > mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; > mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0; > }
On 2/20/25 22:48, David Hildenbrand wrote: > On 20.02.25 00:13, Balbir Singh wrote: >> page_pgmap() is referenced before checking if the page is a zone >> device page and this triggers the warning in page_pgmap(). Refactor >> the code to use the helper function after relevant checks. >> >> Cc: Alistair Popple <apopple@nvidia.com> >> Cc: Jason Gunthorpe <jgg@ziepe.ca> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: Dan Williams <dan.j.williams@intel.com> >> >> Signed-off-by: Balbir Singh <balbirs@nvidia.com> >> --- >> >> Fixes: 7f1cfd71153b ("mm: allow compound zone device pages") on >> mm-unstable > > Is there actually something broken? At least for now, reading folio->pgmap should just work, although it might be garbage. > It triggers the VM_WARN_ON_ONCE_PAGE static inline struct dev_pagemap *page_pgmap(const struct page *page) { VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page); return page_folio(page)->pgmap; } Nothing is broken, because the code below has checks for is_device_coherent_page(), but in general I think the WARN_ON is correct because it warns us against garbage and it's propagation if the correct checks are not in place. >> >> mm/migrate_device.c | 13 ++++++++----- >> 1 file changed, 8 insertions(+), 5 deletions(-) >> >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c >> index 6771893d4601..e0bf771edb6f 100644 >> --- a/mm/migrate_device.c >> +++ b/mm/migrate_device.c >> @@ -153,14 +153,17 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >> goto next; >> } >> page = vm_normal_page(migrate->vma, addr, pte); >> - pgmap = page_pgmap(page); >> if (page && !is_zone_device_page(page) && >> !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) >> goto next; >> - else if (page && is_device_coherent_page(page) && >> - (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || >> - pgmap->owner != migrate->pgmap_owner)) >> - goto next; >> + else if (page && is_device_coherent_page(page)) { >> + pgmap = page_pgmap(page); >> + >> + if (!(migrate->flags & >> + MIGRATE_VMA_SELECT_DEVICE_COHERENT) || >> + pgmap->owner != migrate->pgmap_owner) >> + goto next; >> + } > > Coding style wants you to use > > if () { > > } else if { > > } > > Not > > if () > else if { > > } > Ack, checkpatch.pl missed it, but agreed > > Something simpler might be > > page_pgmap(page)->owner != migrate->pgmap_owner > Yep, I had that and dropped it, the four clauses made it feel that it might benefit from a split. >> mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; >> mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0; >> } > > Thanks for the review! Balbir
On 20.02.25 12:58, Balbir Singh wrote: > On 2/20/25 22:48, David Hildenbrand wrote: >> On 20.02.25 00:13, Balbir Singh wrote: >>> page_pgmap() is referenced before checking if the page is a zone >>> device page and this triggers the warning in page_pgmap(). Refactor >>> the code to use the helper function after relevant checks. >>> >>> Cc: Alistair Popple <apopple@nvidia.com> >>> Cc: Jason Gunthorpe <jgg@ziepe.ca> >>> Cc: David Hildenbrand <david@redhat.com> >>> Cc: Dan Williams <dan.j.williams@intel.com> >>> >>> Signed-off-by: Balbir Singh <balbirs@nvidia.com> >>> --- >>> >>> Fixes: 7f1cfd71153b ("mm: allow compound zone device pages") on >>> mm-unstable >> >> Is there actually something broken? At least for now, reading folio->pgmap should just work, although it might be garbage. >> > > It triggers the VM_WARN_ON_ONCE_PAGE > > static inline struct dev_pagemap *page_pgmap(const struct page *page) > { > VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page); > return page_folio(page)->pgmap; > } > > Nothing is broken, because the code below has checks for is_device_coherent_page(), > but in general I think the WARN_ON is correct because it warns us against garbage > and it's propagation if the correct checks are not in place. Ah! Now I read your "triggers the warning in page_pgmap()" in the description. It's usually a good idea to just include the splat you observed, if you did, and call it "triggers the VM_WARN_ON_ONCE_PAGE". The "Fixes:" should go above the "---" in that case. > > >>> >>> mm/migrate_device.c | 13 ++++++++----- >>> 1 file changed, 8 insertions(+), 5 deletions(-) >>> >>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c >>> index 6771893d4601..e0bf771edb6f 100644 >>> --- a/mm/migrate_device.c >>> +++ b/mm/migrate_device.c >>> @@ -153,14 +153,17 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >>> goto next; >>> } >>> page = vm_normal_page(migrate->vma, addr, pte); >>> - pgmap = page_pgmap(page); >>> if (page && !is_zone_device_page(page) && >>> !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) >>> goto next; >>> - else if (page && is_device_coherent_page(page) && >>> - (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || >>> - pgmap->owner != migrate->pgmap_owner)) >>> - goto next; >>> + else if (page && is_device_coherent_page(page)) { >>> + pgmap = page_pgmap(page); >>> + >>> + if (!(migrate->flags & >>> + MIGRATE_VMA_SELECT_DEVICE_COHERENT) || >>> + pgmap->owner != migrate->pgmap_owner) >>> + goto next; >>> + } >> >> Coding style wants you to use >> >> if () { >> >> } else if { >> >> } >> >> Not >> >> if () >> else if { >> >> } >> > > Ack, checkpatch.pl missed it, but agreed > >> >> Something simpler might be >> >> page_pgmap(page)->owner != migrate->pgmap_owner >> > > Yep, I had that and dropped it, the four clauses made it feel that it might > benefit from a split. Right, the mixture of && and || is confusing. So with the {} Acked-by: David Hildenbrand <david@redhat.com> Thanks!
On 2/20/25 23:05, David Hildenbrand wrote: > On 20.02.25 12:58, Balbir Singh wrote: >> On 2/20/25 22:48, David Hildenbrand wrote: >>> On 20.02.25 00:13, Balbir Singh wrote: >>>> page_pgmap() is referenced before checking if the page is a zone >>>> device page and this triggers the warning in page_pgmap(). Refactor >>>> the code to use the helper function after relevant checks. >>>> >>>> Cc: Alistair Popple <apopple@nvidia.com> >>>> Cc: Jason Gunthorpe <jgg@ziepe.ca> >>>> Cc: David Hildenbrand <david@redhat.com> >>>> Cc: Dan Williams <dan.j.williams@intel.com> >>>> >>>> Signed-off-by: Balbir Singh <balbirs@nvidia.com> >>>> --- >>>> >>>> Fixes: 7f1cfd71153b ("mm: allow compound zone device pages") on >>>> mm-unstable >>> >>> Is there actually something broken? At least for now, reading folio->pgmap should just work, although it might be garbage. >>> >> >> It triggers the VM_WARN_ON_ONCE_PAGE >> >> static inline struct dev_pagemap *page_pgmap(const struct page *page) >> { >> VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page); >> return page_folio(page)->pgmap; >> } >> >> Nothing is broken, because the code below has checks for is_device_coherent_page(), >> but in general I think the WARN_ON is correct because it warns us against garbage >> and it's propagation if the correct checks are not in place. > > Ah! Now I read your "triggers the warning in page_pgmap()" in the description. > > It's usually a good idea to just include the splat you observed, if you did, and call it "triggers the VM_WARN_ON_ONCE_PAGE". Sure > > The "Fixes:" should go above the "---" in that case. I was not sure if the commit-ids in mm-unstable are persistent, so I put it below the --- to avoid confusion > >> >> >>>> >>>> mm/migrate_device.c | 13 ++++++++----- >>>> 1 file changed, 8 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c >>>> index 6771893d4601..e0bf771edb6f 100644 >>>> --- a/mm/migrate_device.c >>>> +++ b/mm/migrate_device.c >>>> @@ -153,14 +153,17 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, >>>> goto next; >>>> } >>>> page = vm_normal_page(migrate->vma, addr, pte); >>>> - pgmap = page_pgmap(page); >>>> if (page && !is_zone_device_page(page) && >>>> !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) >>>> goto next; >>>> - else if (page && is_device_coherent_page(page) && >>>> - (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || >>>> - pgmap->owner != migrate->pgmap_owner)) >>>> - goto next; >>>> + else if (page && is_device_coherent_page(page)) { >>>> + pgmap = page_pgmap(page); >>>> + >>>> + if (!(migrate->flags & >>>> + MIGRATE_VMA_SELECT_DEVICE_COHERENT) || >>>> + pgmap->owner != migrate->pgmap_owner) >>>> + goto next; >>>> + } >>> >>> Coding style wants you to use >>> >>> if () { >>> >>> } else if { >>> >>> } >>> >>> Not >>> >>> if () >>> else if { >>> >>> } >>> >> >> Ack, checkpatch.pl missed it, but agreed >> >>> >>> Something simpler might be >>> >>> page_pgmap(page)->owner != migrate->pgmap_owner >>> >> >> Yep, I had that and dropped it, the four clauses made it feel that it might >> benefit from a split. > > Right, the mixture of && and || is confusing. > > > So with the {} > > Acked-by: David Hildenbrand <david@redhat.com> > > Thanks! > Thanks, I can respin it with the {} as needed. Balbir
On Thu, Feb 20, 2025 at 11:38:10PM +1100, Balbir Singh wrote: > On 2/20/25 23:05, David Hildenbrand wrote: > > On 20.02.25 12:58, Balbir Singh wrote: > >> On 2/20/25 22:48, David Hildenbrand wrote: > >>> On 20.02.25 00:13, Balbir Singh wrote: > >>>> page_pgmap() is referenced before checking if the page is a zone > >>>> device page and this triggers the warning in page_pgmap(). Refactor > >>>> the code to use the helper function after relevant checks. > >>>> > >>>> Cc: Alistair Popple <apopple@nvidia.com> > >>>> Cc: Jason Gunthorpe <jgg@ziepe.ca> > >>>> Cc: David Hildenbrand <david@redhat.com> > >>>> Cc: Dan Williams <dan.j.williams@intel.com> > >>>> > >>>> Signed-off-by: Balbir Singh <balbirs@nvidia.com> > >>>> --- > >>>> > >>>> Fixes: 7f1cfd71153b ("mm: allow compound zone device pages") on > >>>> mm-unstable > >>> > >>> Is there actually something broken? At least for now, reading folio->pgmap should just work, although it might be garbage. > >>> > >> > >> It triggers the VM_WARN_ON_ONCE_PAGE > >> > >> static inline struct dev_pagemap *page_pgmap(const struct page *page) > >> { > >> VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page); > >> return page_folio(page)->pgmap; > >> } > >> > >> Nothing is broken, because the code below has checks for is_device_coherent_page(), > >> but in general I think the WARN_ON is correct because it warns us against garbage > >> and it's propagation if the correct checks are not in place. > > > > Ah! Now I read your "triggers the warning in page_pgmap()" in the description. > > > > It's usually a good idea to just include the splat you observed, if you did, and call it "triggers the VM_WARN_ON_ONCE_PAGE". > > Sure > > > > > The "Fixes:" should go above the "---" in that case. > > I was not sure if the commit-ids in mm-unstable are persistent, so I put it > below the --- to avoid confusion They are not, but I'm unsure what the process is here. I guess either I roll this fix into my series and repost or maybe Andrew takes it as is a separate fix on top of my series? Either way the fix looks good, so with the code style comment below fixed feel free to add: Reviewed-by: Alistair Popple <apopple@nvidia.com> > > > >> > >> > >>>> > >>>> mm/migrate_device.c | 13 ++++++++----- > >>>> 1 file changed, 8 insertions(+), 5 deletions(-) > >>>> > >>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c > >>>> index 6771893d4601..e0bf771edb6f 100644 > >>>> --- a/mm/migrate_device.c > >>>> +++ b/mm/migrate_device.c > >>>> @@ -153,14 +153,17 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > >>>> goto next; > >>>> } > >>>> page = vm_normal_page(migrate->vma, addr, pte); > >>>> - pgmap = page_pgmap(page); > >>>> if (page && !is_zone_device_page(page) && > >>>> !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) > >>>> goto next; > >>>> - else if (page && is_device_coherent_page(page) && > >>>> - (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || > >>>> - pgmap->owner != migrate->pgmap_owner)) > >>>> - goto next; > >>>> + else if (page && is_device_coherent_page(page)) { > >>>> + pgmap = page_pgmap(page); > >>>> + > >>>> + if (!(migrate->flags & > >>>> + MIGRATE_VMA_SELECT_DEVICE_COHERENT) || > >>>> + pgmap->owner != migrate->pgmap_owner) > >>>> + goto next; > >>>> + } > >>> > >>> Coding style wants you to use > >>> > >>> if () { > >>> > >>> } else if { > >>> > >>> } > >>> > >>> Not > >>> > >>> if () > >>> else if { > >>> > >>> } > >>> > >> > >> Ack, checkpatch.pl missed it, but agreed > >> > >>> > >>> Something simpler might be > >>> > >>> page_pgmap(page)->owner != migrate->pgmap_owner > >>> > >> > >> Yep, I had that and dropped it, the four clauses made it feel that it might > >> benefit from a split. > > > > Right, the mixture of && and || is confusing. > > > > > > So with the {} > > > > Acked-by: David Hildenbrand <david@redhat.com> > > > > Thanks! > > > > Thanks, I can respin it with the {} as needed. > > Balbir
On 20.02.25 23:45, Alistair Popple wrote: > On Thu, Feb 20, 2025 at 11:38:10PM +1100, Balbir Singh wrote: >> On 2/20/25 23:05, David Hildenbrand wrote: >>> On 20.02.25 12:58, Balbir Singh wrote: >>>> On 2/20/25 22:48, David Hildenbrand wrote: >>>>> On 20.02.25 00:13, Balbir Singh wrote: >>>>>> page_pgmap() is referenced before checking if the page is a zone >>>>>> device page and this triggers the warning in page_pgmap(). Refactor >>>>>> the code to use the helper function after relevant checks. >>>>>> >>>>>> Cc: Alistair Popple <apopple@nvidia.com> >>>>>> Cc: Jason Gunthorpe <jgg@ziepe.ca> >>>>>> Cc: David Hildenbrand <david@redhat.com> >>>>>> Cc: Dan Williams <dan.j.williams@intel.com> >>>>>> >>>>>> Signed-off-by: Balbir Singh <balbirs@nvidia.com> >>>>>> --- >>>>>> >>>>>> Fixes: 7f1cfd71153b ("mm: allow compound zone device pages") on >>>>>> mm-unstable >>>>> >>>>> Is there actually something broken? At least for now, reading folio->pgmap should just work, although it might be garbage. >>>>> >>>> >>>> It triggers the VM_WARN_ON_ONCE_PAGE >>>> >>>> static inline struct dev_pagemap *page_pgmap(const struct page *page) >>>> { >>>> VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page); >>>> return page_folio(page)->pgmap; >>>> } >>>> >>>> Nothing is broken, because the code below has checks for is_device_coherent_page(), >>>> but in general I think the WARN_ON is correct because it warns us against garbage >>>> and it's propagation if the correct checks are not in place. >>> >>> Ah! Now I read your "triggers the warning in page_pgmap()" in the description. >>> >>> It's usually a good idea to just include the splat you observed, if you did, and call it "triggers the VM_WARN_ON_ONCE_PAGE". >> >> Sure >> >>> >>> The "Fixes:" should go above the "---" in that case. >> >> I was not sure if the commit-ids in mm-unstable are persistent, so I put it >> below the --- to avoid confusion > > They are not, but I'm unsure what the process is here. I guess either I roll > this fix into my series and repost or maybe Andrew takes it as is a separate fix > on top of my series? Ah, I completely ignored that this is not in mm/mm-stable yet. Yes, that should best just be squashed into the original commit as long as it is not in mm/mm-stable.
diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 6771893d4601..e0bf771edb6f 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -153,14 +153,17 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, goto next; } page = vm_normal_page(migrate->vma, addr, pte); - pgmap = page_pgmap(page); if (page && !is_zone_device_page(page) && !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) goto next; - else if (page && is_device_coherent_page(page) && - (!(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_COHERENT) || - pgmap->owner != migrate->pgmap_owner)) - goto next; + else if (page && is_device_coherent_page(page)) { + pgmap = page_pgmap(page); + + if (!(migrate->flags & + MIGRATE_VMA_SELECT_DEVICE_COHERENT) || + pgmap->owner != migrate->pgmap_owner) + goto next; + } mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0; }
page_pgmap() is referenced before checking if the page is a zone device page and this triggers the warning in page_pgmap(). Refactor the code to use the helper function after relevant checks. Cc: Alistair Popple <apopple@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: David Hildenbrand <david@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Balbir Singh <balbirs@nvidia.com> --- Fixes: 7f1cfd71153b ("mm: allow compound zone device pages") on mm-unstable mm/migrate_device.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)