Message ID | 359fdf0103b61014bf811d88d4ce36bc793d18f2.1530716899.git.yi.z.zhang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
[ adding Jerome ] On Wed, Jul 4, 2018 at 8:30 AM, Zhang Yi <yi.z.zhang@linux.intel.com> wrote: > For device specific memory space, when we move these area of pfn to > memory zone, we will set the page reserved flag at that time, some of > these reserved for device mmio, and some of these are not, such as > NVDIMM pmem. > > Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM > backend, since these pages are reserved. the check of > kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we > introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX, > to indentify these pages are from NVDIMM pmem. and let kvm treat these > as normal pages. > > Without this patch, Many operations will be missed due to this > mistreatment to pmem pages. For example, a page may not have chance to > be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be > marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc. > > Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> > Signed-off-by: Zhang Yu <yu.c.zhang@linux.intel.com> > --- > virt/kvm/kvm_main.c | 17 +++++++++++++++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index afb2e6e..1365d18 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -140,10 +140,23 @@ __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, > { > } > > +static bool kvm_is_nd_pfn(kvm_pfn_t pfn) > +{ > + struct page *page = pfn_to_page(pfn); > + > + return is_zone_device_page(page) && > + ((page->pgmap->type == MEMORY_DEVICE_FS_DAX) || > + (page->pgmap->type == MEMORY_DEVICE_DEV_DAX)); Jerome, might there be any use case to pass MEMORY_DEVICE_PUBLIC memory to a guest vm? > +} > + > bool kvm_is_reserved_pfn(kvm_pfn_t pfn) > { > - if (pfn_valid(pfn)) > - return PageReserved(pfn_to_page(pfn)); > + struct page *page; > + > + if (pfn_valid(pfn)) { > + page = pfn_to_page(pfn); > + return kvm_is_nd_pfn(pfn) ? false : PageReserved(page); > + } > > return true; > } > -- > 2.7.4 >
On 04/07/2018 17:30, Zhang Yi wrote: > For device specific memory space, when we move these area of pfn to > memory zone, we will set the page reserved flag at that time, some of > these reserved for device mmio, and some of these are not, such as > NVDIMM pmem. > > Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM > backend, since these pages are reserved. the check of > kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we > introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX, > to indentify these pages are from NVDIMM pmem. and let kvm treat these > as normal pages. > > Without this patch, Many operations will be missed due to this > mistreatment to pmem pages. For example, a page may not have chance to > be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be > marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc. > > Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> > Signed-off-by: Zhang Yu <yu.c.zhang@linux.intel.com> > --- > virt/kvm/kvm_main.c | 17 +++++++++++++++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index afb2e6e..1365d18 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -140,10 +140,23 @@ __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, > { > } > > +static bool kvm_is_nd_pfn(kvm_pfn_t pfn) > +{ > + struct page *page = pfn_to_page(pfn); > + > + return is_zone_device_page(page) && > + ((page->pgmap->type == MEMORY_DEVICE_FS_DAX) || > + (page->pgmap->type == MEMORY_DEVICE_DEV_DAX)); > +} If the mm people agree, I'd prefer something that takes a struct page * and is exported by include/linux/mm.h. Then KVM can just do something like struct page *page; if (!pfn_valid(pfn)) return true; page = pfn_to_page(pfn); return PageReserved(page) && !is_dax_page(page); Thanks, Paolo
On 04/07/2018 16:50, Dan Williams wrote: >> + return is_zone_device_page(page) && >> + ((page->pgmap->type == MEMORY_DEVICE_FS_DAX) || >> + (page->pgmap->type == MEMORY_DEVICE_DEV_DAX)); > Jerome, might there be any use case to pass MEMORY_DEVICE_PUBLIC > memory to a guest vm? > An even better reason to place this in mm.h. :) There should be an function to tell you if a reserved page has accessed/dirty bits etc., that's all that KVM needs to know. Paolo
On 2018年07月04日 23:25, Paolo Bonzini wrote: > On 04/07/2018 17:30, Zhang Yi wrote: >> For device specific memory space, when we move these area of pfn to >> memory zone, we will set the page reserved flag at that time, some of >> these reserved for device mmio, and some of these are not, such as >> NVDIMM pmem. >> >> Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM >> backend, since these pages are reserved. the check of >> kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we >> introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX, >> to indentify these pages are from NVDIMM pmem. and let kvm treat these >> as normal pages. >> >> Without this patch, Many operations will be missed due to this >> mistreatment to pmem pages. For example, a page may not have chance to >> be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be >> marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc. >> >> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> >> Signed-off-by: Zhang Yu <yu.c.zhang@linux.intel.com> >> --- >> virt/kvm/kvm_main.c | 17 +++++++++++++++-- >> 1 file changed, 15 insertions(+), 2 deletions(-) >> >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >> index afb2e6e..1365d18 100644 >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -140,10 +140,23 @@ __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, >> { >> } >> >> +static bool kvm_is_nd_pfn(kvm_pfn_t pfn) >> +{ >> + struct page *page = pfn_to_page(pfn); >> + >> + return is_zone_device_page(page) && >> + ((page->pgmap->type == MEMORY_DEVICE_FS_DAX) || >> + (page->pgmap->type == MEMORY_DEVICE_DEV_DAX)); >> +} > If the mm people agree, I'd prefer something that takes a struct page * > and is exported by include/linux/mm.h. Then KVM can just do something like > > struct page *page; > if (!pfn_valid(pfn)) > return true; > > page = pfn_to_page(pfn); > return PageReserved(page) && !is_dax_page(page); > > Thanks, > > Paolo Yeah, that could be much better. Thanks for your comments Paolo. Hi Kara, Do u have Any opinions/ideas to add such definition in mm? Regards, Yi
On Thu 05-07-18 21:19:30, Zhang,Yi wrote: > > > On 2018年07月04日 23:25, Paolo Bonzini wrote: > > On 04/07/2018 17:30, Zhang Yi wrote: > >> For device specific memory space, when we move these area of pfn to > >> memory zone, we will set the page reserved flag at that time, some of > >> these reserved for device mmio, and some of these are not, such as > >> NVDIMM pmem. > >> > >> Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM > >> backend, since these pages are reserved. the check of > >> kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we > >> introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX, > >> to indentify these pages are from NVDIMM pmem. and let kvm treat these > >> as normal pages. > >> > >> Without this patch, Many operations will be missed due to this > >> mistreatment to pmem pages. For example, a page may not have chance to > >> be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be > >> marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc. > >> > >> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com> > >> Signed-off-by: Zhang Yu <yu.c.zhang@linux.intel.com> > >> --- > >> virt/kvm/kvm_main.c | 17 +++++++++++++++-- > >> 1 file changed, 15 insertions(+), 2 deletions(-) > >> > >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > >> index afb2e6e..1365d18 100644 > >> --- a/virt/kvm/kvm_main.c > >> +++ b/virt/kvm/kvm_main.c > >> @@ -140,10 +140,23 @@ __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, > >> { > >> } > >> > >> +static bool kvm_is_nd_pfn(kvm_pfn_t pfn) > >> +{ > >> + struct page *page = pfn_to_page(pfn); > >> + > >> + return is_zone_device_page(page) && > >> + ((page->pgmap->type == MEMORY_DEVICE_FS_DAX) || > >> + (page->pgmap->type == MEMORY_DEVICE_DEV_DAX)); > >> +} > > If the mm people agree, I'd prefer something that takes a struct page * > > and is exported by include/linux/mm.h. Then KVM can just do something like > > > > struct page *page; > > if (!pfn_valid(pfn)) > > return true; > > > > page = pfn_to_page(pfn); > > return PageReserved(page) && !is_dax_page(page); > > > > Thanks, > > > > Paolo > Yeah, that could be much better. Thanks for your comments Paolo. > > Hi Kara, Do u have Any opinions/ideas to add such definition in mm? What Paolo suggests sounds good to me. Honza
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index afb2e6e..1365d18 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -140,10 +140,23 @@ __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, { } +static bool kvm_is_nd_pfn(kvm_pfn_t pfn) +{ + struct page *page = pfn_to_page(pfn); + + return is_zone_device_page(page) && + ((page->pgmap->type == MEMORY_DEVICE_FS_DAX) || + (page->pgmap->type == MEMORY_DEVICE_DEV_DAX)); +} + bool kvm_is_reserved_pfn(kvm_pfn_t pfn) { - if (pfn_valid(pfn)) - return PageReserved(pfn_to_page(pfn)); + struct page *page; + + if (pfn_valid(pfn)) { + page = pfn_to_page(pfn); + return kvm_is_nd_pfn(pfn) ? false : PageReserved(page); + } return true; }