Message ID | 20191125030631.7716-2-bharata@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: PPC: Driver to manage pages of secure guest | expand |
On Mon, Nov 25, 2019 at 08:36:25AM +0530, Bharata B Rao wrote: > On PEF-enabled POWER platforms that support running of secure guests, > secure pages of the guest are represented by device private pages > in the host. Such pages needn't participate in KSM merging. This is > achieved by using ksm_madvise() call which need to be exported > since KVM PPC can be a kernel module. > > Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> > Acked-by: Paul Mackerras <paulus@ozlabs.org> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Hugh Dickins <hughd@google.com> Just want to point out that I observe a kernel crash when KSM is dealing with device private pages. More details about the crash here: https://lore.kernel.org/linuxppc-dev/20191115141006.GA21409@in.ibm.com/ Regards, Bharata.
On Mon, 25 Nov 2019, Bharata B Rao wrote: > On PEF-enabled POWER platforms that support running of secure guests, > secure pages of the guest are represented by device private pages > in the host. Such pages needn't participate in KSM merging. This is > achieved by using ksm_madvise() call which need to be exported > since KVM PPC can be a kernel module. > > Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> > Acked-by: Paul Mackerras <paulus@ozlabs.org> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Hugh Dickins <hughd@google.com> I can say Acked-by: Hugh Dickins <hughd@google.com> to this one. But not to your 2/7 which actually makes use of it: because sadly it needs down_write(&kvm->mm->mmap_sem) for the case when it switches off VM_MERGEABLE in vma->vm_flags. That's frustrating, since I think it's the only operation for which down_read() is not good enough. I have no idea how contended that mmap_sem is likely to be, nor how many to-be-secured pages that vma is likely to contain: you might find it okay simply to go with it down_write throughout, or you might want to start out with it down_read, and only restart with down_write (then perhaps downgrade_write later) when you see VM_MERGEABLE is set. The crash you got (thanks for the link): that will be because your migrate_vma_pages() had already been applied to a page that was already being shared via KSM. But if these secure pages are expected to be few and far between, maybe you'd prefer to keep VM_MERGEABLE, and add per-page checks of some kind into mm/ksm.c, to skip over these surprising hybrids. Hugh > --- > mm/ksm.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/ksm.c b/mm/ksm.c > index dbee2eb4dd05..e45b02ad3f0b 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -2478,6 +2478,7 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start, > > return 0; > } > +EXPORT_SYMBOL_GPL(ksm_madvise); > > int __ksm_enter(struct mm_struct *mm) > { > -- > 2.21.0
On Tue, Nov 26, 2019 at 07:59:49PM -0800, Hugh Dickins wrote: > On Mon, 25 Nov 2019, Bharata B Rao wrote: > > > On PEF-enabled POWER platforms that support running of secure guests, > > secure pages of the guest are represented by device private pages > > in the host. Such pages needn't participate in KSM merging. This is > > achieved by using ksm_madvise() call which need to be exported > > since KVM PPC can be a kernel module. > > > > Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> > > Acked-by: Paul Mackerras <paulus@ozlabs.org> > > Cc: Andrea Arcangeli <aarcange@redhat.com> > > Cc: Hugh Dickins <hughd@google.com> > > I can say > Acked-by: Hugh Dickins <hughd@google.com> > to this one. > > But not to your 2/7 which actually makes use of it: because sadly it > needs down_write(&kvm->mm->mmap_sem) for the case when it switches off > VM_MERGEABLE in vma->vm_flags. That's frustrating, since I think it's > the only operation for which down_read() is not good enough. Oh ok! Thanks for pointing this out. > > I have no idea how contended that mmap_sem is likely to be, nor how > many to-be-secured pages that vma is likely to contain: you might find > it okay simply to go with it down_write throughout, or you might want > to start out with it down_read, and only restart with down_write (then > perhaps downgrade_write later) when you see VM_MERGEABLE is set. Using down_write throughtout is not easy as we do migrate_vma_pages() from fault path (->migrate_to_ram()) too. Here we come with down_read already held. Starting with down_read and restarting with down_write if VM_MERGEABLE is set -- this also looks a bit difficult as we will have challenges with locking order if we release mmap_sem in between and re-acquire. So I think I will start with down_write in this particular case and will downgrade_write as soon as ksm_madvise() is complete. > > The crash you got (thanks for the link): that will be because your > migrate_vma_pages() had already been applied to a page that was > already being shared via KSM. > > But if these secure pages are expected to be few and far between, > maybe you'd prefer to keep VM_MERGEABLE, and add per-page checks > of some kind into mm/ksm.c, to skip over these surprising hybrids. I did bail out from a few routines in mm/ksm.c with is_device_private_page(page) check, but that wasn't good enough and I encountered crashes in different code paths. Guess a bit more understanding of KSM internals would be required before retrying that. However since all the pages of the guest except for a few will be turned into secure pages early during boot, it appears better if secure guests don't participate in in KSM merging at all. Regards, Bharata.
diff --git a/mm/ksm.c b/mm/ksm.c index dbee2eb4dd05..e45b02ad3f0b 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2478,6 +2478,7 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start, return 0; } +EXPORT_SYMBOL_GPL(ksm_madvise); int __ksm_enter(struct mm_struct *mm) {