diff mbox

postcopy migration hangs while loading virtio state

Message ID 45954ce5-fae8-5709-da3c-045ff74f740c@de.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Christian Borntraeger July 10, 2017, 8:05 a.m. UTC
On 07/04/2017 10:16 AM, Martin Schwidefsky wrote:
> On Tue, 4 Jul 2017 09:48:11 +0200
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> On 07/03/2017 09:07 PM, Dr. David Alan Gilbert wrote:
>>> * Michael S. Tsirkin (mst@redhat.com) wrote:  
>>>> On Fri, Jun 30, 2017 at 05:31:39PM +0100, Dr. David Alan Gilbert wrote:  
>>>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote:  
>>>>>> On 04/26/2017 01:45 PM, Christian Borntraeger wrote:
>>>>>>  
>>>>>>>> Hmm, I have a theory, if the flags field has bit 1 set, i.e. RAM_SAVE_FLAG_COMPRESS
>>>>>>>> then try changing ram_handle_compressed to always do the memset.  
>>>>>>>
>>>>>>> FWIW, changing ram_handle_compressed to always memset makes the problem go away.  
>>>>>>
>>>>>> It is still running fine now with the "always memset change"  
>>>>>
>>>>> Did we ever nail down a fix for this; as I remember Andrea said
>>>>> we shouldn't need to do that memset, but we came to the conclusion
>>>>> it was something specific to how s390 protection keys worked.
>>>>>
>>>>> Dave  
>>>>
>>>> No we didn't. Let's merge that for now, with a comment that
>>>> we don't really understand what's going on?  
>>>
>>> Hmm no, I don't really want to change the !s390 behaviour, especially
>>> since it causes allocation that we otherwise avoid and Andrea's
>>> reply tothe original post pointed out it's not necessary.  
>>
>>
>> Since storage keys are per physical page we must not have shared pages.
>> Therefore in s390_enable_skey we already do a break_ksm (via ksm_madvise),
>> in other words we already allocate pages on the keyless->keyed switch.
>>
>> So why not do the same for zero pages - instead of invalidating the page
>> table entry, lets just do a proper COW.
>>
>> Something like this maybe (Andrea, Martin any better way to do that?)
>>
>>
>> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
>> index 4fb3d3c..11475c7 100644
>> --- a/arch/s390/mm/gmap.c
>> +++ b/arch/s390/mm/gmap.c
>> @@ -2149,13 +2149,18 @@ EXPORT_SYMBOL_GPL(s390_enable_sie);
>>  static int __s390_enable_skey(pte_t *pte, unsigned long addr,
>>                               unsigned long next, struct mm_walk *walk)
>>  {
>> +       struct page *page;
>>         /*
>> -        * Remove all zero page mappings,
>> +        * Remove all zero page mappings with a COW
>>          * after establishing a policy to forbid zero page mappings
>>          * following faults for that page will get fresh anonymous pages
>>          */
>> -       if (is_zero_pfn(pte_pfn(*pte)))
>> -               ptep_xchg_direct(walk->mm, addr, pte, __pte(_PAGE_INVALID));
>> +       if (is_zero_pfn(pte_pfn(*pte))) {
>> +               if (get_user_pages(addr, 1, FOLL_WRITE, &page, NULL) == 1)
>> +                       put_page(page);
>> +               else
>> +                       return -ENOMEM;
>> +       }
>>         /* Clear storage key */
>>         ptep_zap_key(walk->mm, addr, pte);
>>         return 0;
> 
> I do not quite get the problem here. The zero-page mappings are always
> marked as _PAGE_SPECIAL. These should be safe to replace with an empty
> pte. We do mark all VMAs as unmergeable prior to the page table walk
> that replaces all zero-page mappings. What is the get_user_pages() call
> supposed to do?
> 


After talking to Martin, we decided that it might be a good trial to simply
not use the empty zero page at all for KVM guests. Something like this





seems to do the trick. Will have a look what that means for the memory usage for
the usual cases.
diff mbox

Patch

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 57057fb..65ab11d 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -505,7 +505,7 @@  static inline int mm_alloc_pgste(struct mm_struct *mm)
  * In the case that a guest uses storage keys
  * faults should no longer be backed by zero pages
  */
-#define mm_forbids_zeropage mm_use_skey
+#define mm_forbids_zeropage mm_has_pgste
 static inline int mm_use_skey(struct mm_struct *mm)
 {
 #ifdef CONFIG_PGSTE
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index 4fb3d3c..88f502a 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2149,13 +2149,6 @@  EXPORT_SYMBOL_GPL(s390_enable_sie);
 static int __s390_enable_skey(pte_t *pte, unsigned long addr,
                              unsigned long next, struct mm_walk *walk)
 {
-       /*
-        * Remove all zero page mappings,
-        * after establishing a policy to forbid zero page mappings
-        * following faults for that page will get fresh anonymous pages
-        */
-       if (is_zero_pfn(pte_pfn(*pte)))
-               ptep_xchg_direct(walk->mm, addr, pte, __pte(_PAGE_INVALID));
        /* Clear storage key */
        ptep_zap_key(walk->mm, addr, pte);
        return 0;