From patchwork Mon Jul 10 08:05:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Borntraeger X-Patchwork-Id: 9832563 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EEE4760318 for ; Mon, 10 Jul 2017 08:06:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E281126E40 for ; Mon, 10 Jul 2017 08:06:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D5C3B283D8; Mon, 10 Jul 2017 08:06:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4559126E40 for ; Mon, 10 Jul 2017 08:06:48 +0000 (UTC) Received: from localhost ([::1]:39128 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dUTiR-0001WO-FN for patchwork-qemu-devel@patchwork.kernel.org; Mon, 10 Jul 2017 04:06:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44626) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dUThn-0001W3-LN for qemu-devel@nongnu.org; Mon, 10 Jul 2017 04:06:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dUThk-0006dB-9O for qemu-devel@nongnu.org; Mon, 10 Jul 2017 04:06:07 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:55037 helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dUThk-0006cE-2V for qemu-devel@nongnu.org; Mon, 10 Jul 2017 04:06:04 -0400 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v6A842DV061122 for ; Mon, 10 Jul 2017 04:06:02 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2bjvs2jej6-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 10 Jul 2017 04:06:02 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 10 Jul 2017 04:06:02 -0400 Received: from b01cxnp22036.gho.pok.ibm.com (9.57.198.26) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 10 Jul 2017 04:05:58 -0400 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v6A85wrT62455840; Mon, 10 Jul 2017 08:05:58 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 45467112054; Mon, 10 Jul 2017 04:05:57 -0400 (EDT) Received: from oc1450873852.ibm.com (unknown [9.152.224.35]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP id 93C9F112040; Mon, 10 Jul 2017 04:05:55 -0400 (EDT) To: Martin Schwidefsky References: <75a4c5ff-385e-31ac-5f86-883b082cd94e@de.ibm.com> <20170424143516.GD2075@work-vm> <5c0608dd-ba22-dc09-71a1-bb95c977f77d@de.ibm.com> <20170424191202.GQ2362@work-vm> <097a5085-1128-cf2d-abc4-54660a608f36@de.ibm.com> <20170426110144.GF2098@work-vm> <20170630163139.GC2437@work-vm> <20170703192353-mutt-send-email-mst@kernel.org> <20170703190728.GE2206@work-vm> <55708cff-cdda-9b1d-2106-1c6d2774f890@de.ibm.com> <20170704101643.3c20f745@mschwideX1> From: Christian Borntraeger Date: Mon, 10 Jul 2017 10:05:55 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 In-Reply-To: <20170704101643.3c20f745@mschwideX1> Content-Language: en-IE X-TM-AS-GCONF: 00 x-cbid: 17071008-0040-0000-0000-0000037B80A7 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007341; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00885445; UDB=6.00441896; IPR=6.00665605; BA=6.00005462; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016154; XFM=3.00000015; UTC=2017-07-10 08:06:00 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17071008-0041-0000-0000-0000076F8B54 Message-Id: <45954ce5-fae8-5709-da3c-045ff74f740c@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-07-10_03:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1707100146 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.158.5 Subject: Re: [Qemu-devel] postcopy migration hangs while loading virtio state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andrea Arcangeli , thuth@redhat.com, Juan Quintela , "Michael S. Tsirkin" , qemu-devel , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP On 07/04/2017 10:16 AM, Martin Schwidefsky wrote: > On Tue, 4 Jul 2017 09:48:11 +0200 > Christian Borntraeger wrote: > >> On 07/03/2017 09:07 PM, Dr. David Alan Gilbert wrote: >>> * Michael S. Tsirkin (mst@redhat.com) wrote: >>>> On Fri, Jun 30, 2017 at 05:31:39PM +0100, Dr. David Alan Gilbert wrote: >>>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote: >>>>>> On 04/26/2017 01:45 PM, Christian Borntraeger wrote: >>>>>> >>>>>>>> Hmm, I have a theory, if the flags field has bit 1 set, i.e. RAM_SAVE_FLAG_COMPRESS >>>>>>>> then try changing ram_handle_compressed to always do the memset. >>>>>>> >>>>>>> FWIW, changing ram_handle_compressed to always memset makes the problem go away. >>>>>> >>>>>> It is still running fine now with the "always memset change" >>>>> >>>>> Did we ever nail down a fix for this; as I remember Andrea said >>>>> we shouldn't need to do that memset, but we came to the conclusion >>>>> it was something specific to how s390 protection keys worked. >>>>> >>>>> Dave >>>> >>>> No we didn't. Let's merge that for now, with a comment that >>>> we don't really understand what's going on? >>> >>> Hmm no, I don't really want to change the !s390 behaviour, especially >>> since it causes allocation that we otherwise avoid and Andrea's >>> reply tothe original post pointed out it's not necessary. >> >> >> Since storage keys are per physical page we must not have shared pages. >> Therefore in s390_enable_skey we already do a break_ksm (via ksm_madvise), >> in other words we already allocate pages on the keyless->keyed switch. >> >> So why not do the same for zero pages - instead of invalidating the page >> table entry, lets just do a proper COW. >> >> Something like this maybe (Andrea, Martin any better way to do that?) >> >> >> diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c >> index 4fb3d3c..11475c7 100644 >> --- a/arch/s390/mm/gmap.c >> +++ b/arch/s390/mm/gmap.c >> @@ -2149,13 +2149,18 @@ EXPORT_SYMBOL_GPL(s390_enable_sie); >> static int __s390_enable_skey(pte_t *pte, unsigned long addr, >> unsigned long next, struct mm_walk *walk) >> { >> + struct page *page; >> /* >> - * Remove all zero page mappings, >> + * Remove all zero page mappings with a COW >> * after establishing a policy to forbid zero page mappings >> * following faults for that page will get fresh anonymous pages >> */ >> - if (is_zero_pfn(pte_pfn(*pte))) >> - ptep_xchg_direct(walk->mm, addr, pte, __pte(_PAGE_INVALID)); >> + if (is_zero_pfn(pte_pfn(*pte))) { >> + if (get_user_pages(addr, 1, FOLL_WRITE, &page, NULL) == 1) >> + put_page(page); >> + else >> + return -ENOMEM; >> + } >> /* Clear storage key */ >> ptep_zap_key(walk->mm, addr, pte); >> return 0; > > I do not quite get the problem here. The zero-page mappings are always > marked as _PAGE_SPECIAL. These should be safe to replace with an empty > pte. We do mark all VMAs as unmergeable prior to the page table walk > that replaces all zero-page mappings. What is the get_user_pages() call > supposed to do? > After talking to Martin, we decided that it might be a good trial to simply not use the empty zero page at all for KVM guests. Something like this seems to do the trick. Will have a look what that means for the memory usage for the usual cases. diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 57057fb..65ab11d 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -505,7 +505,7 @@ static inline int mm_alloc_pgste(struct mm_struct *mm) * In the case that a guest uses storage keys * faults should no longer be backed by zero pages */ -#define mm_forbids_zeropage mm_use_skey +#define mm_forbids_zeropage mm_has_pgste static inline int mm_use_skey(struct mm_struct *mm) { #ifdef CONFIG_PGSTE diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 4fb3d3c..88f502a 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2149,13 +2149,6 @@ EXPORT_SYMBOL_GPL(s390_enable_sie); static int __s390_enable_skey(pte_t *pte, unsigned long addr, unsigned long next, struct mm_walk *walk) { - /* - * Remove all zero page mappings, - * after establishing a policy to forbid zero page mappings - * following faults for that page will get fresh anonymous pages - */ - if (is_zero_pfn(pte_pfn(*pte))) - ptep_xchg_direct(walk->mm, addr, pte, __pte(_PAGE_INVALID)); /* Clear storage key */ ptep_zap_key(walk->mm, addr, pte); return 0;