From patchwork Wed May 29 16:03:44 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Guangrong X-Patchwork-Id: 2631071 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 1F28140077 for ; Wed, 29 May 2013 16:04:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751356Ab3E2QD4 (ORCPT ); Wed, 29 May 2013 12:03:56 -0400 Received: from e28smtp07.in.ibm.com ([122.248.162.7]:53569 "EHLO e28smtp07.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934425Ab3E2QDy (ORCPT ); Wed, 29 May 2013 12:03:54 -0400 Received: from /spool/local by e28smtp07.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 29 May 2013 21:27:41 +0530 Received: from d28dlp02.in.ibm.com (9.184.220.127) by e28smtp07.in.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 29 May 2013 21:27:40 +0530 Received: from d28relay04.in.ibm.com (d28relay04.in.ibm.com [9.184.220.61]) by d28dlp02.in.ibm.com (Postfix) with ESMTP id 07408394004D; Wed, 29 May 2013 21:33:51 +0530 (IST) Received: from d28av01.in.ibm.com (d28av01.in.ibm.com [9.184.220.63]) by d28relay04.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r4TG3hbH9634190; Wed, 29 May 2013 21:33:43 +0530 Received: from d28av01.in.ibm.com (loopback [127.0.0.1]) by d28av01.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r4TG3mN6013273; Wed, 29 May 2013 16:03:48 GMT Received: from localhost.localdomain ([9.125.28.118]) by d28av01.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id r4TG3i2G013064; Wed, 29 May 2013 16:03:45 GMT Message-ID: <51A626E0.9030308@linux.vnet.ibm.com> Date: Thu, 30 May 2013 00:03:44 +0800 From: Xiao Guangrong User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Xiao Guangrong CC: Marcelo Tosatti , gleb@redhat.com, avi.kivity@gmail.com, pbonzini@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: Re: [PATCH v7 04/11] KVM: MMU: zap pages in batch References: <1369252560-11611-1-git-send-email-xiaoguangrong@linux.vnet.ibm.com> <1369252560-11611-5-git-send-email-xiaoguangrong@linux.vnet.ibm.com> <20130524203432.GB4525@amt.cnet> <51A2C2DC.6080403@linux.vnet.ibm.com> <20130528001802.GB1359@amt.cnet> <51A4C6F1.9000607@linux.vnet.ibm.com> <20130529111132.GA5931@amt.cnet> <51A5FDF5.8020003@linux.vnet.ibm.com> <20130529133243.GG5931@amt.cnet> <51A60A64.2080509@linux.vnet.ibm.com> In-Reply-To: <51A60A64.2080509@linux.vnet.ibm.com> X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13052915-8878-0000-0000-0000074E5DE6 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On 05/29/2013 10:02 PM, Xiao Guangrong wrote: > On 05/29/2013 09:32 PM, Marcelo Tosatti wrote: >> On Wed, May 29, 2013 at 09:09:09PM +0800, Xiao Guangrong wrote: >>> This information is I replied Gleb in his mail where he raced a question that >>> why "collapse tlb flush is needed": >>> >>> ====== >>> It seems no. >>> Since we have reloaded mmu before zapping the obsolete pages, the mmu-lock >>> is easily contended. I did the simple track: >>> >>> + int num = 0; >>> restart: >>> list_for_each_entry_safe_reverse(sp, node, >>> &kvm->arch.active_mmu_pages, link) { >>> @@ -4265,6 +4265,7 @@ restart: >>> if (batch >= BATCH_ZAP_PAGES && >>> cond_resched_lock(&kvm->mmu_lock)) { >>> batch = 0; >>> + num++; >>> goto restart; >>> } >>> >>> @@ -4277,6 +4278,7 @@ restart: >>> * may use the pages. >>> */ >>> kvm_mmu_commit_zap_page(kvm, &invalid_list); >>> + printk("lock-break: %d.\n", num); >>> } >>> >>> I do read pci rom when doing kernel building in the guest which >>> has 1G memory and 4vcpus with ept enabled, this is the normal >>> workload and normal configuration. >>> >>> # dmesg >>> [ 2338.759099] lock-break: 8. >>> [ 2339.732442] lock-break: 5. >>> [ 2340.904446] lock-break: 3. >>> [ 2342.513514] lock-break: 3. >>> [ 2343.452229] lock-break: 3. >>> [ 2344.981599] lock-break: 4. >>> >>> Basically, we need to break many times. >> >> Should measure kvm_mmu_zap_all latency. >> >>> ====== >>> >>> You can see we should break 3 times to zap all pages even if we have zapoed >>> 10 pages in batch. It is obviously that it need break more times without >>> batch-zapping. >> >> Again, breaking should be no problem, what matters is latency. Please >> measure kvm_mmu_zap_all latency after all optimizations to justify >> this minimum batching. > > Okay, okay. I will benchmark the latency. Okay, I have done the test, the test environment is the same that "I do read pci rom when doing kernel building in the guest which has 1G memory and 4vcpus with ept enabled, this is the normal workload and normal configuration.". Batch-zapped: Guest: # cat /sys/bus/pci/devices/0000\:00\:03.0/rom # free -m total used free shared buffers cached Mem: 975 793 181 0 6 438 -/+ buffers/cache: 347 627 Swap: 2015 43 1972 Host shows: [ 2229.918558] lock-break: 5. [ 2229.918564] kvm_mmu_invalidate_zap_all_pages: 174706e. No-batch: Guest: # cat /sys/bus/pci/devices/0000\:00\:03.0/rom # free -m total used free shared buffers cached Mem: 975 843 131 0 17 476 -/+ buffers/cache: 348 626 Swap: 2015 2 Host shows: [ 2931.675285] lock-break: 13. [ 2931.675291] kvm_mmu_invalidate_zap_all_pages: 69c1676. That means, nearly the same memory accessed on guest: - batch-zapped need to break 5 times, the latency is 174706e. - no-batch need to break 13 times, the latency is 69c1676. The code change to track the latency: --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 055d675..a66f21b 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4233,13 +4233,13 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot) spin_unlock(&kvm->mmu_lock); } -#define BATCH_ZAP_PAGES 10 +#define BATCH_ZAP_PAGES 0 static void kvm_zap_obsolete_pages(struct kvm *kvm) { struct kvm_mmu_page *sp, *node; LIST_HEAD(invalid_list); int batch = 0; - + int num = 0; restart: list_for_each_entry_safe_reverse(sp, node, &kvm->arch.active_mmu_pages, link) { @@ -4265,6 +4265,7 @@ restart: if (batch >= BATCH_ZAP_PAGES && cond_resched_lock(&kvm->mmu_lock)) { batch = 0; + num++; goto restart; } @@ -4277,6 +4278,7 @@ restart: * may use the pages. */ kvm_mmu_commit_zap_page(kvm, &invalid_list); + printk("lock-break: %d.\n", num); } /* @@ -4290,7 +4292,12 @@ restart: */ void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm) { + u64 start; + spin_lock(&kvm->mmu_lock); + + start = local_clock(); + trace_kvm_mmu_invalidate_zap_all_pages(kvm); kvm->arch.mmu_valid_gen++; @@ -4306,6 +4313,9 @@ void kvm_mmu_invalidate_zap_all_pages(struct kvm *kvm) kvm_reload_remote_mmus(kvm); kvm_zap_obsolete_pages(kvm); + + printk("%s: %llx.\n", __FUNCTION__, local_clock() - start); + spin_unlock(&kvm->mmu_lock); }