From patchwork Thu Dec 13 12:40:37 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 1872001 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id D8210DF2EF for ; Thu, 13 Dec 2012 12:40:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754314Ab2LMMks (ORCPT ); Thu, 13 Dec 2012 07:40:48 -0500 Received: from mx1.redhat.com ([209.132.183.28]:52966 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753887Ab2LMMkr (ORCPT ); Thu, 13 Dec 2012 07:40:47 -0500 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id qBDCeiRt015809 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 13 Dec 2012 07:40:47 -0500 Received: from yakj.usersys.redhat.com (ovpn-112-24.ams2.redhat.com [10.36.112.24]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id qBDCeeUO014702; Thu, 13 Dec 2012 07:40:42 -0500 From: Paolo Bonzini To: kvm@vger.kernel.org Cc: mtosatti@redhat.com, gleb@redhat.com Subject: [PATCH kvm-unit-tests v2] vmexit: time the number of cycles for simple PIO Date: Thu, 13 Dec 2012 13:40:37 +0100 Message-Id: <1355402437-21935-1-git-send-email-pbonzini@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch adds three scenarios to the vmexit test. Two are very simple PIO cases that are handled in the kernel (reading from and writing to ELCR). The other is an unmapped PIO that is handled in userspace. The difference between the two reading scenarios is roughly the cost of a userspace exit; the existing inl_from_pmtimer test is not precise enough, because the device model has a pretty high cost. The difference between the kernel read and write is the cost of emulation, because inl_from_kernel goes through the whole emulation stuff while outl does not (it is used for virtio, while the speed of inl matters less). Example: vmcall 3898 inl_from_pmtimer 24615 inl_from_qemu 20574 inl_from_kernel 7237 outl_to_kernel 4451 So the cost of exiting to userspace is 13000 cycles on this machine, and the cost of emulation is 3300 cycles. Suggested-by: Avi Kivity Signed-off-by: Paolo Bonzini --- x86/vmexit.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/x86/vmexit.c b/x86/vmexit.c index ad8ab55..98f0ead 100644 --- a/x86/vmexit.c +++ b/x86/vmexit.c @@ -4,6 +4,18 @@ #include "processor.h" #include "atomic.h" +static void outb(unsigned short port, int val) +{ + asm volatile("outb %b0, %w1" : "=a"(val) : "Nd"(port)); +} + +static unsigned int inb(unsigned short port) +{ + unsigned int val; + asm volatile("xorl %0, %0; inb %w1, %b0" : "=a"(val) : "Nd"(port)); + return val; +} + static unsigned int inl(unsigned short port) { unsigned int val; @@ -82,6 +94,21 @@ static void inl_pmtimer(void) inl(0xb008); } +static void inl_nop_qemu(void) +{ + inl(0x1234); +} + +static void inl_nop_kernel(void) +{ + inb(0x4d0); +} + +static void outl_elcr_kernel(void) +{ + outb(0x4d0, 0); +} + static void ple_round_robin(void) { struct counter { @@ -116,6 +143,9 @@ static struct test { { mov_to_cr8, "mov_to_cr8" , .parallel = 1, }, #endif { inl_pmtimer, "inl_from_pmtimer", .parallel = 1, }, + { inl_nop_qemu, "inl_from_qemu", .parallel = 1 }, + { inl_nop_kernel, "inl_from_kernel", .parallel = 1 }, + { outl_elcr_kernel, "outl_to_kernel", .parallel = 1 }, { ipi, "ipi", is_smp, .parallel = 0, }, { ipi_halt, "ipi+halt", is_smp, .parallel = 0, }, { ple_round_robin, "ple-round-robin", .parallel = 1 },