From patchwork Fri Feb 5 07:17:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gonglei (Arei)" X-Patchwork-Id: 8232221 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id E23E59F1C1 for ; Fri, 5 Feb 2016 07:18:33 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0086020375 for ; Fri, 5 Feb 2016 07:18:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ADE7420357 for ; Fri, 5 Feb 2016 07:18:31 +0000 (UTC) Received: from localhost ([::1]:46420 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aRaf0-00052Y-0W for patchwork-qemu-devel@patchwork.kernel.org; Fri, 05 Feb 2016 02:18:30 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52106) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aRaer-00051i-QK for qemu-devel@nongnu.org; Fri, 05 Feb 2016 02:18:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aRaeo-0003yJ-Ir for qemu-devel@nongnu.org; Fri, 05 Feb 2016 02:18:21 -0500 Received: from szxga01-in.huawei.com ([58.251.152.64]:11817) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aRaen-0003xh-5O for qemu-devel@nongnu.org; Fri, 05 Feb 2016 02:18:18 -0500 Received: from 172.24.1.49 (EHLO SZXEMA412-HUB.china.huawei.com) ([172.24.1.49]) by szxrg01-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DEF26950; Fri, 05 Feb 2016 15:17:57 +0800 (CST) Received: from SZXEMA503-MBS.china.huawei.com ([169.254.6.26]) by SZXEMA412-HUB.china.huawei.com ([10.82.72.71]) with mapi id 14.03.0235.001; Fri, 5 Feb 2016 15:17:55 +0800 From: "Gonglei (Arei)" To: Paolo Bonzini , "qemu-devel@nongnu.org" Thread-Topic: [Qemu-devel] [PATCH v2 00/10] virtio/vring: optimization patches Thread-Index: AQHRXBJ8J6dEt4zAmE6zZZAMTEFTGp8aO8KwgADxCwCAAeHQIA== Date: Fri, 5 Feb 2016 07:17:55 +0000 Message-ID: <33183CC9F5247A488A2544077AF19020B02DC1BD@SZXEMA503-MBS.china.huawei.com> References: <1454236146-23293-1-git-send-email-pbonzini@redhat.com> <33183CC9F5247A488A2544077AF19020B02DA7EA@SZXEMA503-MBS.china.huawei.com> <56B325B0.8050903@redhat.com> In-Reply-To: <56B325B0.8050903@redhat.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.177.18.62] MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020204.56B44CA7.0115, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=169.254.6.26, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 652d2343ff81d75700745182c06111a1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 58.251.152.64 Cc: "cornelia.huck@de.ibm.com" , "v.maffione@gmail.com" , "mst@redhat.com" Subject: Re: [Qemu-devel] [PATCH v2 00/10] virtio/vring: optimization patches X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Dear Paolo, > > From: Paolo Bonzini [mailto:pbonzini@redhat.com] > Sent: Thursday, February 04, 2016 6:19 PM > > On 03/02/2016 13:08, Gonglei (Arei) wrote: > > 22.56% qemu-kvm [.] address_space_translate > > 13.29% qemu-kvm [.] qemu_get_ram_ptr > > We could get rid of qemu_get_ram_ptr by storing the RAMBlock pointer > into the memory region, instead of the ram_addr_t value. I'm happy to > answer any question if you want to do it. > Good point! And I simply realize this change, get nearly 8MB/s through output bonus. Testing AES-128-CBC cipher: Encrypting in chunks of 256 bytes: done. 412.16 MiB in 5.02 secs: 82.17 MiB/sec (1688202 packets) Encrypting in chunks of 256 bytes: done. 412.15 MiB in 5.02 secs: 82.16 MiB/sec (1688158 packets) Encrypting in chunks of 256 bytes: done. 412.32 MiB in 5.02 secs: 82.20 MiB/sec (1688876 packets) Encrypting in chunks of 256 bytes: done. 412.47 MiB in 5.02 secs: 82.23 MiB/sec (1689491 packets) Encrypting in chunks of 256 bytes: done. 412.31 MiB in 5.02 secs: 82.20 MiB/sec (1688825 packets) Encrypting in chunks of 256 bytes: done. 411.30 MiB in 5.01 secs: 82.15 MiB/sec (1684671 packets) Encrypting in chunks of 256 bytes: done. 412.08 MiB in 5.01 secs: 82.18 MiB/sec (1687864 packets) Encrypting in chunks of 256 bytes: done. 412.49 MiB in 5.02 secs: 82.23 MiB/sec (1689564 packets) Now, 'perf top' shows me: 16.32% qemu-kvm [.] address_space_translate 5.39% libpthread-2.19.so [.] __pthread_mutex_unlock_usercnt 4.13% qemu-kvm [.] qemu_ram_addr_from_host 4.01% qemu-kvm [.] address_space_map 3.82% libc-2.19.so [.] _int_malloc 3.70% libc-2.19.so [.] _int_free 3.49% libc-2.19.so [.] malloc 3.18% libpthread-2.19.so [.] pthread_mutex_lock 3.10% qemu-kvm [.] phys_page_find 2.93% qemu-kvm [.] address_space_translate_internal 2.74% libc-2.19.so [.] malloc_consolidate 2.71% libc-2.19.so [.] __memcpy_sse2_unaligned 1.92% qemu-kvm [.] find_next_zero_bit 1.65% qemu-kvm [.] object_unref 1.61% qemu-kvm [.] address_space_rw 1.35% qemu-kvm [.] virtio_notify 1.33% qemu-kvm [.] object_ref 1.22% libc-2.19.so [.] memset Please review the below patch (based on qemu-2.3 which I'm using), thanks! If it's ok, I can rebase it based on the master branch. [PATCH] exec: store RAMBlock pointer into memory region Signed-off-by: Gonglei --- exec.c | 39 ++++++++++++++++++++++++--------------- include/exec/memory.h | 1 + include/exec/ram_addr.h | 1 + memory.c | 4 +++- 4 files changed, 29 insertions(+), 16 deletions(-) diff --git a/exec.c b/exec.c index 4a16769..51d6f30 100644 --- a/exec.c +++ b/exec.c @@ -1544,6 +1544,7 @@ ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, error_propagate(errp, local_err); return -1; } + mr->ram_block = new_block; return addr; } @@ -1817,6 +1818,11 @@ found: return mr; } +void *qemu_get_ram_ptr_from_block(RAMBlock *block, hwaddr addr) +{ + return ramblock_ptr(block, addr - block->offset); +} + static void notdirty_mem_write(void *opaque, hwaddr ram_addr, uint64_t val, unsigned size) { @@ -2350,7 +2356,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf, } else { addr1 += memory_region_get_ram_addr(mr); /* RAM case */ - ptr = qemu_get_ram_ptr(addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, addr1); memcpy(ptr, buf, l); invalidate_and_set_dirty(addr1, l); } @@ -2384,7 +2390,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf, } } else { /* RAM case */ - ptr = qemu_get_ram_ptr(mr->ram_addr + addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, mr->ram_addr + addr1); memcpy(buf, ptr, l); } } @@ -2437,7 +2443,7 @@ static inline void cpu_physical_memory_write_rom_internal(AddressSpace *as, } else { addr1 += memory_region_get_ram_addr(mr); /* ROM/RAM case */ - ptr = qemu_get_ram_ptr(addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, addr1); switch (type) { case WRITE_DATA: memcpy(ptr, buf, l); @@ -2681,9 +2687,10 @@ static inline uint32_t ldl_phys_internal(AddressSpace *as, hwaddr addr, #endif } else { /* RAM case */ - ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr) - & TARGET_PAGE_MASK) - + addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, + (memory_region_get_ram_addr(mr) + & TARGET_PAGE_MASK) + + addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: val = ldl_le_p(ptr); @@ -2740,9 +2747,10 @@ static inline uint64_t ldq_phys_internal(AddressSpace *as, hwaddr addr, #endif } else { /* RAM case */ - ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr) - & TARGET_PAGE_MASK) - + addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, + (memory_region_get_ram_addr(mr) + & TARGET_PAGE_MASK) + + addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: val = ldq_le_p(ptr); @@ -2807,9 +2815,10 @@ static inline uint32_t lduw_phys_internal(AddressSpace *as, hwaddr addr, #endif } else { /* RAM case */ - ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr) - & TARGET_PAGE_MASK) - + addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, + (memory_region_get_ram_addr(mr) + & TARGET_PAGE_MASK) + + addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: val = lduw_le_p(ptr); @@ -2856,7 +2865,7 @@ void stl_phys_notdirty(AddressSpace *as, hwaddr addr, uint32_t val) io_mem_write(mr, addr1, val, 4); } else { addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK; - ptr = qemu_get_ram_ptr(addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, addr1); stl_p(ptr, val); if (unlikely(in_migration)) { @@ -2896,7 +2905,7 @@ static inline void stl_phys_internal(AddressSpace *as, } else { /* RAM case */ addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK; - ptr = qemu_get_ram_ptr(addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: stl_le_p(ptr, val); @@ -2959,7 +2968,7 @@ static inline void stw_phys_internal(AddressSpace *as, } else { /* RAM case */ addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK; - ptr = qemu_get_ram_ptr(addr1); + ptr = qemu_get_ram_ptr_from_block(mr->ram_block, addr1); switch (endian) { case DEVICE_LITTLE_ENDIAN: stw_le_p(ptr, val); diff --git a/include/exec/memory.h b/include/exec/memory.h index 06ffa1d..bd9ddea 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -146,6 +146,7 @@ struct MemoryRegion { Int128 size; hwaddr addr; void (*destructor)(MemoryRegion *mr); + void *ram_block; /* RAMBlock pointer */ ram_addr_t ram_addr; uint64_t align; bool subpage; diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index ff558a4..cc8d769 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -38,6 +38,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr); void *qemu_get_ram_ptr(ram_addr_t addr); void qemu_ram_free(ram_addr_t addr); void qemu_ram_free_from_ptr(ram_addr_t addr); +void *qemu_get_ram_ptr_from_block(RAMBlock *block, hwaddr addr); int qemu_ram_resize(ram_addr_t base, ram_addr_t newsize, Error **errp); diff --git a/memory.c b/memory.c index ee3f2a8..31bd84a 100644 --- a/memory.c +++ b/memory.c @@ -877,6 +877,7 @@ void memory_region_init(MemoryRegion *mr, mr->size = int128_2_64(); } mr->name = g_strdup(name); + mr->ram_block = NULL; if (name) { char *escaped_name = memory_region_escape_name(name); @@ -1449,7 +1450,8 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr) assert(mr->terminates); - return qemu_get_ram_ptr(mr->ram_addr & TARGET_PAGE_MASK); + return qemu_get_ram_ptr_from_block(mr->ram_block, + mr->ram_addr & TARGET_PAGE_MASK); } static void memory_region_update_coalesced_range_as(MemoryRegion *mr, AddressSpace *as)