From patchwork Sun Jun 30 19:40:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717310 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AD108C3065B for ; Sun, 30 Jun 2024 19:41:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0Pz-00005d-4P; Sun, 30 Jun 2024 15:40:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0008S5-D7 for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:54 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pn-0004O1-Lq for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:51 -0400 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJcvCO020058; Sun, 30 Jun 2024 19:40:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=3raiXCm8kRFGnYk1UPCeOPFpx20xzLtvntIeQ9Dkyxg=; b= nTOyazqKQSjW8LQP4azBJFbllje1B+NYVtLw44T/XhuvOyoWtet5VNLEyKbXbVr4 krdeu1XUMY1iLnaOPV8XBj+pwhXTkfBeEsGPQGpR99ZuaSSVDVknXLlXaaU1AHiV t06F565AqlLOS0Nac/lu7gnGPL9gW2V3rGFE23e6A2f4hLapW5eHYx4cynWE03/n AJGJP/RcnEUQjOFE4dFU5+XFhWFT3hQ+iMXWY2q0k/CjCPirrs+ClPXhwGTBwL+r fbWTHNMG3bEKp2dKlwVOCDcskzWg4nZiNu1LHOSk57K4c5oSKILoRAHfSiIwm8qs L1HvQg/a2hpj9byramVirw== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 402att9e1v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:39 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UIwiMG018459; Sun, 30 Jun 2024 19:40:38 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16cw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:38 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSQ014044; Sun, 30 Jun 2024 19:40:37 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-2; Sun, 30 Jun 2024 19:40:37 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 01/11] machine: alloc-anon option Date: Sun, 30 Jun 2024 12:40:24 -0700 Message-Id: <1719776434-435013-2-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: LLqZLkOEv4-f0KPQWrpRiUV3jHhgo8HX X-Proofpoint-ORIG-GUID: LLqZLkOEv4-f0KPQWrpRiUV3jHhgo8HX Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Allocate anonymous memory using mmap MAP_ANON or memfd_create depending on the value of the anon-alloc machine property. This affects memory-backend-ram objects, guest RAM created with the global -m option but without an associated memory-backend object and without the -mem-path option, and various memory regions such as ROMs that are allocated when devices are created. This option does not affect memory-backend-file, memory-backend-memfd, or memory-backend-epc objects. The memfd option is intended to support new migration modes, in which the memory region can be transferred in place to a new QEMU process, by sending the memfd file descriptor to the process. Memory contents are preserved, and if the mode also transfers device descriptors, then pages that are locked in memory for DMA remain locked. This behavior is a pre-requisite for supporting vfio, vdpa, and iommufd devices with the new modes. To access the same memory in the old and new QEMU processes, the memory must be mapped shared. Therefore, the implementation always sets RAM_SHARED if alloc-anon=memfd, except for memory-backend-ram, where the user must explicitly specify the share option. In lieu of defining a new RAM flag, at the lowest level the implementation uses RAM_SHARED with fd=-1 as the condition for calling memfd_create. Signed-off-by: Steve Sistare Reviewed-by: Fabiano Rosas --- hw/core/machine.c | 24 ++++++++++++++++++++++++ include/hw/boards.h | 1 + qapi/machine.json | 14 ++++++++++++++ qemu-options.hx | 13 +++++++++++++ system/memory.c | 12 +++++++++--- system/physmem.c | 38 +++++++++++++++++++++++++++++++++++++- system/trace-events | 3 +++ 7 files changed, 101 insertions(+), 4 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 655d75c..7ca2ad0 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -454,6 +454,20 @@ static void machine_set_mem_merge(Object *obj, bool value, Error **errp) ms->mem_merge = value; } +static int machine_get_anon_alloc(Object *obj, Error **errp) +{ + MachineState *ms = MACHINE(obj); + + return ms->anon_alloc; +} + +static void machine_set_anon_alloc(Object *obj, int value, Error **errp) +{ + MachineState *ms = MACHINE(obj); + + ms->anon_alloc = value; +} + static bool machine_get_usb(Object *obj, Error **errp) { MachineState *ms = MACHINE(obj); @@ -1066,6 +1080,11 @@ static void machine_class_init(ObjectClass *oc, void *data) object_class_property_set_description(oc, "mem-merge", "Enable/disable memory merge support"); + object_class_property_add_enum(oc, "anon-alloc", "AnonAllocOption", + &AnonAllocOption_lookup, + machine_get_anon_alloc, + machine_set_anon_alloc); + object_class_property_add_bool(oc, "usb", machine_get_usb, machine_set_usb); object_class_property_set_description(oc, "usb", @@ -1416,6 +1435,11 @@ static bool create_default_memdev(MachineState *ms, const char *path, Error **er if (!object_property_set_int(obj, "size", ms->ram_size, errp)) { goto out; } + if (!object_property_set_bool(obj, "share", + ms->anon_alloc == ANON_ALLOC_OPTION_MEMFD, + errp)) { + goto out; + } object_property_add_child(object_get_objects_root(), mc->default_ram_id, obj); /* Ensure backend's memory region name is equal to mc->default_ram_id */ diff --git a/include/hw/boards.h b/include/hw/boards.h index 73ad319..77f16ad 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -383,6 +383,7 @@ struct MachineState { bool enable_graphics; ConfidentialGuestSupport *cgs; HostMemoryBackend *memdev; + AnonAllocOption anon_alloc; /* * convenience alias to ram_memdev_id backend memory region * or to numa container memory region diff --git a/qapi/machine.json b/qapi/machine.json index 2fd3e9c..9173953 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1881,3 +1881,17 @@ { 'command': 'x-query-interrupt-controllers', 'returns': 'HumanReadableText', 'features': [ 'unstable' ]} + +## +# @AnonAllocOption: +# +# An enumeration of the options for allocating anonymous guest memory. +# +# @mmap: allocate using mmap MAP_ANON +# +# @memfd: allocate using memfd_create +# +# Since: 9.1 +## +{ 'enum': 'AnonAllocOption', + 'data': [ 'mmap', 'memfd' ] } diff --git a/qemu-options.hx b/qemu-options.hx index 8ca7f34..595b693 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -38,6 +38,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ " nvdimm=on|off controls NVDIMM support (default=off)\n" " memory-encryption=@var{} memory encryption object to use (default=none)\n" " hmat=on|off controls ACPI HMAT support (default=off)\n" + " anon-alloc=mmap|memfd allocate anonymous guest RAM using mmap MAP_ANON or memfd_create (default: mmap)\n" " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n" " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n", QEMU_ARCH_ALL) @@ -101,6 +102,18 @@ SRST Enables or disables ACPI Heterogeneous Memory Attribute Table (HMAT) support. The default is off. + ``anon-alloc=mmap|memfd`` + Allocate anonymous guest RAM using mmap MAP_ANON (the default) + or memfd_create. This affects memory-backend-ram objects, + RAM created with the global -m option but without an + associated memory-backend object and without the -mem-path + option, and various memory regions such as ROMs that are + allocated when devices are created. This option does not + affect memory-backend-file, memory-backend-memfd, or + memory-backend-epc objects. + + Some migration modes require anon-alloc=memfd. + ``memory-backend='id'`` An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options. Allows to use a memory backend as main RAM. diff --git a/system/memory.c b/system/memory.c index 2d69521..28a837d 100644 --- a/system/memory.c +++ b/system/memory.c @@ -1552,8 +1552,10 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr, uint64_t size, Error **errp) { + uint32_t flags = (current_machine->anon_alloc == ANON_ALLOC_OPTION_MEMFD) ? + RAM_SHARED : 0; return memory_region_init_ram_flags_nomigrate(mr, owner, name, - size, 0, errp); + size, flags, errp); } bool memory_region_init_ram_flags_nomigrate(MemoryRegion *mr, @@ -1713,8 +1715,10 @@ bool memory_region_init_rom_nomigrate(MemoryRegion *mr, uint64_t size, Error **errp) { + uint32_t flags = (current_machine->anon_alloc == ANON_ALLOC_OPTION_MEMFD) ? + RAM_SHARED : 0; if (!memory_region_init_ram_flags_nomigrate(mr, owner, name, - size, 0, errp)) { + size, flags, errp)) { return false; } mr->readonly = true; @@ -1731,6 +1735,8 @@ bool memory_region_init_rom_device_nomigrate(MemoryRegion *mr, Error **errp) { Error *err = NULL; + uint32_t flags = (current_machine->anon_alloc == ANON_ALLOC_OPTION_MEMFD) ? + RAM_SHARED : 0; assert(ops); memory_region_init(mr, owner, name, size); mr->ops = ops; @@ -1738,7 +1744,7 @@ bool memory_region_init_rom_device_nomigrate(MemoryRegion *mr, mr->terminates = true; mr->rom_device = true; mr->destructor = memory_region_destructor_ram; - mr->ram_block = qemu_ram_alloc(size, 0, mr, &err); + mr->ram_block = qemu_ram_alloc(size, flags, mr, &err); if (err) { mr->size = int128_zero(); object_unparent(OBJECT(mr)); diff --git a/system/physmem.c b/system/physmem.c index 33d09f7..efe95ff 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -47,6 +47,7 @@ #include "qemu/qemu-print.h" #include "qemu/log.h" #include "qemu/memalign.h" +#include "qemu/memfd.h" #include "exec/memory.h" #include "exec/ioport.h" #include "sysemu/dma.h" @@ -54,6 +55,7 @@ #include "sysemu/hw_accel.h" #include "sysemu/xen-mapcache.h" #include "trace/trace-root.h" +#include "trace.h" #ifdef CONFIG_FALLOCATE_PUNCH_HOLE #include @@ -69,6 +71,8 @@ #include "qemu/pmem.h" +#include "qapi/qapi-types-migration.h" +#include "migration/options.h" #include "migration/vmstate.h" #include "qemu/range.h" @@ -1828,6 +1832,32 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) qemu_mutex_unlock_ramlist(); return; } + + } else if (new_block->flags & RAM_SHARED) { + size_t max_length = new_block->max_length; + MemoryRegion *mr = new_block->mr; + const char *name = memory_region_name(mr); + + new_block->mr->align = QEMU_VMALLOC_ALIGN; + + if (new_block->fd == -1) { + new_block->fd = qemu_memfd_create(name, max_length + mr->align, + 0, 0, 0, errp); + } + + if (new_block->fd >= 0) { + int mfd = new_block->fd; + qemu_set_cloexec(mfd); + new_block->host = file_ram_alloc(new_block, max_length, mfd, + false, 0, errp); + } + if (!new_block->host) { + qemu_mutex_unlock_ramlist(); + return; + } + memory_try_enable_merging(new_block->host, new_block->max_length); + free_on_error = true; + } else { new_block->host = qemu_anon_ram_alloc(new_block->max_length, &new_block->mr->align, @@ -1911,6 +1941,9 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) ram_block_notify_add(new_block->host, new_block->used_length, new_block->max_length); } + trace_ram_block_add(memory_region_name(new_block->mr), new_block->flags, + new_block->fd, new_block->used_length, + new_block->max_length); return; out_free: @@ -2097,8 +2130,11 @@ RAMBlock *qemu_ram_alloc_resizeable(ram_addr_t size, ram_addr_t maxsz, void *host), MemoryRegion *mr, Error **errp) { + uint32_t flags = (current_machine->anon_alloc == ANON_ALLOC_OPTION_MEMFD) ? + RAM_SHARED : 0; + flags |= RAM_RESIZEABLE; return qemu_ram_alloc_internal(size, maxsz, resized, NULL, - RAM_RESIZEABLE, mr, errp); + flags, mr, errp); } static void reclaim_ramblock(RAMBlock *block) diff --git a/system/trace-events b/system/trace-events index 69c9044..f8ebf42 100644 --- a/system/trace-events +++ b/system/trace-events @@ -38,3 +38,6 @@ dirtylimit_state_finalize(void) dirtylimit_throttle_pct(int cpu_index, uint64_t pct, int64_t time_us) "CPU[%d] throttle percent: %" PRIu64 ", throttle adjust time %"PRIi64 " us" dirtylimit_set_vcpu(int cpu_index, uint64_t quota) "CPU[%d] set dirty page rate limit %"PRIu64 dirtylimit_vcpu_execute(int cpu_index, int64_t sleep_time_us) "CPU[%d] sleep %"PRIi64 " us" + +#physmem.c +ram_block_add(const char *name, uint32_t flags, int fd, size_t used_length, size_t max_length) "%s, flags %u, fd %d, len %lu, maxlen %lu" From patchwork Sun Jun 30 19:40:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5382FC30653 for ; Sun, 30 Jun 2024 19:42:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0Q1-000072-5B; Sun, 30 Jun 2024 15:41:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0008SB-E9 for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:54 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pn-0004O2-Jv for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:51 -0400 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJW3BS026542; Sun, 30 Jun 2024 19:40:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=7+QL5ZrrpPDnlyBxW0st4Wn/YKyjCdy8c6YwBxJSlJI=; b= JMYo2HVd42mWYrnCWpGbMD657/sjXoB8OAlMJmyB+Kh0EczHvef2CpXnFzVqCyse Xg7y7CrmHGv4kLaqAKLxAsWJXAq3q+stU2+3fz91cscpGTQj9jTKu6E08Qy3HkQP Zuw+qyRZxn/gnEXY2c6Lp2XL6iHiNzpsRVrTuF7Fxk9ZsdysrAlf6EAExrJVH0nx yWgxJQYy8hipl7kBs0vQjs71bQZPyvuUNaYeNA4Ox5HNx4UV2q1Gpk4sMk3CDF6I KolCOLG18bnNhd+zjwk/KkSx8AY24S9LuMhzA5TmmVV270KQh7TkPhhk6oDxER7c mon2K3wE7nq0HnI3TDLaTg== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 4028v0hgjs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:40 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UHtoa4019083; Sun, 30 Jun 2024 19:40:39 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16d2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:39 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSS014044; Sun, 30 Jun 2024 19:40:38 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-3; Sun, 30 Jun 2024 19:40:38 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 02/11] migration: cpr-state Date: Sun, 30 Jun 2024 12:40:25 -0700 Message-Id: <1719776434-435013-3-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-ORIG-GUID: zfmTs7vXi1T1LP0RHQR8hE53gHD2ecPu X-Proofpoint-GUID: zfmTs7vXi1T1LP0RHQR8hE53gHD2ecPu Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org CPR must save state that is needed after QEMU is restarted, when devices are realized. Thus the extra state cannot be saved in the migration stream, as objects must already exist before that stream can be loaded. Instead, define auxilliary state structures and vmstate descriptions, not associated with any registered object, and serialize the aux state to a cpr-specific stream in cpr_state_save. Deserialize in cpr_state_load after QEMU restarts, before devices are realized. Provide accessors for clients to register file descriptors for saving. The mechanism for passing the fd's to the new process will be specific to each migration mode, and added in subsequent patches. Signed-off-by: Steve Sistare Reviewed-by: Fabiano Rosas --- include/migration/cpr.h | 21 ++++++ migration/cpr.c | 188 ++++++++++++++++++++++++++++++++++++++++++++++++ migration/meson.build | 1 + migration/migration.c | 6 ++ migration/trace-events | 5 ++ system/vl.c | 3 + 6 files changed, 224 insertions(+) create mode 100644 include/migration/cpr.h create mode 100644 migration/cpr.c diff --git a/include/migration/cpr.h b/include/migration/cpr.h new file mode 100644 index 0000000..8e7e705 --- /dev/null +++ b/include/migration/cpr.h @@ -0,0 +1,21 @@ +/* + * Copyright (c) 2021, 2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef MIGRATION_CPR_H +#define MIGRATION_CPR_H + +typedef int (*cpr_walk_fd_cb)(int fd); +void cpr_save_fd(const char *name, int id, int fd); +void cpr_delete_fd(const char *name, int id); +int cpr_find_fd(const char *name, int id); +int cpr_walk_fd(cpr_walk_fd_cb cb); +void cpr_resave_fd(const char *name, int id, int fd); + +int cpr_state_save(Error **errp); +int cpr_state_load(Error **errp); + +#endif diff --git a/migration/cpr.c b/migration/cpr.c new file mode 100644 index 0000000..313e74e --- /dev/null +++ b/migration/cpr.c @@ -0,0 +1,188 @@ +/* + * Copyright (c) 2021-2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "migration/cpr.h" +#include "migration/misc.h" +#include "migration/qemu-file.h" +#include "migration/savevm.h" +#include "migration/vmstate.h" +#include "sysemu/runstate.h" +#include "trace.h" + +/*************************************************************************/ +/* cpr state container for all information to be saved. */ + +typedef QLIST_HEAD(CprFdList, CprFd) CprFdList; + +typedef struct CprState { + CprFdList fds; +} CprState; + +static CprState cpr_state; + +/****************************************************************************/ + +typedef struct CprFd { + char *name; + unsigned int namelen; + int id; + int fd; + QLIST_ENTRY(CprFd) next; +} CprFd; + +static const VMStateDescription vmstate_cpr_fd = { + .name = "cpr fd", + .version_id = 1, + .minimum_version_id = 1, + .fields = (VMStateField[]) { + VMSTATE_UINT32(namelen, CprFd), + VMSTATE_VBUFFER_ALLOC_UINT32(name, CprFd, 0, NULL, namelen), + VMSTATE_INT32(id, CprFd), + VMSTATE_INT32(fd, CprFd), + VMSTATE_END_OF_LIST() + } +}; + +void cpr_save_fd(const char *name, int id, int fd) +{ + CprFd *elem = g_new0(CprFd, 1); + + trace_cpr_save_fd(name, id, fd); + elem->name = g_strdup(name); + elem->namelen = strlen(name) + 1; + elem->id = id; + elem->fd = fd; + QLIST_INSERT_HEAD(&cpr_state.fds, elem, next); +} + +static CprFd *find_fd(CprFdList *head, const char *name, int id) +{ + CprFd *elem; + + QLIST_FOREACH(elem, head, next) { + if (!strcmp(elem->name, name) && elem->id == id) { + return elem; + } + } + return NULL; +} + +void cpr_delete_fd(const char *name, int id) +{ + CprFd *elem = find_fd(&cpr_state.fds, name, id); + + if (elem) { + QLIST_REMOVE(elem, next); + g_free(elem->name); + g_free(elem); + } + + trace_cpr_delete_fd(name, id); +} + +int cpr_find_fd(const char *name, int id) +{ + CprFd *elem = find_fd(&cpr_state.fds, name, id); + int fd = elem ? elem->fd : -1; + + trace_cpr_find_fd(name, id, fd); + return fd; +} + +int cpr_walk_fd(cpr_walk_fd_cb cb) +{ + CprFd *elem; + + QLIST_FOREACH(elem, &cpr_state.fds, next) { + if (elem->fd >= 0 && cb(elem->fd)) { + return 1; + } + } + return 0; +} + +void cpr_resave_fd(const char *name, int id, int fd) +{ + CprFd *elem = find_fd(&cpr_state.fds, name, id); + int old_fd = elem ? elem->fd : -1; + + if (old_fd < 0) { + cpr_save_fd(name, id, fd); + } else if (old_fd != fd) { + error_setg(&error_fatal, + "internal error: cpr fd '%s' id %d value %d " + "already saved with a different value %d", + name, id, fd, old_fd); + } +} +/*************************************************************************/ +#define CPR_STATE "CprState" + +static const VMStateDescription vmstate_cpr_state = { + .name = CPR_STATE, + .version_id = 1, + .minimum_version_id = 1, + .fields = (VMStateField[]) { + VMSTATE_QLIST_V(fds, CprState, 1, vmstate_cpr_fd, CprFd, next), + VMSTATE_END_OF_LIST() + } +}; +/*************************************************************************/ + +int cpr_state_save(Error **errp) +{ + int ret; + QEMUFile *f; + + /* set f based on mode in a later patch in this series */ + return 0; + + qemu_put_be32(f, QEMU_VM_FILE_MAGIC); + qemu_put_be32(f, QEMU_VM_FILE_VERSION); + + ret = vmstate_save_state(f, &vmstate_cpr_state, &cpr_state, 0); + if (ret) { + error_setg(errp, "vmstate_save_state error %d", ret); + } + + qemu_fclose(f); + return ret; +} + +int cpr_state_load(Error **errp) +{ + int ret; + uint32_t v; + QEMUFile *f; + + /* set f based on mode in a later patch in this series */ + return 0; + + v = qemu_get_be32(f); + if (v != QEMU_VM_FILE_MAGIC) { + error_setg(errp, "Not a migration stream (bad magic %x)", v); + qemu_fclose(f); + return -EINVAL; + } + v = qemu_get_be32(f); + if (v != QEMU_VM_FILE_VERSION) { + error_setg(errp, "Unsupported migration stream version %d", v); + qemu_fclose(f); + return -ENOTSUP; + } + + ret = vmstate_load_state(f, &vmstate_cpr_state, &cpr_state, 1); + if (ret) { + error_setg(errp, "vmstate_load_state error %d", ret); + } + + qemu_fclose(f); + return ret; +} + diff --git a/migration/meson.build b/migration/meson.build index 5ce2acb4..87feb4c 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -13,6 +13,7 @@ system_ss.add(files( 'block-dirty-bitmap.c', 'channel.c', 'channel-block.c', + 'cpr.c', 'dirtyrate.c', 'exec.c', 'fd.c', diff --git a/migration/migration.c b/migration/migration.c index 3dea06d..e394ad7 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -27,6 +27,7 @@ #include "sysemu/cpu-throttle.h" #include "rdma.h" #include "ram.h" +#include "migration/cpr.h" #include "migration/global_state.h" #include "migration/misc.h" #include "migration.h" @@ -2118,6 +2119,10 @@ void qmp_migrate(const char *uri, bool has_channels, } } + if (cpr_state_save(&local_err)) { + goto out; + } + if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) { SocketAddress *saddr = &addr->u.socket; if (saddr->type == SOCKET_ADDRESS_TYPE_INET || @@ -2142,6 +2147,7 @@ void qmp_migrate(const char *uri, bool has_channels, MIGRATION_STATUS_FAILED); } +out: if (local_err) { if (!resume_requested) { yank_unregister_instance(MIGRATION_YANK_INSTANCE); diff --git a/migration/trace-events b/migration/trace-events index 0b7c332..173f2c0 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -340,6 +340,11 @@ colo_receive_message(const char *msg) "Receive '%s' message" # colo-failover.c colo_failover_set_state(const char *new_state) "new state %s" +# cpr.c +cpr_save_fd(const char *name, int id, int fd) "%s, id %d, fd %d" +cpr_delete_fd(const char *name, int id) "%s, id %d" +cpr_find_fd(const char *name, int id, int fd) "%s, id %d returns %d" + # block-dirty-bitmap.c send_bitmap_header_enter(void) "" send_bitmap_bits(uint32_t flags, uint64_t start_sector, uint32_t nr_sectors, uint64_t data_size) "flags: 0x%x, start_sector: %" PRIu64 ", nr_sectors: %" PRIu32 ", data_size: %" PRIu64 diff --git a/system/vl.c b/system/vl.c index 03951be..6521ee3 100644 --- a/system/vl.c +++ b/system/vl.c @@ -77,6 +77,7 @@ #include "hw/block/block.h" #include "hw/i386/x86.h" #include "hw/i386/pc.h" +#include "migration/cpr.h" #include "migration/misc.h" #include "migration/snapshot.h" #include "sysemu/tpm.h" @@ -3713,6 +3714,8 @@ void qemu_init(int argc, char **argv) qemu_create_machine(machine_opts_dict); + cpr_state_load(&error_fatal); + suspend_mux_open(); qemu_disable_default_devices(); From patchwork Sun Jun 30 19:40:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 52CA5C41513 for ; Sun, 30 Jun 2024 19:41:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0Pw-0008T7-S9; Sun, 30 Jun 2024 15:40:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pp-0008R1-RD for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:49 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pn-0004O9-KJ for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:49 -0400 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UIuFOu001690; Sun, 30 Jun 2024 19:40:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=n4MgAr16sdNb4ng3UGdhpWQ4/HtGee3oPi/m+dImwfc=; b= h/5wp/zJGkemQXwtQdDuJktV/BOjkyxdW82U3N8cgW65nBHZCqXAeP1sBfpwcfBX f3d4eaXb97n/vH+vRbo7Ei4hWL9oxfbssbuDx9290wAXLGNnnKVpnWNoqAtBrAiX iIBbpLQ4GwbrFouwINfRTkbmCKBn8PiKFe0YRmCB+0lCA2oEh75NNRTKwh3fs8LB wo4H/0w1FDi1jeFvq3HecYSrWxjSNcJeDVK7KIYhQujL3JEZnbuG/hhuPxTlq+Iz iKzw39auTGEocvaQrIkDa2uGoNrgHHgzfW9yy/iy+rJ6L3LlSzOP6YwW5bDTWzKn zcFU7U8zb5FmUhsMiykH3A== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40296asg9n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:42 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UIwiMH018459; Sun, 30 Jun 2024 19:40:40 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16d9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:40 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSU014044; Sun, 30 Jun 2024 19:40:40 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-4; Sun, 30 Jun 2024 19:40:40 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 03/11] migration: save cpr mode Date: Sun, 30 Jun 2024 12:40:26 -0700 Message-Id: <1719776434-435013-4-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: 3rqkujtzAlanIjf-eP2bWH66TW-GJ6T2 X-Proofpoint-ORIG-GUID: 3rqkujtzAlanIjf-eP2bWH66TW-GJ6T2 Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Save the mode in CPR state, so the user does not need to explicitly specify it for the target. Modify migrate_mode() so it returns the incoming mode on the target. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 7 +++++++ migration/cpr.c | 23 ++++++++++++++++++++++- migration/migration.c | 1 + migration/options.c | 9 +++++++-- 4 files changed, 37 insertions(+), 3 deletions(-) diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 8e7e705..42b4019 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -8,6 +8,13 @@ #ifndef MIGRATION_CPR_H #define MIGRATION_CPR_H +#include "qapi/qapi-types-migration.h" + +#define MIG_MODE_NONE MIG_MODE__MAX + +MigMode cpr_get_incoming_mode(void); +void cpr_set_incoming_mode(MigMode mode); + typedef int (*cpr_walk_fd_cb)(int fd); void cpr_save_fd(const char *name, int id, int fd); void cpr_delete_fd(const char *name, int id); diff --git a/migration/cpr.c b/migration/cpr.c index 313e74e..1c296c6 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -21,10 +21,23 @@ typedef QLIST_HEAD(CprFdList, CprFd) CprFdList; typedef struct CprState { + MigMode mode; CprFdList fds; } CprState; -static CprState cpr_state; +static CprState cpr_state = { + .mode = MIG_MODE_NONE, +}; + +MigMode cpr_get_incoming_mode(void) +{ + return cpr_state.mode; +} + +void cpr_set_incoming_mode(MigMode mode) +{ + cpr_state.mode = mode; +} /****************************************************************************/ @@ -124,11 +137,19 @@ void cpr_resave_fd(const char *name, int id, int fd) /*************************************************************************/ #define CPR_STATE "CprState" +static int cpr_state_presave(void *opaque) +{ + cpr_state.mode = migrate_mode(); + return 0; +} + static const VMStateDescription vmstate_cpr_state = { .name = CPR_STATE, .version_id = 1, .minimum_version_id = 1, + .pre_save = cpr_state_presave, .fields = (VMStateField[]) { + VMSTATE_UINT32(mode, CprState), VMSTATE_QLIST_V(fds, CprState, 1, vmstate_cpr_fd, CprFd, next), VMSTATE_END_OF_LIST() } diff --git a/migration/migration.c b/migration/migration.c index e394ad7..0f47765 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -411,6 +411,7 @@ void migration_incoming_state_destroy(void) mis->postcopy_qemufile_dst = NULL; } + cpr_set_incoming_mode(MIG_MODE_NONE); yank_unregister_instance(MIGRATION_YANK_INSTANCE); } diff --git a/migration/options.c b/migration/options.c index 645f550..305397a 100644 --- a/migration/options.c +++ b/migration/options.c @@ -22,6 +22,7 @@ #include "qapi/qmp/qnull.h" #include "sysemu/runstate.h" #include "migration/colo.h" +#include "migration/cpr.h" #include "migration/misc.h" #include "migration.h" #include "migration-stats.h" @@ -758,8 +759,12 @@ uint64_t migrate_max_postcopy_bandwidth(void) MigMode migrate_mode(void) { - MigrationState *s = migrate_get_current(); - MigMode mode = s->parameters.mode; + MigMode mode = cpr_get_incoming_mode(); + + if (mode == MIG_MODE_NONE) { + MigrationState *s = migrate_get_current(); + mode = s->parameters.mode; + } assert(mode >= 0 && mode < MIG_MODE__MAX); return mode; From patchwork Sun Jun 30 19:40:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AD0D0C30659 for ; Sun, 30 Jun 2024 19:41:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0Pz-00005g-CD; Sun, 30 Jun 2024 15:40:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0008SD-Ev for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:54 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Po-0004OD-3s for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:50 -0400 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJaJeC017664; Sun, 30 Jun 2024 19:40:43 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=YMzHOnPI4SJdDNBTFEQliIPT3ScsAM18YJJIq7Tjh00=; b= ElT7uQztWXxHez8ZaMIP929QloRKHJPST6zCN9a5pukfEMJwamxanSFPyY7TEM12 YBDzl+WARLzh0pXSMzBAtns8EByPc+uTGoWpRQtdGkxtUCSN0en86ZvUEMy0BvtB 6se2hvFI1qR5F2npHs7SJz9DSmdUMJlE+YrB13p19izm8NmyfCx+ZWazAJYhIAMi S9+KjI39sN24tV+jxGXBqiexW9cAep2b0L8fF2YzMEFV0l4T67X8KbWEgz+X+J67 T6ipp+4NrcQp8AkVNF1r81zMBHel5d9WKEObjEWwZIor3cqEGk7H+uigJnmUhVLy Ma4AKbA043m9akUqCcgj9Q== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 402att9e1w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:43 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UIF7Pd018995; Sun, 30 Jun 2024 19:40:41 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16de-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:41 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSW014044; Sun, 30 Jun 2024 19:40:41 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-5; Sun, 30 Jun 2024 19:40:41 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 04/11] migration: stop vm earlier for cpr Date: Sun, 30 Jun 2024 12:40:27 -0700 Message-Id: <1719776434-435013-5-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: qF5bfRn0hc62MZWOB_mm7YLhsdzvD6lP X-Proofpoint-ORIG-GUID: qF5bfRn0hc62MZWOB_mm7YLhsdzvD6lP Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Stop the vm earlier for cpr, to guarantee consistent device state when CPR state is saved. Signed-off-by: Steve Sistare --- migration/migration.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 0f47765..8a8e927 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2077,6 +2077,7 @@ void qmp_migrate(const char *uri, bool has_channels, MigrationState *s = migrate_get_current(); g_autoptr(MigrationChannel) channel = NULL; MigrationAddress *addr = NULL; + bool stopped = false; /* * Having preliminary checks for uri and channel @@ -2120,6 +2121,15 @@ void qmp_migrate(const char *uri, bool has_channels, } } + if (migrate_mode_is_cpr(s)) { + int ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE); + if (ret < 0) { + error_setg(&local_err, "migration_stop_vm failed, error %d", -ret); + goto out; + } + stopped = true; + } + if (cpr_state_save(&local_err)) { goto out; } @@ -2155,6 +2165,9 @@ out: } migrate_fd_error(s, local_err); error_propagate(errp, local_err); + if (stopped && runstate_is_live(s->vm_old_state)) { + vm_start(); + } return; } } @@ -3738,7 +3751,6 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) Error *local_err = NULL; uint64_t rate_limit; bool resume = (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP); - int ret; /* * If there's a previous error, free it and prepare for another one. @@ -3810,14 +3822,6 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) return; } - if (migrate_mode_is_cpr(s)) { - ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE); - if (ret < 0) { - error_setg(&local_err, "migration_stop_vm failed, error %d", -ret); - goto fail; - } - } - if (migrate_background_snapshot()) { qemu_thread_create(&s->thread, "mig/snapshot", bg_migration_thread, s, QEMU_THREAD_JOINABLE); From patchwork Sun Jun 30 19:40:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B8460C30653 for ; Sun, 30 Jun 2024 19:42:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0QH-0000EZ-9V; Sun, 30 Jun 2024 15:41:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0008SC-Ed for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:54 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Po-0004OF-6w for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:50 -0400 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJdPwI028776; Sun, 30 Jun 2024 19:40:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=LIBmEiC+D9GTXCGgocbHmU9iP3J5h3etyy3WxDqWsM8=; b= T/nc/yS0j9+J3xWkCXKvnA8AhFCbDMmDGs3RNBmvY4/JOpJbQSC9nE0TavUSv2qd 7JyH+vY67FDO7lwoK+UarT2rTo7wh8YRjxLus6Vk4zYzc2MaNqY+yamSjaxKA/jP Diz/dWxUiAkTy1tWQQi7tq0OEljdyH1LycY5fUoF4fMjHGhm2FGQLZ6pEbHQuB7T s/WuOHzh8BELQ4gQyTKTU6s3gxSH4QaqKWo4/z5G6462CVtMeZiegWOMu6W3Eg0v WvqM9Hk/L6bktpwCKULRHbtD0LJbZw7t1rvZbWU0GGSsXaKOWXulS67WyyNnyt+V ZXeLkX3AdK+u+jqDt0jU8g== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 402a591erd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:44 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UEU59V018387; Sun, 30 Jun 2024 19:40:43 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16dn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:43 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSY014044; Sun, 30 Jun 2024 19:40:42 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-6; Sun, 30 Jun 2024 19:40:42 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 05/11] physmem: preserve ram blocks for cpr Date: Sun, 30 Jun 2024 12:40:28 -0700 Message-Id: <1719776434-435013-6-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: 3RMlwOa6TIJr3HL8RT4KpdnV9-266g-2 X-Proofpoint-ORIG-GUID: 3RMlwOa6TIJr3HL8RT4KpdnV9-266g-2 Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Save the memfd for anonymous ramblocks in CPR state, along with a name that uniquely identifies it. The block's idstr is not yet set, so it cannot be used for this purpose. Find the saved memfd in new QEMU when creating a block. QEMU hard-codes the length of some internally-created blocks, so to guard against that length changing, use lseek to get the actual length of an incoming memfd. Signed-off-by: Steve Sistare --- system/physmem.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/system/physmem.c b/system/physmem.c index efe95ff..e37352e 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -73,6 +73,7 @@ #include "qapi/qapi-types-migration.h" #include "migration/options.h" +#include "migration/cpr.h" #include "migration/vmstate.h" #include "qemu/range.h" @@ -1641,6 +1642,19 @@ void qemu_ram_unset_idstr(RAMBlock *block) } } +static char *cpr_name(RAMBlock *block) +{ + MemoryRegion *mr = block->mr; + const char *mr_name = memory_region_name(mr); + g_autofree char *id = mr->dev ? qdev_get_dev_path(mr->dev) : NULL; + + if (id) { + return g_strdup_printf("%s/%s", id, mr_name); + } else { + return g_strdup(mr_name); + } +} + size_t qemu_ram_pagesize(RAMBlock *rb) { return rb->page_size; @@ -1836,13 +1850,17 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) } else if (new_block->flags & RAM_SHARED) { size_t max_length = new_block->max_length; MemoryRegion *mr = new_block->mr; - const char *name = memory_region_name(mr); + g_autofree char *name = cpr_name(new_block); new_block->mr->align = QEMU_VMALLOC_ALIGN; + new_block->fd = cpr_find_fd(name, 0); if (new_block->fd == -1) { new_block->fd = qemu_memfd_create(name, max_length + mr->align, 0, 0, 0, errp); + cpr_save_fd(name, 0, new_block->fd); + } else { + new_block->max_length = lseek(new_block->fd, 0, SEEK_END); } if (new_block->fd >= 0) { @@ -1852,6 +1870,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) false, 0, errp); } if (!new_block->host) { + cpr_delete_fd(name, 0); qemu_mutex_unlock_ramlist(); return; } @@ -2162,6 +2181,8 @@ static void reclaim_ramblock(RAMBlock *block) void qemu_ram_free(RAMBlock *block) { + g_autofree char *name = NULL; + if (!block) { return; } @@ -2172,6 +2193,8 @@ void qemu_ram_free(RAMBlock *block) } qemu_mutex_lock_ramlist(); + name = cpr_name(block); + cpr_delete_fd(name, 0); QLIST_REMOVE_RCU(block, next); ram_list.mru_block = NULL; /* Write list before version */ From patchwork Sun Jun 30 19:40:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1590CC27C4F for ; Sun, 30 Jun 2024 19:42:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0QB-0000Ah-Aq; Sun, 30 Jun 2024 15:41:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0008S6-DW for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:54 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pp-0004OO-Kc for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:51 -0400 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJeY21019514; Sun, 30 Jun 2024 19:40:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=rHRV/raAz2FvaOpg5bcnOa8jzjLwVEUJugMChsrTwKY=; b= BUaCgZX/tkJ7r0n7WCGIv50nbcR3AFqypOAN5nHmjTDmWpd7xE+5GLffYqAN0FUd Go2VN4dM0UQ+bcDWqFMmC9oYDtrsvH9KQlMiOfuq8JGK2L1O0/N4j1CSbIA5jNsJ zIx5p55DVGxOiBkmO0Hh2hmJ3bNdnD9Q8iwe+5xNI8AwRuKy7fmWqHfVeE95s6mG hVaw8O4xwkTq4+iYu3fiGbL8sSVG757Hv5oSA/JgoK6xGEu9WjVoM0SF0HFpjBB+ QCNtPEMc6VBfZ+YbYzdYW3RjtTB0Y7Mv9xFdU2QFI4FbZ0K7xE4XDid466rKjgST 0z2qUJrtB7e/NPJKrOKTtw== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40296asg9p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:45 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UFckWb018336; Sun, 30 Jun 2024 19:40:44 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16dv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:44 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSa014044; Sun, 30 Jun 2024 19:40:43 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-7; Sun, 30 Jun 2024 19:40:43 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 06/11] migration: fix mismatched GPAs during cpr Date: Sun, 30 Jun 2024 12:40:29 -0700 Message-Id: <1719776434-435013-7-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: BpDX3Ai9npk0ioyPus9WVjdZe8Pf9jCv X-Proofpoint-ORIG-GUID: BpDX3Ai9npk0ioyPus9WVjdZe8Pf9jCv Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org For new cpr modes, ramblock_is_ignored will always be true, because the memory is preserved in place rather than copied. However, for an ignored block, parse_ramblock currently requires that the received address of the block must match the address of the statically initialized region on the target. This fails for a PCI rom block, because the memory region address is set when the guest writes to a BAR on the source, which does not occur on the target, causing a "Mismatched GPAs" error during cpr migration. To fix, unconditionally set the target's address to the source's address if the target region does not have an address yet. Signed-off-by: Steve Sistare Reviewed-by: Fabiano Rosas --- include/exec/memory.h | 12 ++++++++++++ migration/ram.c | 15 +++++++++------ system/memory.c | 10 ++++++++-- 3 files changed, 29 insertions(+), 8 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index c26ede3..227169e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -811,6 +811,7 @@ struct MemoryRegion { bool ram_device; bool enabled; bool warning_printed; /* For reservations */ + bool has_addr; uint8_t vga_logging_count; MemoryRegion *alias; hwaddr alias_offset; @@ -2408,6 +2409,17 @@ void memory_region_set_enabled(MemoryRegion *mr, bool enabled); void memory_region_set_address(MemoryRegion *mr, hwaddr addr); /* + * memory_region_set_address_only: set the address of a region. + * + * Same as memory_region_set_address, but without causing transaction side + * effects. + * + * @mr: the region to be updated + * @addr: new address, relative to container region + */ +void memory_region_set_address_only(MemoryRegion *mr, hwaddr addr); + +/* * memory_region_set_size: dynamically update the size of a region. * * Dynamically updates the size of a region. diff --git a/migration/ram.c b/migration/ram.c index edec1a2..eaf3151 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -4059,12 +4059,15 @@ static int parse_ramblock(QEMUFile *f, RAMBlock *block, ram_addr_t length) } if (migrate_ignore_shared()) { hwaddr addr = qemu_get_be64(f); - if (migrate_ram_is_ignored(block) && - block->mr->addr != addr) { - error_report("Mismatched GPAs for block %s " - "%" PRId64 "!= %" PRId64, block->idstr, - (uint64_t)addr, (uint64_t)block->mr->addr); - return -EINVAL; + if (migrate_ram_is_ignored(block)) { + if (!block->mr->has_addr) { + memory_region_set_address_only(block->mr, addr); + } else if (block->mr->addr != addr) { + error_report("Mismatched GPAs for block %s " + "%" PRId64 "!= %" PRId64, block->idstr, + (uint64_t)addr, (uint64_t)block->mr->addr); + return -EINVAL; + } } } ret = rdma_block_notification_handle(f, block->idstr); diff --git a/system/memory.c b/system/memory.c index 28a837d..b7548bf 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2655,7 +2655,7 @@ static void memory_region_add_subregion_common(MemoryRegion *mr, for (alias = subregion->alias; alias; alias = alias->alias) { alias->mapped_via_alias++; } - subregion->addr = offset; + memory_region_set_address_only(subregion, offset); memory_region_update_container_subregions(subregion); } @@ -2735,10 +2735,16 @@ static void memory_region_readd_subregion(MemoryRegion *mr) } } +void memory_region_set_address_only(MemoryRegion *mr, hwaddr addr) +{ + mr->addr = addr; + mr->has_addr = true; +} + void memory_region_set_address(MemoryRegion *mr, hwaddr addr) { if (addr != mr->addr) { - mr->addr = addr; + memory_region_set_address_only(mr, addr); memory_region_readd_subregion(mr); } } From patchwork Sun Jun 30 19:40:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717318 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CFA24C27C4F for ; Sun, 30 Jun 2024 19:43:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0QH-0000En-Cr; Sun, 30 Jun 2024 15:41:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Px-0008VJ-Lg for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:57 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0004Op-4q for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:57 -0400 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJbKi2018314; Sun, 30 Jun 2024 19:40:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s= corp-2023-11-20; bh=i5+JyvSJETfbDxmWNrKcfGvEku7jqW3wlnKNZ9xh0Mg=; b= LtTGZmBMFrWNAKRpn7Xo3R+wucLhFJg7/e5JjDMWU9fOSHj1VEcdcURIQzFeMtSp kcFlVE4zq/zx7m42wcFwSdMKzA2jOT5xQXniaVzmVwl4u0I6LqqwvvhzDBdBlpgI j9VSA0UJK0OzF8KmJWfXROd2wXg9kKBvpwIY6qQaG+ITlsPiS2OUgu6utrErh3EL ZOPobomSxiMb8oMNAL/T+7dOUn2uCVf7iCmz8PcqgEXX8Thz8xOuwrulz+Vo81rr LZeX3xHoMbMM8SGoL6U/ZD4XMuLgzOluYdP7FWLMEB7yNTsj6hEu1ELJ3MdT5B9j O478EYKPNHlmY+B8WmvVww== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 402att9e20-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:47 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UEQnGW018357; Sun, 30 Jun 2024 19:40:45 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16e1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:45 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSc014044; Sun, 30 Jun 2024 19:40:45 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-8; Sun, 30 Jun 2024 19:40:45 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 07/11] oslib: qemu_clear_cloexec Date: Sun, 30 Jun 2024 12:40:30 -0700 Message-Id: <1719776434-435013-8-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: nYpOXhzlORUaiaL8QVXBAr9kQ7dWBiUR X-Proofpoint-ORIG-GUID: nYpOXhzlORUaiaL8QVXBAr9kQ7dWBiUR Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Define qemu_clear_cloexec, analogous to qemu_set_cloexec. Signed-off-by: Steve Sistare Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Marc-AndrĂ© Lureau Reviewed-by: Fabiano Rosas --- include/qemu/osdep.h | 9 +++++++++ util/oslib-posix.c | 9 +++++++++ util/oslib-win32.c | 4 ++++ 3 files changed, 22 insertions(+) diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index 191916f..6ebf192 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -662,6 +662,15 @@ ssize_t qemu_write_full(int fd, const void *buf, size_t count) void qemu_set_cloexec(int fd); +/* + * Clear FD_CLOEXEC for a descriptor. + * + * The caller must guarantee that no other fork+exec's occur before the + * exec that is intended to inherit this descriptor, eg by suspending CPUs + * and blocking monitor commands. + */ +void qemu_clear_cloexec(int fd); + /* Return a dynamically allocated directory path that is appropriate for storing * local state. * diff --git a/util/oslib-posix.c b/util/oslib-posix.c index e764416..614c3e5 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -272,6 +272,15 @@ int qemu_socketpair(int domain, int type, int protocol, int sv[2]) return ret; } +void qemu_clear_cloexec(int fd) +{ + int f; + f = fcntl(fd, F_GETFD); + assert(f != -1); + f = fcntl(fd, F_SETFD, f & ~FD_CLOEXEC); + assert(f != -1); +} + char * qemu_get_local_state_dir(void) { diff --git a/util/oslib-win32.c b/util/oslib-win32.c index b623830..c3e969a 100644 --- a/util/oslib-win32.c +++ b/util/oslib-win32.c @@ -222,6 +222,10 @@ void qemu_set_cloexec(int fd) { } +void qemu_clear_cloexec(int fd) +{ +} + int qemu_get_thread_id(void) { return GetCurrentThreadId(); From patchwork Sun Jun 30 19:40:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717308 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4FC92C27C4F for ; Sun, 30 Jun 2024 19:41:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0Q5-00008R-7Y; Sun, 30 Jun 2024 15:41:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pw-0008UA-Fa for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:56 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0004Oq-5L for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:56 -0400 Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UIwcCx022697; Sun, 30 Jun 2024 19:40:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=4uKXgAn4d6hJh3JzwucC45gtPiookX4h8suJNmSSDmA=; b= AwV3ZTBLF7NOyDdrwxzRUONH+cwp7tO2g/DpZV1KW686YsfmGsHfFHfGmFzOEjSU aQJE2tmp6/ZqLJ7T9TKbBwlTQb8XGxKPpykWueZ1OuSJqAGAtDXjQwAUHgw8KVqM rJciOGYJyUOEFSeTo86EEy6vat0k0aymvwOxXAS90GQNgAnCcgm9S0S6BYd6vDd0 GlLBNlko90mGWppRpmhv5Ofb1cUgIHzLTOiquHD37Je//66N1PeLw4PfqCyyx7Sr Y1lCh/v0+tsxv4OtwsVxZMwvF7blNwpwVeKwABns3imTm1BWELQc1O1X+BhAa9DI 4GkOcSDOGOapIoTCye3PNw== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 4028v0hgjw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:48 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UEQnGX018357; Sun, 30 Jun 2024 19:40:47 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16e6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:46 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSe014044; Sun, 30 Jun 2024 19:40:46 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-9; Sun, 30 Jun 2024 19:40:46 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 08/11] vl: helper to request exec Date: Sun, 30 Jun 2024 12:40:31 -0700 Message-Id: <1719776434-435013-9-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-ORIG-GUID: JLC7LN32UQinUQ8IaB4vS5uDV9YeJhZc X-Proofpoint-GUID: JLC7LN32UQinUQ8IaB4vS5uDV9YeJhZc Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add a qemu_system_exec_request() hook that causes the main loop to exit and exec a command using the specified arguments. This will be used during CPR to exec a new version of QEMU. Signed-off-by: Steve Sistare --- include/sysemu/runstate.h | 3 +++ system/runstate.c | 29 +++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h index 0117d24..cb669cf 100644 --- a/include/sysemu/runstate.h +++ b/include/sysemu/runstate.h @@ -80,6 +80,8 @@ typedef enum WakeupReason { QEMU_WAKEUP_REASON_OTHER, } WakeupReason; +typedef void (*qemu_exec_func)(char **exec_argv); + void qemu_system_reset_request(ShutdownCause reason); void qemu_system_suspend_request(void); void qemu_register_suspend_notifier(Notifier *notifier); @@ -91,6 +93,7 @@ void qemu_register_wakeup_support(void); void qemu_system_shutdown_request_with_code(ShutdownCause reason, int exit_code); void qemu_system_shutdown_request(ShutdownCause reason); +void qemu_system_exec_request(qemu_exec_func func, const strList *args); void qemu_system_powerdown_request(void); void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_register_shutdown_notifier(Notifier *notifier); diff --git a/system/runstate.c b/system/runstate.c index ec32e27..afc56e4 100644 --- a/system/runstate.c +++ b/system/runstate.c @@ -40,6 +40,7 @@ #include "qapi/error.h" #include "qapi/qapi-commands-run-state.h" #include "qapi/qapi-events-run-state.h" +#include "qapi/type-helpers.h" #include "qemu/accel.h" #include "qemu/error-report.h" #include "qemu/job.h" @@ -400,6 +401,8 @@ static NotifierList wakeup_notifiers = static NotifierList shutdown_notifiers = NOTIFIER_LIST_INITIALIZER(shutdown_notifiers); static uint32_t wakeup_reason_mask = ~(1 << QEMU_WAKEUP_REASON_NONE); +qemu_exec_func exec_func; +static char **exec_argv; ShutdownCause qemu_shutdown_requested_get(void) { @@ -416,6 +419,11 @@ static int qemu_shutdown_requested(void) return qatomic_xchg(&shutdown_requested, SHUTDOWN_CAUSE_NONE); } +static int qemu_exec_requested(void) +{ + return exec_argv != NULL; +} + static void qemu_kill_report(void) { if (!qtest_driver() && shutdown_signal) { @@ -693,6 +701,23 @@ void qemu_system_shutdown_request(ShutdownCause reason) qemu_notify_event(); } +static void qemu_system_exec(void) +{ + exec_func(exec_argv); + + /* exec failed */ + g_strfreev(exec_argv); + exec_argv = NULL; + exec_func = NULL; +} + +void qemu_system_exec_request(qemu_exec_func func, const strList *args) +{ + exec_func = func; + exec_argv = strv_from_str_list(args); + qemu_notify_event(); +} + static void qemu_system_powerdown(void) { qapi_event_send_powerdown(); @@ -739,6 +764,10 @@ static bool main_loop_should_exit(int *status) if (qemu_suspend_requested()) { qemu_system_suspend(); } + if (qemu_exec_requested()) { + qemu_system_exec(); + return false; + } request = qemu_shutdown_requested(); if (request) { qemu_kill_report(); From patchwork Sun Jun 30 19:40:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E556AC27C4F for ; Sun, 30 Jun 2024 19:42:38 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0QI-0000Er-3y; Sun, 30 Jun 2024 15:41:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pw-0008UF-Fy for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:56 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pu-0004PE-5I for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:56 -0400 Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJOOO6012862; Sun, 30 Jun 2024 19:40:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=0niKNmTF4WrGuB6zYfEq+bYLo3vZ/K4GncHP23R0cUo=; b= TnkizE5xSCefKzkKTzjUW540sd9EViOrK3E6ZWCBEz1O2k6/hTxRzgjTJCxPLJCk et1z6/HgtLx/HGMo4pg2Gc8WfHEEPUFPybdQqbQEGjw33SNAouDoBkBpfA7FJUp/ 43ROB9trnt7cjN1PGj6n++uRfIwBCyIJkxP4NdmVwmXbVT5989oISjRU+Ul0rv8h h29X3epDhvtfYO9fBB4kV5u+nhxKSA9u9qHMNzgmIPf2i0KIIcjcS7F5uX67B/Gs EOvL3NqYzP5f+l3Ak7PV2gsbT+JFU6bBbiFC+3a6rHke6mmjKPp44okq6DGf8gZg iQ/y599jwkFNC575OUgprQ== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 402a591erf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:49 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UG0nJ0019121; Sun, 30 Jun 2024 19:40:48 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16ee-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:48 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSg014044; Sun, 30 Jun 2024 19:40:47 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-10; Sun, 30 Jun 2024 19:40:47 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 09/11] migration: cpr-exec-command parameter Date: Sun, 30 Jun 2024 12:40:32 -0700 Message-Id: <1719776434-435013-10-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: gnP3ihmOBfFg8Rd5CkNMWkvpBjVzPV0P X-Proofpoint-ORIG-GUID: gnP3ihmOBfFg8Rd5CkNMWkvpBjVzPV0P Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Create the cpr-exec-command migration parameter, defined as a list of strings. It will be used for cpr-exec migration mode in a subsequent patch, and contains forward references to cpr-exec mode in the qapi doc. No functional change, except that cpr-exec-command is shown by the 'info migrate' command. Signed-off-by: Steve Sistare --- hmp-commands.hx | 2 +- migration/migration-hmp-cmds.c | 25 +++++++++++++++++++++++++ migration/options.c | 14 ++++++++++++++ qapi/migration.json | 21 ++++++++++++++++++--- 4 files changed, 58 insertions(+), 4 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index 06746f0..0e04eac 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1009,7 +1009,7 @@ ERST { .name = "migrate_set_parameter", - .args_type = "parameter:s,value:s", + .args_type = "parameter:s,value:S", .params = "parameter value", .help = "Set the parameter for migration", .cmd = hmp_migrate_set_parameter, diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 7d608d2..16a4b00 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -229,6 +229,18 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict) qapi_free_MigrationCapabilityStatusList(caps); } +static void monitor_print_cpr_exec_command(Monitor *mon, strList *args) +{ + monitor_printf(mon, "%s:", + MigrationParameter_str(MIGRATION_PARAMETER_CPR_EXEC_COMMAND)); + + while (args) { + monitor_printf(mon, " %s", args->value); + args = args->next; + } + monitor_printf(mon, "\n"); +} + void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) { MigrationParameters *params; @@ -358,6 +370,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) MIGRATION_PARAMETER_DIRECT_IO), params->direct_io ? "on" : "off"); } + + assert(params->has_cpr_exec_command); + monitor_print_cpr_exec_command(mon, params->cpr_exec_command); } qapi_free_MigrationParameters(params); @@ -635,6 +650,16 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_direct_io = true; visit_type_bool(v, param, &p->direct_io, &err); break; + case MIGRATION_PARAMETER_CPR_EXEC_COMMAND: { + g_autofree char **strv = g_strsplit(valuestr ?: "", " ", -1); + strList **tail = &p->cpr_exec_command; + + for (int i = 0; strv[i]; i++) { + QAPI_LIST_APPEND(tail, strv[i]); + } + p->has_cpr_exec_command = true; + break; + } default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 305397a..b8d5f72 100644 --- a/migration/options.c +++ b/migration/options.c @@ -931,6 +931,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->zero_page_detection = s->parameters.zero_page_detection; params->has_direct_io = true; params->direct_io = s->parameters.direct_io; + params->has_cpr_exec_command = true; + params->cpr_exec_command = QAPI_CLONE(strList, + s->parameters.cpr_exec_command); return params; } @@ -964,6 +967,7 @@ void migrate_params_init(MigrationParameters *params) params->has_mode = true; params->has_zero_page_detection = true; params->has_direct_io = true; + params->has_cpr_exec_command = true; } /* @@ -1252,6 +1256,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_direct_io) { dest->direct_io = params->direct_io; } + + if (params->has_cpr_exec_command) { + dest->cpr_exec_command = params->cpr_exec_command; + } } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1381,6 +1389,12 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_direct_io) { s->parameters.direct_io = params->direct_io; } + + if (params->has_cpr_exec_command) { + qapi_free_strList(s->parameters.cpr_exec_command); + s->parameters.cpr_exec_command = + QAPI_CLONE(strList, params->cpr_exec_command); + } } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) diff --git a/qapi/migration.json b/qapi/migration.json index 0f24206..20092d2 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -829,6 +829,10 @@ # only has effect if the @mapped-ram capability is enabled. # (Since 9.1) # +# @cpr-exec-command: Command to start the new QEMU process when @mode +# is @cpr-exec. The first list element is the program's filename, +# the remainder its arguments. (Since 9.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @@ -854,7 +858,8 @@ 'vcpu-dirty-limit', 'mode', 'zero-page-detection', - 'direct-io'] } + 'direct-io', + 'cpr-exec-command'] } ## # @MigrateSetParameters: @@ -1004,6 +1009,10 @@ # only has effect if the @mapped-ram capability is enabled. # (Since 9.1) # +# @cpr-exec-command: Command to start the new QEMU process when @mode +# is @cpr-exec. The first list element is the program's filename, +# the remainder its arguments. (Since 9.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @@ -1044,7 +1053,8 @@ '*vcpu-dirty-limit': 'uint64', '*mode': 'MigMode', '*zero-page-detection': 'ZeroPageDetection', - '*direct-io': 'bool' } } + '*direct-io': 'bool', + '*cpr-exec-command': [ 'str' ]} } ## # @migrate-set-parameters: @@ -1208,6 +1218,10 @@ # only has effect if the @mapped-ram capability is enabled. # (Since 9.1) # +# @cpr-exec-command: Command to start the new QEMU process when @mode +# is @cpr-exec. The first list element is the program's filename, +# the remainder its arguments. (Since 9.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @@ -1245,7 +1259,8 @@ '*vcpu-dirty-limit': 'uint64', '*mode': 'MigMode', '*zero-page-detection': 'ZeroPageDetection', - '*direct-io': 'bool' } } + '*direct-io': 'bool', + '*cpr-exec-command': [ 'str' ]} } ## # @query-migrate-parameters: From patchwork Sun Jun 30 19:40:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B85DCC27C4F for ; Sun, 30 Jun 2024 19:41:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0QC-0000At-TC; Sun, 30 Jun 2024 15:41:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Px-0008Vc-Ql for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:58 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Pv-0004Pf-AM for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:40:57 -0400 Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UJDtaG021987; Sun, 30 Jun 2024 19:40:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=b028g9TMxwjL28USZRQNrGWTPltZfSI206ZNnMeo+FE=; b= V81i7Hi0rFgrkpqkfcmY9IJ2tBsxa4ZD9ONcztXag4fvRq8MNWq2o4pxnYS454bx p2xkL+WQ+7crQHEJjsd30eTXjpcA/DTgrdPg8byljydCVdD8ASxiuqOEFS+VVtuy oL7m9dpmPw8OyBhznyRsRVIvmv9Zj2239M4d+eKs2H1jv2dxr8u72KvSGcASNtwF +KPPOtRxZDI1y7m4410VaXZG3+1iDLqvybdZAFeBQrAY/YXNKKwlQJFoc5XJcMM2 i/AkxTiv+Tatrdfu/nnbkpIFJSUB4y8byjlSJlCQdi3LbLYCdFDthynahO7H5BD/ wnTxWndv1NzzYRsPvRNHag== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40296asg9q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:50 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UI4VL7018993; Sun, 30 Jun 2024 19:40:49 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16ek-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:49 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSi014044; Sun, 30 Jun 2024 19:40:48 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-11; Sun, 30 Jun 2024 19:40:48 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 10/11] migration: cpr-exec save and load Date: Sun, 30 Jun 2024 12:40:33 -0700 Message-Id: <1719776434-435013-11-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: dfX4xSzsgK0SbDrY6mNJmAPiS-CwiTHy X-Proofpoint-ORIG-GUID: dfX4xSzsgK0SbDrY6mNJmAPiS-CwiTHy Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org To preserve CPR state across exec, create a QEMUFile based on a memfd, and keep the memfd open across exec. Save the value of the memfd in an environment variable so post-exec QEMU can find it. These new functions are called in a subsequent patch. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 5 +++ migration/cpr-exec.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++ migration/meson.build | 1 + 3 files changed, 101 insertions(+) create mode 100644 migration/cpr-exec.c diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 42b4019..76d6ccb 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -25,4 +25,9 @@ void cpr_resave_fd(const char *name, int id, int fd); int cpr_state_save(Error **errp); int cpr_state_load(Error **errp); +QEMUFile *cpr_exec_output(Error **errp); +QEMUFile *cpr_exec_input(Error **errp); +void cpr_exec_persist_state(QEMUFile *f); +bool cpr_exec_has_state(void); +void cpr_exec_unpersist_state(void); #endif diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c new file mode 100644 index 0000000..5c40457 --- /dev/null +++ b/migration/cpr-exec.c @@ -0,0 +1,95 @@ +/* + * Copyright (c) 2021-2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/cutils.h" +#include "qemu/memfd.h" +#include "qapi/error.h" +#include "io/channel-file.h" +#include "io/channel-socket.h" +#include "migration/cpr.h" +#include "migration/qemu-file.h" +#include "migration/misc.h" +#include "migration/vmstate.h" +#include "sysemu/runstate.h" + +#define CPR_EXEC_STATE_NAME "QEMU_CPR_EXEC_STATE" + +static QEMUFile *qemu_file_new_fd_input(int fd, const char *name) +{ + g_autoptr(QIOChannelFile) fioc = qio_channel_file_new_fd(fd); + QIOChannel *ioc = QIO_CHANNEL(fioc); + qio_channel_set_name(ioc, name); + return qemu_file_new_input(ioc); +} + +static QEMUFile *qemu_file_new_fd_output(int fd, const char *name) +{ + g_autoptr(QIOChannelFile) fioc = qio_channel_file_new_fd(fd); + QIOChannel *ioc = QIO_CHANNEL(fioc); + qio_channel_set_name(ioc, name); + return qemu_file_new_output(ioc); +} + +void cpr_exec_persist_state(QEMUFile *f) +{ + QIOChannelFile *fioc = QIO_CHANNEL_FILE(qemu_file_get_ioc(f)); + int mfd = dup(fioc->fd); + char val[16]; + + /* Remember mfd in environment for post-exec load */ + qemu_clear_cloexec(mfd); + snprintf(val, sizeof(val), "%d", mfd); + g_setenv(CPR_EXEC_STATE_NAME, val, 1); +} + +static int cpr_exec_find_state(void) +{ + const char *val = g_getenv(CPR_EXEC_STATE_NAME); + int mfd; + + assert(val); + g_unsetenv(CPR_EXEC_STATE_NAME); + assert(!qemu_strtoi(val, NULL, 10, &mfd)); + return mfd; +} + +bool cpr_exec_has_state(void) +{ + return g_getenv(CPR_EXEC_STATE_NAME) != NULL; +} + +void cpr_exec_unpersist_state(void) +{ + int mfd; + const char *val = g_getenv(CPR_EXEC_STATE_NAME); + + g_unsetenv(CPR_EXEC_STATE_NAME); + assert(val); + assert(!qemu_strtoi(val, NULL, 10, &mfd)); + close(mfd); +} + +QEMUFile *cpr_exec_output(Error **errp) +{ + int mfd = memfd_create(CPR_EXEC_STATE_NAME, 0); + + if (mfd < 0) { + error_setg_errno(errp, errno, "memfd_create failed"); + return NULL; + } + + return qemu_file_new_fd_output(mfd, CPR_EXEC_STATE_NAME); +} + +QEMUFile *cpr_exec_input(Error **errp) +{ + int mfd = cpr_exec_find_state(); + + lseek(mfd, 0, SEEK_SET); + return qemu_file_new_fd_input(mfd, CPR_EXEC_STATE_NAME); +} diff --git a/migration/meson.build b/migration/meson.build index 87feb4c..dd1d315 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -14,6 +14,7 @@ system_ss.add(files( 'channel.c', 'channel-block.c', 'cpr.c', + 'cpr-exec.c', 'dirtyrate.c', 'exec.c', 'fd.c', From patchwork Sun Jun 30 19:40:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Sistare X-Patchwork-Id: 13717315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4C986C30653 for ; Sun, 30 Jun 2024 19:42:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sO0QO-0000G7-Qf; Sun, 30 Jun 2024 15:41:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0QB-0000As-Cs for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:41:12 -0400 Received: from mx0a-00069f02.pphosted.com ([205.220.165.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sO0Q6-0004Pp-7W for qemu-devel@nongnu.org; Sun, 30 Jun 2024 15:41:11 -0400 Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45UIt2tN001382; Sun, 30 Jun 2024 19:40:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references; s= corp-2023-11-20; bh=s2nPXNUcWaNjXppRfhXzp3VvXyVUj6SXPjOtMadq9Ac=; b= fZmQ1aySNvNYlSxjrzZFDf0dYFq6HcZkiLb1G6ZeTsmsOWuKT9aZg15zmybe4CCX qMEGX9EhStLeL6blSncH6p0vzory5i21y0VDnGimNSjg2MK49DEkTv95rxH6EXBy QUk5uM7XmIg4zOhhWGlw1udKhrU1lD9tf/auw6PRCWkhI7I/uGq2fLhWbAzEEl8Y 9bcg3bz9qiueT3cAxN9HEryud0dJM1BIIma52W/EKb48/4LGOGf/KE1hHlnYg+0s rrmQaJGmq9k/kcQVDwT90imlT8eixWtjfwnL3rVnA/+LszTpqMtUVrneU2zaHmLD lWkMOOz23oY3Suw4BDPgtw== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 402att9e21-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:52 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45UIF7Pe018995; Sun, 30 Jun 2024 19:40:50 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 4028qc16ev-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 30 Jun 2024 19:40:50 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 45UJeaSk014044; Sun, 30 Jun 2024 19:40:50 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 4028qc16cc-12; Sun, 30 Jun 2024 19:40:50 +0000 From: Steve Sistare To: qemu-devel@nongnu.org Cc: Peter Xu , Fabiano Rosas , David Hildenbrand , Marcel Apfelbaum , Eduardo Habkost , Philippe Mathieu-Daude , Paolo Bonzini , "Daniel P. Berrange" , Markus Armbruster , Steve Sistare Subject: [PATCH V2 11/11] migration: cpr-exec mode Date: Sun, 30 Jun 2024 12:40:34 -0700 Message-Id: <1719776434-435013-12-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> References: <1719776434-435013-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-30_16,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2406300157 X-Proofpoint-GUID: 5kF2f7jvzqH_aire-mOMf0vaaCopLgJ4 X-Proofpoint-ORIG-GUID: 5kF2f7jvzqH_aire-mOMf0vaaCopLgJ4 Received-SPF: pass client-ip=205.220.165.32; envelope-from=steven.sistare@oracle.com; helo=mx0a-00069f02.pphosted.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add the cpr-exec migration mode. Usage: qemu-system-$arch -machine anon-alloc=memfd ... migrate_set_parameter mode cpr-exec migrate_set_parameter cpr-exec-command \ ... -incoming \ migrate -d The migrate command stops the VM, saves state to uri-1, directly exec's a new version of QEMU on the same host, replacing the original process while retaining its PID, and loads state from uri-1. Guest RAM is preserved in place, albeit with new virtual addresses. The new QEMU process is started by exec'ing the command specified by the @cpr-exec-command parameter. The first word of the command is the binary, and the remaining words are its arguments. The command may be a direct invocation of new QEMU, or may be a non-QEMU command that exec's the new QEMU binary. This mode creates a second migration channel that is not visible to the user. At the start of migration, old QEMU saves CPR state to the second channel, and at the end of migration, it tells the main loop to call cpr_exec. New QEMU loads CPR state early, before objects are created. Because old QEMU terminates when new QEMU starts, one cannot stream data between the two, so uri-1 must be a type, such as a file, that accepts all data before old QEMU exits. Otherwise, old QEMU may quietly block writing to the channel. Memory-backend objects must have the share=on attribute, but memory-backend-epc is not supported. The VM must be started with the '-machine anon-alloc=memfd' option, which allows anonymous memory to be transferred in place to the new process. The memfds are kept open across exec by clearing the close-on-exec flag, their values are saved in CPR state, and they are mmap'd in new qemu. Note that the anon-alloc option is not related to memory-backend-memfd. Later patches add support for memory-backend-memfd. Signed-off-by: Steve Sistare --- include/migration/cpr.h | 2 ++ migration/cpr-exec.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++++ migration/cpr.c | 37 ++++++++++++++++++--- migration/migration.c | 14 ++++++-- migration/ram.c | 2 ++ qapi/migration.json | 24 +++++++++++++- 6 files changed, 157 insertions(+), 7 deletions(-) diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 76d6ccb..c6c60f8 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -30,4 +30,6 @@ QEMUFile *cpr_exec_input(Error **errp); void cpr_exec_persist_state(QEMUFile *f); bool cpr_exec_has_state(void); void cpr_exec_unpersist_state(void); +void cpr_mig_init(void); +void cpr_unpreserve_fds(void); #endif diff --git a/migration/cpr-exec.c b/migration/cpr-exec.c index 5c40457..fd65435 100644 --- a/migration/cpr-exec.c +++ b/migration/cpr-exec.c @@ -11,8 +11,11 @@ #include "qapi/error.h" #include "io/channel-file.h" #include "io/channel-socket.h" +#include "block/block-global-state.h" +#include "qemu/main-loop.h" #include "migration/cpr.h" #include "migration/qemu-file.h" +#include "migration/migration.h" #include "migration/misc.h" #include "migration/vmstate.h" #include "sysemu/runstate.h" @@ -93,3 +96,85 @@ QEMUFile *cpr_exec_input(Error **errp) lseek(mfd, 0, SEEK_SET); return qemu_file_new_fd_input(mfd, CPR_EXEC_STATE_NAME); } + +static int preserve_fd(int fd) +{ + qemu_clear_cloexec(fd); + return 0; +} + +static int unpreserve_fd(int fd) +{ + qemu_set_cloexec(fd); + return 0; +} + +static void cpr_preserve_fds(void) +{ + cpr_walk_fd(preserve_fd); +} + +void cpr_unpreserve_fds(void) +{ + cpr_walk_fd(unpreserve_fd); +} + +static void cpr_exec(char **argv) +{ + MigrationState *s = migrate_get_current(); + Error *err = NULL; + + /* + * Clear the close-on-exec flag for all preserved fd's. We cannot do so + * earlier because they should not persist across miscellaneous fork and + * exec calls that are performed during normal operation. + */ + cpr_preserve_fds(); + + execvp(argv[0], argv); + + cpr_unpreserve_fds(); + + error_setg_errno(&err, errno, "execvp %s failed", argv[0]); + error_report_err(error_copy(err)); + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); + migrate_set_error(s, err); + + migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL); + + if (s->block_inactive) { + Error *local_err = NULL; + bdrv_activate_all(&local_err); + if (local_err) { + error_report_err(local_err); + return; + } else { + s->block_inactive = false; + } + } + + if (runstate_is_live(s->vm_old_state)) { + vm_start(); + } +} + +static int cpr_exec_notifier(NotifierWithReturn *notifier, MigrationEvent *e, + Error **errp) +{ + MigrationState *s = migrate_get_current(); + + if (e->type == MIG_EVENT_PRECOPY_DONE) { + assert(s->state == MIGRATION_STATUS_COMPLETED); + qemu_system_exec_request(cpr_exec, s->parameters.cpr_exec_command); + } else if (e->type == MIG_EVENT_PRECOPY_FAILED) { + cpr_exec_unpersist_state(); + } + return 0; +} + +void cpr_mig_init(void) +{ + static NotifierWithReturn notifier; + migration_add_notifier_mode(¬ifier, cpr_exec_notifier, + MIG_MODE_CPR_EXEC); +} diff --git a/migration/cpr.c b/migration/cpr.c index 1c296c6..f756c15 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -9,6 +9,7 @@ #include "qapi/error.h" #include "migration/cpr.h" #include "migration/misc.h" +#include "migration/options.h" #include "migration/qemu-file.h" #include "migration/savevm.h" #include "migration/vmstate.h" @@ -160,9 +161,16 @@ int cpr_state_save(Error **errp) { int ret; QEMUFile *f; + MigMode mode = migrate_mode(); - /* set f based on mode in a later patch in this series */ - return 0; + if (mode == MIG_MODE_CPR_EXEC) { + f = cpr_exec_output(errp); + } else { + return 0; + } + if (!f) { + return -1; + } qemu_put_be32(f, QEMU_VM_FILE_MAGIC); qemu_put_be32(f, QEMU_VM_FILE_VERSION); @@ -170,8 +178,14 @@ int cpr_state_save(Error **errp) ret = vmstate_save_state(f, &vmstate_cpr_state, &cpr_state, 0); if (ret) { error_setg(errp, "vmstate_save_state error %d", ret); + goto out; } + if (mode == MIG_MODE_CPR_EXEC) { + cpr_exec_persist_state(f); + } + +out: qemu_fclose(f); return ret; } @@ -182,8 +196,18 @@ int cpr_state_load(Error **errp) uint32_t v; QEMUFile *f; - /* set f based on mode in a later patch in this series */ - return 0; + /* + * Mode will be loaded in CPR state, so cannot use it to decide which + * form of state to load. + */ + if (cpr_exec_has_state()) { + f = cpr_exec_input(errp); + } else { + return 0; + } + if (!f) { + return -1; + } v = qemu_get_be32(f); if (v != QEMU_VM_FILE_MAGIC) { @@ -203,6 +227,11 @@ int cpr_state_load(Error **errp) error_setg(errp, "vmstate_load_state error %d", ret); } + if (migrate_mode() == MIG_MODE_CPR_EXEC) { + /* Set cloexec to prevent fd leaks from fork until the next cpr-exec */ + cpr_unpreserve_fds(); + } + qemu_fclose(f); return ret; } diff --git a/migration/migration.c b/migration/migration.c index 8a8e927..a4a020e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -73,9 +73,10 @@ #define INMIGRATE_DEFAULT_EXIT_ON_ERROR true -static NotifierWithReturnList migration_state_notifiers[] = { +static NotifierWithReturnList migration_state_notifiers[MIG_MODE__MAX] = { NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_NORMAL), NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_REBOOT), + NOTIFIER_ELEM_INIT(migration_state_notifiers, MIG_MODE_CPR_EXEC), }; /* Messages sent on the return path from destination to source */ @@ -264,6 +265,7 @@ void migration_object_init(void) ram_mig_init(); dirty_bitmap_mig_init(); + cpr_mig_init(); } typedef struct { @@ -1693,7 +1695,9 @@ bool migration_thread_is_self(void) bool migrate_mode_is_cpr(MigrationState *s) { - return s->parameters.mode == MIG_MODE_CPR_REBOOT; + MigMode mode = s->parameters.mode; + return mode == MIG_MODE_CPR_REBOOT || + mode == MIG_MODE_CPR_EXEC; } int migrate_init(MigrationState *s, Error **errp) @@ -2028,6 +2032,12 @@ static bool migrate_prepare(MigrationState *s, bool resume, Error **errp) return false; } + if (migrate_mode() == MIG_MODE_CPR_EXEC && + !s->parameters.has_cpr_exec_command) { + error_setg(errp, "cpr-exec mode requires setting cpr-exec-command"); + return false; + } + if (migration_is_blocked(errp)) { return false; } diff --git a/migration/ram.c b/migration/ram.c index eaf3151..45b8f00 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -216,7 +216,9 @@ static bool postcopy_preempt_active(void) bool migrate_ram_is_ignored(RAMBlock *block) { + MigMode mode = migrate_mode(); return !qemu_ram_is_migratable(block) || + mode == MIG_MODE_CPR_EXEC || (migrate_ignore_shared() && qemu_ram_is_shared(block) && qemu_ram_is_named_file(block)); } diff --git a/qapi/migration.json b/qapi/migration.json index 20092d2..4e626df 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -604,9 +604,31 @@ # or COLO. # # (since 8.2) +# +# @cpr-exec: The migrate command stops the VM, saves state to the +# migration channel, directly exec's a new version of QEMU on the +# same host, replacing the original process while retaining its +# PID, and loads state from the channel. Guest RAM is preserved +# in place. +# +# Old QEMU starts new QEMU by exec'ing the command specified by +# the @cpr-exec-command parameter. The command may be a direct +# invocation of new QEMU, or may be a non-QEMU command that exec's +# the new QEMU binary. +# +# Because old QEMU terminates when new QEMU starts, one cannot +# stream data between the two, so the channel must be a type, +# such as a file, that accepts all data before old QEMU exits. +# Otherwise, old QEMU may quietly block writing to the channel. +# +# Memory-backend objects must have the share=on attribute, but +# memory-backend-epc is not supported. The VM must be started +# with the '-machine anon-alloc=memfd' option. +# +# (since 9.1) ## { 'enum': 'MigMode', - 'data': [ 'normal', 'cpr-reboot' ] } + 'data': [ 'normal', 'cpr-reboot', 'cpr-exec' ] } ## # @ZeroPageDetection: