From patchwork Thu Jul 13 14:55:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13312327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B455C001DE for ; Thu, 13 Jul 2023 14:56:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA66390001C; Thu, 13 Jul 2023 10:56:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C57E56B007B; Thu, 13 Jul 2023 10:56:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A82BF90001C; Thu, 13 Jul 2023 10:56:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9B6826B0078 for ; Thu, 13 Jul 2023 10:56:08 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 650F9B03B4 for ; Thu, 13 Jul 2023 14:56:08 +0000 (UTC) X-FDA: 81006888816.15.762E4DD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id 52DD1140026 for ; Thu, 13 Jul 2023 14:56:06 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cT3GAHcd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689260166; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/51NS5IdbGhAF1WTe4etoP/bWkQWV6fSApbrFXQZ0Fk=; b=U12wp1AYq5YyPqe90GRhrinJFDZxx83KkDvJ/rT86Pc8YDZAiavu7Ea86+8kZx1HRptmck HL65LC32/mGLitWz/3sGJbMd5hl0npAiY8kRikFEGPe38JllA4gAhaKzYaLLltX+UsvSir FxPJtk4jZrPkYqe8i0sL0sIi9T89a8I= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cT3GAHcd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689260166; a=rsa-sha256; cv=none; b=D5y3JA3gqa8pCCpz6pRHJ1Xvu2DqxmtFfISIeo7m/hkqxwO1AFeJ6KAerADhhp8nLvdddT OP7RhnUqrjiTxl4c77HRlWvNXkBaM1a5Jyo5/I/cMMiey/ljhWsymXG1QjukavNsl3YmTP dzS8V7KZtp6TpQEF0jssE7nou3cPrTo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689260165; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/51NS5IdbGhAF1WTe4etoP/bWkQWV6fSApbrFXQZ0Fk=; b=cT3GAHcdCMQ6jimAJQcgCQMJtqPvjFJaqjSE3TpXOjRlzPh0XWuV0vyxYMSbFxKejRy42M PfBktahuYoCIi44iy5azBjA31yVOkbzmqZG3sirqrLVF1lVmzJp7qRcanmfY4NAXsV4nwH CuxHY6ApF13i2aMx1mbzXcomG4tDv84= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-669-Cnd2LY13MK-Zy4GC5ZFc9w-1; Thu, 13 Jul 2023 10:56:01 -0400 X-MC-Unique: Cnd2LY13MK-Zy4GC5ZFc9w-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C7FA283FC20; Thu, 13 Jul 2023 14:56:00 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.245]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9F036F66D1; Thu, 13 Jul 2023 14:55:59 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, virtualization@lists.linux-foundation.org, David Hildenbrand , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo Subject: [PATCH v1 3/4] virtio-mem: keep retrying on offline_and_remove_memory() errors in Sub Block Mode (SBM) Date: Thu, 13 Jul 2023 16:55:50 +0200 Message-ID: <20230713145551.2824980-4-david@redhat.com> In-Reply-To: <20230713145551.2824980-1-david@redhat.com> References: <20230713145551.2824980-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Rspam-User: X-Stat-Signature: 54wsfe8wyqhw6nuwitqhpnfzz41ij77s X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 52DD1140026 X-HE-Tag: 1689260166-669354 X-HE-Meta: U2FsdGVkX195f0TYu7W19VFvlTlUePxsSUholySTjvoYWWeNyG9bJmRJamLhXTN6WcC+/kQqduT7jMiCvyThBV6ZCtqnGy06zDAP2x6HdB0VzLIzWNgJenj9DUbraDm/979Cnz724jUl6d3/Q4XT90urUhoUcE+q4QrKEGREMikh/C28cHfXyZtsoDlRzV+SZfZ6s7umFl9U9jg0HuX6b3mmXCSWXF3+0B1hN7lcnxGHMJiNnpNv0TArhSjDjtDDkGRd2v308kpUyPBCxlgHQs1QU7WiOWBQIvoea/0O5RFuKiyMtOerVq/HKp+G9QM5OLg8j5/8coR1iiceskkth6MZC5MT6pX1qCBTYPnwxtHpzYJh6oWw3gpEfb2sYV+6ZJD1blTKOc2b8mCkAfYgFA0eIOK0oHL376SDqmpTgUXVwSSRwYYYL/riVv/bmxdC8+50pkJMcRvfEnBouSCoYaGEa/u5Ei85nmOXiKRh5Yy0BOjKLyXK3313r7udnfBh1fJ9+6RPv61uhM6U7rqDPRL9gsr5fiJB8xkVCTbDZqrIBC58MEAo3l1PEfsDkjPWhQCiIMaaRbcSCL2C15TOSYmV1RrD8zXCEjIVs8V0TjX7xPhYaDlMqvjVygddlhfLz9dnCl64/B/blfyVORmRl1M27z+QCerdxBMhEYcfYlo92za4ydvVNsXNnrhi8BU5b+NaK9WhcWoZmt5t/gflbCSGmnI3Qe8Js8OEg8c+x82HAb4QEmcK/IjpjBjIDjioBFqIz1BnzGU30fme+QnPRD0rsiIo2CibQT+pML8+1rQjF2R1t8DYLh9pMfOz6lzkUYkJI9b+VscSVLHYWfgV8psDpjXCygzB1mCO/P3CVGV6II01bAb263Pf+3ciSNfMaw4nz7zPRrg5kpgUWpmV41YjFURCBNDNZKZ9frIAgC5V0m1DJarHVKY3CL65F4E5FDAp8BwI8Zdw/TznYX9 SSNYtSfR bWfoRmSVtGhkn6Eg6+D7lvKOUOps8BRDC0Cjq6EfyPA0SWjKxyndLTR92H6zOtAYYjNGIXbUsd1PFG7wfyWHQD1NoHZJF/vweb2vT+GVE7xq9CqAFhTrHKA0Iry6B+y+SPgJVpVoGtua4vf5X5NwV8f1k/XHjxloUe84wAmiJGH2FkjweDQ+jKA8tnjWe7+4iazZiizMPKyxbGfFtCRl6+Pd2Aw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In case offline_and_remove_memory() fails in SBM, we leave a completely unplugged Linux memory block stick around until we try plugging memory again. We won't try removing that memory block again. offline_and_remove_memory() may, for example, fail if we're racing with another alloc_contig_range() user, if allocating temporary memory fails, or if some memory notifier rejected the offlining request. Let's handle that case better, by simple retrying to offline and remove such memory. Tested using CONFIG_MEMORY_NOTIFIER_ERROR_INJECT. Signed-off-by: David Hildenbrand --- drivers/virtio/virtio_mem.c | 92 +++++++++++++++++++++++++++++-------- 1 file changed, 73 insertions(+), 19 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 1a76ba2bc118..a5cf92e3e5af 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -168,6 +168,13 @@ struct virtio_mem { /* The number of subblocks per Linux memory block. */ uint32_t sbs_per_mb; + /* + * Some of the Linux memory blocks tracked as "partially + * plugged" are completely unplugged and can be offlined + * and removed -- which previously failed. + */ + bool have_unplugged_mb; + /* Summary of all memory block states. */ unsigned long mb_count[VIRTIO_MEM_SBM_MB_COUNT]; @@ -765,6 +772,34 @@ static int virtio_mem_sbm_offline_and_remove_mb(struct virtio_mem *vm, return virtio_mem_offline_and_remove_memory(vm, addr, size); } +/* + * Try (offlining and) removing memory from Linux in case all subblocks are + * unplugged. Can be called on online and offline memory blocks. + * + * May modify the state of memory blocks in virtio-mem. + */ +static int virtio_mem_sbm_try_remove_unplugged_mb(struct virtio_mem *vm, + unsigned long mb_id) +{ + int rc; + + /* + * Once all subblocks of a memory block were unplugged, offline and + * remove it. + */ + if (!virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) + return 0; + + /* offline_and_remove_memory() works for online and offline memory. */ + mutex_unlock(&vm->hotplug_mutex); + rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id); + mutex_lock(&vm->hotplug_mutex); + if (!rc) + virtio_mem_sbm_set_mb_state(vm, mb_id, + VIRTIO_MEM_SBM_MB_UNUSED); + return rc; +} + /* * See virtio_mem_offline_and_remove_memory(): Try to offline and remove a * all Linux memory blocks covered by the big block. @@ -1988,20 +2023,10 @@ static int virtio_mem_sbm_unplug_any_sb_online(struct virtio_mem *vm, } unplugged: - /* - * Once all subblocks of a memory block were unplugged, offline and - * remove it. This will usually not fail, as no memory is in use - * anymore - however some other notifiers might NACK the request. - */ - if (virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) { - mutex_unlock(&vm->hotplug_mutex); - rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id); - mutex_lock(&vm->hotplug_mutex); - if (!rc) - virtio_mem_sbm_set_mb_state(vm, mb_id, - VIRTIO_MEM_SBM_MB_UNUSED); - } - + rc = virtio_mem_sbm_try_remove_unplugged_mb(vm, mb_id); + if (rc) + vm->sbm.have_unplugged_mb = 1; + /* Ignore errors, this is not critical. We'll retry later. */ return 0; } @@ -2253,12 +2278,13 @@ static int virtio_mem_unplug_request(struct virtio_mem *vm, uint64_t diff) /* * Try to unplug all blocks that couldn't be unplugged before, for example, - * because the hypervisor was busy. + * because the hypervisor was busy. Further, offline and remove any memory + * blocks where we previously failed. */ -static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm) +static int virtio_mem_cleanup_pending_mb(struct virtio_mem *vm) { unsigned long id; - int rc; + int rc = 0; if (!vm->in_sbm) { virtio_mem_bbm_for_each_bb(vm, id, @@ -2280,6 +2306,27 @@ static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm) VIRTIO_MEM_SBM_MB_UNUSED); } + if (!vm->sbm.have_unplugged_mb) + return 0; + + /* + * Let's retry (offlining and) removing completely unplugged Linux + * memory blocks. + */ + vm->sbm.have_unplugged_mb = false; + + mutex_lock(&vm->hotplug_mutex); + virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL) + rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id); + virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL) + rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id); + virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL) + rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id); + mutex_unlock(&vm->hotplug_mutex); + + if (rc) + vm->sbm.have_unplugged_mb = true; + /* Ignore errors, this is not critical. We'll retry later. */ return 0; } @@ -2361,9 +2408,9 @@ static void virtio_mem_run_wq(struct work_struct *work) virtio_mem_refresh_config(vm); } - /* Unplug any leftovers from previous runs */ + /* Cleanup any leftovers from previous runs */ if (!rc) - rc = virtio_mem_unplug_pending_mb(vm); + rc = virtio_mem_cleanup_pending_mb(vm); if (!rc && vm->requested_size != vm->plugged_size) { if (vm->requested_size > vm->plugged_size) { @@ -2375,6 +2422,13 @@ static void virtio_mem_run_wq(struct work_struct *work) } } + /* + * Keep retrying to offline and remove completely unplugged Linux + * memory blocks. + */ + if (!rc && vm->in_sbm && vm->sbm.have_unplugged_mb) + rc = -EBUSY; + switch (rc) { case 0: vm->retry_timer_ms = VIRTIO_MEM_RETRY_TIMER_MIN_MS;