From patchwork Mon Mar 10 12:03:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 14009707 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11082C282DE for ; Mon, 10 Mar 2025 12:04:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD619280007; Mon, 10 Mar 2025 08:04:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7B3A280001; Mon, 10 Mar 2025 08:04:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F5D1280007; Mon, 10 Mar 2025 08:04:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6D3D2280001 for ; Mon, 10 Mar 2025 08:04:12 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EE9D01A0E20 for ; Mon, 10 Mar 2025 12:04:13 +0000 (UTC) X-FDA: 83205508386.23.8157795 Received: from forwardcorp1a.mail.yandex.net (forwardcorp1a.mail.yandex.net [178.154.239.72]) by imf14.hostedemail.com (Postfix) with ESMTP id 11786100008 for ; Mon, 10 Mar 2025 12:04:11 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=x72TBAYJ; dmarc=pass (policy=none) header.from=yandex-team.com; spf=pass (imf14.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.72 as permitted sender) smtp.mailfrom=arbn@yandex-team.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741608252; a=rsa-sha256; cv=none; b=2XA9hKgs2zxdfhaDwhna1fai+uQ6cXzE8CRKdCzUCiZiWCdWmkhTF//N12HwCP31ya3MhQ sgP5Y4g26zKm3W3vFSP7MvlXkEn1O9zUZJosBdfKBpPQI7EJUmeAUtsZkNV+sblaKYvZoN bHADZMwzszZN0asfMA5hXPM09Mf6Rjw= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=x72TBAYJ; dmarc=pass (policy=none) header.from=yandex-team.com; spf=pass (imf14.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.72 as permitted sender) smtp.mailfrom=arbn@yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741608252; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=C5c3jBE2oO6bqhBgG572MhRimfpuB96f90y0qeqDqaE=; b=G/jGdiKnsiOPikd4gJPM5NilM/6e3JQBV49Zh5+1i19DOJgjuZTT4FxXwUjC7VQljTWFGg KBedYjKG0eGVhJRo7APeJGyjWUpPdsI9KlhCNssBtoNjL0hAVFOOlWjIVSD7AElmoaNIY/ Shzr8BmaG0GKGQoZFEZHmICU1OZ317Q= Received: from mail-nwsmtp-smtp-corp-main-83.vla.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-83.vla.yp-c.yandex.net [IPv6:2a02:6b8:c1f:600c:0:640:a431:0]) by forwardcorp1a.mail.yandex.net (Yandex) with ESMTPS id A80A160E06; Mon, 10 Mar 2025 15:04:10 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-83.vla.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id s3o0lL2FT0U0-3q4ic3FI; Mon, 10 Mar 2025 15:04:09 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1741608250; bh=C5c3jBE2oO6bqhBgG572MhRimfpuB96f90y0qeqDqaE=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=x72TBAYJRosfJkV7QqfGY6QJIW4ozXZqVj+oHuUpLHNMH2cZMJvUt085RbtBwN2x+ d7TAjQWgFTsAD4ThVtRq38Oxh4sxvtZW/C5NJi3+wwzfXsmGrTOuJh6TQNaT1lZGRs bX2YRijLOeTBrqHRVuYuRUt4MkNUCTd9QMpWNxT4= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H . Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Pratyush Yadav , Jason Gunthorpe , Pasha Tatashin , David Rientjes , Andrey Ryabinin Subject: [PATCH v2 4/7] kexec, kstate: delay loading of kexec segments Date: Mon, 10 Mar 2025 13:03:15 +0100 Message-ID: <20250310120318.2124-5-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.3 In-Reply-To: <20250310120318.2124-1-arbn@yandex-team.com> References: <20250310120318.2124-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 11786100008 X-Stat-Signature: 4n55dah96oacgj73bn7365uobbnyuugy X-Rspamd-Server: rspam02 X-Rspam-User: X-HE-Tag: 1741608251-807212 X-HE-Meta: U2FsdGVkX1+VvZM5pTipQFsYl4p9DdUhISuaG7f+whCLK8EGIPBHNWhwsnA5dvvJvShbrT1B+u0xetM35BPu7R/rediQEOQ/bliuQ+mpSGHQbcjvBf2sAIyLgHVQWsZR5+ZGtN48lhQvUzTUc+m2jc6Org1K4haR4mIoOcJMUCg3zaOQTIp+99bwPc8gOz59c4lbbwFEaDZsvkvA9jln6R5k8DazAf42MqlONxHOyOSaC6dn5lUL6Pjqovm/ECoEpf5dDDYkB1ZMczEtTCDGTosKcwmbFxJMrZzGBlAGh1qSNLxLOdlNs7/nl22wnDrK6qc3VS3uKwBg6/wgHwaqyduWtuOpoLCAgAthTXjITxw9jYL9QW6OT3eOR+4bvJ7i1zwxacQ0z3BjNobFEG6BIhK+H/HI2mKfpB5Gkjr54OyHERLIWe9FI8k1RJrkgMLGSAz3ztl8/0dliHP3lAzkZTK4/q2aqUmYErdMCbM/RXxRk7oe5A8tk77eBhmDT9YcAGK5b1NSHvpcpa9E5QdXWsTwiEg2GSIlHasJsx0gz2uceo1ncMXovmMw5ELD1EuFq2eNWUB9hEh/eomihICrYVsfRLWqBC6609r/9pOuCazWKJ0kQeNSGsqEClrEU9fm84YJu8w8VRmW0QP+iyN7DrfQ+UOY2upJbx/TBngkncr/8+h4BrG+EqtJeoBesV3nZvUyR4OcOT4SfUriTl52/95jc4MEvwvUzYKr724SDUzYratkqMZvwyBax7e42hcUsU60hzJirmY7FvOTmgm72kQBo9uEY7Kj+nDm08d0EeLBf/3nnBYXiwbK7+M+8BFMnx/2K3RuSgc5kqOS+qTjCpQzqCl9FG3uadaaWSFZmTxylwUlRYYdqX+Wb53NXcDh+tKZksdbwAKLnblkr0BMywIz22fnCSH6SF/w4fNWt1vCTsJwtXOU2z9PKp2Wr8c0DcWlcm6SBZnEH/GjlaM CWeTlHfs r07e1kOCk4ge/iVYnIwS9fhDrt87DBS1YJgJFOoQ0KYQyziNkta2bl5k+93EyvoAAvbJFoIg9ovMG4g4o40bD/3b/lm4wRsmrgUc3tNhX4vfC/IXmumbz0uthMp4gQx/gf8z/cPCkptYfEkYW0jAjy0ZaX+ErsLvfT2PvG7ae8cjoOZA4XfmVztjBc/Lz6biBxuvJYE3Cf+QYqnuDqO+79GpRUk+gFvSJzGtpH/7Jg/MMMqh25ftPbN6/nXA8GqJZHbmRzeP3OcmHAyUQeV1XDgAi9lB1gJQ2SnZ0GIpNcvx3eo3vDOitOFPOKcTdtVJSCXAje/yCFeZl1FEpg2jICmqE5SpszsjmWIiA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: KSTATE's purpose is to preserve some memory across kexec. To make this happen kexec needs to choose destination ranges after the KSTATE, so these ranges doesn't collide with KSTATE preserved memory. Kexec chooses destination ranges on the kexec load stage which might happen long before the actual reboot to the new kernel. This means that KSTATE must know all preserved memory before the kexec_file_load(), unless we delay loading of kexec segments/destination addresses to the latter, at the point of reboot to the new kernel. So let's do that. Signed-off-by: Andrey Ryabinin --- include/linux/kexec.h | 1 + kernel/kexec_core.c | 6 ++ kernel/kexec_file.c | 144 ++++++++++++++++++++++++++-------------- kernel/kexec_internal.h | 6 ++ 4 files changed, 108 insertions(+), 49 deletions(-) diff --git a/include/linux/kexec.h b/include/linux/kexec.h index bd82f04888a1..539aaacfd3fd 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -377,6 +377,7 @@ extern void machine_kexec(struct kimage *image); extern int machine_kexec_prepare(struct kimage *image); extern void machine_kexec_cleanup(struct kimage *image); extern int kernel_kexec(void); +extern int kexec_file_load_segments(struct kimage *image); extern struct page *kimage_alloc_control_pages(struct kimage *image, unsigned int order); diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 647ab5705c37..7c79addeb93b 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -1017,6 +1017,12 @@ int kernel_kexec(void) goto Unlock; } + if (kexec_late_load(kexec_image)) { + error = kexec_file_load_segments(kexec_image); + if (error) + goto Unlock; + } + #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { /* diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 8ecd34071bfa..634e2ed4cc4c 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -187,6 +187,34 @@ kimage_validate_signature(struct kimage *image) } #endif +static int kimage_add_buffers(struct kimage *image) +{ + void *ldata; + int ret = 0; + + /* IMA needs to pass the measurement list to the next kernel. */ + ima_add_kexec_buffer(image); + + ret = kstate_load_migrate_buf(image); + if (ret) + goto out; + + /* Call image load handler */ + ldata = kexec_image_load_default(image); + + if (IS_ERR(ldata)) { + ret = PTR_ERR(ldata); + goto out; + } + + image->image_loader_data = ldata; +out: + /* In case of error, free up all allocated memory in this function */ + if (ret) + kimage_file_post_load_cleanup(image); + return ret; + +} /* * In file mode list of segments is prepared by kernel. Copy relevant * data from user space, do error checking, prepare segment list @@ -197,7 +225,6 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd, unsigned long cmdline_len, unsigned flags) { ssize_t ret; - void *ldata; ret = kernel_read_file_from_fd(kernel_fd, 0, &image->kernel_buf, KEXEC_FILE_SIZE_MAX, NULL, @@ -251,22 +278,6 @@ kimage_file_prepare_segments(struct kimage *image, int kernel_fd, int initrd_fd, image->cmdline_buf_len - 1); } - /* IMA needs to pass the measurement list to the next kernel. */ - ima_add_kexec_buffer(image); - - ret = kstate_load_migrate_buf(image); - if (ret) - goto out; - - /* Call image load handler */ - ldata = kexec_image_load_default(image); - - if (IS_ERR(ldata)) { - ret = PTR_ERR(ldata); - goto out; - } - - image->image_loader_data = ldata; out: /* In case of error, free up all allocated memory in this function */ if (ret) @@ -303,10 +314,6 @@ kimage_file_alloc_init(struct kimage **rimage, int kernel_fd, if (ret) goto out_free_image; - ret = sanity_check_segment_list(image); - if (ret) - goto out_free_post_load_bufs; - ret = -ENOMEM; image->control_code_page = kimage_alloc_control_pages(image, get_order(KEXEC_CONTROL_PAGE_SIZE)); @@ -334,6 +341,70 @@ kimage_file_alloc_init(struct kimage **rimage, int kernel_fd, return ret; } +static int kimage_post_load(struct kimage *image) +{ + int ret, i; + + ret = kexec_calculate_store_digests(image); + if (ret) + goto out; + + kexec_dprintk("nr_segments = %lu\n", image->nr_segments); + for (i = 0; i < image->nr_segments; i++) { + struct kexec_segment *ksegment; + + ksegment = &image->segment[i]; + kexec_dprintk("segment[%d]: buf=0x%p bufsz=0x%zx mem=0x%lx memsz=0x%zx\n", + i, ksegment->buf, ksegment->bufsz, ksegment->mem, + ksegment->memsz); + + ret = kimage_load_segment(image, &image->segment[i]); + if (ret) + goto out; + } + + kimage_terminate(image); + + ret = machine_kexec_post_load(image); + if (ret) + goto out; + + kexec_dprintk("kexec_file_load: type:%u, start:0x%lx head:0x%lx\n", + image->type, image->start, image->head); +out: + return ret; +} + +int kexec_file_load_segments(struct kimage *image) +{ + int ret; + + ret = kimage_add_buffers(image); + if (ret) { + pr_err("failed to add kimage buffers %d\n", ret); + goto out; + } + + ret = sanity_check_segment_list(image); + if (ret) { + pr_err("sanity check failed %d\n", ret); + goto out; + } + + ret = kimage_post_load(image); + if (ret) + pr_err("kimage post load failed %d\n", ret); + +out: + /* + * Free up any temporary buffers allocated which are not needed + * after image has been loaded + */ + kimage_file_post_load_cleanup(image); + + return ret; +} + SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, unsigned long, cmdline_len, const char __user *, cmdline_ptr, unsigned long, flags) @@ -341,7 +412,7 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, int image_type = (flags & KEXEC_FILE_ON_CRASH) ? KEXEC_TYPE_CRASH : KEXEC_TYPE_DEFAULT; struct kimage **dest_image, *image; - int ret = 0, i; + int ret = 0; /* We only trust the superuser with rebooting the system. */ if (!kexec_load_permitted(image_type)) @@ -398,37 +469,12 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, if (ret) goto out; - ret = kexec_calculate_store_digests(image); - if (ret) - goto out; - - kexec_dprintk("nr_segments = %lu\n", image->nr_segments); - for (i = 0; i < image->nr_segments; i++) { - struct kexec_segment *ksegment; - - ksegment = &image->segment[i]; - kexec_dprintk("segment[%d]: buf=0x%p bufsz=0x%zx mem=0x%lx memsz=0x%zx\n", - i, ksegment->buf, ksegment->bufsz, ksegment->mem, - ksegment->memsz); - - ret = kimage_load_segment(image, &image->segment[i]); + if (!kexec_late_load(image)) { + ret = kexec_file_load_segments(image); if (ret) goto out; } - kimage_terminate(image); - - ret = machine_kexec_post_load(image); - if (ret) - goto out; - - kexec_dprintk("kexec_file_load: type:%u, start:0x%lx head:0x%lx flags:0x%lx\n", - image->type, image->start, image->head, flags); - /* - * Free up any temporary buffers allocated which are not needed - * after image has been loaded - */ - kimage_file_post_load_cleanup(image); exchange: image = xchg(dest_image, image); out: diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h index 12e655a70e25..690b1c21b642 100644 --- a/kernel/kexec_internal.h +++ b/kernel/kexec_internal.h @@ -34,6 +34,12 @@ static inline void kexec_unlock(void) atomic_set_release(&__kexec_lock, 0); } +static inline bool kexec_late_load(struct kimage *image) +{ + return IS_ENABLED(CONFIG_KSTATE) && image->file_mode && + (image->type == KEXEC_TYPE_DEFAULT); +} + #ifdef CONFIG_KEXEC_FILE #include void kimage_file_post_load_cleanup(struct kimage *image);