From patchwork Fri May 24 04:10:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13672705 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 703DFC25B7D for ; Fri, 24 May 2024 04:11:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A55686B00A0; Fri, 24 May 2024 00:11:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A06EE6B00A1; Fri, 24 May 2024 00:11:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B1D86B00A2; Fri, 24 May 2024 00:11:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5D2736B00A0 for ; Fri, 24 May 2024 00:11:01 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 085631208A4 for ; Fri, 24 May 2024 04:11:01 +0000 (UTC) X-FDA: 82151963922.28.E9DC864 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf16.hostedemail.com (Postfix) with ESMTP id A728B180022 for ; Fri, 24 May 2024 04:10:58 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=SGDSyMA0; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716523859; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a17hYVy0ZiE4pEbgAJiQ2+W/V4Fvpi5O93hp5E179g4=; b=M/g6TmagjfJ/53AkmP1q8mFkOfI7KUwRamDUnLB9mAbXjLHnYMoPzSoAtdFbXYKI7N6u1H dXEvJvIcLQeIO/SCJmqHUf0QLBqU+rPXJB7sT6ewrPVRK/XwsTXcaowVpGeYSWFHMg8Aap yxnBeau5dJn3uQwwHmUWZl4jQkt7Q+I= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=SGDSyMA0; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716523859; a=rsa-sha256; cv=none; b=3Vw+exg8C0ghP2Q3YtvXfg/EYsv1sqMvRDUcyv9PAk4+amkGa9jmTFTLDZyAUMjpWsqOPl 5R241DhU9Ul59AS1MU/usYhzEHe1I/sVo87QLkMB0WHFhhXTsTcYgP6L4aIHpQTJPbX5Z5 J8MyGFqzB+Ba3+ugDEQFRlSD0eN62aU= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id C7B56CE18C9; Fri, 24 May 2024 04:10:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 402F6C2BD11; Fri, 24 May 2024 04:10:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1716523854; bh=/x/7qV6pswSalCbC6o7pTWU/vn1XS+egPb3p7SBL30Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SGDSyMA05INCp7JaIvbBjPly0Su26/b3/n57wy76Zk8EtpSuYXONCHkI5yDyrHt3i 79oEGExT1NXwBUpgfHlOlB+iWQqoa+Iojo+GCwsTDhTETd5+G5UVnSXBh53ZzdTrgq mqU3Ky7I2ej1qtcyUVko64q2SrVSIY789f/yinONQ2/ueHET64sa6BKLp32vhSjw2l 9bNrpNNGXd9W/KpwmVrPOUsLcZlYOazYsR/YkiqdVu6KZZV+17VGpb2aPEr964gZi2 FNvZC2ND4seohJSb1WkKM9FQe/zocDvY2SUDe114ejqoDzfbvY7VC6g0/VG7nZeuwj o+voZvJS/t1Ig== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, Andrii Nakryiko Subject: [PATCH v2 5/9] fs/procfs: add build ID fetching to PROCMAP_QUERY API Date: Thu, 23 May 2024 21:10:27 -0700 Message-ID: <20240524041032.1048094-6-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240524041032.1048094-1-andrii@kernel.org> References: <20240524041032.1048094-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: A728B180022 X-Stat-Signature: x3qtxwmkkm8jmh7xa1q6j3t64je834gg X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1716523858-425931 X-HE-Meta: U2FsdGVkX1+IOzxF+xJi4DSH+eilHHVemYUEAmxmLfiH4ERTp98wDMb22Nfjfo+yg7sA/heZewcvwUNhU8NF/SUvtpIBw5wQKlkLUk+wEZOJwVa5UchZfY5CqFcvutifcUgfXITkF1anS4eizm1ADYgk/MmuZuVeBD16pO9Ur9sPnSR5dLIli1xQBFu8dcMEKGwVOyUvQwvMPaVnftAhUfFs4Sh1S7VVgC7WuWS/3Qlui3X2IXyFDhfa25vIU2CZfmRY7D1Jw0GmmCZX54aUBhIgRKNVPI+pLCMwb9nxhasT1/GVK5ZJDVcfCxiyAYoUE1RlRNFNDpjuFPmJivdXRehxThdJEvHti3wmCERSTCzvi9C7AEjOfYNqZImO1tYmt1BDKodjNdnjGjSxnRYMjWw51IvRKH3C3IJnhC1X70CmZGHMI/Zh/qoX6iDJmzNqoWPOj/zP8hVuVtvYZyN8fCbn6zVddhcWigbdW13nWJDNSniVVzcO257+meMVDItnCqXVtg7dFOGitPzqAJ712EFg1EqFeuL1D2FMOZkTpAEaVSqs8yQxamTYaJ3VRDzmkTlVzEgssC/qmNH+LIW6AxN4/TMSMMpVJ2Jw0zWRoVvkCjpz5yMGGqNtMFXXNL6NH1sWbGE6Djub6YMdFFFz7YnJhZ9gkm0n+rN/yAntsMFNwOkM0s4tWak/68jCG8lzm/jYoBiqI8MCpb75xtAImZnMnCGC5q+ZeSt48hk8XqNq9AOP3FZiour/h1BaevFB8qQs151LXm8W6NZ12B35ThyfhjBfdn2N7SnH6opSnWwImmdJCg7mHZsv8aCt10f7Rwekgj+qmKN5Gr3AF8v0zEG1AuNzJixykVH1IvciqzV0+0/qc+0OOfManDnW6RmP8uJg07V4c0VQf2M8UxN8OnCn3JZ+wNllELwBnBQsyLnL37Y2co5Um9CzMMvUsO1iFBKXsDwgF0TPjOJofwP btFc6KU1 DsXTdt1sFMJDmg5UVnEO1S49BmOR/TioTp8fHXr5WK0cMvGz8dOix9crYZVUaWRMh6n7D+g+RvKtCeGN/uj6vneYSngXncqYWP88yEm284XzLwa4/oR+JpuqvffGFYFHnNeiQC/McA0HDABU1CZJapb1fJOWq/0xxt2mJCmiDeN4zhRKtefLtPvNMb7L1qyKjwv/JdMzulu9VxZdvtZW/QpQs+GBuqWwrQNioGegGxac+qAqPxqVo0qPC4HTDVpAsj37LX2Px0UbPYZFdaNZ3peOUms8eGuXUFl+u X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The need to get ELF build ID reliably is an important aspect when dealing with profiling and stack trace symbolization, and /proc//maps textual representation doesn't help with this. To get backing file's ELF build ID, application has to first resolve VMA, then use it's start/end address range to follow a special /proc//map_files/- symlink to open the ELF file (this is necessary because backing file might have been removed from the disk or was already replaced with another binary in the same file path. Such approach, beyond just adding complexity of having to do a bunch of extra work, has extra security implications. Because application opens underlying ELF file and needs read access to its entire contents (as far as kernel is concerned), kernel puts additional capable() checks on following /proc//map_files/- symlink. And that makes sense in general. But in the case of build ID, profiler/symbolizer doesn't need the contents of ELF file, per se. It's only build ID that is of interest, and ELF build ID itself doesn't provide any sensitive information. So this patch adds a way to request backing file's ELF build ID along the rest of VMA information in the same API. User has control over whether this piece of information is requested or not by either setting build_id_size field to zero or non-zero maximum buffer size they provided through build_id_addr field (which encodes user pointer as __u64 field). This is a completely optional piece of information, and so has no performance implications for user cases that don't care about build ID, while improving performance and simplifying the setup for those application that do need it. Kernel already implements build ID fetching, which is used from BPF subsystem. We are reusing this code here, but plan a follow up changes to make it work better under more relaxed assumption (compared to what existing code assumes) of being called from user process context, in which page faults are allowed. BPF-specific implementation currently bails out if necessary part of ELF file is not paged in, all due to extra BPF-specific restrictions (like the need to fetch build ID in restrictive contexts such as NMI handler). Signed-off-by: Andrii Nakryiko --- fs/proc/task_mmu.c | 25 ++++++++++++++++++++++++- include/uapi/linux/fs.h | 28 ++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 1 deletion(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 2b14d06d1def..c8f783644d36 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -453,6 +454,7 @@ static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, vma_end_read(vma); if (flags & PROCMAP_QUERY_COVERING_OR_NEXT_VMA) goto next_vma; + no_vma: if (mmap_locked) mmap_read_unlock(mm); @@ -474,7 +476,7 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) struct mm_struct *mm; bool mm_locked; const char *name = NULL; - char *name_buf = NULL; + char build_id_buf[BUILD_ID_SIZE_MAX], *name_buf = NULL; __u64 usize; int err; @@ -496,6 +498,8 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) /* either both buffer address and size are set, or both should be zero */ if (!!karg.vma_name_size != !!karg.vma_name_addr) return -EINVAL; + if (!!karg.build_id_size != !!karg.build_id_addr) + return -EINVAL; mm = priv->mm; if (!mm || !mmget_not_zero(mm)) @@ -534,6 +538,21 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) if (vma->vm_flags & VM_MAYSHARE) karg.vma_flags |= PROCMAP_QUERY_VMA_SHARED; + if (karg.build_id_size) { + __u32 build_id_sz; + + err = build_id_parse(vma, build_id_buf, &build_id_sz); + if (err) { + karg.build_id_size = 0; + } else { + if (karg.build_id_size < build_id_sz) { + err = -ENAMETOOLONG; + goto out; + } + karg.build_id_size = build_id_sz; + } + } + if (karg.vma_name_size) { size_t name_buf_sz = min_t(size_t, PATH_MAX, karg.vma_name_size); const struct path *path; @@ -578,6 +597,10 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) } kfree(name_buf); + if (karg.build_id_size && copy_to_user((void __user *)karg.build_id_addr, + build_id_buf, karg.build_id_size)) + return -EFAULT; + if (copy_to_user(uarg, &karg, min_t(size_t, sizeof(karg), usize))) return -EFAULT; diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index f25e7004972d..7306022780d3 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -509,6 +509,26 @@ struct procmap_query { * If set to zero, vma_name_addr should be set to zero as well */ __u32 vma_name_size; /* in/out */ + /* + * If set to non-zero value, signals the request to extract and return + * VMA's backing file's build ID, if the backing file is an ELF file + * and it contains embedded build ID. + * + * Kernel will set this field to zero, if VMA has no backing file, + * backing file is not an ELF file, or ELF file has no build ID + * embedded. + * + * Build ID is a binary value (not a string). Kernel will set + * build_id_size field to exact number of bytes used for build ID. + * If build ID is requested and present, but needs more bytes than + * user-supplied maximum buffer size (see build_id_addr field below), + * -E2BIG error will be returned. + * + * If this field is set to non-zero value, build_id_addr should point + * to valid user space memory buffer of at least build_id_size bytes. + * If set to zero, build_id_addr should be set to zero as well + */ + __u32 build_id_size; /* in/out */ /* * User-supplied address of a buffer of at least vma_name_size bytes * for kernel to fill with matched VMA's name (see vma_name_size field @@ -517,6 +537,14 @@ struct procmap_query { * Should be set to zero if VMA name should not be returned. */ __u64 vma_name_addr; /* in */ + /* + * User-supplied address of a buffer of at least build_id_size bytes + * for kernel to fill with matched VMA's ELF build ID, if available + * (see build_id_size field description above for details). + * + * Should be set to zero if build ID should not be returned. + */ + __u64 build_id_addr; /* in */ }; #endif /* _UAPI_LINUX_FS_H */